Plasmid-Borne Identity
What does it do?
Plasmid-Borne Identity performs GeneSeekr and MOB-suite analyses on FASTA files, and creates a report combining and summarising the outputs.
MobSuite is a set of tools developed by the Public Health Agency of Canada for detecting plasmids in draft genome
assemblies. This tool runs the mob_recon
part of the suite, which first detects plasmids in the assemblies, and then
performs typing on the plasmids. More details on MobSuite, including fairly extensive details on the output files
produced, can be found at the MobSuite GitHub repository).
How do I use it?
Subject
In the Subject
field, put plasmid_borne_identity
. Spelling counts, but case sensitivity doesn't.
Description
All you need to put in the description is a list of SEQIDs you want to process, one per line.
Attachments
You are required to attach a FASTA-formatted file containing the gene(s) you wish analysed
Optional arguments
- BLAST program. NOTE: GeneSeekr (and therefore PlasmidBorne Identity) does not check to see if your query or
database are the appropriate molecule for the requested program.
- default is
blastn
- You can select one of the following BLAST programs to use:
- blastn - nt query: nt db
- blastp - protein query: protein db
- blastx - translated nt query: protein db
- tblastn - protein query: translated nt db
- tblastx - translated nt query: translated nt db
- modify as follows:
blast=tblastx
- default is
- Minimum cutoff for matches to be included in report.
- default is
70
- modify as follows:
cutoff=80
- default is
- E-value cutoff
- default is
1E-05
- modify as follows:
evalue=1E-10
orevalue=0.01
- default is
Example
For an example Plasmid-Borne Identity analysis, see issue 15644.
Interpreting Results
Plasmid-Borne Identity will upload five separate reports once it is complete
geneseekr_{BLAST_PROGRAM}.xlsx
strain name and percent identity match for all query genes
geneseekr_{BLAST_PROGRAM}_detailed.csv
strain name, and BLAST summary information, including percent match, alignment length,
subject length, e-value, number of positives, number of mismatches, and number of gaps for every gene
geneseekr_{BLAST_PROGRAM}.csv
same as geneseekr_{BLAST_PROGRAM}.csv
same as, but in .csv format
mob_recon_summary.csv
shows any contigs that are predicted to be plasmids - note that all contigs calculated to be
chromosomal are ignored. Location is the name of the predicted plasmid, while Contig is the name contig
predicted to contain plasmid sequence. One plasmid can be composed of several contigs if it could not be circularised.
plasmid_borne_summary.csv
combines information from geneseekr_{BLAST_PROGRAM}.csv
and mob_recon_summary.csv
.
The contigs of all predicted AMR genes from the geneseekr_{BLAST_PROGRAM}_detailed.csv
report are used to search
the mob_recon_summary
report. The plasmid predictions, and well as all the incompatibility types for that plasmid
are extracted, and used in the report. Location will specify either chromosome
or the name of the predicted plasmid.
How long does it take?
GeneSeekr is very fast, while MOB-suite is relatively slow - it should take a few minutes to analyze each SEQID requested.
What can go wrong?
- Requested SEQIDs are not available. If we can't find some of the SEQIDs that you request, you will get a warning message informing you of it.
- Issue with required FASTA-formatted targets file, including not attaching the file, or the FASTA formatting being incorrect
- Specifying the incorrect BLAST analysis program for the provided sequences e.g. blastp with a nucleotide query and db