Plasmid Atlas - A web interface to browse for plasmids and their associated genes. Visit us at:
GPL-3.0 License
Plasmid Atlas is a web-base tool that empowers
researchers to easily and rapidly access
information related with plasmids present in NCBI's refseq
database.
In pATLAS each node (or circle) represents
a plasmid and each link between two plasmids means that those two plasmids
share around 90% average nucleotide identity.
With this tool we have two main goals:
Tiago F Jesus, Bruno Ribeiro-Gonçalves, Diogo N Silva, Valeria Bortolaia, Mário Ramirez, João A Carriço; Plasmid ATLAS: plasmid visual analytics and identification in high-throughput sequencing data, Nucleic Acids Research, gky1073, https://doi.org/10.1093/nar/gky1073
If are interested in learning how to use pATLAS, please refer to gitbook documentation.
Mash (2.0) - You can download mash version 2.0.0 directly here: linux and OSX. Other releases were not tested but may be downloaded in Mash git releases page.
Postgresql (>= 10.0) - This script uses Postgres database to store the database: releases page
Python 3 and the respective pip.
To install all other dependencies just run: pip install -r requirements.txt
MASHix.py is the main script to generate the database. This script generates a matrix of pairwise comparisons between sequences in input fasta(s) file(s). Note that it reads multifastas, i.e., each header in fasta is a reference sequence.
'-i','--input_references' - 'Provide the input fasta files to parse.
This will inputs will be joined in a
master fasta.'
'-o','--output' - 'Provide an output tag'
'-t', '--threads' - 'Provide the number of threads to be used'
'-db', '--database_name' - 'This argument must be provided as the last
argument. It states the database name that must be used.'
'-k','--kmers' - 'Provide the number of k-mers to be provided to mash
sketch. Default: 21'
'-p','--pvalue' - 'Provide the p-value to consider a distance
significant. Default: 0.05'
'-md','--mashdist' - 'Provide the maximum mash distance to be
parsed to the matrix. Default:0.1'
'-no_rm', '--no-remove' - 'Specify if you do not want to remove the
output concatenated fasta.'
'-hist', '--histograms' - 'Checks the distribution of distances
values ploting histograms.'
'-non', '--nodes_ncbi' - 'specify the path to the file containing
nodes.dmp from NCBI'
'-nan', '--names_ncbi' - 'specify the path to the file containing
names.dmp from NCBI'
'--search-sequences-to-remove' - 'this option allows to only run the
part of the script that is required
to generate the filtered fasta.
Allowing for instance to debug
sequences that shoudn't be removed
using 'cds' and 'origin' keywords'.
Go to db_manager/config_default.py
and edit the following line:
SQLALCHEMY_DATABASE_URI = 'postgresql:///<custom_database_name>'
Go to db_manager/db_app/models.py and edit the following line:
__tablename__ = "<custom_table_name>"
pg_dump <db_name> > <file_name.sql>
psql -U <user_name> -d <db_name> -f <file_name.sql>
This script inherits a class from ODiogoSilva/Templates and uses it to parse abricate outputs and dumps abricate outputs to a psql database, depending on the input type provided.
"-i", "--input_file" - "Provide the abricate file to parse to db.
It can accept more than one file in the case of
resistances."
"-db_psql", "--database_name" - "his argument must be provided as the
last argument. It states the database
name that must be used."
"-db", "--db" - "Provide the db to output in psql models."
"-id", "--identity" - "minimum identity to be reported to db"
"-cov", "--coverage" - "minimum coverage do be reported to db"
"-csv", "--csv" - "Provide card csv file to get correspondence between
DNA accessions and ARO accessions. Usually named
aro_index.csv. By default this file is already
available in patlas repo with a specific path:
'db_manager/db_app/static/csv/aro_index.csv'"
This script is located in utils
folder and can be used to generate a
JSON file with the corresponding taxonomic tree. It fetches for a given
species, the genera, family and order to which it belongs.
Note: for plasmids I have to make some filtering in the resulting taxids
and list of species that other users may want to skip
-i INPUT_LIST, --input_list INPUT_LIST
provide a file with a listof species. Each
speciesshould be in each line.
-non NODES_FILE, --nodes_ncbi NODES_FILE
specify the path to the file containing
nodes.dmp from NCBI
-nan NAMES_FILE, --names_ncbi NAMES_FILE
specify the path to the file containing
names.dmp from NCBI
-w, --weirdos This option allows the userto add a checks for
weirdentries. This is mainly usedto parse the
plasmids refseq, so if you do not want this to
be used, use this option.
weirdos
optionFrom taxonomy levels:
From species in fasta headers:
It also attempts to fix some bugs in species naming like the following:
Note: Yes people like to give interesting names to bacteria...
git clone https://github.com/tiagofilipe12/pATLAS
Install its dependencies
Configure the database:
createdb <database_name>
pATLAS/patlas/db_manager/db_create.py <database_name>
master_fasta_*.fas
).# e.g.
abricate --db card <master_fasta*.fas> > abr_card.tsv
abricate --db resfinder <master_fasta*.fas> > abr_resfinder.tsv
abricate --db vfdb <master_fasta*.fas> > abr_vfdb.tsv
abricate --db plasmidfinder <master_fasta*.fas> > abr_plasmidfinder.tsv
# e.g.
abricate2db.py -i abr_plasmidfinder.tsv -db plasmidfinder \
-id 80 -cov 90 -csv aro_index.csv -db_psql <database_name>
This steps are fully automated in the nextflow pipeline pATLAS-db-creation.
If you require to add your own plasmids to pATLAS database
without asking to add them to pATLAS website,
you can provide custom fasta files when building the database using
the -i
option of MASHix.py.
Then follow the steps described above.
You can run pATLAS locally without much requirements by using patlas-compose. This will automatically handle the installation of the version 1.5.2 of pATLAS and launch the service in a local instance. For that you just require:
Then, follow this simple steps:
git clone https://github.com/bfrgoncalves/patlas-compose
cd patlas-compose
docker-compose up
Wait for the line * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
to show up, meaning that the service is now running.
Access on 127.0.0.1:5000
or 0.0.0.0:5000
.
Note: This methodology is highly recommended.
pATLAS can be run locally if you have PostgreSQL installed and configured. After, you just need to:
git clone https://github.com/tiagofilipe12/pATLAS
Create your custom database version
or generate the default pATLAS database or
download sql file from version 1.5.2
(the tar.gz
archive).
Note: if you download the sql file from version 1.5.2 you may skip
steps 3 to 4 and continue with step 5.
Make sure all the necessary files are in place.
import_to_vivagraph.json
file in<tag_provided_to_o_flag>/results
. Place this file in thepatlas/db_manager/db_app/static/json
folder.import_to_vivagraph.json
file byfalse
to true
a variable named devel
inpatlas/db_manager/db_app/static/js/pATLASGlobals.js
createdb <your_database>
Install backend dependencies:
# within the root directory of this repository
pip install -r requirements.txt
# change directory to static direcoty where `index.html` will look for
# its depdendenies
cd patlas/db_manager/db_app/static/
# then install them (package.json is located in this directory)
yarn install
# You can also user a local installation of webpack.
# entry-point.js is the config file where all the imported modules are
# called
node_modules/webpack/bin/webpack.js entry-point.js
run.py
.# within the root directory of this repository
cd patlas/db_manager
./run.py <your_database>
Note: the database name is utterly important to properly say to the frontend where to get the data.
127.0.0.1:5000
.Using the devel = true
isn't very efficient, so you can allow the
force directed graph to render in a devel = true
session, then when
you are satisfied pause the force layout using the buttons available in
pATLAS and click at the same time Shift+Ctrl+Space
. This will take a
while but eventually it will generate a file named filtered.json
.
Once you have this file you can add it to the
patlas/db_manager/db_app/static/json
folder and change the
devel
variable to false
. This will use the previously saved
positions to render a pre rendered network.