thapbi-pict

Tree Health and Plant Biosecurity Initiative - Phytophthora ITS1 Classifier Tool

MIT License

Downloads
4.7K
Stars
8
Committers
5
thapbi-pict - THAPBI PICT v1.0.12 Latest Release

Published by peterjc 8 months ago

Released on PyPI on 2024-03-11:

https://pypi.org/project/thapbi-pict/1.0.12/

Fixes the inadvertent use of type-annotation syntax which had required Python 3.9 or later since THAPBI PICT v1.0.9. This release is tested on and requires at least Python 3.8.

Graceful edit-graph failure if missing graphviz fdp command line tool.

Heuristics for importing sequences in SINTAX style when missing genus in the species field.

Metadata argument -x now accepts multiple columns.

thapbi-pict - THAPBI PICT v1.0.11

Published by peterjc 8 months ago

Released on PyPI on 2024-03-05:

https://pypi.org/project/thapbi-pict/1.0.11/

Harmonised the ASV naming used in the optional FASTA and BIOM output files to match the main TSV and Excel usage. The sample-tally command and stage of the pipeline can now optionally output a BIOM file.

Importing the (pre-classifier) BIOM file and matching FASTA file of ASV sequences into Qiime2 has been tested.

thapbi-pict - THAPBI PICT v1.0.10

Published by peterjc 8 months ago

Released on PyPI on 2024-02-26:

https://pypi.org/project/thapbi-pict/1.0.10/

Changed sample report 'Unique' column to be the number of unique ASVs as per the expectation in the examples. Previously this showed the number of unique species or species complex classifications (a smaller number, although often close with good database coverage and a low diversity sample).

Adjusted the metadata counts in the read report to show 'Accepted' and 'Unique' in read report column headers, and replaced the old TOTAL row values (equal to the 'Accepted' values) with MAX values instead.

Use thousands separators in Excel for read counts etc. This should respect regional settings.

The import command now rejects species names with semi-colon in them, which would cause issues downstream as the semi-colon is used to separate multiple taxonomic matches.

Updated the gg_to_sintax.py helper script to accept FASTA and TSV input files from Qiime2 archives.

thapbi-pict - THAPBI PICT v1.0.9

Published by peterjc 9 months ago

Released on PyPI on 2024-02-12:

https://pypi.org/project/thapbi-pict/1.0.9/

Using Python type annotations (internal code change).

thapbi-pict - THAPBI PICT v1.0.8

Published by peterjc 9 months ago

Released on PyPI on 2024-02-06:

https://pypi.org/project/thapbi-pict/1.0.8/

Updated the default database with the February 2024 NCBI taxonomy (e.g. Phytophthora glovera is now P. gloveri).

Added several additional curated Phytophthora to the default ITS1 database, including 15 novel taxa which we have observed in multiple samples from the environment or tree nurseries, and KP691408.1 as Phytophthora taxon Catala2017sp4 from Català et al. (2017) https://doi.org/10.1111/ppa.12541

Corrected the year in the novel species entries like Phytophthora taxon Catala2015sp1 to match the citation Català et al. (2015) https://doi.org/10.1371/journal.pone.0119311

Also added a 1s6g classifier following the existing naming pattern.

thapbi-pict - THAPBI PICT v1.0.7

Published by peterjc 9 months ago

Released on PyPI on 2024-01-29:

https://pypi.org/project/thapbi-pict/1.0.7/

Updated the default database to treat Phytophthora cambivora as a synonym
of the more recent Phytophthora x cambivora description as a hybrid.

Also fixed the documentation builds on Read-The-Docs.

thapbi-pict - THAPBI PICT v1.0.6

Published by peterjc 9 months ago

Released on PyPI on 2024-01-24:

https://pypi.org/project/thapbi-pict/1.0.6/

Updated the NCBI import and curated entries in the default database.

Added a minimum sample count option to edit-graph command.

Added basic Python type annotation in helper scripts (internal change).

thapbi-pict - THAPBI PICT v1.0.5

Published by peterjc 10 months ago

Released on PyPI on 2023-11-22:

https://pypi.org/project/thapbi-pict/1.0.5/

Updated the NCBI import in the default database, and scripted most of what was a semi-manual process to do this.

thapbi-pict - THAPBI PICT v1.0.4

Published by peterjc 10 months ago

Released on PyPI on 2023-11-20:

https://pypi.org/project/thapbi-pict/1.0.4/

Dropped unused -m / --method argument to the edit-graph command.

thapbi-pict - THAPBI PICT v1.0.3

Published by peterjc about 1 year ago

Released on PyPI on 2023-09-04:

https://pypi.org/project/thapbi-pict/1.0.3/

Updated the NCBI taxonomy and bulk imported entries at genus level, along with adding two curated entries for Phytophthora condilina.

Belated update to the Batovska et al. (2021) pest insects worked example for the pooled marker report changes in v1.0.2.

thapbi-pict - THAPBI PICT v1.0.2

Published by peterjc about 1 year ago

Released on PyPI on 2023-08-18:

https://pypi.org/project/thapbi-pict/1.0.2/

Documentation updated to request citation of the now published paper:

Cock et al. (2023) "THAPBI PICT - a fast, cautious, and accurate metabarcoding analysis pipeline"
PeerJ 11:e15648 https://doi.org/10.7717/peerj.15648

The summary stage now preserves the cutadapt, singletons, etc columns in the pooled reports for multiple markers. Will take the sum or maximum as appropriate. This allows the plot_reduction.py script to be used on pooled reports.

The plot_reduction.py script has been enhancemed to offer raw counts and percentages in addition to the original stacked counts mode, and the ability to pool sample groups by column(s) of the metadata.

Also a belated update to the soil_nematodes/ example for the v1.0.1 change to unoise-l read correction.

thapbi-pict - THAPBI PICT v1.0.1

Published by peterjc over 1 year ago

Released on PyPI on 2023-07-26:

https://pypi.org/project/thapbi-pict/1.0.1/

Now requires at least Python 3.7 (since Python 3.6 is no maintained). Fixed some rare corner-case read-corrections in unoise-l mode. Improved memory usage using the sample-tally step. Updated the tests for a slight change in chimera detection in VSEARCH 2.23.0. Adjustments to the logging in verbose mode.

Added a new script producing a data-reduction stacked plot as used for the accepted manuscript, based on the figure in the preprint.

Minor documentation changes including noting the paper has been accepted, but until it is published continue to suggest citing the preprint:

Cock et al. (2023) "THAPBI PICT - a fast, cautious, and accurate metabarcoding analysis pipeline" bioRxiv
https://doi.org/10.1101/2023.03.24.534090

thapbi-pict - THAPBI PICT v1.0.0

Published by peterjc over 1 year ago

Released on PyPI on 2023-05-19:

https://pypi.org/project/thapbi-pict/1.0.0/

Minor documentation changes since v0.14.1, including adding links to our preprint:

Cock et al. (2023) "THAPBI PICT - a fast, cautious, and accurate metabarcoding analysis pipeline" bioRxiv
https://doi.org/10.1101/2023.03.24.534090

thapbi-pict - THAPBI PICT v0.14.1

Published by peterjc over 1 year ago

Released on PyPI on 2023-03-13:

https://pypi.org/project/thapbi-pict/0.14.1/

The tool now offers optional BIOM format output (requested at the command line with the --biom switch), which requires the Python biom-format library to be installed. See:

McDonald *et al.* (2012) The Biological Observation Matrix (BIOM) format or:
how I learned to stop worrying and love the ome-ome.
https://doi.org/10.1186/2047-217X-1-7

This will hopefully facilitate interoperability and downstream analysis of our tools output.

thapbi-pict - THAPBI PICT v0.14.0

Published by peterjc over 1 year ago

Released on PyPI on 2023-02-03:

https://pypi.org/project/thapbi-pict/0.14.0/

The tool now offers UNOISE style read-correction (off by default), either a built-in implementation of the published algorithm, or by invoking the command line tools USEARCH or VSEARCH. This algorithm requires access to all the reads prior to abundance level thresholds, and thus required some restructuring of the pipeline. The read-preparation step therefore only discards singletons, with the new sample-tally step combining all the unique sequence variants (ASVs). This can optionally apply read-correction before applying the abundance thresholds (which can still be set dynamically using control samples). This is output as a sequence tally table which can be converted to BIOM format. Furthermore, the classifier output now extends the sequence tally table with additional columns containing the taxid and genus-species.

thapbi-pict - THAPBI PICT v0.13.6

Published by peterjc almost 2 years ago

Released on PyPI on 2022-12-28:

https://pypi.org/project/thapbi-pict/0.13.6/

Miscellaneous small fixes and documentation updates, including fixing the factional abundance threshold in sample-tally which was not quite strict enough (only relevant if used outside the pipeline).

thapbi-pict - THAPBI PICT v0.13.5

Published by peterjc almost 2 years ago

Released on PyPI on 2022-12-21:

https://pypi.org/project/thapbi-pict/0.13.5/

Miscellaneous small fixes and documentation updates, including fixing excessive memory usage in the new sample-tally command with larger datasets.

thapbi-pict - THAPBI PICT v0.13.4

Published by peterjc almost 2 years ago

Released on PyPI on 2022-12-07:

https://pypi.org/project/thapbi-pict/0.13.4/

Extends the use of the sample-tally command added in v0.13.3, which can now also perform the abundance filtering including control-driven abundance thresholds. The species versus sample tally table header now also includes an entry for which of the samples are controls.

Again, the motivation behind this change is mostly as a stepping stone towards planned future functionality.

thapbi-pict - THAPBI PICT v0.13.3

Published by peterjc almost 2 years ago

Released on PyPI on 2022-11-25:

https://pypi.org/project/thapbi-pict/0.13.3/

Introduces a new sample-tally command which is now used in the pipeline in place of the older fasta-nr command. This outputs a BIOM style TSV with one row per ASV giving the sample abundances as columns, with the ASV sequence as the final column. This TSV file is now used as input to the summary command, rather than scanning the per-sample intermediate FASTA files.

The motivation behind this change is mostly as a stepping stone towards planned future functionality.

thapbi-pict - THAPBI PICT v0.13.2

Published by peterjc almost 2 years ago

Released on PyPI on 2022-11-11:

https://pypi.org/project/thapbi-pict/0.13.2/

Sped up substr classifier, especially with larger databases. Various small documentation updates, and some minor code cleanup and refactoring.