pyrodigal

Cython bindings and Python interface to Prodigal, an ORF finder for genomes and metagenomes. Now with SIMD!

GPL-3.0 License

Downloads
37.8K
Stars
139
Committers
4

Bot releases are visible (Hide)

pyrodigal - v3.5.2 Latest Release

Published by github-actions[bot] about 2 months ago

Added

  • Warning in CLI when given sequences with empty identifiers.

Fixed

  • FASTA parser used in CLI crashing on empty header lines (#61).
pyrodigal - v3.5.1

Published by github-actions[bot] 3 months ago

Fixed

  • Outdated code in pyrodigal.cli breaking the CLI.
pyrodigal - v3.5.0

Published by github-actions[bot] 3 months ago

Added

  • Support for reading from stdin in CLI (#35).
  • Flag for changing parallel computation to use Pool instead of ThreadPool (#57).
  • Better documentation of command line interface (#56).
  • Allow changing the formatter class in pyrodigal.cli.argument_parser.

Changed

  • Migrate documentation to pydata-sphinx-theme.

Fixed

  • Cython warnings with unused except * statements in MetagenomicBins.
  • Signatures of __init__ methods missing from all Cython types after the v3.0 update.
  • Small typos in documentation.
pyrodigal - v3.4.1

Published by github-actions[bot] 5 months ago

Changed

  • Refactor SIMD code to reduce number of required registers, and improve SSE2 performance.
  • Refactor Prodigal initialization functions into sparse initializer code to reduce library size.
pyrodigal - v3.4.0

Published by github-actions[bot] 5 months ago

Added

  • strict argument to Gene.translate to control translation of ambiguous codons with unambiguous translation (#54).
  • strict_translation argument to Genes.write_genbank and Genes.write_translation.
  • Support for translation tables 26 to 33 in Gene.translate.
  • Support for translation tables 26, 29, 30, 32 and 33 in GeneFinder.train.
  • Genes.score property to count the total score of all extracted genes.
  • full_id parameter to Genes.write_gff, Genes.write_translation and Genes.write_genes to control the ID field written for each gene (#53).

Changed

  • Gene.translate now raises a warning when called with a translation table incompatible with the training info.

Fixed

  • Bug in code for masking trailing nucleotides (#55).
pyrodigal - v3.3.0

Published by github-actions[bot] 9 months ago

Added

  • CLI option to disable translation of stop codons (#51, by @zclaas).

Changed

  • Scorer internal API to separate connection scoring and overlap disentangling.

Fixed

  • Bug with computation of minimum node in connection scoring loop (hyattpd/Prodigal#108).
  • Out-of-bounds sequence access in _shine_dalgarno_exact and _shine_dalgarno_mm methods of Sequence.
  • Memory leak in Nodes.__setstate__ caused by incorrect reallocation.
pyrodigal - v3.2.2

Published by github-actions[bot] 9 months ago

Fixed

  • Always mark SSE2 support on x86-64 CPUs independently of archspec-detected features (#49).
pyrodigal - v3.2.1

Published by github-actions[bot] 11 months ago

Added

  • Option to change argument parser in pyrodigal.cli.main.
pyrodigal - v3.2.0

Published by github-actions[bot] 11 months ago

Added

  • AVX-512 implementation of the SIMD pre-filter.
  • Additional support for reading lz4 and xz and zstd-compressed input in the CLI.
  • Option to change gene finder type in pyrodigal.cli.main.
pyrodigal - v3.1.1

Published by github-actions[bot] 12 months ago

Fixed

  • Incorrect unpickling of GeneFinder causing crashes with multiprocessing (#46).
pyrodigal - v3.1.0

Published by github-actions[bot] about 1 year ago

Added

  • Support for Python 3.12.
  • min_mask argument to GeneFinder to control the minimum lenght of masked regions on mask=True.
pyrodigal - v3.0.1

Published by github-actions[bot] about 1 year ago

Fixed

  • Genes.write_scores and Genes.write_gff crashing on empty Genes (#44).
pyrodigal - v3.0.0

Published by github-actions[bot] about 1 year ago

Added

  • MetagenomicBins collection to store a dense array of MetagenomicBin objects.
  • metagenomic_bins keyword argument to GeneFinder allowing to control which models are used when running gene finding in meta mode (#24).
  • metagenomic_bin attribute to Genes referencing the metagenomic model with which the genes were predicted, if in meta mode.
  • Additional TrainingInfo properties (missing_motif_weight, coding_statistics).
  • Setters for all remaining TrainingInfo properties.
  • Proper TrainingInfo constructor with configuration option for all attributes.
  • TrainingInfo.to_dict method to extract all parameters from a TrainingInfo.
  • Genes.write_genbank method to write a GenBank record with all predicted genes from a sequence.
  • include_stop flag to Gene.translate and Genes.write_translations to allow excluding the stop codon from the translated sequence.
  • include_translation_table flag to Genes.write_gff to include the translation table to the GFF attributes of each gene.
  • gbk output format to the Pyrodigal CLI.
  • Sequence.unknown property exposing the number of unknown nucleotides in the sequence.
  • Sequence.start_probability and Sequence.stop_probability to estimate the probability of encountering a start and a stop codon based on the GC%.

Fixed

  • Genes.write_gff not properly reporting the number of bytes written.
  • Merge several nogil sections in Sequence constructor.
  • Several Cython functions missing a noexcept qualifier.

Changed

  • BREAKING: Rename OrfFinder to GeneFinder for consistency.
  • BREAKING: Use memoryview to expose all TrainingInfo attributes instead manually building lists or tuples.
  • Reorganize memory management of the built-in metagenomic models.
  • Make the internal Cython model public (pyrodigal.lib) to allow importing the underlying classes in other Cython projects.
  • Use typing.Literal for allowed translation table values in pyrodigal.lib annotations
  • Cache intermediate log-odds in Nodes._raw_coding_score to reduce calls to pow and log functions.
  • Inline connection scoring functions to reduce function call overhead.
  • Reorganize struct _node fields to reduce size in memory.
  • Make GeneFinder.find_genes and GeneFinder.train reserve memory for the Nodes based on the GC% of the input sequence.
  • Avoid storing temporary results in the generic implementation of ConnectionScorer.compute_skippable.
  • Use Cython freelist for allocating Node, Gene, MetagenomicBin and Mask.
  • Increase minimum allocation for Genes and Nodes to reduce early reallocations.

Removed

  • BREAKING: metagenomic_bin attribute of TrainingInfo.
pyrodigal - v3.0.0-alpha4

Published by github-actions[bot] about 1 year ago

Added

  • Sequence.unknown property exposing the number of unknown nucleotides in the sequence.
  • Sequence.start_probability and Sequence.stop_probability to estimate the probability of encountering a start and a stop codon based on the GC%.

Changed

  • Cache intermediate log-odds in Nodes._raw_coding_score to reduce calls to pow and log functions.
  • Inline connection scoring functions to reduce function call overhead.
  • Reorganize struct _node fields to reduce size in memory.
  • Make GeneFinder.find_genes and GeneFinder.train reserve memory for the Nodes based on the GC% of the input sequence.
  • Avoid storing temporary results in the generic implementation of ConnectionScorer.compute_skippable.
pyrodigal - v3.0.0-alpha3

Published by github-actions[bot] about 1 year ago

Fixed

  • Merge several nogil sections in Sequence constructor.
  • Several Cython functions missing a noexcept qualifier.

Changed

  • Use Cython freelist for allocating Node, Gene, MetagenomicBin and Mask.
  • Increase minimum allocation for Genes and Nodes to reduce early reallocations.
pyrodigal - v3.0.0-alpha2

Published by github-actions[bot] about 1 year ago

Added

  • Genes.write_genbank method to write a GenBank record with all predicted genes from a sequence.
  • include_stop flag to Gene.translate and Genes.write_translations to allow excluding the stop codon from the translated sequence.
  • include_translation_table flag to Genes.write_gff to include the translation table to the GFF attributes of each gene.
  • gbk output format to the Pyrodigal CLI.

Fixed

  • Genes.write_gff not properly reporting the number of bytes written.

Changed

  • Use typing.Literal for allowed translation table values in pyrodigal.lib annotations
pyrodigal - v3.0.0-alpha1

Published by github-actions[bot] about 1 year ago

Added

  • MetagenomicBins collection to store a dense array of MetagenomicBin objects.
  • metagenomic_bins keyword argument to GeneFinder allowing to control which models are used when running gene finding in meta mode (#24).
  • metagenomic_bin attribute to Genes referencing the metagenomic model with which the genes were predicted, if in meta mode.
  • Additional TrainingInfo properties (missing_motif_weight, coding_statistics).
  • Setters for all remaining TrainingInfo properties.
  • Proper TrainingInfo constructor with configuration option for all attributes.
  • TrainingInfo.to_dict method to extract all parameters from a TrainingInfo.

Changed

  • BREAKING: Rename OrfFinder to GeneFinder for consistency.
  • Reorganize memory management of the built-in metagenomic models.
  • Make the internal Cython model public (pyrodigal.lib) to allow importing the underlying classes in other Cython projects.
  • BREAKING: Use memoryview to expose all TrainingInfo attributes instead manually building lists or tuples.

Removed

  • BREAKING: metagenomic_bin attribute of TrainingInfo.
pyrodigal - v2.3.0

Published by github-actions[bot] over 1 year ago

Changed

  • Bump Cython to v3.0.0.
pyrodigal - v2.2.0

Published by github-actions[bot] over 1 year ago

Changed

  • Release GIL while masking sequence regions in Sequence.__init__.
  • Use archspec instead of cpu_features for runtime feature detection.

Added

  • CLI flag to run ORF detection in parallel when input contains several contigs.

Removed

  • Support for Python 3.5.
pyrodigal - v2.1.0

Published by github-actions[bot] over 1 year ago

Changed

  • Update Prodigal to v2.6.3+c1e2d36 to fix a bug with Shine-Dalgarno detection on reverse contig edge (hyattpd/Prodigal#100).

Added

  • CLI flags to set the minimum gene size (#32, by @cjprybol).

Fixed

  • ArchLinux User Repository package generation in CI.