kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

AGPL-3.0 License

Downloads
875
Stars
329
Committers
5

Bot releases are visible (Hide)

kaldi-active-grammar - v3.1.0 Latest Release

Published by daanzu almost 3 years ago

Fixed

  • Fix updating of SymbolTable multiple times for new words, so that there is only one instance for a single Model.

Changed

  • Only mark lexicon stale if it was successfully modified.
  • Removed deprecated CLI binaries from Windows build, reducing wheel size by ~65%.

Donations are appreciated to encourage development.

Donate Donate Donate Donate

Artifacts

  • Models are available here and below.
kaldi-active-grammar - v3.0.0

Published by daanzu almost 3 years ago

Changed

  • Pronunciation generation for lexicon now better supports local mode (using the g2p_en package), which is now also the default mode. It is also preferred over the online mode (using CMU's web service), which is now disabled by default. See the Setup section of the README for details. The new models now include the data files for g2p_en.
  • PlainDictation output now discards any silence words from transcript.
  • lattice_beam default value reduced from 6.0 to 5.0, to hopefully avoid occasional errors.
  • Removed deprecated CLI binaries from build for linux/mac.

Fixed

  • Whitespace in the model path is once again handled properly (thanks @matthewmcintire).
  • NativeWFST.has_path() now handles loops.
  • Linux/Mac binaries are now more stripped.

Donations are appreciated to encourage development.

Donate Donate Donate Donate

Artifacts

  • Models are available here and below.
kaldi-active-grammar - v2.1.0

Published by daanzu over 3 years ago

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions Gitter)

Donations are appreciated to encourage development.

Donate Donate Donate Donate

See major changes introduced in v2.0.0 and associated downloads.

Added

  • NativeWFST support for checking for impossible graphs (no successful path), which can then fail to compile.
  • Debugging info for NativeWFST.

Changed

  • lattice_beam default value reduced from 8.0 to 6.0, to hopefully avoid occasional errors.
  • Minor fix for OpenBLAS compilation for some architectures on linux/mac.

Fixed

  • Reloading grammars with NativeWFST.

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

kaldi-active-grammar - V2.0.0: Faster Grammar Compilation; Cleaner Codebase; Preparation For New Features

Published by daanzu over 3 years ago

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions Gitter)

Donations are appreciated to encourage development.

Donate Donate Donate Donate

Added

  • Native FST support, via direct wrapping of OpenFST, rather than Python text-format implementation
    • Eliminates grammar (G) FST compilation step
  • Internalized many graph construction steps, via direct use of native Kaldi/OpenFST functions, rather than invoking separate CLI processes
    • Eliminates need for many temporary files (FSTs, .confs, etc) and pipes
  • Example usage for allowing mixing of free dictation with strict command phrases
  • Experimental support for "look ahead" graphs, as an alternative to full HCLG compilation
  • Experimental support for rescoring with CARPA LMs
  • Experimental support for rescoring with RNN LMs
  • Experimental support for "priming" RNNLM previous left context for each utterance

Changed

  • OpenBLAS is now the default linear algebra library (rather than Intel MKL) on Linux/MacOS
    • Because it is open source and provides good performance on all hardware (including AMD)
    • Windows is more difficult for this, and will be implemented soon in a later release
  • Default tmp_dir is now set to [model_dir]/cache.tmp
  • tmp_dir is now optional, and only needed if caching compiled FSTs (or for certain framework/option combinations)
  • File cache is now stored at [model_dir]/file_cache.json
  • Optimized adding many new words to the lexicon, in many different grammars, all in one loading session: only rebuild L_disambig.fst once at the end.
  • External interfaces: Compiler.__init__(), decoding setup, etc.
  • Internal interfaces: wrappers, etc.
  • Major refactoring of C++ components, with a new inheritance hierarchy and configuration mechanism, making it easier to use and test features with and without "activity"
  • Many build changes

Removed

  • Python 2.7 support: it may still work, but will not be a focus.
  • Google cloud speech-to-text removed, as an unneeded dependency. Alternative dictation is still supported as an option, via a callback to an external provider.

Deprecated

  • Separate CLI Kaldi/OpenFST executables
  • Indirect AGF graph compilation (framework==agf-indirect)
  • Non-native FSTs
  • parsing_framework==text

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

kaldi-active-grammar - v1.8.0: New Models, Noise Resistance, Better Errors, More Documentation

Published by daanzu about 4 years ago

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions Gitter)

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

Added

  • New speech models (should be better in general, and support new noise resistance)
  • Make failed AGF graph compilation save and output stderr upon failure automatically
  • Example of complete usage with a grammar and microphone audio
  • Various documentation

Changed

  • Top FST now accepts various noise phones (if present in speech model), making it more resistant to noise
  • Cleanup error handling in compiler, supporting Dragonfly backend automatically printing excerpt of the Rule that failed

Fixed

  • Mysterious windows newline bug in some environments

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

kaldi-active-grammar - v1.7.0: Support compiling some complex grammars (Caster text manipulation)

Published by daanzu about 4 years ago

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions Gitter)

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

Added

  • Add automatic saving of text FST & compiled FST files with log level 5

Changed

  • Miscellaneous naming

Fixed

  • Support compiling some complex grammars (Caster text manipulation), by simplifying during compilation (remove epsilons, and determinize)

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

kaldi-active-grammar - v1.6.0: Easier Configuration; Public Automated Builds

Published by daanzu over 4 years ago

This should be included the next dragonfly version.

You can subscribe to announcements on Gitter: see instructions. Gitter

Added

  • Can now pass configuration dict to KaldiAgfNNet3Decoder, PlainDictationRecognizer (without HCLG.fst).
  • Continuous Integration builds run on GitHub Actions for Windows (x64), MacOS (x64), Linux (x64).

Changed

  • Refactor of passing configuration to initialization.
  • PlainDictationRecognizer.decode_utterance can take chunk_size parameter.
  • Smaller binaries: MacOS 11MB -> 7.6MB, Linux 21MB -> 18MB.

Fixed

  • Confidence measurement in the presence of multiple, redundant rules.
  • Python3 int division bug for cloud dictation.

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

kaldi-active-grammar - v1.5.0: Improved Recognition Confidence Estimation

Published by daanzu over 4 years ago

You can subscribe to announcements on Gitter: see instructions. Gitter

Notes

  • Improved Recognition Confidence Estimation: two new, different measures:
    • confidence: basically the difference in how much "better" the returned recognition was, compared to the second best guess (>0)
    • expected_error_rate: an estimate of how often similar utterances are incorrect (roughly out of 1.0, but can be greater)
  • Refactoring in preparation for future improvements
  • Various bug fixes & optimizations

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

kaldi-active-grammar - v1.4.0: MacOS Support, And Faster Graph Compilation

Published by daanzu over 4 years ago

Support is now included in dragonfly2 v0.22.0! You can try a self-contained distribution available below.

You can subscribe to announcements on Gitter: see instructions. Gitter

Notes

  • MacOS Support
  • Faster Graph Compilation
  • Dictation: the dictation model now does not recognize a zero-word sequence
  • Various bug fixes & optimizations

Artifacts

  • kaldi_model_daanzu*: A better acoustic model, and varying levels of language model for dictation (bigger is generally better).
  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

kaldi-active-grammar - v1.3.0: Preparation and Fixes for Next Generation of Models

Published by daanzu almost 5 years ago

This should be included the next dragonfly version, or you can try a self-contained distribution available below.

You can subscribe to announcements on Gitter: see instructions. Gitter

Notes

  • Next Generation of Models: support for a new generation of models, trained on more data, and with hopefully better accuracy.
  • User Lexicon: if there is a user_lexicon.txt file in the current working directory of your initial loader script, its contents will be automatically added to the user_lexicon.txt in the active model when it is loaded.
  • Various bug fixes & optimizations

Artifacts

  • kaldi_model_daanzu*: A better acoustic model, and varying levels of language model for dictation (bigger is generally better).
  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is matching (only) my GitHub Sponsors donations.]

Support is now included in dragonfly2 v0.20.0! You can try a self-contained distribution available below, of either stable or development versions.

Notes

  • Improved Recognition: better graph construction/compilation should give significantly better overall recognition.
  • Weights on Any Elements: you can now easily add weights to any element (including compound elements in MappingRules), in addition to any rule/grammar.
  • Pluggable Alternative Dictation: you can optionally pass a callable as alternative_dictation to define your own, external dictation engine.
  • Stand-alone Plain Dictation Interface: the library now provides a simple interface for recognizing plain dictation without fancy active grammar features.
  • NOTE: the default model directory is now kaldi_model.
  • Various bug fixes & optimizations

Artifacts

  • kaldi_model_daanzu: A better overall compatible general English Kaldi nnet3 chain model than below.
  • kaldi_model_zamia_daanzu_mediumlm: A compatible general English Kaldi nnet3 chain model, with a larger/better dictation language model than below.
  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is currently matching all my donations $-for-$.]

kaldi-active-grammar - v1.0.0: Faster Loading, Python3, Grammar/Rule Weights, and more

Published by daanzu about 5 years ago

Support is now included in dragonfly2 v0.18.0! You can try a self-contained distribution available below, of either stable or development versions.

Notes

  • Direct Parsing: parse recognitions directly on the FST, removing the (slow) pyparsing dependency.
    • Caster example: Loading is now ~50% faster when cached, and the Kaldi backend accounts for only ~15% of loading time.
  • Python3: both python 2 and 3 should be fully supported now.
    • Unicode: this should also fix unicode issues in various places in both python2/3.
  • Grammar/Rule Weights: can specify weight, where grammars/rules with higher weight value are more likely to be recognized, compared to their peers, for an ambiguous recognition.
  • Generalized Alternative Dictation: the cloud dictation feature has been generalized to make it easier to add other alternatives in the future.
  • Various bug fixes & optimizations

Artifacts

  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate Donate Donate Donate
[GitHub is currently matching all my donations $-for-$.]

kaldi-active-grammar - v0.7.1: Partial Decoding, Parallel Compilation, & Various Optimizations for 15-50% Speedup

Published by daanzu about 5 years ago

Support is now included in dragonfly2 v0.17.0! You can try a self-contained distribution available below, of either stable or development versions.

Notes

  • Partial Decoding: support for having separate Voice Activity Detection timeout values based on whether the current utterance is complex (dictation) or not.
  • Parallel Compilation: when compiling grammars/rules that are not cached, multiple can be compiled at once (up to your core count).
    • Example: loading Caster without cache is ~40% faster (in addition to optimizations below).
  • Various Optimizations: loading even while cached sped up 15%.
  • Refactored temporary/cache file handling
  • Various bug fixes

Artifacts

  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate
Donate

kaldi-active-grammar - v0.6.0: Big Fixes And Optimizations To Get Caster Running

Published by daanzu about 5 years ago

Artifacts

  • kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

Donate
Donate

kaldi-active-grammar - v0.5.0: User Lexicon! Compilation Optimizations! Better Model!

Published by daanzu over 5 years ago

Notes

  • User Lexicon: you can add new words/pronunciations to the model's lexicon to be recognized & used in grammars, and the pronunciations can be either specified explicitly or inferred automatically.
  • Compilation Optimizations: compilation while loading grammars uses the disk much less, and far fewer passes are made over the graphs, as separate modules have been customized & combined.
  • Better Model: 50% more training data.

Artifacts

  • kaldi_model_zamia: [new model version required!] A compatible general English Kaldi nnet3 chain model.
  • kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-dragonfly-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

Donations are appreciated to encourage development.

Donate
Donate

kaldi-active-grammar - v0.4.0: Rule reloading (for ListRefs)

Published by daanzu over 5 years ago

Donations are appreciated to encourage development.

Donate

New model below!

kaldi-active-grammar - v0.3.0: Rule unloading, preliminary cloud dictation

Published by daanzu over 5 years ago

Donations are appreciated to encourage development.

Donate

kaldi-active-grammar - v0.2.2: Add Linux Support

Published by daanzu over 5 years ago

Donations are appreciated to encourage development.

Donate

kaldi-active-grammar - Initial release

Published by daanzu over 5 years ago