neural-lm

focus on fusion on speech recognition

Stars

6

View Code on GitHub

Ecosystems: Python

(deprecated, will reimplement by jax) under development may not work until whole pipeline done

neural-lm

focus on fusion on speech recognition

Note

When a language model is used wide beam searches often yield incomplete transcripts. With narrow beams, the problem is less visible due to implicit hypothesis pruning.

See if it appears in ctc+lm fusion

TODO

adaptive softmax for large voca (because pytorch offical implementation can't work with torchscript)
onnx support and torchscript
gru
rnn tie embedding
gru fusion on wenet runtime ctc prefix beam search
transformer-xl with cache
transformer-xl with cache to fusion
mwer training when lm fusion
etc

reference

Related Projects

tf-speech-recon

Speech recognition Kaggle repo

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

19 Apr 2023 1,269

wavenet_vocoder

WaveNet vocoder

27 Dec 2017 2,314

End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation)...

08 Dec 2017 1,177

Automatic-Lipreading-translator

Megatron-LM

Ongoing research training transformer models at scale

21 Mar 2019 8,839

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

29 Aug 2017 29,423

parler-tts

Inference and training library for high-quality TTS models.

13 Feb 2024 2,623

audio-pretrained-model

A collection of Audio and Speech pre-trained models.

18 Jul 2020 180

sequence-to-sequence-from-scratch

Sequence to Sequence from Scratch Using Pytorch

30 Sep 2018 119

speech-to-text-wavenet

Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's...

14 Nov 2016 3,945

spear-tts-pytorch

Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch

19 Jun 2023 252

SimCR

Code for NAACL 2024 main conference paper "An Empirical Study of Consistency Regularization for E...

SpeechGPT

SpeechGPT Series: Speech Large Language Models

16 May 2023 1,233

dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

23 Nov 2017 1,160