Open Source Ecosystems

Triton ASR Client

A command-line client for the Triton ASR service.

Installation

To get started with this project, you can clone the repository and install the required packages using pip.

https://github.com/yuekaizhang/Triton-ASR-Client.git
cd Triton-ASR-Client
pip install -r requirements.txt

Usage

client.py [-h] [--server-addr SERVER_ADDR] [--server-port SERVER_PORT]
                 [--manifest-dir MANIFEST_DIR] [--audio-path AUDIO_PATH]
                 [--model-name {transducer,attention_rescoring,streaming_wenet,infer_pipeline}]
                 [--num-tasks NUM_TASKS] [--log-interval LOG_INTERVAL]
                 [--compute-cer] [--streaming] [--simulate-streaming]
                 [--chunk_size CHUNK_SIZE] [--context CONTEXT]
                 [--encoder_right_context ENCODER_RIGHT_CONTEXT]
                 [--subsampling SUBSAMPLING] [--stats_file STATS_FILE]

Optional Arguments

-h, --help: show this help message and exit
--server-addr SERVER_ADDR: Address of the server (default: localhost)
--server-port SERVER_PORT: gRPC port of the triton server, default is 8001 (default: 8001)
--manifest-dir MANIFEST_DIR: Path to the manifest dir which includes wav.scp trans.txt files. (default: ./datasets/aishell1_test)
--audio-path AUDIO_PATH: Path to a single audio file. It can't be specified at the same time with --manifest-dir (default: None)
--model-name {whisper,transducer,attention_rescoring,streaming_wenet,infer_pipeline}: Triton model_repo module name to request: whisper with TensorRT-LLM, transducer for k2, attention_rescoring for wenet offline, streaming_wenet for wenet streaming, infer_pipeline for paraformer large offline (default: transducer)
--num-tasks NUM_TASKS: Number of concurrent tasks for sending (default: 50)
--log-interval LOG_INTERVAL: Controls how frequently we print the log. (default: 5)
--compute-cer: True to compute CER, e.g., for Chinese. False to compute WER, e.g., for English words. (default: False)
--streaming: True for streaming ASR. (default: False)
--simulate-streaming: True for strictly simulate streaming ASR. Threads will sleep to simulate the real speaking scene. (default: False)
--chunk_size CHUNK_SIZE: Parameter for streaming ASR, chunk size default is 16 (default: 16)
--context CONTEXT: Subsampling context for wenet (default: -1)
--encoder_right_context ENCODER_RIGHT_CONTEXT: Encoder right context for k2 streaming (default: 2)
--subsampling SUBSAMPLING: Subsampling rate (default: 4)
--stats_file STATS_FILE: Output of stats analysis in human readable format (default: ./stats_summary.txt)

List of Supported Triton ASR Server

Model Repo	Description	Source	HuggingFace Link
Whisper	Offline ASR TensorRT-LLM	Openai
Conformer Onnx	Offline ASR Onnx FP16	Wenet	yuekai/model_repo_conformer_aishell_wenet
Conformer Tensorrt	Streaming ASR Tensorrt FP16	Wenet
Conformer FasterTransformer	Offline ASR FasterTransformer FP16	Wenet
Conformer CUDA-TLG decoder	Offline ASR with CUDA Decoders	Wenet	speechai/model_repo_conformer_aishell_wenet_tlg
Offline Conformer Onnx	Offline ASR Onnx FP16	k2	wd929/k2_conformer_offline_onnx_model_repo
Offline Conformer TensorRT	Offline ASR TensorRT FP16	k2	wd929/k2_conformer_offline_trt_model_repo
Streaming Conformer Onnx	Streaming ASR Onnx FP16	k2
Zipformer Onnx	Offline ASR Onnx FP16 with Blank Skip	k2
Paraformer Onnx	Offline ASR FP32	FunASR

Related Projects

vits_chinese

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ...

27 Sep 2021 1,154

TensorFlowTTS

TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including ...

22 Mar 2020 3,810

TTS

Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

23 Jan 2018 8,880

model_profiling

24 May 2022 5

PPASR

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

26 Feb 2021 806

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model propose...

07 Nov 2023 1,419

AudioClassification-PaddlePaddle

基于PaddlePaddle实现的音频分类，支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型，还有多种预处理方法

24 Apr 2020 85

bert4keras

keras implement of transformers for humans

26 Aug 2019 5,363

StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

28 Nov 2023 9,514

SenseVoice

Multilingual Voice Understanding Model

03 Jul 2024 2,804

VoiceprintRecognition-PaddlePaddle

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型，同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fban...

29 Apr 2020 218

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

14 Jan 2024 33,328

SpeechEmotionRecognition-Pytorch

基于Pytorch实现的语音情感识别

07 Jul 2022 118

Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system ...

17 Oct 2023 1,255

transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer ...

31 Oct 2021 1,644