pyannote-rs

pyannote audio diarization in rust

MIT License

Stars

15

Committers

View Code on GitHub Visit Website

Ecosystems: Rust, Whisper

pyannote-rs

Pyannote audio diarization in Rust

Features

Compute 1 hour of audio in less than a minute on CPU.
Faster performance with DirectML on Windows and CoreML on macOS.
Accurate timestamps with Pyannote segmentation.
Identify speakers with wespeaker embeddings.

Install

cargo add pyannote-rs

Usage

Examples

pyannote-rs uses 2 models for speaker diarization:

Segmentation: segmentation-3.0 identifies when speech occurs.
Speaker Identification: wespeaker-voxceleb-resnet34-LM identifies who is speaking.

Inference is powered by onnxruntime.

The segmentation model processes up to 10s of audio, using a sliding window approach (iterating in chunks).
The embedding model processes filter banks (audio features) extracted with knf-rs.

Speaker comparison (e.g., determining if Alice spoke again) is done using cosine similarity.

Credits

Big thanks to pyannote-onnx and kaldi-native-fbank

Badges

Extracted from project README's

Related Projects

rusty-whisper

Rust implementation of Whisper

whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

17 Mar 2023 872

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

25 Jan 2023 3,362

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

09 Dec 2022 8,782

whisper-node

Node.js bindings for OpenAI's Whisper. (C++ CPU version by ggerganov)

18 Dec 2022 225

faster-whisper-rs

a rust crate for easily implementing faster-whisper stt into your rust programs.

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

16 Sep 2022 64,924

lora-svc

singing voice change based on whisper, and lora for singing voice clone

08 Sep 2022 618

whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered b...

wscribe

ez audio transcription tool with flexible processing and post-processing options

21 Jul 2023 125

transcribe

Python package for accurate audio transcription with speaker diarisation

Whisper-transcription_and_diarization-speaker-identification-

How to use OpenAIs Whisper to transcribe and diarize audio files

12 Oct 2022 285

WhisperLive

A nearly-live implementation of OpenAI's Whisper.

04 May 2023 1,194

whisper-playground

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

02 Oct 2022 776

ChineseTaiwaneseWhisper

This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (M...