Whisper-rs speech-to-text example

An example of how to use Whisper.cpp bindings for Rust to perform speech-to-text.

Prerequisites

Rust
Download and install your preferred Whisper models
Provide some sample audio files

Download and install Whisper models

In order to run this example, you need to download and install the Whisper models.

Detailed instructions on the best way to do that can be found in the Whisper.cpp README.

Provide sample audio files

Sample audio files can be placed in the ./samples folder.

Audio files need to be mono 16bit Wav files.

You can use ffmpeg to convert your audio/video files to this format:

ffmpeg -i <source_file> -ar 16000 -ac 1 -c:a pcm_s16le <target_file>

Note: make sure to replace <source_file> and <target_file> with the appropriate paths

If you are looking for some interesting audio examples, you can check out the following resources:

Build and run

cargo +nightly run --release -- <path_to_audio_file> [path_to_model]

Where:

path_to_audio_file - path to a mono 16bit Wav audio file to be transcribed, for example ./samples/whisper_demo_16k.wav.
path_to_model - path to the folder containing the model files, for example ./models/en_16k. If not provided, it will try to load the first .bin file found in the ./models folder.

Contributing

Everyone is very welcome to contribute to this project. You can contribute just by submitting bugs or suggesting improvements by opening an issue on GitHub.

License

Licensed under MIT License. Luciano Mammino.

Related Projects

faster-whisper-rs

a rust crate for easily implementing faster-whisper stt into your rust programs.

02 Jun 2024 5

whisper-rs-example