What's Changed

Fix Golang example by @streamer45 in https://github.com/snakers4/silero-vad/pull/496
fix: rust example for v5 checkpoint by @rumbleFTW in https://github.com/snakers4/silero-vad/pull/497
VadIterator first chunk bag fx by @adamnsandle in https://github.com/snakers4/silero-vad/pull/505
Add java example for wav file & support V5 model by @yuguanqin in https://github.com/snakers4/silero-vad/pull/506
add csharp example by @nganju98 in https://github.com/snakers4/silero-vad/pull/507
downgrade onnxruntime dependency by @adamnsandle in https://github.com/snakers4/silero-vad/pull/521
код для тюнинга by @adamnsandle in https://github.com/snakers4/silero-vad/pull/526
add neg_threshold parameter explicitly by @adamnsandle in https://github.com/snakers4/silero-vad/pull/528
Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/529
Fixed the pyaudio example can not run issue. by @gengyuchao in https://github.com/snakers4/silero-vad/pull/539
Update README.md by @adamnsandle in https://github.com/snakers4/silero-vad/pull/540
Update README.md by @adamnsandle in https://github.com/snakers4/silero-vad/pull/541
Update README.md by @snakers4 in https://github.com/snakers4/silero-vad/pull/542
Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/543
Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/549

New Contributors

@rumbleFTW made their first contribution in https://github.com/snakers4/silero-vad/pull/497
@yuguanqin made their first contribution in https://github.com/snakers4/silero-vad/pull/506
@nganju98 made their first contribution in https://github.com/snakers4/silero-vad/pull/507
@gengyuchao made their first contribution in https://github.com/snakers4/silero-vad/pull/539

Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.1...v5.1.1

silero-vad - v5.1 Latest Release

Published by snakers4 4 months ago

Experimental PIP package release

Experimental pip-package release;
Community PRs to update the examples;

What's Changed

Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/481
Update microphone_and_webRTC_integration.py by @eltociear in https://github.com/snakers4/silero-vad/pull/475
cpp example by @filtercodes in https://github.com/snakers4/silero-vad/pull/482
Update Golang example to support model v5 by @streamer45 in https://github.com/snakers4/silero-vad/pull/489
Create python-publish.yml by @adamnsandle in https://github.com/snakers4/silero-vad/pull/492
Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/493

New Contributors

@eltociear made their first contribution in https://github.com/snakers4/silero-vad/pull/475
@filtercodes made their first contribution in https://github.com/snakers4/silero-vad/pull/482

Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.0...v5.1

silero-vad - Finally, V5 is here, 3x faster, supporting 6000+ languages!

Published by snakers4 4 months ago

Performance and Model Size

3x faster inference for TorchScript, 10% faster inference for ONNX;
Now TorchScript is as fast as ONNX;
Model size is 2x larger, 2MB vs. 1MB;

Quality

The VAD supports more than 6,000 languages now;
Significanly more robust on noisy data;
Overall 5-7% quality increase on clean data;
Quality difference for 8 kHz and 16 kHz is negligible now;
Quality difference for different window sizes is negligible => window size was deprecated;
Added benchmarks on 9 unique datasets (2 private) and one holistic multi-domain dataset;

Changes and deprecations

ONNX opset 16;
window_size_samples is deprecated - now the VAD only works with fixed size window;
VAD now works with 8 kHz and 16 kHz sample rates, only with fixed 256 and 512 sample windows respectively;
Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk;
Sample rates that are a multiple of 16 kHz are still supported;

silero-vad - # New V4 VAD Released

Published by snakers4 almost 2 years ago

New V4 VAD Released

Improved quality
Improved perfomance
Both 8k and 16k sampling rates are now supported by the ONNX model
Batching is now supported by the ONNX model
Added audio_forward method for one-line processing of a single or multiple audio without postprocessing
Hotfix applied - wrong model was uploaded
Minor hotfix re. PyTorch version

silero-vad - New V3 ONNX VAD Released

Published by snakers4 almost 3 years ago

We finally were able to port a model to ONNX:

Compact model (~100k params);
Both PyTorch and ONNX models are not quantized;
Same quality model as the latest best PyTorch release;
Only 16kHz available now (ONNX has some issues with if-statements and / or tracing vs scripting) with cryptic errors;
In our tests, on short audios (chunks) ONNX is 2-3x faster than PyTorch (this is mitigated with larger batches or long audios);
Audio examples and non-core models moved out of the repo to save space;

silero-vad - New V3 Silero VAD is Already Here

Published by snakers4 almost 3 years ago

Main changes

One VAD to rule them all! New model includes the functionality of the previous ones with improved quality and speed!
Flexible sampling rate, 8000 Hz and 16000 Hz are supported;
Flexible chunk size, minimum chunk size is just 30 milliseconds!
100k parameters;
GPU and batching are supported;
Radically simplified examples;

Migration

Please see the new examples.

New get_speech_timestamps is a simplified and unified version of the old deprecated get_speech_ts or get_speech_ts_adaptive methods.

speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=16000)

New VADIterator class serves as an example for streaming tasks instead of old deprecated VADiterator and VADiteratorAdaptive.

vad_iterator = VADIterator(model)
window_size_samples = 1536

for i in range(0, len(wav), window_size_samples):
   speech_dict = vad_iterator(wav[i: i+ window_size_samples], return_seconds=True)
   if speech_dict:
       print(speech_dict, end=' ')
vad_iterator.reset_states()