Silero VAD: pre-trained enterprise-grade Voice Activity Detector
MIT License
Published by snakers4 16 days ago
A tag to upload new PIP package.
Published by snakers4 16 days ago
Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.1...v5.1.1
Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.0...v5.1
Published by snakers4 4 months ago
window_size_samples
is deprecated - now the VAD only works with fixed size window;Published by snakers4 almost 2 years ago
Published by snakers4 almost 3 years ago
We finally were able to port a model to ONNX:
Published by snakers4 almost 3 years ago
8000 Hz
and 16000 Hz
are supported;Please see the new examples.
New get_speech_timestamps
is a simplified and unified version of the old deprecated get_speech_ts
or get_speech_ts_adaptive
methods.
speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=16000)
New VADIterator
class serves as an example for streaming tasks instead of old deprecated VADiterator
and VADiteratorAdaptive
.
vad_iterator = VADIterator(model)
window_size_samples = 1536
for i in range(0, len(wav), window_size_samples):
speech_dict = vad_iterator(wav[i: i+ window_size_samples], return_seconds=True)
if speech_dict:
print(speech_dict, end=' ')
vad_iterator.reset_states()
Published by snakers4 almost 3 years ago
This is a technical tag, so that users, who do now want to use newer models, could just checkout this tag.