Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages