Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
A real-time speech-to-text transcriber using the Whisper model, designed for efficiency and ease of use in the console
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data
A quick & dirty script to generate and view subtitles and transcriptions for your multimedia files using ggerganov/whisper