Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Swift native on-device speech recognition with Whisper for Apple Silicon
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Whisper command line client compatible with original OpenAI client based on CTranslate2
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc
React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in
Replace OpenAI GPT with another LLM in your app by changing a single line of code
Local web app for transcription and translation services for audio and video using Whisper models
Решение соревнования ТехШторм от корпорации ТатНефть по анализу активности членов команды на ВКС
This repository contains an experimental demo application that shows how you can add client-side auto-generated captions to Amazon IVS Real-time and Low-latency streams using transformers
System/service with REST API for extracting text transcriptions from movies and audio recordings in most popular video formats
A real-time, instant dictation desktop application built on Electron that uses Whisper and GROQ under the hood
A bot that downloads, transcribes and analyzes calls to find insights for sales advisors
A web ui application that utilizes the stream-translator-gpt