Speech-To-Text-Prompter

Turn your voice into a prompt!

Prerequisites

FFmpeg must be installed to run Whisper.

Please install FFmpeg compatible with your OS from the following link.

FFmpeg : https://ffmpeg.org/download.html

After installing FFmpeg, make sure to add the FFmpeg/bin folder to your system PATH!

Installation

git clone https://github.com/jhj0517/stable-diffusion-webui-Speech-To-Text-Prompter.git to your stable-diffusion-webui extensions folder.

or alternatively, download and unzip the repository in your extensions folder!

How to use

Select "Speech-To-Text Prompter" in the Script drop-down.
Record your voice, select the model you want, and choose the source language (usually "Automatic detection" works fine).
Run the transcrption. This may take some time. Once it's done you can move the prompts on the UI.

Available models

The Extension uses the Open AI Whisper model

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

.en models are for English only, and the cool thing is that you can use the Translate to English option from the "large" models!

Related Projects

jarvis-lite

My lightweight J.A.R.V.I.S desktop experiment

26 Dec 2017 2

whisper-plus

WhisperPlus: Advancing Speech-to-Text Processing 🚀

21 Nov 2023 1,318

talkGPT4All

A voice chatbot based on GPT4All and talkGPT, running on your local pc!

01 Apr 2023 140

speech-to-speech

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for S...

31 Jan 2023 26

whisper-dictate

Run once. Hold left Opt+Cmd and speak. It will transcribe and type what you said, so you don't ha...

06 Nov 2023 2

open-dubbing

Open dubbing is an AI dubbing system which uses machine learning models to automatically translat...

14 Sep 2024 24

Fine-tuning-Whisper

Fine tuning Whisper-Small LLM for Hinglish Audio dataset

30 Jul 2024 2

Whisper-WebUI

A Web UI for easy subtitle using whisper model.

02 Mar 2023 1,083

wubbl0rz-archiv-transcribe

Speech to text using whisper, used in....

26 Nov 2022 5

Realtime-Whisper-Console-Transcriber

A real-time speech-to-text transcriber using the Whisper model, designed for efficiency and ease ...

08 Jul 2024 3

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voic...

11 Feb 2023 4,703

audiotext

A desktop application that transcribes audio from files, microphone input or YouTube videos with ...

31 Jan 2023 159

subtitle

Open-source subtitle generation for seamless content translation.

17 Nov 2023 407

whisper_autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using faster_whisper module ...

27 May 2023 20

Whisper-Transcription-UI

Whisper Transcription UI is a user-friendly graphical interface for whisper-standalone-win. Trans...

01 Jun 2024 4