Subtitle Generator

This repository provides a Python script to generate subtitle files (.srt) for a given video file. The script extracts audio from the video, transcribes the audio using OpenAI's Whisper model, and generates an .srt file with the transcription and timestamps.

SRT Files

SRT (.srt) files are the most common type of closed caption file format. SRT stands for “SubRip Subtitle” file.

An SRT file includes:

The number of the closed caption frame in sequence
Beginning and end timecodes for when the closed caption frame should appear
The closed caption itself
A blank link to indicate the start of a new closed caption sequence

Features

Extracts audio from video files using ffmpeg.
Transcribes audio to text with timestamps using OpenAI's Whisper model.
Generates .srt subtitle files.

Requirements

Python 3.7+
ffmpeg
torch
openai-whisper

Installation

Install ffmpeg:

Make sure ffmpeg is installed on your system. You can download it from ffmpeg.org and follow the installation instructions for your operating system.

Clone the repository:

git clone https://github.com/yourusername/subtitle-generator.git
cd subtitle-generator

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required Python packages:
```
pip install -r requirements.txt
```

Usage

Provide the path to the video file:

Ensure you have the video file path ready. The script will process this file to generate subtitles.
Run the script:

Update all file paths (.srt, .wav, .mp4) in the script
```
python generate_subtitles.py
```
Output:

The script will create a .srt file with the same name as the video file in the same directory.

Script Overview

`generate_subtitles.py`

This is the main script that performs the following steps:

Extract audio from the video file:

Uses ffmpeg to extract audio from the provided video file and save it as a .wav file.
Transcribe the audio file:

Utilizes OpenAI's Whisper model to transcribe the audio into text with timestamps.
Create the .srt file:

Generates an .srt file with the transcribed text and timestamps.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

Acknowledgements

OpenAI Whisper for the transcription model.
ffmpeg for audio extraction.

Related Projects

subcreator

A subtitle creator, translator and embeder tool made using AI

22 Jul 2024 0

easy-subber

A Python-based tool that that takes video files and generates .srt subtitle files using Whisper f...

06 Sep 2024 5

whisper_autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using faster_whisper module ...

27 May 2023 20

autocut

用文本编辑器剪视频

28 Oct 2022 6,416

autosub

[NO LONGER MAINTAINED] Command-line utility for auto-generating subtitles for any video file

29 Jun 2015 4,139

subs_extract

Extracts per-sentence subtitles + audio from a subtitle file + video file.

26 Jan 2019 11

voice-gulliver

The best gradio web-ui for ai subtitle, translation and dubbing. Automatic subtitle creation usin...

05 Jun 2024 1

autosrt

A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Google Speech Rec...

04 Jun 2022 48

whisper_real_time

Real time transcription with OpenAI Whisper.

29 Nov 2022 2,279

whisper-plus

WhisperPlus: Advancing Speech-to-Text Processing 🚀

21 Nov 2023 1,318