This project transcribes audio using whisper and provides an api
APACHE-2.0 License
This project is a Flask-based application that utilizes the Whisper library for transcribing audio files and adds functionality for video subtitling. The main components of the project are the API endpoints, the Whisper model, the Transcriber class, and the video subtitling modules.
This project is a comprehensive solution for transcribing audio files using the Whisper library and adding subtitles to videos. It provides a set of API endpoints for audio transcription and includes modules for video subtitling. The project encompasses audio transcription, text extraction, and video subtitling functionalities.
Click here to see the tutorials for this project.
To install the project, follow these steps:
Clone the repository:
git clone https://github.com/ivanrj7j/Transcription.git
Navigate to the project directory:
cd Transcription
Install the required dependencies:
pip install -r requirements.txt
To use the project, follow these steps:
Start the Flask application:
python main.py
The application will start on port 5000 by default. You can access the API endpoints using a tool like Postman or curl.
The project provides three main API endpoints for audio transcription:
/transcribe
: Transcribes an audio file and returns the transcription as a JSON response./text
: Extracts text from an uploaded audio file and returns it as a JSON response./rawSegments
: Extracts raw audio segments from an uploaded audio file and returns them as a JSON response.The Whisper model is a state-of-the-art speech-to-text library used for transcribing audio files.
The Transcriber class encapsulates the functionality of the Whisper model, providing methods for transcription and text extraction.
The project includes video subtitling capabilities with two new modules:
The VideoTranscriber
class in video.py
handles the process of adding subtitles to videos. Key features include:
The subtitle.py
module contains two main classes:
SubtitleConfig
: Manages the configuration for subtitle appearance, including font, size, color, and positioning.
Subtitle
: Represents individual subtitle segments and handles the rendering of subtitles on video frames.
These classes work together to provide a flexible and customizable video subtitling system.
The utils
module contains helper functions used throughout the project, including audio file handling and temporary file management.
The main
module initializes the Flask application, registers the API endpoints, and sets up the Whisper model and Transcriber class.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for more information.