Transcription and Video Subtitling API

This project is a Flask-based application that utilizes the Whisper library for transcribing audio files and adds functionality for video subtitling. The main components of the project are the API endpoints, the Whisper model, the Transcriber class, and the video subtitling modules.

Introduction

This project is a comprehensive solution for transcribing audio files using the Whisper library and adding subtitles to videos. It provides a set of API endpoints for audio transcription and includes modules for video subtitling. The project encompasses audio transcription, text extraction, and video subtitling functionalities.

Tutorial

Click here to see the tutorials for this project.

Installation

To install the project, follow these steps:

Clone the repository:

git clone https://github.com/ivanrj7j/Transcription.git

Navigate to the project directory:
```
cd Transcription
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

To use the project, follow these steps:

Start the Flask application:
```
python main.py
```
The application will start on port 5000 by default. You can access the API endpoints using a tool like Postman or curl.

API Endpoints

The project provides three main API endpoints for audio transcription:

/transcribe: Transcribes an audio file and returns the transcription as a JSON response.
/text: Extracts text from an uploaded audio file and returns it as a JSON response.
/rawSegments: Extracts raw audio segments from an uploaded audio file and returns them as a JSON response.

Whisper Model

The Whisper model is a state-of-the-art speech-to-text library used for transcribing audio files.

Transcriber Class

The Transcriber class encapsulates the functionality of the Whisper model, providing methods for transcription and text extraction.

Video Subtitling

The project includes video subtitling capabilities with two new modules:

VideoTranscriber (video.py)

The VideoTranscriber class in video.py handles the process of adding subtitles to videos. Key features include:

Initializing with a video file, subtitle configuration, and raw subtitle data
Converting timestamps to frame indices
Interpreting SRT-like subtitle data
Applying subtitles to video frames
Saving the subtitled video to a file

Subtitle and SubtitleConfig (subtitle.py)

The subtitle.py module contains two main classes:

SubtitleConfig: Manages the configuration for subtitle appearance, including font, size, color, and positioning.
Subtitle: Represents individual subtitle segments and handles the rendering of subtitles on video frames.

These classes work together to provide a flexible and customizable video subtitling system.

Utils Module

The utils module contains helper functions used throughout the project, including audio file handling and temporary file management.

Main Module

The main module initializes the Flask application, registers the API endpoints, and sets up the Whisper model and Transcriber class.

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Related Projects

Tech-Enhanced-AI-Interview-Learning-Platform

Developed a sophisticated machine learning model capable of generating diverse interview question...

08 Apr 2024 27

TechJam2024

2Waffles.Ai - An innovative dual-powered, intelligent assistant AI CRM assistant designed to enha...

09 Jun 2024 3

voxu

This Python package provides a request logging and viewing system for Flask applications. It allo...

02 Mar 2024 2

whatsapp-voice-gpt

SonicAI is a WhatsApp Chatbot designed to provide users with a convenient and engaging way to int...

13 Apr 2023 25

flask-ocr-app

A web application that allows users to upload an image and convert it to text using Optical Chara...

19 Jun 2024 6

LangChain-v0.2-HuggingFace-Llama3

This project integrates LangChain v0.2.6, HuggingFace Serverless Inference API, and Meta-Llama-3-...

04 Jul 2024 3

cinnamon

A social reader built with Python Flask.

11 Sep 2021 12

Chatistics

A WhatsApp Chat analyzer and statistics.

29 Dec 2020 40

speech

The Assistive Speech Technology System is designed to enhance communication by analyzing and proc...

06 Jun 2024 0

PyMessager

Python API to develop chatbot on Facebook Messenger Platform

14 May 2016 607

Webify

Webify is an innovative tool that converts YouTube videos into fully functional websites, leverag...

11 Aug 2024 1

qcaster

Queue and simulcast to Farcaster and Twitter/X

04 May 2023 9

ResurrectAI

ResurrectAI is an AI-driven chat application designed to bring the wisdom and knowledge of great ...

08 Sep 2024 2

Chatbot-PDF

This repository is created for the web development project of Custom PDF ChatBot by METIS, IITGN.

22 May 2024 2

python-project-template

DO NOT FORK, CLICK ON "Use this template" - A github template to start a Python Project - this us...

14 Aug 2021 989

Transcription