whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

MIT License

Stars

1

Committers

View Code on GitHub

Ecosystems: Whisper, Cuda

Table of Contents

About The Project

Built With

Onnxruntime

The Story So Far

Coming soon

Getting Started:

Right now getting started is as simple as either a pip install from root or the upstream repo:

pip install .

#or 

pip install git+https://github.com/rhysdg/whisper-onnx-python.git

For Jetpack 5 support with Python 3.11 go ahead and run the installation script first to grab a pre-built onnxruntime-gpu wheel for aarch_64 and a few extra dependencies:
```
sh jetson_install.sh 

pip install .
```

Example usage:

Currently usage closely follows the official package but with a trt swicth (currently being debugged, False is recommended as a result) and expects either an audio file or a numy array:

import numpy as np
import whisper

args = {"language": 'English',
        "name": "small.en",
        "precision": "fp32",
        "disable_cupy": False}

temperature = tuple(np.arange(0, 1.0 + 1e-6, 0.2))

model = whisper.load_model(trt=False, **args)
result = model.transcribe(
                    'data/test.wav', 
                    temperature=temperature,
                    **args
                    )

You can also find an example voice transcription assistant at examples/example_assistant.py
- Go ahead and hold in your space bar from the command line in order to start recording
- Release to start transcription
- This has been tested on Ubuntu 22.04 and Jetpack 5 on a AGX Xavier but feel free to open an issue so we can work through any issues!
```
python examples/example_assistant.py
```

Customisation:

Coming soon

Notebooks

Coming soon

Tools and Scripts

Coming soon

Testing

Ubuntu 22.04 - RTX 3080, 8-core, Python 3.11 - passing
AGX Xavier, Jetpack 5.1.3, Python 3.11 - Passing
CI/CD will be expanded as we go - all general instantiation test pass so far.

Models & Latency benchmarks

Coming soon

Similar projects

Inspired by the work over at:
- whisper-onnx-tensorrt
- The original implementation

Latest Updates

Finished the core Python package
Added an example assistant
Added Jetpack support

Future updates

CI/CD
Pypi release
Becnhmarks for Jetson devices

Contact

Project link: https://github.com/rhysdg/whisper-onnx-python
Email: Rhys

Badges

Extracted from project README

Related Projects

https://github.com/neoheartbeats/neoheartbeats-kernel

An architecture for LLMs' continual-learning and long-term memories

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

24 Jul 2023 289

librapid

A highly optimised C++ library for mathematical applications and neural networks.

25 May 2021 163

Arch-Data-Science

Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision

Open3D-build

Provide Docker build sequences of Open3D for various environments.

https://github.com/MrNeRF/gaussian-splatting-cuda

3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!

30 Jul 2023 862

bmf

Cross-platform, customizable multimedia/video processing framework. With strong GPU acceleration...

15 Jul 2023 773

https://github.com/PennyLaneAI/pennylane-lightning

The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with...

cccl

CUDA C++ Core Libraries

17 Sep 2020 743

willow-inference-server

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT...

03 Feb 2023 375

awesome-gpgpu

A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources

ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints.

watsor

Object detection for video surveillance

20 Jun 2020 244

nvidia-gpu-ml-library-test

Simple tests for JAX, PyTorch, and TensorFlow to test if the installed NVIDIA drivers are being p...

spbla

Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations