ποΈπ€Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI GPT3.5/4, Anthropic Claude2, Chroma Vector DB, Whisper Speech2Text, ElevenLabs Text2SpeechποΈπ€
MIT License
Try our site at RealChar.ai
Not sure how to pronounce RealChar? Listen to this π audip
https://github.com/Shaunwei/RealChar/assets/5101573/6b35a80e-5503-4850-973d-254039bd383c
https://github.com/Shaunwei/RealChar/assets/5101573/5de0b023-6cf3-4947-84cb-596f429d109e
https://github.com/Shaunwei/RealChar/assets/5101573/62a1f3d1-1166-4254-9119-97647be52c42
Demo settings: Web, GPT4, ElevenLabs with voice clone, Chroma, Google Speech to Text
Create a new .env
file
cp .env.example .env
Paste your API keys in .env
file. A single ReByte or OpenAI API key is enough to get started.
You can also configure other API keys if you have them.
Start the app with docker-compose.yaml
docker compose up
If you have issues with docker (especially on a non-Linux machine), please refer to https://docs.docker.com/get-docker/ (installation) and https://docs.docker.com/desktop/troubleshoot/overview/ (troubleshooting).
Open http://localhost:3000 and enjoy the app!
Step 1. Clone the repo
git clone https://github.com/Shaunwei/RealChar.git && cd RealChar
Step 2. Install requirements
Install portaudio and ffmpeg for audio
# for mac
brew install portaudio
brew install ffmpeg
# for ubuntu
sudo apt update
sudo apt install portaudio19-dev
sudo apt install ffmpeg
Note:
ffmpeg>=4.4
is needed to work with torchaudio>=2.1.0
Mac users may need to add ffmpeg library path to DYLD_LIBRARY_PATH
for torchaudio to work:
export DYLD_LIBRARY_PATH=/opt/homebrew/lib:$DYLD_LIBRARY_PATH
Then install all python requirements
pip install -r requirements.txt
If you need a faster local speech to text, install whisperX
pip install git+https://github.com/m-bain/whisperx.git
Step 3. Create an empty sqlite database if you have not done so before
sqlite3 test.db "VACUUM;"
Step 4. Run db upgrade
alembic upgrade head
This ensures your database schema is up to date. Please run this after every time you pull the main branch.
Step 5. Setup .env
:
cp .env.example .env
Update API keys and configs following the instructions in the .env
file.
Note that some features require a working login system. You can get your own OAuth2 login for free with Firebase if needed. To enable, set
USE_AUTH
totrue
and fill in theFIREBASE_CONFIG_PATH
field. Also fill in Firebase configs inclient/next-web/.env
.
Step 6. Run backend server with cli.py
or use uvicorn directly
python cli.py run-uvicorn
# or
uvicorn realtime_ai_character.main:app
Step 7. Run frontend client:
web client:
Create an .env
file under client/next-web/
cp client/next-web/.env.example client/next-web/.env
Adjust .env
according to the instruction in client/next-web/README.md
.
Start the frontend server:
python cli.py next-web-dev
# or
cd client/next-web && npm run dev
# or
cd client/next-web && npm run build && npm run start
After running these commands, a local development server will start, and your default web browser will open a new tab/window pointing to this server (usually http://localhost:3000).
(Optional) Terminal client:
Run the following command in your terminal
python client/cli.py
(Optional) mobile client:
open client/mobile/ios/rac/rac.xcodeproj/project.pbxproj
in Xcode and run the app
Step 8. Select one character to talk to, then start talking. Use GPT4 for better conversation and Wear headphone for best audio(avoid echo)
Note if you want to remotely connect to a RealChar server, SSL set up is required to establish the audio connection.
To get your ReByte API key, follow these steps:
To get your OpenAI API token, follow these steps:
(Optional) To use Azure OpenAI API instead, refer to the following section:
.env
file:OPENAI_API_TYPE=azure
If you want to use the earlier version 2023-03-15-preview
:
OPENAI_API_VERSION=2023-03-15-preview
OPENAI_API_BASE=https://your-base-url.openai.azure.com
OPENAI_API_MODEL_DEPLOYMENT_NAME=gpt-35-turbo-16k
OPENAI_API_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002
To get your Anthropic API token, follow these steps:
To get your Anyscale API token, follow these steps:
We support faster-whisper and whisperX as the local speech to text engines. Work with CPU and NVIDIA GPU.
To get your Google Cloud API credentials.json, follow these steps:
google_credentials.json
in the root folder of this project. Check Create and delete service account keys
SPEECH_TO_TEXT_USE
to use GOOGLE
in your .env
fileSame as OpenAI API Token
Edge TTS is the default and is free to use.
Creating an ElevenLabs Account
Visit ElevenLabs to create an account. You'll need this to access the text to speech and voice cloning features.
In your Profile Setting, you can get an API Key.
To get your Google Cloud API credentials.json, follow these steps:
google_credentials.json
in the root folder of this project. Check Create and delete service account keys
see realtime_ai_character/character_catalog/README.md
see docs/rebyte_agent_clone_instructions.md
To use Twilio with RealChar, you need to set up a Twilio account. Then, fill in the following environment variables in your .env
file:
TWILIO_ACCOUNT_SID=YOUR_TWILIO_ACCOUNT_SID
TWILIO_ACCESS_TOKEN=YOUR_TWILIO_ACCESS_TOKEN
DEFAULT_CALLOUT_NUMBER=YOUR_PHONE_NUMBER
You'll also need to install torch
and torchaudio
to use Twilio.
Now, you can receive phone calls from your characters by typing /call YOURNUMBER
in the text box when chatting with your character.
Note: only US phone numbers and Elevenlabs voiced characters are supported at the moment.
You can now use Anyscale Endpoint to serve Llama-2 models in your RealChar easily! Simply register an account with Anyscale Endpoint. Once you get the API key, set this environment variable in your .env
file:
ANYSCALE_ENDPOINT_API_KEY=<your API Key>
By default, we show the largest servable Llama-2 model (70B) in the Web UI. You can change the model name (meta-llama/Llama-2-70b-chat-hf
) to other models, e.g. 13b or 7b versions.
If you have access to LangSmith, you can edit these environment variables to enable:
LANGCHAIN_TRACING_V2=false # default off
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=YOUR_LANGCHAIN_API_KEY
LANGCHAIN_PROJECT=YOUR_LANGCHAIN_PROJECT
And it should work out of the box.
$*$ These features are powered by ReByte platform.
Please check out our Contribution Guide!