Create a cloud server. If installing on Digital Ocean, make sure to enable the agent with advanced metrics.
For 4 users, 8 CPUs and 16 GB RAM is recommended. After creating the machine, add the IP address to the appropriate DNS record.
Prep the packages:
sudo apt update
sudo apt upgrade -y
sudo apt install -y build-essential # needed for streamp3 package
sudo apt install -y libmp3lame-dev # needed for elevenlabs
sudo apt install -y ffmpeg # for processing elevenlabs input
Install Anaconda:
wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
bash Anaconda3-2024.06-1-Linux-x86_64.sh -b
$HOME/anaconda3/bin/conda init
source ~/.bashrc
rm Anaconda3-2024.06-1-Linux-x86_64.sh
Clone the repo:
git clone https://github.com/kylemcdonald/voice-in-my-head.git
cd voice-in-my-head
Create the environment:
conda create -y -n vimh python=3.9
conda activate vimh
conda install -y -c conda-forge libstdcxx-ng # needed for daily-python
conda install -y pytorch torchvision torchaudio cpuonly -c pytorch
pip install -r requirements.txt
Setup nginx:
# first, edit .nginx to represent the desired subdomain
sudo apt install -y nginx
sudo ufw allow 'Nginx Full'
sudo cp .nginx /etc/nginx/sites-available/vimh.iyoiyo.studio
sudo ln -s /etc/nginx/sites-available/vimh.iyoiyo.studio /etc/nginx/sites-enabled/
Setup certbot:
sudo snap install --classic certbot
sudo ln -s /snap/bin/certbot /usr/bin/certbot
sudo certbot --nginx
Install nvm, Node, and Tailwind:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash
source ~/.bashrc
nvm install 16
npm install -D tailwindcss
npx tailwindcss init
npm run buildcss
Fill out the .env file with the appropriate keys.
ELEVENLABS_API_KEY=...
OPENAI_API_KEY==...
DAILY_API_KEY==...
DEEPGRAM_API_KEY==...
TURN_TIME_SECONDS=50
TOTAL_TIME_MINUTES=25
ROOM_EXPIRE_MINUTES=35
Install the service:
bash install-service.sh
phone/background.png
pictureRun server with Flask autoreloading:
flask --app server.py --debug run
Run the server with gunicorn:
gunicorn -w 4 server:app
Shortcut for running gunicorn:
./run.sh
The duration of the experience is controlled in the .env file.
Sounds should match the audio stream:
Note they might get slightly glitched by the compression and streaming algorithms.
They should also always fade out quickly, or sometimes they can create a lingering noise.
helpers/prepare-sound.sh
will help prepare sounds for this format.
Each row of the script has a function, input and output.
The function is the name of a function instead the VoiceInMyHead
class. The input is the input to that function, and the output is where the output is saved.
When you save output to a variable, you can reference that as an input in later rows.
If a variable is referenced in a speak
line, it should be surrounded by {curly braces}. (This is because the speak
lines get preprocessed and combined.)