LexChat is a chatbot designed to guide you through discussions and analyze various perspectives from conversation transcripts of the Lex Fridman Podcast. It shows the moments where a topic is discussed and helps you watch it from that time stamp. The chatbot aims to provide balanced responses by considering all sides of an argument. The project is built using Python and integrates with Streamlit for the web interface, Weaviate for vector search, and OpenAI for natural language processing.
I usually find myself diving deep (a true DFS) into the internet, trying to research on my questions. But, the noise is louder than the signal drowning out the nuggets of wisdom that I seek. Finding the high-influential individuals or profound books on a particular subject feels like hunting for a golden needle in a haystack.
This chatbot searches through thousands of hours of talks with brilliant minds. Not just an idea, but its opposing ideas that were discussed during other episodes. Not just transcripts, but the exact timestamps of when an idea was discussed for convenient listening or watching. I can tap into a collective wisdom of the internet with ease.
You can quickly start using the chatbot by visiting the lexchat.streamlit.app website. The chatbot will be available for interaction, and you can ask your questions.
You can also run the chatbot locally by following the setup and execution instructions below.
git clone https://github.com/qniksefat/lexitalk.git
cd lexitalk
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
From this point, you can run the chatbot through web interface using the following command which will launch a it on your localhost:
streamlit run streamlit_app.py
You can also run the chatbot through the command line interface using the following command:
python cli_app.py
The project relies on a variety of external libraries and APIs to implement its features:
Please note that episdoes are not up to date containing up to episode #325 excluding episodes #84 and #100. The transcripts are available in the data/raw/all
directory. The transcripts were provided in Lexicap by Andrej Karpathy using Whisper.
The vector store index is built in MongoDB Atlas with the following schema:
{
"mappings": {
"dynamic": true,
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
},
{
"path": "metadata.views",
"type": "filter"
}
]
}
}
Contributions are welcome! Please feel free to submit a pull request or open an issue if you have any suggestions or ideas for improvement.