The Personal Voice Assistant is a sophisticated AI-driven tool designed to interact with users through natural language. Leveraging a state-of-the-art language model (LLM), this assistant provides a seamless and intuitive experience by understanding and executing functions.
MIT License
This is a personal voice assistant that can perform various tasks such as playing music from YouTube, fixing errors, and chatting with you like a normal chatbot. The assistant is built using Python and leverages several libraries and APIs to provide its functionalities.
Error Fixing Process:
When a user reports an error, the Personal Voice Assistant takes a screenshot of the current screen to capture the exact issue. This image is then processed using OpenCV to extract the text from the screenshot. The extracted text is sent to the LLaMA 3 language model, which analyzes the content and generates a relevant response or solution. The assistant then communicates the suggested fix or troubleshooting steps back to the user, ensuring a streamlined and effective resolution process.
Make sure you have Python installed on your system. You can download it from python.org.
Clone the repository:
git clone https://github.com/kiritoInd/Personal-Voice-Assistant.git
cd Personal-Voice-Assistant
Install the required packages:
pip install -r requirements.txt
Create a .env
file in the root directory of the project and add your Groq API key:
GROQ_API_KEY=your_groq_api_key
python main.py
You can add more functionalities to the assistant through the function calling list. Learn more about function calling at DataCamp's OpenAI Function Calling Tutorial.
You can use the same for meta LLama3
You can add more functionalities to the assistant through the function calling list. Learn more about function calling at DataCamp's OpenAI Function Calling Tutorial.
To add new functions, update the function_calling_template
in the code:
function_calling_template = """
<tools> {
"name": "Your Function",
"description": "Description of the function",
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
} </tools>
"""
json
speech_recognition
pyttsx3
groq
Pillow
opencv-python-headless
pytesseract
datasets
torch
transformers
soundfile
sounddevice
requests
beautifulsoup4
keyboard
tkinter
This project is licensed under the MIT License - see the LICENSE file for details.