Llama_RAG_System

Overview

The Llama_RAG_System is a robust retrieval-augmented generation (RAG) system designed to interactively respond to user queries with rich, contextually relevant answers. Built using the LLaMA model and Ollama, this system can handle various tasks, including answering general questions, summarizing content, and extracting information from uploaded PDF documents. The architecture utilizes ChromaDB for efficient document embedding and retrieval, while also incorporating web scraping capabilities to fetch up-to-date information from the internet.

Here’s a glimpse of the Gradio app interface:

🚧 Please note: This project is currently in development. Your feedback and contributions are welcome!

Features

Local Model Execution with Ollama: Utilizes Ollama to run the LLaMA model locally, ensuring faster responses and enhanced privacy. By keeping the data processing local, users can maintain control over their information without sending it to external servers.
Web Scraping for Updated Answers: Scrapes the internet to provide real-time, relevant information, allowing the system to deliver accurate responses based on the latest data.
PDF Document Processing: Upload PDF files for automatic text extraction and embedding.
Dynamic Query Handling: Automatically detects the type of user queries (general questions, summarization, chit-chat, etc.) and provides appropriate responses.
Gradio and Flask Interfaces: User-friendly web interfaces for interacting with the model and uploading documents.
Custom Embeddings: Utilizes ChromaDB to store and retrieve document embeddings efficiently.

Why Use Ollama?

Ollama is an excellent option for running machine learning models locally for several reasons:

Privacy: Running the model on local infrastructure ensures that sensitive data remains within the user's environment, minimizing the risk of data breaches or leaks.
Performance: Local execution reduces latency, allowing for quicker response times compared to cloud-based solutions.
Customization: Users can fine-tune the model to meet specific needs without depending on external service providers.

Folder Structure

The project is organized as follows:

project/
├── core/
│   ├── embedding.py             # Embedding-related functionality
│   ├── document_utils.py        # Functions to handle document loading and processing
│   ├── query.py                 # Query document functionality
│   ├── generate.py              # Response generation logic
│   ├── web_scrape.py            # Web scraping functionality
│
├── scripts/
│   ├── run_flask.py             # Script to run Flask API
│   ├── run_gradio.py            # Script to run Gradio interface
│
├── chromadb_setup.py            # ChromaDB setup and connection
│
├── README.md                    # Project documentation

Installation

To set up the Llama_RAG_System, follow these steps:

Clone the repository:

git clone https://github.com/NimaVahdat/Llama_RAG_System.git
cd Llama_RAG_System

Ensure that ChromaDB and any other necessary services are running as needed.

Usage

Running the Flask API

To start the Flask API, run the following command:

python -m scripts.run_flask

Running the Gradio Interface

To launch the Gradio interface, execute:

python -m scripts.run_gradio

After running either script, you will be able to interact with the system via the provided web interface.

Contributing

Contributions are welcome! If you have suggestions for improvements or features, please fork the repository and submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

LLaMA for the underlying model architecture.
Ollama for local execution of machine learning models, enhancing privacy and performance.
Gradio for the interactive interface.
ChromaDB for efficient document storage and retrieval.

Contact

For any inquiries or support, please contact me.

Related Projects

LLM-Stream-Service

Streaming API and Web page for Large Language Models (Llama3) based on transformers+flask+gradio.

18 May 2024 2

Masters-Thesis-on-Big-Data

Master's thesis on Big Data

01 Feb 2022 33

Tech-Enhanced-AI-Interview-Learning-Platform

Developed a sophisticated machine learning model capable of generating diverse interview question...

08 Apr 2024 27

baby-code

100% Private & Simple. OSS 🐍 Code Interpreter for LLMs 🦙

23 Jul 2023 34

Chatbot-PDF

This repository is created for the web development project of Custom PDF ChatBot by METIS, IITGN.

22 May 2024 2

LangChain-v0.2-HuggingFace-Llama3

This project integrates LangChain v0.2.6, HuggingFace Serverless Inference API, and Meta-Llama-3-...

04 Jul 2024 3

ToK

Simple, High Quality, Open Source RAG solution for chatting with your documents

11 May 2024 13

ArogyaMitra

An accessible, reliable, and efficient platform for medical information and support using LLMs

22 Jul 2024 1

Stock_Analysis_Investment

This project is a robust and scalable multi-agent stock investment and analysis platform built us...

07 Aug 2024 1

ResurrectAI

ResurrectAI is an AI-driven chat application designed to bring the wisdom and knowledge of great ...

08 Sep 2024 2

TiChat

Simple, High Quality, RAG application using TiDB vector store

23 Aug 2024 0

Jarvis-Chat

Jarvis-Chat is a Flask-based web application that utilizes various AI models to provide users wit...

11 Sep 2024 2

flask-ocr-app

A web application that allows users to upload an image and convert it to text using Optical Chara...

19 Jun 2024 6

ClassifyXR.ai

The Customer Support Ticket Classification and Response System combines advance AI models with RA...

16 Aug 2024 0

TechJam2024

2Waffles.Ai - An innovative dual-powered, intelligent assistant AI CRM assistant designed to enha...

09 Jun 2024 3