Research Paper Q&A Tool
A powerful tool for document search and analysis using advanced language models. Upload PDFs, convert them to vectors, and query your documents with ease.
https://github.com/user-attachments/assets/1161b9f2-7f42-4cc5-b15e-da7f1f6401c3
Features
-
PDF Upload and Vectorization: Upload PDFs and convert them into vectors using Pinecone.
-
Advanced Querying: Leverage the Ollama model for intelligent document querying.
-
User-Friendly Interface: Built with Streamlit for a seamless user experience.
Quick Start
Prerequisites
- Python 3.10
- Docker
- Pinecone API Key
- Ollama Model (pre-configured in the application)
Setup
1. Create a Pinecone Account and API Key
- Sign up for a Pinecone account at Pinecone.
- Create an index and generate your API key.
- Save your API key and index name, as you'll need them to run the application.
2. Configure the Models in the Code
- Open
backend/core/embedding_service.py
.
- Find the section where the models are defined:
# Example configuration
LLM_MODEL_NAME = "your_llm_model_name"
EMBEDDING_MODEL_NAME = "your_embedding_model_name"
- Replace "your_llm_model_name" and "your_embedding_model_name" with the actual names of the models you downloaded.
4. Build and Run
4.1. With Docker
- Ensure Docker is installed on your system.
- From the project root, run:
docker-compose up --build
- Access the app at
http://localhost:8501
.
4.2. Without Docker
- Install dependencies:
pip install -r backend/requirements.txt
- Start the FastAPI server:
python backend/scripts/main.py
- In a new terminal, start the Streamlit frontend:
streamlit run frontend/streamlit_ui.py
- Open
http://localhost:8501
to use the app.
5. Enter Pinecone API Key and Index to Use
After starting the Streamlit frontend, enter the Pinecone API key and index:
6. Upload your document pdf
- Push the buttom to convert your pdf to vector and store to the vectorDB
- Now, you may start to ask the llm question about your pdf