Developed a document question answering system that utilizes Llama and LangChain for contextual and accurate answers. The system supports .txt documents, intelligent text splitting, and context-aware querying through an easy-to-use Streamlit interface.
This project implements an advanced document question answering system using state-of-the-art language models and natural language processing techniques. By leveraging the power of Llama and LangChain, our system allows users to upload documents, ask questions about their content, and receive accurate, context-aware answers.
The system is designed to be efficient, scalable, and easily customizable, making it suitable for a wide range of applications, from personal knowledge management to enterprise-level document analysis.
TextLoader
.RecursiveCharacterTextSplitter
breaks documents into smaller, overlapping chunks.Clone the repository:
git clone https://github.com/yourusername/document-qa-system.git
cd document-qa-system
Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
Install the required packages:
pip install -r requirements.txt
Download the Llama model:
models
directoryMODEL_PATH
in the code to point to your model fileStart the Streamlit app:
streamlit run app.py
Open your web browser and navigate to the provided local URL (typically http://localhost:8501
)
Upload a document using the file uploader in the sidebar
Enter your question in the text input field
Click the "Ask" button to generate an answer
View the answer and relevant context in the main area of the app
Key configuration options can be found at the top of the app.py
file:
MODEL_PATH
: Path to the Llama model fileCHUNK_SIZE
: Size of text chunks for splitting (default: 256)CHUNK_OVERLAP
: Overlap between chunks (default: 0)TOP_K
: Number of most relevant chunks to consider (default: 1)LlamaCppEmbeddings
with other LangChain-compatible embedding modelstemplate
in the PromptTemplate
to alter the system's response styleCHUNK_SIZE
or use a smaller language modelCHUNK_SIZE
and CHUNK_OVERLAP
values, or try a more advanced language modelWe welcome contributions to improve the Document QA System! Here's how you can contribute:
git checkout -b feature/AmazingFeature
)git commit -m 'Add some AmazingFeature'
)git push origin feature/AmazingFeature
)