Eskwelabs Chatbot

This project focuses on the end-to-end development of a Q&A Chatbot tailored to answer bootcamp-related queries, specifically for Eskwelabs. The methodology covers the key steps, including knowledge base embedding, Retrieval-Augmented Generation (RAG) chatbot development using LangChain, and deployment via Streamlit.

Disclaimer: This guide demonstrates how to create your own chatbot using Llama3.1. However, Llama3.1 struggles with making multiple tool calls simultaneously. In contrast, GPT-Turbo-3.5 handles this task more efficiently. For improved results, GPT-Turbo-3.5 is recommended over Llama3.1. You can try out the chatbot powered by GPT-Turbo-3.5 here.

Tech Stack

ChromaDB: Vector Store
LangChain: Chatbot Framework
Llama3.1: Large Language Model
text-embedding-ada-002: Embedding Model
SemanticChunker: Chunking Strategy

Installation

Clone the repository

git clone https://github.com/alfonsokan/eskwelabs_chatbot.git

Install libraries

pip install -r requirements.txt

Install an open-source LLM using Ollama. Refer to the Ollama documentation and select an LLM.

Then, run the following command in the command line (CMD):

ollama run llama3.1

For the code repository, open terminal and run the following command:

streamlit run app.py

Methodology

The flow chart below displays the 4-step approach to developing the chatbot.

1. Data Preparation

For this step, the documents are embeddings and stored in the vector database called embeddings_deployment_sentencetransformer located in this repository.

If interested, the code for embedding the documents can be viewed here.

2. Retriever Generation

There are two retriever tools, Eskwelabs Info Retriever and General Bootcamp Info Retriever, created from the embedded knowledge base.

Another retriever is optionally used when a user submits their resume to the chatbot.

3. Tool-calling Agent Creation

Three parameters to instantiate a tool-calling agent:

List of retriever tools

    resume = st.file_uploader("Upload File", type=['txt', 'docx', 'pdf'])

    # if a resume is passed, include resume retriever as tool
    if resume is not None:
        with open(resume.name, "wb") as f:
            f.write(resume.getbuffer())
        resume_tool = resume_retriever_tool(resume.name)
        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
        tools = [resume_tool, eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]

    # if no resume is passed, do not include resume retriever as tool
    else:
        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
        tools = [eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]

LLM (Llama 3.1)

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model = "llama3.1",
    temperature = 0.1,
    num_predict = 350,
    verbose=True.
)

Prompt passed to the chatbot

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import MessagesPlaceholder

prompt = ChatPromptTemplate(
    messages=[
        MessagesPlaceholder(variable_name='chat_history'),
        ('system', "You're a helpful assistant who provides concise, complete answers without getting cut off mid-statement. Stick strictly to the user's questions, avoiding any unnecessary details."),
        ('human', '{input}'),
        MessagesPlaceholder(variable_name="agent_scratchpad")       
                                                                    
    ]
)

Afterwards, the tool-calling agent can now be instantiated:

from langchain.agents import create_tool_calling_agent, AgentExecutor

agent=create_tool_calling_agent(llm,tools,prompt)
agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True, handle_parsing_errors=True)

4. Response Generation

Pass the tool-calling agent, user input, as well as chat history to generate a response.

def process_chat(agent_executor, user_input, chat_history):
    response = agent_executor.invoke(
        {'input': user_input,
         'chat_history': chat_history
         },
    )
    return response['output']

Recommendations

Using a ReAct agent instead of a tool-calling agent
- Explore output quality using a ReAct agent instead of a tool-calling agent. Develop a ReAct prompt that enables the LLM to generate reasoning traces before taking action on a task.
Explore different chunking strategies and embedding model
Connect the chatbot to a third-party database (Redis-Upstash) to allow long-term storage of chat history

Appendix

Chatbot's Selective History Retrieval Mechanism

Past conversations’ user query and chatbot response are stored in a temporary vector store
For each new user query, only the most relevant parts of the vector store are retrieved and passed as chat history to the chatbot.
Importance: This feature reduces token consumption by efficiently retrieving only the relevant parts of the chat history, preventing unnecessary length and minimizing what is passed to the LLM.
Code snippet of app with chat history implemented:

if "messages" not in st.session_state:
    st.session_state.messages = []


if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

if "unique_id" not in st.session_state:
    st.session_state.unique_id = 0


if "chat_history_vector_store" not in st.session_state:
    st.session_state.chat_history_vector_store = None

if "fed_chat_history" not in st.session_state:
    st.session_state.fed_chat_history = []


# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# React to user input
if user_input := st.chat_input("Say something"):
    # Display user message in chat message container
    with st.chat_message("human"):
        st.markdown(user_input)
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": user_input})

    if st.session_state.chat_history_vector_store:
        results = st.session_state.chat_history_vector_store.similarity_search(query=user_input,
                                                                k=4,
                                                                filter={'use_case':'chat_history'})

        sequenced_chat_history = [(parse_message(results.metadata['msg_element']), results.metadata['msg_placement']) for results in results]
        sequenced_chat_history.sort(key=lambda pair: pair[1])
        st.session_state.fed_chat_history = [message[0] for message in sequenced_chat_history]


    # chatbot response
    response = process_chat(agent_executor, user_input, st.session_state.fed_chat_history)

    st.session_state.chat_history.append(HumanMessage(content=user_input))
    st.session_state.chat_history.append(AIMessage(content=response))

    formatted_human_message = format_message(HumanMessage(content=user_input))
    formatted_ai_message = format_message(AIMessage(content=response))


    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        st.markdown(response)
    # Add assistant response to chat history
    st.session_state.messages.append({"role": "assistant", "content": response})


    # Add the last two messages (HumanMessage and AIMessage) to the vector store
    if st.session_state.chat_history_vector_store:
        st.session_state.chat_history_vector_store.add_texts(
            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 
            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
            metadatas=[
                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
            ],
            embedding=embedding_function
        )
        st.session_state.unique_id += 2
    else:
        # Initialize the vector store with the last two messages
        st.session_state.chat_history_vector_store = Chroma.from_texts(
            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 
            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
            metadatas=[
                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
            ],
            embedding=embedding_function
        )
        st.session_state.unique_id += 2
    
    st.session_state.chat_history = [] # after embedding convos to vector store, clear chat history before the end of the loop

Knowledge Base Embedding

The code for the embedding of the knowledge base can be found here.

LangSmith Tracing

LangSmith can be a useful tool for debugging the chatbot application. To trace its runs, do the following:

Create a LangSmith account
Retrieve API key
Create .env file

LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY='ENTER_API_KEY_HERE'
LANGCHAIN_PROJECT='PROJECT_NAME'

In the app.py file, make sure environment variables are loaded properly.

from dotenv import load_dotenv
load_dotenv()

LangSmith can be used to check if the LLM calls the appropriate tool/s for the prompt.

Related Projects

embedJs

A NodeJS RAG framework to easily work with LLMs and embeddings

29 Jun 2023 281

Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 ...

31 Mar 2023 31,493

chap

chap - A Python interface to chatgpt, including a terminal user interface (tui)

07 Mar 2023 14

botality-ii

telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large langua...

11 Mar 2023 37

libre-chat

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline ca...

26 Jul 2023 128

modelfusion

The TypeScript library for building AI applications.

25 May 2023 889

langchain-glm

基于 Langchain，快速集成GLM-4 AllTools 功能的插件

14 Jun 2024 34

LLM-RAG-Chatbot-With-LangChain

Development and deployment of a question-answer LLM model using Llama2 with 7B parameters and RAG...

18 Jun 2024 1

ArogyaMitra

An accessible, reliable, and efficient platform for medical information and support using LLMs

22 Jul 2024 1

llama-api

An OpenAI-like LLaMA inference API

20 Jul 2023 111

LangChain-v0.2-HuggingFace-Llama3

This project integrates LangChain v0.2.6, HuggingFace Serverless Inference API, and Meta-Llama-3-...

04 Jul 2024 3

Get-Things-Done-with-Prompt-Engineering-and-LangChain

LangChain & Prompt Engineering tutorials on Large Language Models (LLMs) such as ChatGPT with cus...

12 Apr 2023 1,094

llama_ros

llama.cpp and llava.cpp for ROS 2

01 Apr 2023 80

api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, L...

23 May 2023 2,330

megabots

🤖 State-of-the-art, production ready LLM apps made mega-easy, so you don't have to build them fro...

11 Apr 2023 343