eskwelabs_chatbot

A RAG chatbot that answers both Eskwelabs bootcamp-specific queries and general bootcamp-related questions.

Stars
0
Committers
2

Eskwelabs Chatbot

This project focuses on the end-to-end development of a Q&A Chatbot tailored to answer bootcamp-related queries, specifically for Eskwelabs. The methodology covers the key steps, including knowledge base embedding, Retrieval-Augmented Generation (RAG) chatbot development using LangChain, and deployment via Streamlit.

Disclaimer: This guide demonstrates how to create your own chatbot using Llama3.1. However, Llama3.1 struggles with making multiple tool calls simultaneously. In contrast, GPT-Turbo-3.5 handles this task more efficiently. For improved results, GPT-Turbo-3.5 is recommended over Llama3.1. You can try out the chatbot powered by GPT-Turbo-3.5 here.

Tech Stack

  • ChromaDB: Vector Store
  • LangChain: Chatbot Framework
  • Llama3.1: Large Language Model
  • text-embedding-ada-002: Embedding Model
  • SemanticChunker: Chunking Strategy

Installation

  1. Clone the repository
git clone https://github.com/alfonsokan/eskwelabs_chatbot.git
  1. Install libraries
pip install -r requirements.txt
  1. Install an open-source LLM using Ollama. Refer to the Ollama documentation and select an LLM.

Then, run the following command in the command line (CMD):

ollama run llama3.1
  1. For the code repository, open terminal and run the following command:
streamlit run app.py

Methodology

The flow chart below displays the 4-step approach to developing the chatbot.

1. Data Preparation

For this step, the documents are embeddings and stored in the vector database called embeddings_deployment_sentencetransformer located in this repository.

If interested, the code for embedding the documents can be viewed here.

2. Retriever Generation

  • There are two retriever tools, Eskwelabs Info Retriever and General Bootcamp Info Retriever, created from the embedded knowledge base.
  • Another retriever is optionally used when a user submits their resume to the chatbot.

3. Tool-calling Agent Creation

Three parameters to instantiate a tool-calling agent:

  • List of retriever tools
    resume = st.file_uploader("Upload File", type=['txt', 'docx', 'pdf'])

    # if a resume is passed, include resume retriever as tool
    if resume is not None:
        with open(resume.name, "wb") as f:
            f.write(resume.getbuffer())
        resume_tool = resume_retriever_tool(resume.name)
        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
        tools = [resume_tool, eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]

    # if no resume is passed, do not include resume retriever as tool
    else:
        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
        tools = [eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]
  • LLM (Llama 3.1)
from langchain_ollama import ChatOllama

llm = ChatOllama(
    model = "llama3.1",
    temperature = 0.1,
    num_predict = 350,
    verbose=True.
)
  • Prompt passed to the chatbot
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import MessagesPlaceholder

prompt = ChatPromptTemplate(
    messages=[
        MessagesPlaceholder(variable_name='chat_history'),
        ('system', "You're a helpful assistant who provides concise, complete answers without getting cut off mid-statement. Stick strictly to the user's questions, avoiding any unnecessary details."),
        ('human', '{input}'),
        MessagesPlaceholder(variable_name="agent_scratchpad")       
                                                                    
    ]
)

Afterwards, the tool-calling agent can now be instantiated:

from langchain.agents import create_tool_calling_agent, AgentExecutor

agent=create_tool_calling_agent(llm,tools,prompt)
agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True, handle_parsing_errors=True)

4. Response Generation

  • Pass the tool-calling agent, user input, as well as chat history to generate a response.
def process_chat(agent_executor, user_input, chat_history):
    response = agent_executor.invoke(
        {'input': user_input,
         'chat_history': chat_history
         },
    )
    return response['output']

Recommendations

  • Using a ReAct agent instead of a tool-calling agent
    • Explore output quality using a ReAct agent instead of a tool-calling agent. Develop a ReAct prompt that enables the LLM to generate reasoning traces before taking action on a task.
  • Explore different chunking strategies and embedding model
  • Connect the chatbot to a third-party database (Redis-Upstash) to allow long-term storage of chat history

Appendix

Chatbot's Selective History Retrieval Mechanism

  • Past conversations’ user query and chatbot response are stored in a temporary vector store
  • For each new user query, only the most relevant parts of the vector store are retrieved and passed as chat history to the chatbot.
  • Importance: This feature reduces token consumption by efficiently retrieving only the relevant parts of the chat history, preventing unnecessary length and minimizing what is passed to the LLM.
  • Code snippet of app with chat history implemented:
if "messages" not in st.session_state:
    st.session_state.messages = []


if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

if "unique_id" not in st.session_state:
    st.session_state.unique_id = 0


if "chat_history_vector_store" not in st.session_state:
    st.session_state.chat_history_vector_store = None

if "fed_chat_history" not in st.session_state:
    st.session_state.fed_chat_history = []


# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# React to user input
if user_input := st.chat_input("Say something"):
    # Display user message in chat message container
    with st.chat_message("human"):
        st.markdown(user_input)
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": user_input})

    if st.session_state.chat_history_vector_store:
        results = st.session_state.chat_history_vector_store.similarity_search(query=user_input,
                                                                k=4,
                                                                filter={'use_case':'chat_history'})

        sequenced_chat_history = [(parse_message(results.metadata['msg_element']), results.metadata['msg_placement']) for results in results]
        sequenced_chat_history.sort(key=lambda pair: pair[1])
        st.session_state.fed_chat_history = [message[0] for message in sequenced_chat_history]


    # chatbot response
    response = process_chat(agent_executor, user_input, st.session_state.fed_chat_history)

    st.session_state.chat_history.append(HumanMessage(content=user_input))
    st.session_state.chat_history.append(AIMessage(content=response))

    formatted_human_message = format_message(HumanMessage(content=user_input))
    formatted_ai_message = format_message(AIMessage(content=response))


    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        st.markdown(response)
    # Add assistant response to chat history
    st.session_state.messages.append({"role": "assistant", "content": response})


    # Add the last two messages (HumanMessage and AIMessage) to the vector store
    if st.session_state.chat_history_vector_store:
        st.session_state.chat_history_vector_store.add_texts(
            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 
            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
            metadatas=[
                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
            ],
            embedding=embedding_function
        )
        st.session_state.unique_id += 2
    else:
        # Initialize the vector store with the last two messages
        st.session_state.chat_history_vector_store = Chroma.from_texts(
            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 
            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
            metadatas=[
                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
            ],
            embedding=embedding_function
        )
        st.session_state.unique_id += 2
    
    st.session_state.chat_history = [] # after embedding convos to vector store, clear chat history before the end of the loop

Knowledge Base Embedding

The code for the embedding of the knowledge base can be found here.

LangSmith Tracing

LangSmith can be a useful tool for debugging the chatbot application. To trace its runs, do the following:

  • Create a LangSmith account
  • Retrieve API key
  • Create .env file
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY='ENTER_API_KEY_HERE'
LANGCHAIN_PROJECT='PROJECT_NAME'
  • In the app.py file, make sure environment variables are loaded properly.
from dotenv import load_dotenv
load_dotenv()
  • LangSmith can be used to check if the LLM calls the appropriate tool/s for the prompt.
    image
    image
Related Projects