A RAG chatbot that answers both Eskwelabs bootcamp-specific queries and general bootcamp-related questions.
This project focuses on the end-to-end development of a Q&A Chatbot tailored to answer bootcamp-related queries, specifically for Eskwelabs. The methodology covers the key steps, including knowledge base embedding, Retrieval-Augmented Generation (RAG) chatbot development using LangChain, and deployment via Streamlit.
Disclaimer: This guide demonstrates how to create your own chatbot using Llama3.1
. However, Llama3.1
struggles with making multiple tool calls simultaneously. In contrast, GPT-Turbo-3.5
handles this task more efficiently. For improved results, GPT-Turbo-3.5
is recommended over Llama3.1
. You can try out the chatbot powered by GPT-Turbo-3.5 here.
git clone https://github.com/alfonsokan/eskwelabs_chatbot.git
pip install -r requirements.txt
Then, run the following command in the command line (CMD):
ollama run llama3.1
streamlit run app.py
The flow chart below displays the 4-step approach to developing the chatbot.
1. Data Preparation
For this step, the documents are embeddings and stored in the vector database called embeddings_deployment_sentencetransformer
located in this repository.
If interested, the code for embedding the documents can be viewed here.
2. Retriever Generation
3. Tool-calling Agent Creation
Three parameters to instantiate a tool-calling agent:
resume = st.file_uploader("Upload File", type=['txt', 'docx', 'pdf'])
# if a resume is passed, include resume retriever as tool
if resume is not None:
with open(resume.name, "wb") as f:
f.write(resume.getbuffer())
resume_tool = resume_retriever_tool(resume.name)
eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
tools = [resume_tool, eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]
# if no resume is passed, do not include resume retriever as tool
else:
eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)
tools = [eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]
from langchain_ollama import ChatOllama
llm = ChatOllama(
model = "llama3.1",
temperature = 0.1,
num_predict = 350,
verbose=True.
)
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import MessagesPlaceholder
prompt = ChatPromptTemplate(
messages=[
MessagesPlaceholder(variable_name='chat_history'),
('system', "You're a helpful assistant who provides concise, complete answers without getting cut off mid-statement. Stick strictly to the user's questions, avoiding any unnecessary details."),
('human', '{input}'),
MessagesPlaceholder(variable_name="agent_scratchpad")
]
)
Afterwards, the tool-calling agent can now be instantiated:
from langchain.agents import create_tool_calling_agent, AgentExecutor
agent=create_tool_calling_agent(llm,tools,prompt)
agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True, handle_parsing_errors=True)
4. Response Generation
def process_chat(agent_executor, user_input, chat_history):
response = agent_executor.invoke(
{'input': user_input,
'chat_history': chat_history
},
)
return response['output']
if "messages" not in st.session_state:
st.session_state.messages = []
if "chat_history" not in st.session_state:
st.session_state.chat_history = []
if "unique_id" not in st.session_state:
st.session_state.unique_id = 0
if "chat_history_vector_store" not in st.session_state:
st.session_state.chat_history_vector_store = None
if "fed_chat_history" not in st.session_state:
st.session_state.fed_chat_history = []
# Display chat messages from history on app rerun
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# React to user input
if user_input := st.chat_input("Say something"):
# Display user message in chat message container
with st.chat_message("human"):
st.markdown(user_input)
# Add user message to chat history
st.session_state.messages.append({"role": "user", "content": user_input})
if st.session_state.chat_history_vector_store:
results = st.session_state.chat_history_vector_store.similarity_search(query=user_input,
k=4,
filter={'use_case':'chat_history'})
sequenced_chat_history = [(parse_message(results.metadata['msg_element']), results.metadata['msg_placement']) for results in results]
sequenced_chat_history.sort(key=lambda pair: pair[1])
st.session_state.fed_chat_history = [message[0] for message in sequenced_chat_history]
# chatbot response
response = process_chat(agent_executor, user_input, st.session_state.fed_chat_history)
st.session_state.chat_history.append(HumanMessage(content=user_input))
st.session_state.chat_history.append(AIMessage(content=response))
formatted_human_message = format_message(HumanMessage(content=user_input))
formatted_ai_message = format_message(AIMessage(content=response))
# Display assistant response in chat message container
with st.chat_message("assistant"):
st.markdown(response)
# Add assistant response to chat history
st.session_state.messages.append({"role": "assistant", "content": response})
# Add the last two messages (HumanMessage and AIMessage) to the vector store
if st.session_state.chat_history_vector_store:
st.session_state.chat_history_vector_store.add_texts(
texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content],
ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
metadatas=[
{'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
{'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
],
embedding=embedding_function
)
st.session_state.unique_id += 2
else:
# Initialize the vector store with the last two messages
st.session_state.chat_history_vector_store = Chroma.from_texts(
texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content],
ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],
metadatas=[
{'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},
{'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}
],
embedding=embedding_function
)
st.session_state.unique_id += 2
st.session_state.chat_history = [] # after embedding convos to vector store, clear chat history before the end of the loop
The code for the embedding of the knowledge base can be found here.
LangSmith can be a useful tool for debugging the chatbot application. To trace its runs, do the following:
.env
fileLANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY='ENTER_API_KEY_HERE'
LANGCHAIN_PROJECT='PROJECT_NAME'
app.py
file, make sure environment variables are loaded properly.from dotenv import load_dotenv
load_dotenv()