rag-research-agent-template

MIT License

Stars
2

LangGraph RAG Research Agent Template

This is a starter project to help you get started with developing a RAG research agent using LangGraph in LangGraph Studio.

What it does

This project has three graphs:

  • an "index" graph (src/index_graph/graph.py)
  • a "retrieval" graph (src/retrieval_graph/graph.py)
  • a "researcher" subgraph (part of the retrieval graph) (src/researcher_graph/graph.py)

The index graph takes in document objects indexes them.

[{ "page_content": "LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows." }]

If an empty list is provided (default), a list of sample documents from src/sample_docs.json is indexed instead. Those sample documents are based on the conceptual guides for LangChain and LangGraph.

The retrieval graph manages a chat history and responds based on the fetched documents. Specifically, it:

  1. Takes a user query as input
  2. Analyzes the query and determines how to route it:
  • if the query is about "LangChain", it creates a research plan based on the user's query and passes the plan to the researcher subgraph
  • if the query is ambiguous, it asks for more information
  • if the query is general (unrelated to LangChain), it lets the user know
  1. If the query is about "LangChain", the researcher subgraph runs for each step in the research plan, until no more steps are left:
  • it first generates a list of queries based on the step
  • it then retrieves the relevant documents in parallel for all queries and return the documents to the retrieval graph
  1. Finally, the retrieval graph generates a response based on the retrieved documents and the conversation context

Getting Started

Assuming you have already installed LangGraph Studio, to set up:

  1. Create a .env file.
cp .env.example .env
  1. Select your retriever & index, and save the access instructions to your .env file.

Setup Retriever

The defaults values for retriever_provider are shown below:

retriever_provider: elastic-local

Follow the instructions below to get set up, or pick one of the additional options.

Elasticsearch

Elasticsearch (as provided by Elastic) is an open source distributed search and analytics engine, scalable data store and vector database optimized for speed and relevance on production-scale workloads.

Setup Elasticsearch

Elasticsearch can be configured as the knowledge base provider for a retrieval agent by being deployed on Elastic Cloud (either as a hosted deployment or serverless project) or on your local environment.

Elasticsearch Serverless

  1. Signup for a free 14 day trial with Elasticsearch Serverless.
  2. Get the Elasticsearch URL, found on home under "Copy your connection details".
  3. Create an API key found on home under "API Key".
  4. Copy the URL and API key to your .env file created above:
ELASTICSEARCH_URL=<ES_URL>
ELASTICSEARCH_API_KEY=<API_KEY>

Elastic Cloud

  1. Signup for a free 14 day trial with Elastic Cloud.
  2. Get the Elasticsearch URL, found under Applications of your deployment.
  3. Create an API key. See the official elastic documentation for more information.
  4. Copy the URL and API key to your .env file created above:
ELASTICSEARCH_URL=<ES_URL>
ELASTICSEARCH_API_KEY=<API_KEY>

Local Elasticsearch (Docker)

docker run \
  -p 127.0.0.1:9200:9200 \
  -d \
  --name elasticsearch \
  -e ELASTIC_PASSWORD=changeme \
  -e "discovery.type=single-node" \
  -e "xpack.security.http.ssl.enabled=false" \
  -e "xpack.license.self_generated.type=trial" \
  docker.elastic.co/elasticsearch/elasticsearch:8.15.1

See the official Elastic documentation for more information on running it locally.

Then populate the following in your .env file:

# As both Elasticsearch and LangGraph Studio runs in Docker, we need to use host.docker.internal to access.

ELASTICSEARCH_URL=http://host.docker.internal:9200
ELASTICSEARCH_USER=elastic
ELASTICSEARCH_PASSWORD=changeme

MongoDB Atlas

MongoDB Atlas is a fully-managed cloud database that includes vector search capabilities for AI-powered applications.

  1. Create a free Atlas cluster:
  • Go to the MongoDB Atlas website and sign up for a free account.
  • After logging in, create a free cluster by following the on-screen instructions.
  1. Create a vector search index
  • Follow the instructions at the Mongo docs
  • By default, we use the collection langgraph_retrieval_agent.default - create the index there
  • Add an indexed filter for path user_id
  • IMPORTANT: select Atlas Vector Search NOT Atlas Search when creating the index
    Your final JSON editor configuration should look something like the following:
{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

The exact numDimensions may differ if you select a different embedding model.

  1. Set up your environment:
  • In the Atlas dashboard, click on "Connect" for your cluster.
  • Choose "Connect your application" and copy the provided connection string.
  • Create a .env file in your project root if you haven't already.
  • Add your MongoDB Atlas connection string to the .env file:
MONGODB_URI="mongodb+srv://username:[email protected]/?retryWrites=true&w=majority&appName=your-cluster-name"

Replace username, password, your-cluster-url, and your-cluster-name with your actual credentials and cluster information.

Pinecone Serverless

Pinecone is a managed, cloud-native vector database that provides long-term memory for high-performance AI applications.

  1. Sign up for a Pinecone account at https://login.pinecone.io/login if you haven't already.

  2. After logging in, generate an API key from the Pinecone console.

  3. Create a serverless index:

    • Choose a name for your index (e.g., "example-index")
    • Set the dimension based on your embedding model (e.g., 1536 for OpenAI embeddings)
    • Select "cosine" as the metric
    • Choose "Serverless" as the index type
    • Select your preferred cloud provider and region (e.g., AWS us-east-1)
  4. Once you have created your index and obtained your API key, add them to your .env file:

PINECONE_API_KEY=your-api-key
PINECONE_INDEX_NAME=your-index-name

Setup Model

The defaults values for response_model, query_model are shown below:

response_model: anthropic/claude-3-5-sonnet-20240620
query_model: anthropic/claude-3-haiku-20240307

Follow the instructions below to get set up, or pick one of the additional options.

Anthropic

To use Anthropic's chat models:

  1. Sign up for an Anthropic API key if you haven't already.
  2. Once you have your API key, add it to your .env file:
ANTHROPIC_API_KEY=your-api-key

OpenAI

To use OpenAI's chat models:

  1. Sign up for an OpenAI API key.
  2. Once you have your API key, add it to your .env file:
OPENAI_API_KEY=your-api-key

Setup Embedding Model

The defaults values for embedding_model are shown below:

embedding_model: openai/text-embedding-3-small

Follow the instructions below to get set up, or pick one of the additional options.

OpenAI

To use OpenAI's embeddings:

  1. Sign up for an OpenAI API key.
  2. Once you have your API key, add it to your .env file:
OPENAI_API_KEY=your-api-key

Cohere

To use Cohere's embeddings:

  1. Sign up for a Cohere API key.
  2. Once you have your API key, add it to your .env file:
COHERE_API_KEY=your-api-key

Using

Once you've set up your retriever and saved your model secrets, it's time to try it out! First, let's add some information to the index. Open studio, select the "indexer" graph from the dropdown in the top-left, and then add some content to chat over. You can just invoke it with an empty list (default) to index sample documents from LangChain and LangGraph documentation.

You'll know that the indexing is complete when the indexer "delete"'s the content from its graph memory (since it's been persisted in your configured storage provider).

Next, open the "retrieval_graph" using the dropdown in the top-left. Ask it questions about LangChain to confirm it can fetch the required information!

How to customize

You can customize this retrieval agent template in several ways:

  1. Change the retriever: You can switch between different vector stores (Elasticsearch, MongoDB, Pinecone) by modifying the retriever_provider in the configuration. Each provider has its own setup instructions in the "Getting Started" section above.

  2. Modify the embedding model: You can change the embedding model used for document indexing and query embedding by updating the embedding_model in the configuration. Options include various OpenAI and Cohere models.

  3. Adjust search parameters: Fine-tune the retrieval process by modifying the search_kwargs in the configuration. This allows you to control aspects like the number of documents retrieved or similarity thresholds.

  4. Customize the response generation: You can modify the response_system_prompt to change how the agent formulates its responses. This allows you to adjust the agent's personality or add specific instructions for answer generation.

  5. Modify prompts: Update the prompts used for user query routing, research planning, query generation and more in src/retrieval_graph/prompts.py to better suit your specific use case or to improve the agent's performance. You can also modify these directly in LangGraph Studio. For example, you can:

  • Modify system prompt for creating research plan (research_plan_system_prompt)
  • Modify system prompt for generating search queries based on the research plan (generate_queries_system_prompt)
  1. Change the language model: Update the response_model in the configuration to use different language models for response generation. Options include various Claude models from Anthropic, as well as models from other providers like Fireworks AI.

  2. Extend the graph: You can add new nodes or modify existing ones in the src/retrieval_graph/graph.py file to introduce additional processing steps or decision points in the agent's workflow.

  3. Add tools: Implement tools to expand the researcher agent's capabilities beyond simple retrieval generation.

Remember to test your changes thoroughly to ensure they improve the agent's performance for your specific use case.

Development

While iterating on your graph, you can edit past state and rerun your app from past states to debug specific nodes. Local changes will be automatically applied via hot reload. Try adding an interrupt before the agent calls the researcher subgraph, updating the default system message in src/retrieval_graph/prompts.py to take on a persona, or adding additional nodes and edges!

Follow up requests will be appended to the same thread. You can create an entirely new thread, clearing previous history, using the + button in the top right.

You can find the latest (under construction) docs on LangGraph here, including examples and other references. Using those guides can help you pick the right patterns to adapt here for your use case.

LangGraph Studio also integrates with LangSmith for more in-depth tracing and collaboration with teammates.