gcp-llm-retrieval-augmentation

A retrieval augmentation LLM demo in GCP

Stars
18

LLM retrieval augmentation in Google Cloud

This demo features GCP Vector Search and VertexAI PaLM to combine the functionality of retrieval augmentation and conversational engines to create a question answering system where the user can ask a question and the LLM will use it's given context to answer the question.

The Dataset used is the Stanford Question Answering Dataset (SQuAD) , a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles.

The demo can be accessed here.

Services used

Frameworks:

Prerequisites

Docs

  1. Infrastructure and Vector Search Setup:
    Setup the required infrastructure using Terraform and create
    the Vector Search index
  2. Create embeddings: Generate the embeddings for the documents and index them in
    Vector Search
  3. Firestore: Index the documents in Firestore
  4. LangChain Retriever and Agent: Create a LangChain retriever and conversational agent
  5. Cloud Run: Grab all the code, package it and deploy the API to Cloud Run
  6. Firebase WebUI: Create the Web app
Related Projects