RAG document chat with Amazon Bedrock using Typescript on Lambda.
This repository represents a basic, more-or-less functional prototype-grade implementation of retrieval augmented generation (RAG) on Amazon Bedrock.
The project uses TypeScript and, I hope, a clear structure to understand the steps, undiluted without LangChain or other magics (other than a small text-splitting library!).
When making this, unfortunately it was harder than expected to get clear, first-principles examples and guidance, especially if you are not primarily interested in Python as the language. Thanks mainly to Janakiram MSV's videos on RAG and a set of examples shared by David Boyne on LinkedIn (unable to find the link, though...) I've been able to get something working that should demonstrate how to think of constructing these capabilities.
I hope this repo will make this technique easier to understand and implement for you, even in you might use a different tech stack.
In the AWS console:
document-chat-demo-embeddings
with standard settings.Collections
, select your collection, go into the Indexes
tab and create a vector index.
opensearch-index.json
into the text field.documents
.Collections
view, make note of the OpenSearch URL; you will update the infrastructure configuration in the next step.In serverless.yml
:
custom.awsAccountNumber
, custom.documentsBucketName
(your choice of random name), and custom.openSearchUrl
to your valuesIn your IDE/CLI:
npm run deploy
In the AWS console:
Serverless > Security > Data access policies
, open the pre-baked policy and add the Lambda functions' roles (Ask
and GenerateEmbeddings
) to the selected principalsThere is also a file src/config/config.ts
that you may wish to modify, if you want a different region or similar.
Clone, fork, or download the repo as you normally would. Run npm install
.
npm start
: Run application locallynpm run build
: Package application with Serverless Frameworknpm run deploy
: Deploy application to AWS with Serverless Frameworknpm run teardown
: Remove stack from AWSYou will need documents for this to use your "own data".
In the current implementation, the infrastructure allows for S3 events to be emitted for either PDF and TXT files being added to a documents
folder in your bucket (create this if you haven't already).
However, the actual chunking function will only currently do anything with TXT
files. Feel free to extend this with PDF parsing and whatever you might need. It's not too complicated, and this repo is about showing the principles in a working minimal way, so I've not felt any need to over-invest here and now.
To start the process of embedding vectors on document data, simply upload one of the provided documents (or any other such document) to your buckets documents
folder. There is a TXT and a PDF file, with essentially the same content, located in the data
directory.
The endpoint takes a GET request with a URL-encoded string. If you don't know how to do this by heart, then there are simple online tools that can help you.
For the question "What does Mikael say about dumb, predictable code?", the call would be:
curl https://RANDOM_ID.execute-api.REGION.amazonaws.com/\?ask\=What%20does%20Mikael%20say%20about%20dumb%2C%20predictable%20code%3F
This will respond back with the LLM's answer in a few seconds.
amazon.titan-tg1-large
amazon.titan-e1t-medium
amazon.titan-embed-g1-text-02
amazon.titan-text-express-v1
amazon.titan-embed-text-v1
stability.stable-diffusion-xl
stability.stable-diffusion-xl-v0
ai21.j2-grande-instruct
ai21.j2-jumbo-instruct
ai21.j2-mid
ai21.j2-mid-1
ai21.j2-ultra
ai21.j2-ultra-v1
anthropic.claude-instant-v1
anthropic.claude-v1
anthropic.claude-v2
cohere.command-text-v14
vector
for fields in OpenSearch - it won't work :)