ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector
MIT License
This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.
Files are organized into embeddings by file_id
. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.
The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.
The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.
.env
file based on section belowdocker compose up
(also starts RAG API)
docker compose -f ./db-compose.yaml up
docker compose up
(also starts PSQL/pgvector)
docker compose -f ./api-compose.yaml up
DB_HOST
to the correct database hostnamepip install -r requirements.txt
uvicorn main:app
The following environment variables are required to run the application:
RAG_OPENAI_API_KEY
: The API key for OpenAI API Embeddings (if using default settings).
OPENAI_API_KEY
will work but RAG_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.RAG_OPENAI_BASEURL
: (Optional) The base URL for your OpenAI API Embeddings
RAG_OPENAI_PROXY
: (Optional) Proxy for OpenAI API Embeddings
VECTOR_DB_TYPE
: (Optional) select vector database type, default to pgvector
.
POSTGRES_DB
: (Optional) The name of the PostgreSQL database, used when VECTOR_DB_TYPE=pgvector
.
POSTGRES_USER
: (Optional) The username for connecting to the PostgreSQL database.
POSTGRES_PASSWORD
: (Optional) The password for connecting to the PostgreSQL database.
DB_HOST
: (Optional) The hostname or IP address of the PostgreSQL database server.
DB_PORT
: (Optional) The port number of the PostgreSQL database server.
RAG_HOST
: (Optional) The hostname or IP address where the API server will run. Defaults to "0.0.0.0"
RAG_PORT
: (Optional) The port number where the API server will run. Defaults to port 8000.
JWT_SECRET
: (Optional) The secret key used for verifying JWT tokens for requests.
COLLECTION_NAME
: (Optional) The name of the collection in the vector store. Default value is "testcollection".
CHUNK_SIZE
: (Optional) The size of the chunks for text processing. Default value is "1500".
CHUNK_OVERLAP
: (Optional) The overlap between chunks during text processing. Default value is "100".
RAG_UPLOAD_DIR
: (Optional) The directory where uploaded files are stored. Default value is "./uploads/".
PDF_EXTRACT_IMAGES
: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is "False".
DEBUG_RAG_API
: (Optional) Set to "True" to show more verbose logging output in the server console, and to enable postgresql database routes
CONSOLE_JSON
: (Optional) Set to "True" to log as json for Cloud Logging aggregations
EMBEDDINGS_PROVIDER
: (Optional) either "openai", "bedrock", "azure", "huggingface", "huggingfacetei" or "ollama", where "huggingface" uses sentence_transformers; defaults to "openai"
EMBEDDINGS_MODEL
: (Optional) Set a valid embeddings model to use from the configured provider.
RAG_AZURE_OPENAI_API_VERSION
: (Optional) Default is 2023-05-15
. The version of the Azure OpenAI API.
RAG_AZURE_OPENAI_API_KEY
: (Optional) The API key for Azure OpenAI service.
AZURE_OPENAI_API_KEY
will work but RAG_AZURE_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.RAG_AZURE_OPENAI_ENDPOINT
: (Optional) The endpoint URL for Azure OpenAI service, including the resource.
https://YOUR_RESOURCE_NAME.openai.azure.com
.AZURE_OPENAI_ENDPOINT
will work but RAG_AZURE_OPENAI_ENDPOINT
will override it in order to not conflict with LibreChat setting.HF_TOKEN
: (Optional) if needed for huggingface
option.
OLLAMA_BASE_URL
: (Optional) defaults to http://ollama:11434
.
ATLAS_SEARCH_INDEX
: (Optional) the name of the vector search index if using Atlas MongoDB, defaults to vector_index
MONGO_VECTOR_COLLECTION
: Deprecated for MongoDB, please use ATLAS_SEARCH_INDEX
and COLLECTION_NAME
AWS_DEFAULT_REGION
: (Optional) defaults to us-east-1
AWS_ACCESS_KEY_ID
: (Optional) needed for bedrock embeddings
AWS_SECRET_ACCESS_KEY
: (Optional) needed for bedrock embeddings
Make sure to set these environment variables before running the application. You can set them in a .env
file or as system environment variables.
Instead of using the default pgvector, we could use Atlas MongoDB as the vector database. To do so, set the following environment variables
VECTOR_DB_TYPE=atlas-mongo
ATLAS_MONGO_DB_URI=<mongodb+srv://...>
COLLECTION_NAME=<vector collection>
ATLAS_SEARCH_INDEX=<vector search index>
The ATLAS_MONGO_DB_URI
could be the same or different from what is used by LibreChat. Even if it is the same, the $COLLECTION_NAME
collection needs to be a completely new one, separate from all collections used by LibreChat. In addition, create a vector search index for collection above (remember to assign $ATLAS_SEARCH_INDEX
) with the following json:
{
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
},
{
"path": "file_id",
"type": "filter"
}
]
}
Follow one of the four documented methods to create the vector index.
Make sure your RDS Postgres instance adheres to this requirement:
The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.
In order to setup RDS Postgres with RAG API, you can follow these steps:
Create a RDS Instance/Cluster using the provided AWS Documentation.
Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.
The login is via the Master User.
Create a dedicated database for rag_api:
create database rag_api;
.
Create a dedicated user\role for that database:
create role rag;
Switch to the database you just created: \c rag_api
Enable the Vector extension: create extension vector;
Use the documentation provided above to set up the connection string to the RDS Postgres Instance\Cluster.
Notes:
create role x with superuser;
Run the following commands to install pre-commit formatter, which uses black code formatter:
pip install pre-commit
pre-commit install