An end-to-end sample of RAG showcasing development, evaluation, experimentation, and deployment using Promptflow, search products like CosmosDB, PostgresSQL, and Azure AI Search
MIT License
The overall intention of this repo is to host the end-to-end process for building RAG applications, showcasing development, evaluation, experimentation, and deployment aspects using PromptFlow, Azure AI Studio, and other Azure data products.
The repo is currently hosting a single use-case on using PromptFlow and Azure AI for development, evaluation, and deployment of a RAG-based chatbot for question and answering on financial transcripts. In this sample, we also demonstrate how to use other Azure database offerings for vector searches.
Before you begin, ensure that you have the following installed on your machine:
Python 3.9 or later, VSCode, PromptFlow for VSCode extension, Docker
a. steps to run rag app locally in your vscode:
conda env update -f environment.yaml
az login
..env.sample
and rename it to .env
. You can use this file to decide to use keys from this file or azure keyvault. The keys will be used for preprocessing in step 3 and creating connections for promptflow in step 4.# Use .env or keyvault. ENV or empty = .env, KEYVAULT = keyvault
KEYS_FROM="ENV"
KEY_VAULT_NAME=""
NOTE: Our convention is that variables from keyvault have a (-), but from a
.env
has a (_) likeOPENAI-API-BASE
vs.OPENAI_API_BASE
preprocessing/
folder and start running the preprocessing notebooks to create a new vector database index if you haven't done so already.connections/
directory.rag-<vector-search>
directory, open flow.dag.yaml visually, then choose the connections that you have created in any specific nodes that are complaining with a warning.For vector search, you may use azure search, or native vector search capabilities of cosmosdb and postgreSQL flex. Currently, we included complete flows for azure-search (previously azure cognitive search), cosmos db mongodbvcore, cosmos db postgreSQL, and postgreSQL flex as shown in rag-azure-search
, rag-cosmos-mongo
,rag-cosmos-postgresql-pgvector
, and rag-postgresql-flex-pgvector
directories, respectively.
Note: You will find two yaml files in the rag-<vector-search>
folder. The flow.dag.yaml
is the main yaml file that orchestrates various app components, such as retreivals, llm calls, etc. In addition, you will find the hyperparameters of the flow inside the param_config.yaml
. As an example, the changes you make to topK
in param_config.yaml
file will be reflected in the flow.dag.yaml
at the run time.
b. Steps for batch evaluation in vscode:
rag-<vector-search>
directory, open flow.dag.yaml.topK
and maxTokens
to use default values or provide integers for the desired values. Do not select them from data mapping as they will not be availableNote: You will find two yaml files in the
rag-<vector-search>
directories. Theflow.dag.yaml
is the main yaml file that orchestrates various app components, such as retreivals, llm calls, etc. In addition, you will find the hyperparameters of the flow inside theparam_config.yaml
. As an example, the changes you make totopK
inparam_config.yaml
file will be reflected in theflow.dag.yaml
at the run time.
c. Steps for experimentation in vscode:
flow.dag.yaml
file. Locate a prompt node and clone it. It will create a new variant and associated jinja file. Make the changes to the prompt in the jinja file. You may also make the changes to the openai variables such as temperature in the cloned node in the flow.dag.yaml
. You may create multiple variants for cloneable nodes. Then save the file.d. Steps for batch experimentation using python SDK:
batchRunEvaluations.ipynb
notebook and run through cells. Note: to setup the configs for the batch experimentation runs, you may modify run_config.yaml
file for batch evaluations on several configurations as you may find in the last section of the notebook.evalset.csv
, which contains 10 human-curated pairs of questions and answers. Please refer to the readme file in the datasets subdirectory for alternative datasets.e. Steps for docker deployment:
pre-requisite: Docker. You can get docker from here.
pf flow build --source ./rag-<vector-search> --output deploy --format docker
Note: replace
vector-search
with one of available search options: (1)azure-search (2)cosmos-mongo (3)cosmos-postgresql-pgvector (4)postgresql-flex-pgvector
Note: the deploy folder is where the llm app is packaged.
Inspect the requirement.txt file in the 'deploy/flow' directory.
Inspect the connection files in the 'deploy/connections' and double-check information such as api_base and api_version.
Build the docker image file by running the following command in the financial_transcipts/
folder.
docker build deploy -t rag-app-serve
Run with docker run. Make sure to add secret values in the command below:
docker run -p 8080:8080 -e AOAI_CONNECTION_API_KEY=<secret-value> -e ACS_CONNECTION_API_KEY=<secret-value> rag-app-serve
Note: check the port mapping and change if needed.
f. steps for webapp deployment
First, you need to provision an azure container registry and an app service plan. Then follow steps from promptflow's official documentations for webapp deployment: https://microsoft.github.io/promptflow/cloud/azureai/deploy-to-azure-appservice.html.
If you wish to use azure cli, please refer to the instructions in Deployment via Azure CLI Instructions for provisioning and deploying your webapp.
Please note that you if you have your own frontend, you can use the deployed web app as an endpoint and integrate with your frontend. See example snippet below for making requests.
import requests
import json
url = 'https://<RAG-WEBAPP-NAME>.azurewebsites.net/score'
data = {
"query": "what was azure ML revenue in FY23Q2?"
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, data=json.dumps(data), headers=headers)
print(response.text)
NOTE: additional information about client interaction with promptflow serving and streaming can be found at
financial_transcripts/deployment_utilities/test_client
Microsft azure databases, such as cosmosdb mongodb vcore, nosql, or postgreSQL flexible server, also offer vector search capabilities that could be used in lieu of azure AI search. Please refer to vectordb-tools
directory for the instructions. You may also refer to rag samples for an end-to-end implementation on financial transcripts.
Tool | Description | End-to-End RAG sample |
---|---|---|
cosmos-mongodbvcore | Cosmos DB mongoDB vcore | rag-cosmos-mongo |
cosmos-postgresql-pgvector | Cosmos DB PostgreSQL with pgvector extension | rag-cosmos-postgresql-pgvector |
postgresql-flex-pgvector | Azure DB PostgreSQL Flexible Server with pgvector extension | rag-postgresql-flex-pgvector |
cosmos-nosql | Cosmos DB NoSQL | rag-cosmos-nosql |
For any contributions, please make sure to check the Python formatting with Black formatter.
Let's Expand this repo to interesting and more complex rag-applications.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.