Promptflow rag project template

The overall intention of this repo is to host the end-to-end process for building RAG applications, showcasing development, evaluation, experimentation, and deployment aspects using PromptFlow, Azure AI Studio, and other Azure data products.

The repo is currently hosting a single use-case on using PromptFlow and Azure AI for development, evaluation, and deployment of a RAG-based chatbot for question and answering on financial transcripts. In this sample, we also demonstrate how to use other Azure database offerings for vector searches.

Prerequisites

Before you begin, ensure that you have the following installed on your machine:

Python 3.9 or later, VSCode, PromptFlow for VSCode extension, Docker

Walkthrough: RAG on financial transcripts sample

Steps

a. steps to run rag app locally in your vscode:

Set up your dev environment:
Install miniconda for your environment, here is the link for windows miniconda. Run the following command
conda env update -f environment.yaml
Install azure-cli if you haven't already. Do az login.
Make a copy of .env.sample and rename it to .env. You can use this file to decide to use keys from this file or azure keyvault. The keys will be used for preprocessing in step 3 and creating connections for promptflow in step 4.

# Use .env or keyvault. ENV or empty = .env, KEYVAULT = keyvault
KEYS_FROM="ENV"
KEY_VAULT_NAME=""

NOTE: Our convention is that variables from keyvault have a (-), but from a .env has a (_) like OPENAI-API-BASE vs. OPENAI_API_BASE

cd to the preprocessing/ folder and start running the preprocessing notebooks to create a new vector database index if you haven't done so already.
Create connections for ACS, AOAI, etc by running python code in connections/ directory.
Go to rag-<vector-search> directory, open flow.dag.yaml visually, then choose the connections that you have created in any specific nodes that are complaining with a warning.

For vector search, you may use azure search, or native vector search capabilities of cosmosdb and postgreSQL flex. Currently, we included complete flows for azure-search (previously azure cognitive search), cosmos db mongodbvcore, cosmos db postgreSQL, and postgreSQL flex as shown in rag-azure-search, rag-cosmos-mongo,rag-cosmos-postgresql-pgvector, and rag-postgresql-flex-pgvector directories, respectively.

Run or build locally to deploy the app and interact with the bot in your local environment.

Note: You will find two yaml files in the rag-<vector-search> folder. The flow.dag.yaml is the main yaml file that orchestrates various app components, such as retreivals, llm calls, etc. In addition, you will find the hyperparameters of the flow inside the param_config.yaml. As an example, the changes you make to topK in param_config.yaml file will be reflected in the flow.dag.yaml at the run time.

b. Steps for batch evaluation in vscode:

Go to rag-<vector-search> directory, open flow.dag.yaml.
Click batch run
When selecting input source, choose evalset.csv.
Then choose the data mapping on the run yaml file.
You may either delete topK and maxTokens to use default values or provide integers for the desired values. Do not select them from data mapping as they will not be available
The click run on the yaml file.
Once the run is completed, then you need to
Go to evaluator directory and choose one of the folders and open the flow.dag.yaml file. Note that each folder presents one or many evaluation metrics.
Click the batch run to start an evaluation flow.
when prompted, choose "existing run" since we are going to use the results of the main rag flow for the evaluation flow.
Choose the run in step 5 of the evaluation
This will create a run yaml file. You need to uncomment the data to be able to chose the evalset.csv again. You may need to use some columns such as ground truth answers.
Choose the column mapping for the necessary inputs.
Note the name of the output file in your terminal.
Click on the promptflow icon on the left ribbon of vscode
Go to "Batch Run History" section and choose your recent run(s), then click on the Visualize.

Note: You will find two yaml files in the rag-<vector-search> directories. The flow.dag.yaml is the main yaml file that orchestrates various app components, such as retreivals, llm calls, etc. In addition, you will find the hyperparameters of the flow inside the param_config.yaml. As an example, the changes you make to topK in param_config.yaml file will be reflected in the flow.dag.yaml at the run time.

c. Steps for experimentation in vscode:

Similar to step a.3 open the flow.dag.yaml file. Locate a prompt node and clone it. It will create a new variant and associated jinja file. Make the changes to the prompt in the jinja file. You may also make the changes to the openai variables such as temperature in the cloned node in the flow.dag.yaml. You may create multiple variants for cloneable nodes. Then save the file.
Finally, go through all the steps for the batch evaluations again to obtain evaluations for all the variants and compare the results.

d. Steps for batch experimentation using python SDK:

Go to experimentation directory and chose the subdirectory based on your choice of vectorsearch services.
Open batchRunEvaluations.ipynb notebook and run through cells. Note: to setup the configs for the batch experimentation runs, you may modify run_config.yaml file for batch evaluations on several configurations as you may find in the last section of the notebook.

Note: the notebook is setup to use evalset.csv, which contains 10 human-curated pairs of questions and answers. Please refer to the readme file in the datasets subdirectory for alternative datasets.

e. Steps for docker deployment:

pre-requisite: Docker. You can get docker from here.

Change directory to sample folder (e.g. financial_transcripts)
Use the command below to recreate your llm app as a docker format or click the vscode build option:

pf flow build --source ./rag-<vector-search> --output deploy --format docker

Note: replace vector-search with one of available search options: (1)azure-search (2)cosmos-mongo (3)cosmos-postgresql-pgvector (4)postgresql-flex-pgvector

Note: the deploy folder is where the llm app is packaged.

Inspect the requirement.txt file in the 'deploy/flow' directory.
Inspect the connection files in the 'deploy/connections' and double-check information such as api_base and api_version.
Build the docker image file by running the following command in the financial_transcipts/ folder. docker build deploy -t rag-app-serve
Run with docker run. Make sure to add secret values in the command below:

docker run -p 8080:8080 -e AOAI_CONNECTION_API_KEY=<secret-value> -e ACS_CONNECTION_API_KEY=<secret-value> rag-app-serve

Note: check the port mapping and change if needed.

Finally inspect the end point.
In you local machine, you may inspect your app in a browser: http://localhost:8080/

f. steps for webapp deployment

First, you need to provision an azure container registry and an app service plan. Then follow steps from promptflow's official documentations for webapp deployment: https://microsoft.github.io/promptflow/cloud/azureai/deploy-to-azure-appservice.html.

If you wish to use azure cli, please refer to the instructions in Deployment via Azure CLI Instructions for provisioning and deploying your webapp.

Please note that you if you have your own frontend, you can use the deployed web app as an endpoint and integrate with your frontend. See example snippet below for making requests.

import requests  
import json  
url = 'https://<RAG-WEBAPP-NAME>.azurewebsites.net/score'
  
data = {
    "query": "what was azure ML revenue in FY23Q2?"
}  
headers = {"Content-Type": "application/json"}  
response = requests.post(url, data=json.dumps(data), headers=headers)  
print(response.text)

NOTE: additional information about client interaction with promptflow serving and streaming can be found at financial_transcripts/deployment_utilities/test_client

Alternative Azure Vector databases

Microsft azure databases, such as cosmosdb mongodb vcore, nosql, or postgreSQL flexible server, also offer vector search capabilities that could be used in lieu of azure AI search. Please refer to vectordb-tools directory for the instructions. You may also refer to rag samples for an end-to-end implementation on financial transcripts.

Tool	Description	End-to-End RAG sample
cosmos-mongodbvcore	Cosmos DB mongoDB vcore	rag-cosmos-mongo
cosmos-postgresql-pgvector	Cosmos DB PostgreSQL with pgvector extension	rag-cosmos-postgresql-pgvector
postgresql-flex-pgvector	Azure DB PostgreSQL Flexible Server with pgvector extension	rag-postgresql-flex-pgvector
cosmos-nosql	Cosmos DB NoSQL	rag-cosmos-nosql

Developer guide

For any contributions, please make sure to check the Python formatting with Black formatter.

Contributing

Let's Expand this repo to interesting and more complex rag-applications.

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.