MIT License
This repository has been created as an application to the Streamlit LLM hackathon.
The idea is to present multiple ways of improving Cypher-generating capabilities of LLMs to improve RAG applications based on knowledge graphs like Neo4j. LangChain is used for all the LLM integrations and functionalities.
Demo is available on Streamlit community cloud: https://vc-chatbot.streamlit.app/
You need to have access to GPT-4 in order for the demo to work!
There is a demo database running on demo.neo4jlabs.com. This database is a set of companies, their subsidiaries, people related to the companies and articles mentioned the companies. The database is a subset of the Diffbot knowledge graph. You can access it with the following credentials:
URI: neo4j+s://demo.neo4jlabs.com
username: companies
password: companies
database: companies
The database contains both structured information about organizations and people as well as news articles.
The news articles are linked to the mentioned entity, while the actual text is stored in the Chunk
nodes alongside their text-embedding-ada-002 vector representations.
You can ask questions related to companies, such as their board members, suppliers, competitors, subsidiaries, and investors. Additionally, you can ask questions regarding the news about those organizations, or just search through news in general using semantic search.
The code include a couple of improvements to the original LangChain GraphCypherQAChain:
To setup a local database replicating the dataset used in the demo, you need to follow these steps:
import.cql
script. You need to provide the openai_api_key to calculate the fewshot example embedding values
Then you can run the streamlit application by install the requirements and setting the streamlit secrets for the following variables:
And optionally NEO4J_DATABASE
. You can then start the streamlit application by running:
streamlit run src/app.py
Contributions are welcomed in the form of pull requests.