ChatPDF-GPT is an innovative chat interface application powered by LangChain and OpenAI, allowing users to upload and chat with PDF documents, stored in Pinecone vector database and Supabase storage.
MIT License
ChatPDF-GPT is an innovative project that harnesses the power of the LangChain framework, a transformative tool for developing applications powered by language models. This unique application uses LangChain to offer a chat interface that communicates with PDF documents, driven by the capabilities of OpenAI's language models.
In this project, the language model is connected to other data sources and allows interaction with its environment, thus embodying the principles of the LangChain framework. Users can upload a PDF document, which is then processed and saved in Pinecone, a vector database, and Supabase storage. Users can then chat with the uploaded PDF, with the AI utilizing the content of the document to engage in a meaningful conversation.
The project relies on the Next.js framework, a leading choice for creating robust, full-stack Web applications. The UI components are beautifully crafted using the Radix UI library and styled with Tailwind CSS, based on the elegant template provided by shadcn/ui.
ChatPDF-GPT is equipped with examples that illustrate various operations such as:
To test the functionality of this project using the demo, you will need to provide your own credentials for OpenAI, Supabase, and Pinecone. For Supabase, you can follow the step-by-step guide provided below to setup and retrieve the necessary credentials. For acquiring credentials for OpenAI and Pinecone, please consult the corresponding documentation as a step-by-step guide may not be available. Always ensure you are following the latest instructions provided by the respective services.
Creating a New Project in Supabase:
Retrieving the Database Connection URL:
This connection string will be used for the DATABASE_URL
environment variable in your application.
This URL will be used for the DIRECT_URL
environment variable in your application.
SUPABASE_URL
and SUPABASE_KEY
. Copy these values.The SUPABASE_URL
is the URL for your project, while SUPABASE_KEY
is the public anonymous key for your project.
Setting up the Supabase Bucket:
SUPABASE_BUCKET
in your application.Setting Up Environment Variables in Your Application:
DATABASE_URL
DIRECT_URL
SUPABASE_KEY
SUPABASE_URL
SUPABASE_BUCKET
These keys will allow your application to interact with the Supabase services.
Please note that while it's possible to set a policy that makes your storage bucket publicly accessible, you should do this with caution. Making your bucket publicly accessible means that anyone with the URL to an object can access it. This might be useful for testing, but for production applications, you should consider more restrictive policies to ensure the security of your data. Always consult the Supabase documentation or a security expert to understand the implications of different policies.
With this, you should be able to set up Supabase for your project and manage storage policies as per your requirements.
To set up and run ChatPDF-GPT on your local machine, follow the steps below:
Clone the project repository:
git clone https://github.com/anis-marrouchi/chatpdf-gpt.git
Navigate into the project directory and install the dependencies using pnpm:
cd chatpdf-gpt
pnpm install
Create a .env
file in the root directory and fill in your credentials (OpenAI, Pinecone, Supabase) as indicated in the .env.example
file.
Create the database schema using Prisma. You must make you have run the prisma generate command prisma generate
npx prisma migrate dev --name init
Start the server:
npm run dev
ChatPDF-GPT is an open-source project and we warmly welcome contributions from everyone. Please read our contributing guide for more details on how to get started.
This project stands on the shoulders of giants. Our work would not be possible without the vast array of libraries, frameworks, and tools that the open source community has produced. Specifically, we would like to express our appreciation to:
The LangChain team for their groundbreaking framework for applications powered by language models.
OpenAI for their state-of-the-art language models, which make the chat functionality possible.
Supabase for their open-source Firebase alternative which we used to build secure and performant backends.
Pinecone for their vector database that allows easy and efficient storage and retrieval of vector embeddings.
Next.js and Vercel for their comprehensive framework which allowed us to build this full-stack Web application with ease.
shadcn for their elegant UI components which we built upon to create a beautiful and user-friendly interface.
Radix UI for their robust, accessible and customizable component library that forms the backbone of our UI.
@react-pdf-viewer for their powerful React component, which lets users preview the actual PDF document they are interacting with.
And all the other dependencies, both listed and not listed, that contributed to the realization of this project. Our contribution is modest in comparison to their collective effort.
ChatPDF-GPT is open-source software licensed under the MIT license.