Generative AI - Chat with your data

 Overview


In the digital age, we can use the power of generative AI to build custom innovative solutions by enabling users to interact with their documents or websites through natural language conversations.

This blog-post focuses on a simple use-case where the user can chat/query an uploaded PDF, JSON data or an existing website with natural language conversations using RAG(Retrieval-Augmented Generation).

The entire stack is based on open source technologies and can be implemented without having to pay a dime.

Tech stack:

  • Node js - Popular Backend framework based on Javascript
  • Langchain JS  - Javascript based framework for developing applications powered by language models
  • Cohere - Open source LLM model
  • Mongo DB Atlas -  an integrated suite of data services centered around a cloud database which can also store vector embeddings.

Pre-Requisites

  • Node JS -  Install version 18 and above.
  • Cohere API Key - Sign up to Cohere and procure an API Key (free of cost)
  • Mongo DB Atlas - Sign up for a free account for Mongo DB Atlas (free of cost)

Database Setup

As stated above, we will be using Mongo DB Atlas to store Vector Embeddings.

For this purpose, we will have to create a database in our Mongo DB atlas account and create Vector search indexes.

Steps to be followed: 
  1. Log on to your Mongo DB Atlas Account UI and click on Overview.
  2. Create a free cluster and db user using the following link:     https://www.mongodb.com/docs/atlas/tutorial/deploy-free-tier-cluster/
  3.  Navigate to Browse collections and click on create Database and provide the following details:
            

    4. Create another Collection "chat_history" within the same database.

      


    5. Navigate to the Atlas Search Tab(can also be named as Search & Vector Search ) and click on                 create search index and enter the following json definition:
    
{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    },
    {
      "path": "source",
      "type": "filter"
    }
  ]
}


    6. Note the connection String

Get the connection string to connect to the database by navigating to:
Database-> Connect .
Normally the connection strings are of the following format:
mongodb+srv://<username>:<password>@<cluster dns>/?retryWrites=true&appName=Cluster0

The connection string can be tested using a tool like MongoDB Compass which is a database client to connect to the mongo db database from your local machine.
Note down the connection url as this needs to be updated as an environment variable in the project.

Project Setup

  • Clone the Repo:
           Repo
  • Create . env file in the root folder with the following entries:
          COHERE_API_KEY= <value of your api key for Cohere>
          MONGO_CONNECTION_STRING= <value of connection string to connect to mongo db>
  •    Execute Application
npm install
npm run dev

API Endpoints

The api endpoints can be accessed using the url : 

http://localhost:5000/api-docs/


   


API Walk-through

Assuming that the application is up and running on your lcoal machine and the necessary pre-requisites including the cohere api key and the mongo db connection strings  have been updated in the env.json,
we will discuss about the various api endpoints used in the applicaiton:

Training the AI Model

The api provides 2 endpoints to train your AI Model:

/api/train/train-using-pdf:

The endpoint is used to train your AI model by providing a pdf document as an input.
A sample pdf has been provided in the uploads folder. You can train the model using the sample "lunch.pdf" file.

The API Endpoint performs the following operations:
  1. Process the uploaded PDF
  2. Split the pdf data into chunks using langchains RecursiveCharacterTextSplitter
  3. Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
  4. Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step
The process can be summarized using the following diagram:


The saved vector embeddings can be viewed using a tool like MongoDB Compass



/api/train/train-using-website:

The endpoint is used to train your AI model by providing a fully formed website url as its source
eg: "https://www.mathema.de"

The API Endpoint performs the following operations:
  1. Process the website data using langchains CheerioWebBaseLoader
  2. Split  and transform the processed website data into chunks using langchains  RecursiveCharacterTextSplitter.fromLanguage('html') and HtmlToTextTransformer
  3. Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
  4. Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step
The saved vector embeddings can be viewed using a tool like MongoDB Compass

Note: we are using the same mongo db collection to store the vector embeddings for both pdf and website.
An additional property "source" is used in the collection to differentiate between the multiple vector embeddings

/api/train/train-using-json:

The endpoint is used to train your AI model by providing a JSON Data as an input

A sample json has been provided in the uplods/data.json file.

The API Endpoint performs the following operations:
  1. Process the website data using langchains JsonLoader
  2. Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
  3. Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step
The saved vector embeddings can be viewed using a tool like MongoDB Compass-

Note: we are using the same mongo db collection to store the vector embeddings for pdf ,website and json.
An additional property "source" is used in the collection to differentiate between the multiple vector embeddings

Query the AI Model (without history)

Once the models are trained and the vector embeddings have been persisted in the database, we can prompt the model using a RAG based approach.


/api/query/prompt
where :
query - The question to be asked to the AI Model
source - The source of the vector embeddings. In our case it can be the pdf or the website.
The value of the source can be identified using the sources endpoints:

/api/sources/trained-models:

The query endpoint performs the following operations:
  1. Load the vector embeddings as a Retriever Object based on the source provided.
  2. Using the Retriever  return a RAG based retrieval chain 
  3. Using the Retriever chain answer the respond to the users prompt.
The output could be look like this:

Query the AI Model (with history)

Sometimes as part of the query , we would like the LLM to respond to our queries based on the previous conversation history. This can be easily done by providing the data of the previous conversation as a context to the LLM.

/api/query/prompt-with-history


While using this endpoint for the first time, the chat history id will be empty and we will query the LLM with just the query and the source. (as the previous section of querying without chat history)

Once the output is generated, the application will persist the conversation history in to the mongo db collection


The list of saved conversations can be fetched using the following endpoint:
/api/sources/chathistory:



Using the chat history you can ask subsequent questions like the following to the LLM:

The LLM will now take into consideration the existing vector embeddings along with the previous chat history to provide optimal output.

/api/summarize/summarize-using-pdf

The endpoint is used to provide a quick summary of the uploaded PDF file 

The API Endpoint performs the following operations:
  1. Process the uploaded PDF
  2. Split the pdf data into chunks using langchains RecursiveCharacterTextSplitter
  3. Use the loadSummarizationChain from langchain chains library to create a summary of the pdf

References





Comments