Overview

In the digital age, we can use the power of generative AI to build custom innovative solutions by enabling users to interact with their documents or websites through natural language conversations.

This blog-post focuses on a simple use-case where the user can chat/query an uploaded PDF, JSON data or an existing website with natural language conversations using RAG(Retrieval-Augmented Generation).

The entire stack is based on open source technologies and can be implemented without having to pay a dime.

Tech stack:

Node js - Popular Backend framework based on Javascript

Langchain JS - Javascript based framework for developing applications powered by language models

Cohere - Open source LLM model

Mongo DB Atlas - an integrated suite of data services centered around a cloud database which can also store vector embeddings.

Pre-Requisites

Node JS - Install version 18 and above.
Cohere API Key - Sign up to Cohere and procure an API Key (free of cost)
Mongo DB Atlas - Sign up for a free account for Mongo DB Atlas (free of cost)

Database Setup

As stated above, we will be using Mongo DB Atlas to store Vector Embeddings.

For this purpose, we will have to create a database in our Mongo DB atlas account and create Vector search indexes.

Steps to be followed:

Log on to your Mongo DB Atlas Account UI and click on Overview.
Create a free cluster and db user using the following link: https://www.mongodb.com/docs/atlas/tutorial/deploy-free-tier-cluster/
Navigate to Browse collections and click on create Database and provide the following details:

4. Create another Collection "chat_history" within the same database.

5. Navigate to the Atlas Search Tab(can also be named as Search & Vector Search ) and click on create search index and enter the following json definition:

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    },
    {
      "path": "source",
      "type": "filter"
    }
  ]
}

6. Note the connection String

Get the connection string to connect to the database by navigating to:

Database-> Connect .

Normally the connection strings are of the following format:
mongodb+srv://<username>:<password>@<cluster dns>/?retryWrites=true&appName=Cluster0

The connection string can be tested using a tool like MongoDB Compass which is a database client to connect to the mongo db database from your local machine.

Note down the connection url as this needs to be updated as an environment variable in the project.

Project Setup

Clone the Repo:

Repo

Create . env file in the root folder with the following entries:

COHERE_API_KEY= <value of your api key for Cohere>

MONGO_CONNECTION_STRING= <value of connection string to connect to mongo db>

Execute Application

npm install
npm run dev

API Endpoints

The api endpoints can be accessed using the url :

http://localhost:5000/api-docs/

API Walk-through

Assuming that the application is up and running on your lcoal machine and the necessary pre-requisites including the cohere api key and the mongo db connection strings have been updated in the env.json,

we will discuss about the various api endpoints used in the applicaiton:

Training the AI Model

The api provides 2 endpoints to train your AI Model:

/api/train/train-using-pdf:

The endpoint is used to train your AI model by providing a pdf document as an input.

A sample pdf has been provided in the uploads folder. You can train the model using the sample "lunch.pdf" file.

The API Endpoint performs the following operations:

Process the uploaded PDF
Split the pdf data into chunks using langchains RecursiveCharacterTextSplitter
Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step

The process can be summarized using the following diagram:

The saved vector embeddings can be viewed using a tool like MongoDB Compass

/api/train/train-using-website:

The endpoint is used to train your AI model by providing a fully formed website url as its source

eg: "https://www.mathema.de"

The API Endpoint performs the following operations:

Process the website data using langchains CheerioWebBaseLoader
Split and transform the processed website data into chunks using langchains RecursiveCharacterTextSplitter.fromLanguage('html') and HtmlToTextTransformer
Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step

The saved vector embeddings can be viewed using a tool like MongoDB Compass

Note: we are using the same mongo db collection to store the vector embeddings for both pdf and website.

An additional property "source" is used in the collection to differentiate between the multiple vector embeddings

/api/train/train-using-json:

The endpoint is used to train your AI model by providing a JSON Data as an input

A sample json has been provided in the uplods/data.json file.

The API Endpoint performs the following operations:

Process the website data using langchains JsonLoader
Generate Vector Embeddings using Cohere's embed-multilingual-v3.0 model
Save the vector embeddings to mongo Db collection 'vector-embeddings' which was created in the previous step

The saved vector embeddings can be viewed using a tool like MongoDB Compass-

Note: we are using the same mongo db collection to store the vector embeddings for pdf ,website and json.

An additional property "source" is used in the collection to differentiate between the multiple vector embeddings

Query the AI Model (without history)

Once the models are trained and the vector embeddings have been persisted in the database, we can prompt the model using a RAG based approach.

/api/query/prompt

where :

query - The question to be asked to the AI Model
source - The source of the vector embeddings. In our case it can be the pdf or the website.
The value of the source can be identified using the sources endpoints:

/api/sources/trained-models:

The query endpoint performs the following operations:

Load the vector embeddings as a Retriever Object based on the source provided.
Using the Retriever return a RAG based retrieval chain
Using the Retriever chain answer the respond to the users prompt.

The output could be look like this:

Query the AI Model (with history)

Sometimes as part of the query , we would like the LLM to respond to our queries based on the previous conversation history. This can be easily done by providing the data of the previous conversation as a context to the LLM.

/api/query/prompt-with-history

While using this endpoint for the first time, the chat history id will be empty and we will query the LLM with just the query and the source. (as the previous section of querying without chat history)

Once the output is generated, the application will persist the conversation history in to the mongo db collection

The list of saved conversations can be fetched using the following endpoint:

/api/sources/chathistory:

Using the chat history you can ask subsequent questions like the following to the LLM:

The LLM will now take into consideration the existing vector embeddings along with the previous chat history to provide optimal output.

/api/summarize/summarize-using-pdf

The endpoint is used to provide a quick summary of the uploaded PDF file

The API Endpoint performs the following operations:

Process the uploaded PDF
Split the pdf data into chunks using langchains RecursiveCharacterTextSplitter
Use the loadSummarizationChain from langchain chains library to create a summary of the pdf

Search This Blog

Prashant - Blogspot

Generative AI - Chat with your data

Overview

Tech stack:

Pre-Requisites

Database Setup

Project Setup

API Endpoints

API Walk-through

Training the AI Model

Query the AI Model (without history)

Query the AI Model (with history)

References

Comments

Post a Comment