Skip to content

[RAG] Implement chat memory for RAG chatbot

DESCRIPTION

We want to add a "chat memory" feature to the RAG process.

  • The feature can be enabled or disabled from the rag.properties configuration.
  • When enabled, the last N messages exchanged between the user and the assistant (LLM) are provided to the LLM in order to generate the next response. (N must be configurable in rag.properties).
  • The chat history (list of messages) should not be stored (yet) in the backend, but only client-side. The UI adds the chat history to the POST request (/ai/rag endpoit)

Example of request:

POST https://127.0.0.1/Datafari/rest/v2.0/ai/rag
{
    "query": "What is my dog's name ?",
    "lang": "fr",
    "history": [
        {
            "role":"user",
            "content": "I just adopted a black labrador. I called her Jumpy."
        },
        {
            "role":"assistant",
            "content": "How nice ! I am sure she will be happy with you."
        },
        {
            "role":"user",
            "content": "What is the capital of France?"
        },
        {
            "role":"assistant",
            "content": "La capitale de la France est Paris, d'après le document `Capitale de la France`."
        }
    ]
}

Parameters:

query: The user query.
lang: The current Datafari language
history: A list of user/assistant messages, sorted by datetime (ASC, the oldest messages come first)

Response (based on chat history and not retrieved sources):

{
    "content": {
        "documents": [],
        "message": "Le nom de votre chien est Jumpy."
    },
    "status": "OK"
}

This example simulates the following fake conversation:

User (fake previous RAG query): I just adopted a black labrador. I called her Jumpy.

Assistant (fake previous generated response): How nice ! I am sure she will be happy with you.

User (actual RAG query): What is my dog's name ?

Assistant (actual generated response): Le nom de votre chien est Jumpy.

The history contains all the user's previous requests, and all the associated generated responses.

Documentation here:

https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/3136552962/Datafari+RagAPI+-+RAG+-+ALPHA+VERSION

Enhance chat memory compatibility

The first version of the feature uses multi-messages technologies, and seems therefore incompatible with some models such as Mistral, when those are ran using the Datafari AI Agent. We want to find a way to provide a chat-memory solution with Mistral models, either by using a mono-message solution or by finding a way to use multi-message with Mistral.

VERSION CONCERNED

6.2

CHECKLIST BEFORE CLOSING TICKET

  • Documentation
    • I have created the functional documentation in the wiki
    • I have created the technical documentation in the wiki
    • I have added javadoc comments on key functions in my code
  • Security
    • I have cleaned up any input coming from users
    • I have not put any token APIs, passwords or the like in my code
    • I am not using 3rd party libraries that are deprecated or not maintained
Edited by Emeric Bernet-Rollande