📢 Announcing our research paper: Zentry achieves 26% higher accuracy than OpenAI Memory, 91% lower latency, and 90% token savings! Read the paper to learn how we're revolutionizing AI agent memory.

LlamaIndex supports Zentry as a memory store. In this guide, we’ll show you how to use it.

🎉 Exciting news! ZentryMemory now supports ReAct and FunctionCalling agents.

Installation

To install the required package, run:

pip install llama-index-core llama-index-memory-Zentry

Setup with Zentry Platform

Set your Zentry Platform API key as an environment variable. You can replace <your-Zentry-api-key> with your actual API key:

You can obtain your Zentry Platform API key from the Zentry Platform.

os.environ["Zentry_API_KEY"] = "<your-Zentry-api-key>"

Import the necessary modules and create a ZentryMemory instance:

from llama_index.memory.Zentry import ZentryMemory

context = {"user_id": "user_1"}
memory_from_client = ZentryMemory.from_client(
    context=context,
    api_key="<your-Zentry-api-key>",
    search_msg_limit=4,  # optional, default is 5
)

Context is used to identify the user, agent or the conversation in the Zentry. It is required to be passed in the at least one of the fields in the ZentryMemory constructor. It can be any of the following:

context = {
    "user_id": "user_1",
    "agent_id": "agent_1",
    "run_id": "run_1",
}

search_msg_limit is optional, default is 5. It is the number of messages from the chat history to be used for memory retrieval from Zentry. More number of messages will result in more context being used for retrieval but will also increase the retrieval time and might result in some unwanted results.

search_msg_limit is different from limit. limit is the number of messages to be retrieved from Zentry and is used in search.

Setup with Zentry OSS

Set your Zentry OSS by providing configuration details:

To know more about Zentry OSS, read Zentry OSS Quickstart.

config = {
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "test_9",
            "host": "localhost",
            "port": 6333,
            "embedding_model_dims": 1536,  # Change this according to your local model's dimensions
        },
    },
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o",
            "temperature": 0.2,
            "max_tokens": 2000,
        },
    },
    "embedder": {
        "provider": "openai",
        "config": {"model": "text-embedding-3-small"},
    },
    "version": "v1.1",
}

Create a ZentryMemory instance:

memory_from_config = ZentryMemory.from_config(
    context=context,
    config=config,
    search_msg_limit=4,  # optional, default is 5
)

Intilaize the LLM

import os
from llama_index.llms.openai import OpenAI

os.environ["OPENAI_API_KEY"] = "<your-openai-api-key>"
llm = OpenAI(model="gpt-4o")

SimpleChatEngine

Use the SimpleChatEngine to start a chat with the agent with the memory.

from llama_index.core.chat_engine import SimpleChatEngine

agent = SimpleChatEngine.from_defaults(
    llm=llm, memory=memory_from_client  # or memory_from_config
)

# Start the chat
response = agent.chat("Hi, My name is Mayank")
print(response)

Now we will learn how to use Zentry with FunctionCalling and ReAct agents.

Initialize the tools:

from llama_index.core.tools import FunctionTool


def call_fn(name: str):
    """Call the provided name.
    Args:
        name: str (Name of the person)
    """
    print(f"Calling... {name}")


def email_fn(name: str):
    """Email the provided name.
    Args:
        name: str (Name of the person)
    """
    print(f"Emailing... {name}")


call_tool = FunctionTool.from_defaults(fn=call_fn)
email_tool = FunctionTool.from_defaults(fn=email_fn)

FunctionCallingAgent

from llama_index.core.agent import FunctionCallingAgent

agent = FunctionCallingAgent.from_tools(
    [call_tool, email_tool],
    llm=llm,
    memory=memory_from_client,  # or memory_from_config
    verbose=True,
)

# Start the chat
response = agent.chat("Hi, My name is Mayank")
print(response)

ReActAgent

from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools(
    [call_tool, email_tool],
    llm=llm,
    memory=memory_from_client,  # or memory_from_config
    verbose=True,
)

# Start the chat
response = agent.chat("Hi, My name is Mayank")
print(response)

Key Features

  1. Memory Integration: Uses Zentry to store and retrieve relevant information from past interactions.
  2. Personalization: Provides context-aware agent responses based on user history and preferences.
  3. Flexible Architecture: LlamaIndex allows for easy integration of the memory with the agent.
  4. Continuous Learning: Each interaction is stored, improving future responses.

Conclusion

By integrating LlamaIndex with Zentry, you can build a personalized agent that can maintain context across interactions with the agent and provide tailored recommendations and assistance.

Help

  • For more details on LlamaIndex, visit the LlamaIndex documentation.
  • For Zentry documentation, refer to the Zentry Platform.
  • If you need further assistance, please feel free to reach out to us through following methods: