Zentry with Ollama

📢 Announcing our research paper: Zentry achieves 26% higher accuracy than OpenAI Memory, 91% lower latency, and 90% token savings! Read the paper to learn how we're revolutionizing AI agent memory.

Running Zentry Locally with Ollama

Zentry can be utilized entirely locally by leveraging Ollama for both the embedding model and the language model (LLM). This guide will walk you through the necessary steps and provide the complete code to get you started.

Overview

By using Ollama, you can run Zentry locally, which allows for greater control over your data and models. This setup uses Ollama for both the embedding model and the language model, providing a fully local solution.

Setup

Before you begin, ensure you have Zentry and Ollama installed and properly configured on your local machine.

Full Code Example

Below is the complete code to set up and use Zentry locally with Ollama:

import os
from zentry import Memory

config = {
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "test",
            "host": "localhost",
            "port": 6333,
            "embedding_model_dims": 768,  # Change this according to your local model's dimensions
        },
    },
    "llm": {
        "provider": "ollama",
        "config": {
            "model": "llama3.1:latest",
            "temperature": 0,
            "max_tokens": 2000,
            "ollama_base_url": "http://localhost:11434",  # Ensure this URL is correct
        },
    },
    "embedder": {
        "provider": "ollama",
        "config": {
            "model": "nomic-embed-text:latest",
            # Alternatively, you can use "snowflake-arctic-embed:latest"
            "ollama_base_url": "http://localhost:11434",
        },
    },
}

# Initialize Memory with the configuration
m = Memory.from_config(config)

# Add a memory
m.add("I'm visiting Paris", user_id="john")

# Retrieve memories
memories = m.get_all(user_id="john")

Key Points

Configuration: The setup involves configuring the vector store, language model, and embedding model to use local resources.
Vector Store: Qdrant is used as the vector store, running on localhost.
Language Model: Ollama is used as the LLM provider, with the “llama3.1:latest” model.
Embedding Model: Ollama is also used for embeddings, with the “nomic-embed-text:latest” model.

Conclusion

This local setup of Zentry using Ollama provides a fully self-contained solution for memory management and AI interactions. It allows for greater control over your data and models while still leveraging the powerful capabilities of Zentry.

💡 Examples

​Running Zentry Locally with Ollama

​Overview

​Setup

​Full Code Example

​Key Points

​Conclusion

Running Zentry Locally with Ollama

Overview

Setup

Full Code Example

Key Points

Conclusion