Multimodal Support

📢 Announcing our research paper: Zentry achieves 26% higher accuracy than OpenAI Memory, 91% lower latency, and 90% token savings! Read the paper to learn how we're revolutionizing AI agent memory.

Zentry extends its capabilities beyond text by supporting multimodal data. With this feature, users can seamlessly integrate images into their interactions—allowing Zentry to extract relevant information.

How It Works

When a user submits an image, Zentry processes it to extract textual information and other pertinent details. These details are then added to the user’s memory, enhancing the system’s ability to understand and recall multimodal inputs.

import os
from zentry import Memory

client = Memory()

messages = [
    {
        "role": "user",
        "content": "Hi, my name is Alice."
    },
    {
        "role": "assistant",
        "content": "Nice to meet you, Alice! What do you like to eat?"
    },
    {
        "role": "user",
        "content": {
            "type": "image_url",
            "image_url": {
                "url": "https://www.superhealthykids.com/wp-content/uploads/2021/10/best-veggie-pizza-featured-image-square-2.jpg"
            }
        }
    },
]

# Calling the add method to ingest messages into the memory system
client.add(messages, user_id="alice")

Using these methods, you can seamlessly incorporate various media types into your interactions, further enhancing Zentry’s multimodal capabilities. If you have any questions, please feel free to reach out to us using one of the following methods:

Join our community

GitHub

Ask questions on GitHub

Support

Talk to founders

Get Started

Core Concepts

Platform

Open Source

Contribution

Multimodal Support

How It Works

Telegram

GitHub

Support

Get Started

Core Concepts

Platform

Open Source

Contribution

​How It Works

Telegram

GitHub

Support

How It Works