Integrate images and documents into your interactions with Zentry
📢 Announcing our research paper: Zentry achieves 26% higher accuracy than OpenAI Memory, 91% lower latency, and 90% token savings! Read the paper to learn how we're revolutionizing AI agent memory.
Zentry extends its capabilities beyond text by supporting multimodal data, including images and documents. With this feature, users can seamlessly integrate visual and document content into their interactions—allowing Zentry to extract relevant information from various media types and enrich the memory system.
When a user submits an image or document, Zentry processes it to extract textual information and other pertinent details. These details are then added to the user’s memory, enhancing the system’s ability to understand and recall multimodal inputs.
Copy
import osfrom Zentry import MemoryClientos.environ["Zentry_API_KEY"] = "your-api-key"client = MemoryClient()messages = [ { "role": "user", "content": "Hi, my name is Alice." }, { "role": "assistant", "content": "Nice to meet you, Alice! What do you like to eat?" }, { "role": "user", "content": { "type": "image_url", "image_url": { "url": "https://www.superhealthykids.com/wp-content/uploads/2021/10/best-veggie-pizza-featured-image-square-2.jpg" } } },]# Calling the add method to ingest messages into the memory systemclient.add(messages, user_id="alice")
import base64# Path to the document filedocument_path = "path/to/your/document.txt"# Function to convert file to Base64def file_to_base64(file_path): with open(file_path, "rb") as file: return base64.b64encode(file.read()).decode('utf-8')# Encode the document in Base64base64_document = file_to_base64(document_path)# Create the message dictionary with the Base64-encoded documentdocument_message = { "role": "user", "content": { "type": "mdx_url", "mdx_url": { "url": base64_document } }}client.add([document_message], user_id="alice")
# Define the PDF URLpdf_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"# Create the message dictionary with the PDF URLpdf_message = { "role": "user", "content": { "type": "pdf_url", "pdf_url": { "url": pdf_url } }}client.add([pdf_message], user_id="alice")
Here’s a comprehensive example showing how to work with different file types:
Copy
import base64from Zentry import MemoryClientclient = MemoryClient()def file_to_base64(file_path): with open(file_path, "rb") as file: return base64.b64encode(file.read()).decode('utf-8')# Example 1: Using an image URLimage_message = { "role": "user", "content": { "type": "image_url", "image_url": { "url": "https://example.com/sample-image.jpg" } }}# Example 2: Using a text document URLtext_message = { "role": "user", "content": { "type": "mdx_url", "mdx_url": { "url": "https://www.w3.org/TR/2003/REC-PNG-20031110/iso_8859-1.txt" } }}# Example 3: Using a PDF URLpdf_message = { "role": "user", "content": { "type": "pdf_url", "pdf_url": { "url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" } }}# Add each message to the memory systemclient.add([image_message], user_id="alice")client.add([text_message], user_id="alice")client.add([pdf_message], user_id="alice")
Using these methods, you can seamlessly incorporate various media types into your interactions, further enhancing Zentry’s multimodal capabilities.If you have any questions, please feel free to reach out to us using one of the following methods: