π’ Announcing our research paper: Zentry achieves 26% higher accuracy than OpenAI Memory, 91% lower latency, and 90% token savings! Read the paper to learn how we're revolutionizing AI agent memory.
To use LM Studio with Zentry, youβll need to have LM Studio running locally with its server enabled. LM Studio provides a way to run local LLMs with an OpenAI-compatible API.
import osfrom Zentry import Memoryos.environ["OPENAI_API_KEY"] = "your-api-key" # used for embedding modelconfig = { "llm": { "provider": "lmstudio", "config": { "model": "lmstudio-community/Meta-Llama-3.1-70B-Instruct-GGUF/Meta-Llama-3.1-70B-Instruct-IQ2_M.gguf", "temperature": 0.2, "max_tokens": 2000, "lmstudio_base_url": "http://localhost:1234/v1", # default LM Studio API URL } }}m = Memory.from_config(config)messages = [ {"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"}, {"role": "assistant", "content": "How about a thriller movies? They can be quite engaging."}, {"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."}, {"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."}]m.add(messages, user_id="alice", metadata={"category": "movies"})
You can also use LM Studio for both LLM and embedding to run Zentry entirely locally:
Copy
from Zentry import Memory# No external API keys needed!config = { "llm": { "provider": "lmstudio" }, "embedder": { "provider": "lmstudio" }}m = Memory.from_config(config)messages = [ {"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"}, {"role": "assistant", "content": "How about a thriller movies? They can be quite engaging."}, {"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."}, {"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."}]m.add(messages, user_id="alice123", metadata={"category": "movies"})
When using LM Studio for both LLM and embedding, make sure you have:
An LLM model loaded for generating responses
An embedding model loaded for vector embeddings
The server enabled with the correct endpoints accessible