Agent memory that learns from experience

Four retrieval strategies running in parallel. Token-budget optimization. Not RAG. Not a vector database wrapper. 94.6% on LongMemEval, peer-reviewed and independently reproducible.

New to agent memory? Start here →

Integration

Add memory in minutes, not sprints

No schema design. No manual tagging. No migration. Your agent starts building memory from the first conversation.

# pip install hindsight-client

from hindsight_client import Hindsight

client = Hindsight(base_url="http://localhost:8888")
client.retain(bank_id="my-bank", content="Alice prefers Slack over email")
results = client.recall(bank_id="my-bank", query="How does Alice communicate?")

Full REST API reference at hindsight.vectorize.io/api-reference

Under the hood

Four retrieval strategies. One query.

Dense vector search
Semantic similarity via embeddings
Finds conceptually related memories even when the wording differs.
Sparse vector search
BM25 keyword matching
Catches exact terms and proper nouns that semantic search misses.
Graph traversal
Entity relationship connections
Discovers linked context: person → project → preference.
Temporal search
Time-aware retrieval with causal chains
“What happened during onboarding last week?” with cause and effect.

All four run in parallel. Results merge with token budgets, not top-K. You get predictable context size, predictable cost, and the most relevant memories from four different angles.

Token budgets, not top-K
Token budgets control how much memory fits in your prompt (e.g., 4,096 tokens), unlike top-K which counts results (e.g., top 10) regardless of size. Predictable context window usage, predictable API costs.
Conflict detection
When facts change, both states are preserved with timestamps. “Alex used to prefer email, now prefers Slack” is more useful than silently overwriting.
Entity resolution
Batch disambiguation handles hundreds of entities in 3 database queries instead of hundreds. 99.5% query reduction.

State of the art. Peer-reviewed.

LongMemEval is a peer-reviewed benchmark for agent memory systems. Our results are independently reproducible. The benchmark code is open source.

GPT-4o60.2%
Zep71.2%
Supermemory85.2%
Hindsight94.6%

Open source or managed. Same product.

MIT Licensed

Open Source

Built from the same codebase as Hindsight Cloud. No feature gating, no usage limits, no phone-home telemetry.

$docker run -p 8888:8888 -e HINDSIGHT_API_LLM_API_KEY=$OPENAI_API_KEY ghcr.io/vectorize-io/hindsight
SOC2 Type 2

Hindsight Cloud

Managed Hindsight. We handle infrastructure, scaling, backups, and upgrades. You focus on your agent.

Same API, same MCP interface, same everything. Just no ops.

Production ready

SOC2 Type 2 certified
Annual audit of security controls, availability, and data handling.
User isolation
Tag-based security boundaries prevent cross-user data leakage during memory consolidation.
No phone-home telemetry
Self-hosted Hindsight sends nothing back. Your data stays on your infrastructure.
Peer-reviewed research
Built on research from academic institutions. Results independently reproducible.

Start building agents that learn

Open source, MIT licensed. Self-host or use Hindsight Cloud.