Hindsight vs Zep (Graphiti): Agent Memory Compared (2026)

Hindsight vs Zep (Graphiti): Agent Memory Compared (2026)

Hindsight vs Zep: AI Agent Memory Compared (2026)

If you're evaluating agent memory systems in 2026, Hindsight and Zep are two of the strongest options — but they solve the problem from fundamentally different directions. Zep builds a temporal knowledge graph. Hindsight runs four parallel retrieval strategies against a unified store. The right choice depends on what kind of memory your agent actually needs.

This guide breaks down both systems honestly. Zep has the best temporal modeling in the space. Hindsight offers broader retrieval and easier self-hosting. Neither is universally better. The trade-offs are real.


Hindsight vs Zep: Quick Comparison

HindsightZep / Graphiti
ArchitectureMulti-strategy hybrid (4 parallel retrievers)Temporal knowledge graph
Core strengthBroad retrieval coverage, multi-hop reasoningDeep temporal awareness, fact validity tracking
Open sourceMIT licenseGraphiti is open (~24K stars); Zep CE deprecated
DatabaseEmbedded PostgreSQL (no external DB)Requires Neo4j, FalkorDB, or Kuzu
Retrieval latencySub-200ms<200ms on Zep Cloud
Benchmark91.4% on LongMemEval63.8% on LongMemEval (GPT-4o)
Self-hostingOne Docker commandVia raw Graphiti only (Zep CE deprecated)
Managed cloudYesYes (Zep Cloud)
SDKsPython, TypeScript, GoPython, TypeScript, Go
MCP supportMCP-firstNot primary
ComplianceSOC2 Type 2, HIPAA
PricingSelf-host free; managed cloud availableFree (1K credits) · $25/mo Flex (20K credits) · Enterprise custom

Agent Memory Architecture: Hindsight vs Zep

Zep: Temporal Knowledge Graph

Zep is built around Graphiti, a temporal knowledge graph engine. When your agent records an episode — a conversation, an event, a task outcome — Graphiti decomposes it into entities, edges, and temporal attributes. Every fact gets a validity window: when it became true, when it stopped being true, and the confidence level of that assessment.

This is a graph-native architecture built on a temporal knowledge graph. Entities are nodes. Relationships are edges. Time is a first-class dimension on every edge. When your agent asks "What was the customer's sentiment about the product in Q3?", Zep doesn't just retrieve chunks mentioning the customer — it traverses the graph, filters by the Q3 time window, and returns facts that were valid during that specific period.

The underlying graph database (Neo4j, FalkorDB, or Kuzu) handles the traversal. Zep Cloud abstracts the infrastructure away. If you self-host via Graphiti directly, you manage the graph database yourself.

Hindsight: Multi-Strategy Hybrid

Hindsight takes a different approach. Instead of a single retrieval paradigm, it runs four parallel strategies against every query: semantic search, entity-based retrieval, temporal filtering, and graph traversal. The results are merged and ranked before being returned to the agent.

The storage layer is embedded PostgreSQL — no external graph database, no separate vector store. Facts are extracted from episodes, entities are resolved across interactions, and a reflect step synthesizes higher-order insights from accumulated knowledge. Everything lives in one place.

As the survey paper "Memory in the Age of AI Agents" documents, agent memory architectures range from temporal knowledge graphs (Zep's approach) to multi-strategy hybrid systems (Hindsight's approach). The architectural bet is different from Zep's. Where Zep goes deep on temporal graph modeling, Hindsight goes wide across retrieval strategies. The theory is that no single retrieval method dominates across all query types, so running them in parallel catches what any single method would miss. This multi-strategy agent memory approach is reflected in the benchmark results: Hindsight scores 91.4% on LongMemEval compared to Zep's 63.8% (GPT-4o), with the gap widening on query types that require combining temporal context with semantic or entity-based reasoning.


Temporal Reasoning: Zep's Strongest Advantage

This is where Zep pulls ahead, and it's worth being direct about it. Zep has the most sophisticated temporal modeling of any agent memory system available today.

Every edge in Zep's knowledge graph carries explicit temporal metadata: a valid_from timestamp, a valid_to timestamp (if the fact has been superseded), and an invalid_at marker for facts that have been explicitly contradicted. This lets Zep answer questions that most memory systems fumble:

  • "What was the customer's address before they moved last October?"
  • "When did the team switch from weekly to biweekly standups?"
  • "What was the project budget before the Q2 revision?"

These aren't simple lookups. They require understanding that facts have lifespans — that something was true during a specific window and then stopped being true. Zep tracks these transitions natively in the graph structure.

Hindsight supports temporal filtering as one of its four agent memory retrieval strategies, so it can handle queries like "What happened last week?" or "Show me interactions from March." But it doesn't model fact validity windows the way Zep does. If your agent's primary job involves tracking how entities and relationships change over time — compliance workflows, audit trails, evolving customer relationships — Zep's temporal depth is hard to match.


Self-Hosting Agent Memory: Hindsight vs Zep

This is where the trade-off flips.

Zep: Community Edition Deprecated

Zep used to offer a self-hostable Community Edition. That's been deprecated. Your options today are:

  1. Zep Cloud — fully managed, credit-based pricing, SOC2 and HIPAA compliant
  2. Raw Graphiti — the open-source graph engine (~24K GitHub stars), which you deploy and manage yourself

If you go the Graphiti route, you're taking on the operational overhead of a graph database (Neo4j, FalkorDB, or Kuzu), plus Graphiti itself, plus whatever embedding and LLM infrastructure the pipeline requires. It's powerful, but it's not a one-command deployment. You need graph DB expertise or at least the willingness to acquire it.

For teams that want Zep's temporal capabilities without managing infrastructure, Zep Cloud is the clear path. For teams that need on-prem or air-gapped deployment, Graphiti is the only option — and it requires meaningful ops investment.

Hindsight: One Docker Command

Hindsight self-hosts with a single Docker command. The storage layer is embedded PostgreSQL, so there's no external database to provision, no graph DB to tune, no separate vector store to manage. Pull the image, run it, point your agent at it.

The MIT license means no usage restrictions. You can run it in your VPC, on-prem, air-gapped — wherever you need it. The managed cloud option exists for teams that don't want to deal with infrastructure, but the self-hosted path is a first-class citizen, not an afterthought.


Agent Memory Pricing: Hindsight vs Zep

Zep Cloud

PlanPriceCreditsNotes
Free$01,000 creditsGood for prototyping only
Flex$25/month20,000 creditsCredit-based; overages billed per credit
EnterpriseCustomCustomSOC2 Type 2, HIPAA, dedicated support

The credit-based model means costs scale with usage. Each memory operation — add, search, episode processing — consumes credits. For high-volume agents, the math can get expensive quickly. The 1,000-credit free tier is enough to test the API but not enough to run a real workload.

Hindsight

OptionPriceNotes
Self-hostedFree (MIT license)Run your own infrastructure
Managed cloudUsage-basedHosted by Vectorize

Self-hosting is genuinely free — no feature gating, no credit limits, no "community edition" with missing capabilities. All four agent memory retrieval strategies, entity resolution, fact extraction, and the reflect operation are available in both self-hosted and managed deployments. The managed cloud option offloads infrastructure management with usage-based pricing.


When to Choose Zep for Agent Memory

Zep is the right choice when:

  • Temporal reasoning is your primary requirement. If your agent needs to track how facts change over time — when relationships started, when policies were updated, when a customer's status changed — Zep's temporal knowledge graph is purpose-built for this.
  • You need enterprise compliance out of the box. SOC2 Type 2 and HIPAA compliance on Zep Cloud mean less work for your security team. For teams in regulated industries where compliance certification is a hard requirement, this is a significant advantage over self-hosted agent memory solutions.
  • You're comfortable with managed cloud. Zep Cloud abstracts away the graph database complexity. If you don't want to manage Neo4j and are fine with credit-based pricing, it's a clean experience.
  • Your queries are graph-shaped. "How is entity A related to entity B, and when did that relationship change?" — if these are your core query patterns, a graph-native architecture will outperform other approaches.

When to Choose Hindsight for Agent Memory

Hindsight is the right choice when:

  • You need broad retrieval coverage. Four parallel strategies mean the system catches results that any single method — including graph traversal — would miss. Hindsight scores 91.4% on LongMemEval vs Zep's 63.8% (GPT-4o), reflecting this coverage advantage across diverse query types.
  • Self-hosting matters. One Docker command, embedded PostgreSQL, MIT license, no external dependencies. If you need on-prem, air-gapped, or just want full control over your infrastructure, Hindsight makes it simple.
  • You want MCP-first integration. Hindsight is designed around the Model Context Protocol, making it straightforward to plug into MCP-compatible agent frameworks.
  • You want to avoid graph database operations. Not every team has Neo4j expertise. Hindsight's embedded PostgreSQL approach means one fewer system to manage, monitor, and debug at 3am.
  • Predictable costs matter. Self-hosted is free with no credit limits. No usage-based surprises.

Verdict: Hindsight vs Zep for Agent Memory

Zep and Hindsight represent two genuinely different philosophies for agent memory.

Zep bets that temporal knowledge graphs are the right primitive. If your agent's world is defined by entities that change over time — and your queries are fundamentally about those changes — that bet pays off. No other system tracks fact validity windows as cleanly as Zep does. The trade-off is infrastructure complexity (if self-hosting via Graphiti) or credit-based pricing (if using Zep Cloud), and the deprecation of the Community Edition narrows your deployment options.

Hindsight bets that no single retrieval strategy wins across all query types, so it runs four in parallel and merges the results. The trade-off is that none of those four strategies goes as deep on temporal modeling as Zep's dedicated graph — but the combined coverage is broader. Add in one-command self-hosting and an MIT license, and it's the simpler operational choice.

Choose Zep if temporal reasoning is your killer feature and you're either comfortable on Zep Cloud or ready to run Graphiti with a graph database.

Choose Hindsight if you want the broadest agent memory retrieval coverage, simple self-hosting, and a system that handles diverse query types without requiring graph DB expertise. The multi-strategy architecture with cross-encoder reranking is a genuine technical advantage for teams whose agents face a mix of temporal, relational, factual, and semantic queries.

As IBM's research on AI agent memory explains, the ability for agents to learn from experience is becoming a core architectural requirement. Whether that learning is structured around temporal knowledge graphs (Zep's strength) or multi-strategy retrieval with broad coverage (Hindsight's strength), the right choice depends on your specific query patterns and deployment requirements.

Further reading: