GBrain vs Hindsight vs Mem0 vs Zep: Memory Compared

May 9, 2026

Four AI agent memory systems dominate the conversation right now: GBrain, the markdown-first personal brain Y Combinator CEO Garry Tan open-sourced in April; Hindsight by Vectorize, the production memory platform with BEAM SOTA at 10M tokens; Mem0, the most established commercial memory-as-a-service; and Zep, the temporal-knowledge-graph option for conversational AI.

Each is excellent for the audience it's designed for. None is the right answer for everyone. This GBrain vs Hindsight vs Mem0 vs Zep comparison covers what each system actually is, how the architectures differ, where the retrieval design wins, what integrations ship out of the box, what they cost, and which one fits which kind of team.

At-a-Glance Verdict

Pick GBrain if you run OpenClaw or Hermes Agent, want to author skill workflows yourself, and value plain-text markdown ownership of your brain.
Pick Hindsight if you want production memory infrastructure that synthesizes structure from raw facts automatically (observations + mental models), with 25+ first-class integrations and a managed cloud option.
Pick Mem0 if you want a mature managed service with broad SDK language coverage and don't need graph or temporal reasoning.
Pick Zep if your application is conversational-first and temporal-graph reasoning is a hard requirement.

The detailed comparison is below.

What Each System Is

GBrain

GBrain is an open-source markdown-first knowledge system released by Garry Tan on April 5, 2026. ~14,000 GitHub stars, MIT license. Three layers: a Brain Repo (Markdown files in git), GBrain Retrieval (Postgres + pgvector with hybrid search), and an AI Agent Skills layer (34 markdown workflow files plus 30+ contract-first operations). Built specifically for OpenClaw and Hermes Agent operators running personal brains. Self-hosted only — no managed cloud. PGLite (WASM Postgres) for zero-config local mode.

For a deeper explainer, see What Is GBrain?.

Hindsight

Hindsight is a production agent memory platform built by Vectorize and released in December 2025. ~12,800 GitHub stars, MIT license. Three core operations: retain, recall, reflect. Background consolidation creates observations — evidence-grounded beliefs with proof counts, freshness trends, and contradiction reconciliation — without operator-defined patterns. Mental models are user-defined topics that auto-refresh as new facts arrive. Multi-strategy retrieval (TEMPR) runs semantic + BM25 + graph traversal + temporal reasoning in parallel, fused via RRF and reranked with a cross-encoder. Available as managed cloud or self-hosted via Docker, Helm, or bare-metal pip install.

Mem0

Mem0 is the most established commercial agent memory layer, with broad SDK language support (Python, Node, Go) and a fully managed cloud service. Server-side LLM extraction creates facts and preferences from conversations automatically. Dual memory scope (user-level and session-level). Single-strategy semantic retrieval with metadata filtering. Self-hosting exists but is less polished than the cloud experience.

Zep

Zep is a long-term memory store designed specifically for conversational AI applications. Its distinctive feature is a temporal knowledge graph — every fact is timestamped so the agent knows when something was true. Strong on session management and user-level scoping. Cloud-managed; self-hosting is via Graphiti (the underlying graph engine), which is more involved.

Quick Comparison Table

Dimension	GBrain	Hindsight	Mem0	Zep
Created by	Garry Tan (YC CEO)	Vectorize	Mem0 Inc.	Zep
Released	Apr 2026	Dec 2025	2023	2023
License	MIT	MIT	OSS + paid cloud	OSS + paid cloud
Source of truth	Markdown in git	Structured store	Cloud DB	Temporal KG
Auto-learning	Operator-authored skills on cron	Observations + mental models (auto)	Fact extraction	Fact + temporal extraction
Retrieval	Hybrid (HNSW + tsvector + RRF)	Multi-strategy (semantic + BM25 + graph + temporal) + reranker	Semantic + metadata	Semantic + temporal graph
Multi-hop graph at retrieval	Has edges, not primary	Yes	No	Yes
Temporal reasoning	No	Yes (native)	No	Yes (graph-scoped)
Cross-encoder reranking	Backlink boost	Yes	No	Basic
Managed cloud	No	Yes	Yes	Yes
Self-host	Yes (PGLite or Postgres)	Yes (Docker, Helm, pip on Linux/macOS/Windows)	Limited	Via Graphiti
Multi-tenant by design	No	Yes	Yes	Yes
Model Context Protocol (MCP) server	Native + OAuth 2.1 (`gbrain serve --http`)	Native + OAuth 2.1 (cloud)	SDK	SDK
Official integrations	OpenClaw + Hermes (skill packs)	25+ across coding tools, frameworks, orchestration, gateways	Major frameworks via SDK	Major frameworks via SDK
Benchmark highlight	BrainBench: P@5 49.1%, R@5 97.9% (240-page corpus)	BEAM 10M tokens: 64.1% (SOTA)	LongMemEval: varies by methodology	LongMemEval: solid conversational results
GitHub stars	~14,000	~12,800	High	Moderate

Architecture: Markdown-First vs Structured-First vs Conversational-First

The four systems differ most sharply on what they treat as the source of truth and where automation lives.

GBrain treats human-readable markdown as primary. The agent reads, writes, and reasons against .md files in a git repo. The Postgres index serves queries; the markdown is what you keep, edit, and version-control. Automation runs operator-authored skills on cron — the signal-detector skill captures entity mentions on every message, enrich updates person and company pages, maintain audits citations and detects stale pages. The intelligence lives in markdown skill files the operator owns.

Hindsight treats the structured store as primary. Memories enter via retain, get extracted into facts, get consolidated into observations in the background, and feed mental models the operator defines as named topics. Automation is generative — the system synthesizes new structure (observations) from raw facts without operator-authored patterns. When new evidence contradicts an existing observation, the engine reconciles the contradiction by capturing the journey ("User was previously a React enthusiast, has since switched to Vue") rather than overwriting silently. The structured graph is the system of record; markdown isn't part of the design.

Mem0 treats the cloud DB as primary. Memories are extracted server-side by an LLM and stored as facts and preferences with user-scoped and session-scoped metadata. Single-strategy retrieval — embed the query, find nearest neighbors, return. Simpler than the others, with the trade-off that complex queries (multi-hop, time-aware) don't have specialized retrieval paths.

Zep treats the temporal knowledge graph as primary. Conversations get extracted into entities and timestamped facts. Every fact knows when it was true, which supports queries like "what did we discuss in March that we don't talk about anymore?" The architecture is purpose-built for conversational AI where temporal context shifts matter.

Retrieval

The single most important section of any agent memory comparison.

GBrain: Hybrid Search with Multi-Query Expansion

A query into GBrain follows: optional Claude Haiku query expansion (2 alternative phrasings) → vector search (HNSW cosine over pgvector) → keyword search (Postgres tsvector, parallel) → RRF merge with score = Σ(1/(60 + rank)) → 4-layer dedup → backlink-boosted ranking. GBrain's published BrainBench numbers report P@5 49.1% and R@5 97.9% on a 240-page Opus-generated corpus, beating the same system with the graph layer disabled by +31.4 points P@5. The graph contributes more lift than hybrid search alone — but the graph is used for ranking, not multi-hop traversal at query time.

Hindsight: TEMPR (Four Parallel Strategies)

Hindsight's retriever runs four strategies on every query: semantic search (vector similarity), BM25 keyword, graph traversal (multi-hop across typed entity edges), and temporal reasoning (native handling of time-based queries). Results fuse via RRF and re-score through a cross-encoder reranker. Hindsight currently holds SOTA on the BEAM 10M-token benchmark at 64.1% — 58% ahead of the next-best system. The graph traversal and temporal layers are the two strategies neither GBrain nor Mem0 ship as primary.

Mem0: Semantic + Metadata

Mem0's retrieval is straightforward — embed the query, find nearest vectors, filter by metadata (user ID, session ID, tags). No keyword search, no graph traversal, no temporal reasoning, no reranker. This is fine at small scale and for queries that look like the embedded text. It struggles when queries require precision (specific term match) or compositional reasoning (multi-entity, time-aware).

Zep: Semantic + Temporal Graph

Zep combines semantic search with the temporal knowledge graph. Queries can scope to time ranges and follow entity relationships through the graph. Stronger than Mem0 on conversational time-shift queries, weaker than Hindsight on broader retrieval (no BM25 keyword, no general-purpose multi-strategy fusion).

Integration Breadth

This is where the practical experience of adopting each system diverges sharply.

GBrain ships first-class skill packs for OpenClaw and Hermes Agent. Everything else integrates through the MCP server you stand up yourself (gbrain serve for stdio, gbrain serve --http for OAuth 2.1). No first-party packages for Claude Code, Cursor, CrewAI, LangGraph, n8n, or other tools.

Hindsight ships 25+ official integrations across every major category at hindsight.vectorize.io/sdks/integrations/<name>:

Coding tools and IDEs: Claude Code, Cursor / Windsurf (chat), Codex CLI, OpenCode, OpenClaw, NemoClaw
Personal agents: Hermes Agent (hermes memory setup → hindsight is a native provider), ChatGPT, Perplexity
Agent frameworks: CrewAI, LangGraph, LlamaIndex, AutoGen, Agno, Pydantic AI, SmolAgents, Strands, Vercel AI SDK, OpenAI Agents SDK
Orchestration: n8n, Dify, Pipecat
LLM gateways: LiteLLM (100+ providers transparently)

Mem0 ships SDKs for Python, Node, and Go, with documented integrations for major frameworks (LangChain, CrewAI, LlamaIndex). Less category breadth than Hindsight, similar to Zep.

Zep ships SDKs for Python and TypeScript, with documented integrations for the major frameworks. Heavily oriented toward conversational AI use cases rather than coding agents or orchestration.

For teams whose stack is Claude Code, Cursor, CrewAI, n8n, Pipecat, or anything else outside the four systems' native targets, Hindsight is the only one with a first-class integration. The other three require the operator to wire it up.

Self-Host vs Managed Cloud

	GBrain	Hindsight	Mem0	Zep
Managed cloud	No	Yes	Yes (primary)	Yes (primary)
Self-host: Docker	docker-compose for tests	Yes (full and slim variants)	Limited	Via Graphiti
Self-host: Kubernetes	Manual	Helm chart	Limited	Via Graphiti
Self-host: Bare metal	`bun install && bun link`	pip install on Linux/macOS/Windows	Limited	Via Graphiti
Zero-config local mode	PGLite (WASM Postgres)	Embedded `pg0` Postgres	Local SQLite for dev	Limited
Vector backend	pgvector	pgvector / pgvectorscale / vchord / ScaNN	Internal	Internal graph + vector
Pluggable LLM provider	OpenAI required, Anthropic optional	Any (LiteLLM-compatible)	Server-side	Server-side

Honest read: GBrain has the simplest self-host path (PGLite means no DB server) but no managed cloud at all. Hindsight has the most flexible self-host story (Docker / Helm / pip across three OSes, four vector extensions, any LLM provider) plus a managed cloud. Mem0 and Zep are cloud-first; their self-host paths exist but aren't where the polish goes.

Pricing

Numbers below reflect public pricing as of writing.

	GBrain	Hindsight	Mem0	Zep
Self-host base	Free (MIT)	Free (MIT)	Limited	Via Graphiti (AGPL/permissive split)
Managed cloud free tier	N/A	Yes	Yes (10K memories)	Limited
Managed cloud paid plans	N/A	Usage-based	Hobby $19, Growth $79, Pro $249	Per-seat / usage-based
LLM cost	Operator pays (OpenAI required)	Operator pays (any provider)	Included in cloud	Included in cloud
Storage cost	Operator pays (PGLite free, Postgres your choice)	Operator pays (self-host) or included (cloud)	Included	Included
Enterprise / custom plan	N/A	Yes	Yes (custom)	Yes

In practice: GBrain self-host is single-digit dollars per month for an active personal brain. Hindsight self-host is similar; Hindsight Cloud has a free tier plus usage-based paid plans (see Vectorize pricing). Mem0 and Zep cloud plans scale steeply with usage and number of users.

Use Case Fit

If you want…	Pick
A markdown-first knowledge base you author skills for and own end-to-end	GBrain
To run OpenClaw or Hermes Agent with operator-authored patterns	GBrain
Memory infrastructure for an agent product serving end users	Hindsight
Drop-in integration with Claude Code, Cursor, CrewAI, LangGraph, n8n, etc.	Hindsight
Beliefs that auto-reconcile contradictions across thousands of memories	Hindsight
Multi-hop graph traversal AND temporal reasoning at retrieval	Hindsight
SOTA-grade long-horizon retention without authoring schema or skills	Hindsight
A managed memory layer with one-click OAuth 2.1 MCP onboarding	Hindsight (cloud)
The most established commercial managed service with broad SDK language coverage	Mem0
Conversational AI where temporal-graph context shifts matter most	Zep
Enterprise compliance posture for conversational user-scoped memory	Zep

Decision Framework

The honest framing of the four systems:

GBrain is a personal brain for one operator. It rewards the operator who wants to author skill workflows and own markdown source-of-truth. It is structurally a different product class from Mem0 / Zep / Hindsight, which are agent memory platforms.
Hindsight is agent memory infrastructure. It rewards teams that want memory the agent calls (retain, recall, reflect) without operator-authored patterns, with 25+ integrations across the categories agents actually run in.
Mem0 is the established commercial managed service. It rewards teams that want broad SDK coverage and the lowest operational ceiling, accepting weaker retrieval (single-strategy) as the trade-off.
Zep is conversational-first temporal memory. It rewards teams whose primary axis is "what did we discuss when," accepting narrower scope as the trade-off.

If you're shopping in this category, the question to start with isn't "which has the best benchmark" — it's "which audience am I." For most teams looking at GBrain after the YC-CEO launch buzz, the honest answer is that you wanted memory infrastructure rather than a personal brain, in which case Hindsight is the closer fit.

For deeper head-to-head: see GBrain vs Hindsight, Hindsight vs Mem0, and Hindsight vs Zep.

Frequently Asked Questions

Which one has the best benchmark scores? Different systems publish different benchmarks, which makes apples-to-apples hard. Hindsight holds SOTA on BEAM at 10M tokens (64.1%, with the next-best at 40.6%). GBrain's published BrainBench numbers (P@5 49.1%, R@5 97.9% on a 240-page corpus) are internal evaluations not directly comparable to academic benchmarks. Mem0 and Zep have published LongMemEval results that vary by methodology (model used, retrieval mode, top-k). The most defensible read: Hindsight is the only one with SOTA on a long-horizon benchmark.

Can I use GBrain with Mem0 or Hindsight? GBrain is structurally a different product (markdown brain, operator-authored skills), so they don't directly conflict. Some teams use GBrain as a personal markdown brain on top of an agent that also uses Hindsight or Mem0 as its production memory layer. Different layers, different jobs.

Is the YC-CEO factor inflating GBrain? A bit, yes. GBrain has more visibility than its young version number would normally command. The architecture is genuinely thoughtful and the BrainBench numbers are honest, but the brand halo is doing some of the work. Strip the YC-CEO branding and GBrain would still be one of the better personal-brain projects in the category; with it, it's also benefiting from outsized attention.

What about Letta, Cognee, SuperMemory, MemPalace? Out of scope for this four-way, but covered in GBrain alternatives (Letta, Cognee, SuperMemory) and MemPalace alternatives (MemPalace). The four-way here covers the systems most likely to come up when an evaluator is comparing GBrain against established production memory layers.

Are any of these vendor-lock-in risky? GBrain and Hindsight are both MIT-licensed with full self-host paths, so vendor lock-in is minimal — your data is in Postgres and (for GBrain) markdown files you own. Mem0 and Zep cloud are the typical SaaS lock-in story; data is in their cloud, governed by their export tooling. Self-host options for both exist but are less polished than the cloud paths.