AI Memory Poisoning: How Attacks Corrupt Agent Memory

AI Memory Poisoning: How Attacks Corrupt Agent Memory

A prompt injection lasts one session. A memory poisoning attack lasts forever.

That sentence is the entire structural argument of this article — and the reason the security community formalized Memory and Context Poisoning as OWASP ASI06 in the 2026 Agentic AI Top 10. Prompt injection has been a known LLM threat since 2023. Memory poisoning is the version that doesn't reset when the conversation ends. Research from this year shows attack success rates of 80%, 95%, and even 99.8% against LLM-based agent implementations. The Agent Security Bench reports a highest average attack success rate of 84.30%, with limited effectiveness shown in current defenses. The threat is real and measurable.

AI memory poisoning is a persistent attack against agent memory: an attacker writes malicious content into an agent's long-term memory so that the agent acts on the poisoned content in future sessions. Unlike prompt injection, which resets between sessions, memory poisoning persists across every subsequent interaction — the attack and its effect are temporally decoupled.

This article explains what memory poisoning is, walks through the named attacks (MINJA, AgentPoison, Sleeper Memory, Memory Control Flow), maps the OWASP ASI06 control set to the defense layers that actually work, and shows what a shipping defense looks like using Hindsight's Memory Defense as the concrete worked example. By the end you'll know what to ask your memory vendor and which architectural choices matter.

What Is AI Memory Poisoning?

AI memory poisoning is the formal name for a class of attacks where adversarial content is written into an AI agent's persistent memory so that the agent retrieves and acts on the malicious content in future sessions. The attack succeeds when the agent treats the poisoned memory as legitimate context during decision-making.

The structural difference from prompt injection matters. Prompt injection is session-scoped: malicious instructions in the current prompt make the model behave wrongly for that session, and the effect ends when the session ends. Memory poisoning is persistent: malicious content written to the memory layer makes the agent behave wrongly across every subsequent interaction. The attack and its effect are temporally decoupled — an attacker can write today and the agent acts wrongly months later.

OWASP recognized the distinction in 2026 by adding Memory and Context Poisoning as ASI06 to its Agentic AI Top 10. The classification exists because the controls that defend against prompt injection (input moderation, output filtering, session-bounded monitoring) don't catch the persistent-state attack surface. ASI06 is about the agent system's memory layer; LLM01 is about the model's input/output. Both apply to any agentic system.

Why is memory poisoning becoming a focus now rather than two years ago? Three reasons. First, persistent memory has moved from "optional feature" to default architecture pattern — frameworks like Hindsight, Mem0, Zep, Letta, and Cognee have made it standard, which means the attack surface is widespread. Second, the auto-extraction pipelines that turn agent traces into stored observations accept content at face value; there's no native trust scoring on most memory layers. Third, real production-relevant evidence is mounting. Microsoft Security identified 50 distinct prompt-based attempts to influence AI assistant memory across 31 companies in more than a dozen industries over 60 days, and Palo Alto Networks Unit 42 published a proof of concept demonstrating how indirect prompt injection can silently poison the long-term memory of an AI agent via session-summarization manipulation.

If you're new to the underlying primitive at risk, see what is agent memory for the foundational concepts.

How Memory Poisoning Attacks Work

The general pattern is simple. An attacker submits content — via prompt, document, retrieved page, or agent interaction — that's crafted to get persisted into memory. The memory layer stores it, often via automatic extraction from agent traces. Future sessions retrieve the poisoned content as legitimate context and the agent acts on it.

That's the shape of every memory poisoning attack. The specifics vary by attack family.

MINJA — Memory INJection Attack

The MINJA paper (arXiv 2503.03704) demonstrates that an attacker doesn't need privileged access to inject poisoned memories. Query-only interaction is sufficient. The attack uses three techniques:

  • Indication prompts appended to benign queries to induce the agent to generate target reasoning steps
  • Bridging steps that logically connect the benign query to the desired malicious reasoning
  • Progressive shortening that gradually removes the indication prompts, leaving malicious records with plausible benign queries that get retrieved when a victim user submits a similar query later

Under the paper's benchmark conditions across LLM-based agent implementations (EHRAgent on MIMIC-III, RAP on Webshop, MMLU), MINJA reports an injection success rate exceeding 95%. The attacker never touched privileged storage — they just held conversations.

AgentPoison

A backdoor-style attack against RAG-based agents, published at NeurIPS 2024 (arXiv 2407.12784). AgentPoison crafts poisoned documents that, once retrieved, steer the agent toward attacker-chosen behaviors. The published numbers are striking: ≥80% attack success rate at less than 0.1% poison rate, with less than 1% impact on benign queries and no model retraining required. A handful of poisoned documents in a corpus of millions is enough.

This is the attack pattern that makes "we have a large vector store" a worse defense, not a better one — the poison rate scales with corpus size, but the attack rate doesn't degrade with it.

Sleeper Memory Poisoning

The "Hidden in Memory" paper (arXiv 2605.15338) describes time-bomb attacks: poisoned memories that remain dormant and only activate when specific conditions accumulate across multiple interactions. The attack works even on agents without explicit memory modules — accumulated context across conversations creates the persistence substrate.

The numbers: 99.8% poisoned-memory acceptance on GPT-5.5; 95% on Kimi-K2.6. Among successful retrievals, poisoned memories cause attacker-intended agentic actions in 60–89% of evaluations across models.

Sleeper attacks are particularly hard to defend because there's no immediate behavioral anomaly. The poisoned content sits quietly in memory until the trigger condition fires.

Memory Control Flow Attacks

The "From Storage to Steering" paper (arXiv 2603.15125) shows storage-layer manipulations that steer future agent reasoning. The paper's central finding is that memory retrieval can dominate the control flow, forcing unintended tool usage even against explicit user instructions. Tested on GPT-5 mini, Claude Sonnet 4.5, and Gemini 2.5 Flash, over 90% of trials proved vulnerable even under strict safety constraints. These attacks bypass many session-level defenses because they don't rely on direct prompt content — they manipulate the metadata, indexes, and retrieval scoring that determine which memories surface when.

Why Attack Success Rates Are So High

The common thread across these attacks: most memory layers were designed for utility, not adversarial robustness. Specifically:

  • Auto-extraction pipelines accept agent traces at face value
  • No native trust scoring on incoming observations
  • Retrieval ranks by semantic relevance — a poisoned memory that's more relevant to a query surfaces ahead of legitimate memories
  • The OWASP ASI06 control layers (input moderation, sanitization, trust-aware retrieval) are not standard

This is why the Agent Security Bench reports a highest average attack success rate of 84.30%, with limited effectiveness shown in current defenses across 27 attack/defense method combinations. Single-layer defenses show limited effectiveness in isolation. The industry's defense baseline is weak.

Real-World Risks

The Microsoft Security case study documented recommendation-system poisoning attempts at scale (50 distinct examples from 31 companies across more than a dozen industries over 60 days). Other realistic scenarios:

  • Healthcare / EHR agents: a poisoned patient identifier corrupts record retrieval, causing wrong-patient responses
  • Customer support agents: a poisoned customer context corrupts responses across the support org, including for users the attacker never targeted directly
  • Coding agents: a poisoned convention memory misleads future PR reviews, propagating insecure patterns
  • Recommendation systems: poisoned engagement signals steer recommendations toward attacker-chosen content

The structural pattern is the same: write once, persist forever, affect every future session in scope.

Memory Poisoning vs Prompt Injection

The most common confusion in security review discussions. Both are real; they're not substitutes for each other.

DimensionPrompt InjectionMemory Poisoning
PersistenceSession-scoped (resets when session ends)Persists indefinitely
Attack/effect timingImmediate, same sessionTemporally decoupled
OWASP classificationLLM01ASI06
Required infrastructureAny LLM with prompt inputAgent with persistent memory layer
Detection difficultySession-bounded log analysisCross-session pattern correlation
Defense layerInput moderation, output filteringWrite-time screening + consolidation + scope isolation
Blast radiusSingle session, single userEvery future session for affected scope
RecoveryEnd sessionAudit, purge, possibly rollback to snapshot

Prompt injection is the higher-volume attack — more attempts per day, easier to automate. Memory poisoning is the higher-impact-per-success attack — one successful injection persists for the lifetime of the memory layer. Defense in depth requires addressing both, and the layers don't overlap.

A subtle and important point: memory poisoning often uses prompt injection as the seed. MINJA is the canonical example — the attacker uses query-only prompt interactions to plant content that the agent then persists. Defending only against prompt injection without defending the memory write path leaves the seeding chain intact. The dedicated memory poisoning vs prompt injection article walks through this interaction in depth.

The OWASP ASI06 Control Set

OWASP's Agentic AI Top 10 lists Memory and Context Poisoning as ASI06 because it represents a distinct architectural attack surface not covered by the LLM Top 10. The OWASP-recommended controls cluster into five layers:

  1. Input moderation with trust scoring — screen the write path before content lands in memory
  2. Memory sanitization with provenance tracking — tag every observation with source metadata
  3. Trust-aware retrieval — rerank by source trust during query, not just semantic relevance
  4. Behavioral monitoring — detect agents acting on beliefs they shouldn't have learned
  5. Forensic capabilities — snapshots and rollback for incident recovery

For the standards-aligned definition of ASI06 itself, see OWASP ASI06: Memory and Context Poisoning explained. No single layer suffices. The Agent Security Bench result — 84.30% highest average attack success against current defenses — is the empirical case for defense in depth. Layered correctly, the layers don't add linearly; they multiply, because an attacker has to defeat each one in sequence.

The architecture-level question for any memory layer is: which of these layers does it implement, and which does it leave to you? Vendors that hand-wave on this question are not ready for ASI06.

OWASP Agent Memory Guard: The Reference Implementation

Before walking through Hindsight's defense, it's worth covering the OWASP project that ships alongside the ASI06 classification.

OWASP Agent Memory Guard is the OWASP-sanctioned reference implementation for ASI06. Released in mid-2026, it's an open-source runtime defense layer that sits between an agent and its memory store, screening every memory read and write through a pipeline of detectors with a YAML-driven policy. Four dispositions: allow, redact, quarantine, block.

Built-in detectors cover:

  • Prompt injection markers — instruction-override attempts
  • Secret and PII leakage — credential and personal data patterns
  • Protected-key modifications — tampering with immutable memory keys
  • Size anomalies — oversized payloads
  • SHA-256 cryptographic baselines — out-of-band tamper detection for memory integrity

It also ships forensic snapshots with rollback to known-good states — when poisoning is detected, you can restore memory rather than manually auditing every entry.

The developer-adoption pattern is striking. Within days of release, framework-feature-request issues appeared on Mem0, Letta, CrewAI, agno, Vercel AI SDK, and FlowiseAI asking for Agent Memory Guard integration. The term has moved from "newly released" to "table-stakes for memory framework procurement" in security-conscious buyer conversations.

For Hindsight users, the relevant question isn't "do we use Agent Memory Guard?" — it's "how does Hindsight's native Memory Defense relate to the OWASP reference implementation?" The next section covers Memory Defense as the worked example of write-path screening; the section after that maps both side by side honestly.

A Concrete Defense Example: Hindsight Memory Defense

Hindsight ships Memory Defense — a per-bank screening feature that maps directly onto the OWASP ASI06 input and sanitization control layers. It comes in two tiers; the difference between them is the architectural answer to "what scale of defense does my deployment need?"

Basic (OSS) Tier

A single sensitive_data detector running a 44-pattern OWASP-aligned regex set. Coverage spans the canonical credential and PII formats:

  • AI/LLM provider keys: Anthropic, OpenAI (project + admin), Google API + OAuth, xAI, Groq, HuggingFace, Replicate, Perplexity, Databricks
  • Cloud credentials: AWS access keys + session tokens, DigitalOcean tokens
  • Source-control tokens: GitHub (PAT, app, user, refresh, OAuth, fine-grained), GitLab, NPM, PyPI
  • Payment processor secrets: Stripe (live, test, restricted), Square, Braintree
  • Communications: Slack tokens + webhooks, Twilio API + SID, SendGrid, Mailgun, Discord, Telegram
  • Commerce: Shopify access tokens
  • Database connection strings: PostgreSQL, MySQL, MongoDB (with embedded credentials)
  • Cryptographic material: PEM private key blocks, JWTs
  • US PII: credit cards, US SSNs

Per-bank policy:

{
  "memory_defense": {
    "enabled": true,
    "rules": [
      { "on": "sensitive_data", "action": "redact" }
    ]
  }
}

Action: redact (matches replaced with [REDACTED:type] markers; scrubbed memory stored). The block action is accepted but downgraded to redact in OSS.

Honest scope for Basic: this addresses credential hygiene. It catches the canonical secret-format leakage path — agents persisting API keys, tokens, and PII into memory where they shouldn't. It does not catch semantic memory poisoning. A MINJA-style attack that plants plausible-sounding malicious instructions (no secret patterns in the payload) won't be caught by regex. For that, the consolidation layer or the Enterprise tier are the answer.

Cloud Enterprise Tier

A 7-stage screen pipeline run in fixed order, each stage gated by per-org entitlement flags:

  1. base64_decode — expand base64-encoded payloads so downstream detectors see the underlying content. Defeats encoding-based smuggling.
  2. detect_secrets — 220-pattern provider catalog (detect-secrets 1.5.0 + GitLeaks + Hindsight-native). A CI test (test_total_pattern_count_meets_enterprise_bar) locks a 200-pattern minimum.
  3. llm_screen — LLM-based detection of credentials embedded in conversational prose. This catches what regex misses — credentials phrased naturally rather than in canonical token formats.
  4. sensitive_data — the 44-pattern OWASP set (same as Basic).
  5. prompt_injection — instruction-override / jailbreak detection. This is the layer that directly addresses MINJA-style semantic poisoning by detecting and blocking the attack at the write path rather than letting it reach memory.
  6. size_anomaly — oversized payload detection (default 200 KB threshold). Defeats blob-stuffing.
  7. protected_keys — immutable tag namespace enforcement. Rejects resubmits that try to overwrite protected memory metadata.

Per-bank policy with multiple detectors:

{
  "memory_defense": {
    "enabled": true,
    "rules": [
      { "on": "detect_secrets", "action": "redact" },
      { "on": "llm_screen", "action": "redact" },
      { "on": "sensitive_data", "action": "redact" },
      { "on": "prompt_injection", "action": "block" },
      { "on": "size_anomaly", "action": "block" },
      { "on": "protected_keys", "action": "block" }
    ]
  }
}

Real block enforcement on Enterprise. Block drops the item; if every item in a retain call is blocked, the call returns 422. A policy that references an unentitled detector returns HTTP 400 with the offending detector names — fails closed, not silently.

Audit trail: the security_events table writes one row per non-ALLOW decision. Each row captures detector, action, severity, source class, redacted-identifiable fingerprint (e.g., ghp_AAAA...BBBB — never plaintext), and the submitting API key name for attribution. This is the audit substrate buyers need for compliance reviews and post-incident forensics.

Webhook: memory_defense.violation fires HMAC-SHA256 signed with 24-hour retry/backoff. Direct integration recipes for Splunk, Datadog, Slack, and PagerDuty.

How Memory Defense Maps to OWASP ASI06

OWASP ASI06 ControlMemory Defense Stage
Input moderationbase64_decode, prompt_injection, size_anomaly
Memory sanitizationdetect_secrets, sensitive_data, llm_screen
Integrity controlsprotected_keys
Audit / monitoringsecurity_events + memory_defense.violation webhook
Forensic capabilityRedacted-identifiable fingerprints + event retention

The mapping isn't complete on its own — trust-aware retrieval and cross-time consolidation are separate layers that Memory Defense's write-path screening doesn't cover. Hindsight's consolidation layer (auto-consolidating observations + refreshing mental models) addresses contradiction reconciliation for poisoned content that gets past screening. Together, the layered architecture covers the ASI06 control set more completely than any single feature could.

Honest Scope Statement

Even with the full Enterprise 7-stage pipeline, Memory Defense covers the write path. It does not cover:

  • Retrieval-time injection in already-stored content. Memory Defense is not retroactive; content stored before a policy was added isn't re-scanned. Retrieval-side reranking and output guardrails address this.
  • Semantic attacks sophisticated enough to evade llm_screen. The llm_screen detector significantly raises the bar for semantic memory poisoning but isn't perfect. The consolidation layer remains the second line of defense.
  • Application-layer prompt construction errors. If an agent's prompt template inlines user input into LLM calls without retrieval-time filtering, Memory Defense doesn't see that path.
  • Long-horizon goal hijacks that don't depend on memory writes. Out of scope for any memory layer.

For the full defense-in-depth walkthrough across all five OWASP control layers, see how to prevent AI memory poisoning.

OWASP Agent Memory Guard vs Hindsight Memory Defense

Both are ASI06 implementations, and they're complementary rather than substitutes. Here's the honest side-by-side.

DimensionOWASP Agent Memory GuardHindsight Memory Defense
Project statusOWASP reference implementationHindsight production feature (two-tier)
Screening scopeMemory reads AND writesMemory writes (retain calls)
Policy formatYAMLJSON per-bank
Dispositionsallow / redact / quarantine / blockallow / redact / block (quarantine dropped)
Cryptographic integrity (SHA-256 baselines)YesNot in current scope
Forensic snapshots + rollbackYesNot in current scope
Drop-in middleware for major frameworksYesNative to Hindsight
LLM-based credential detection in proseNot in current scopeYes (Enterprise llm_screen)
Pattern catalog sizeBuilt-in set44 OSS / 220 Enterprise (detect-secrets + GitLeaks + Hindsight-native)
Enterprise tier with entitlement gatingSingle open-source tierOSS Basic + Cloud Enterprise
SIEM webhook with HMAC signingStandard loggingHMAC-SHA256 + Splunk / Datadog / Slack / PagerDuty recipes
LicenseOWASP open sourceHindsight MIT license (Basic); Cloud (Enterprise)

Where Agent Memory Guard is stronger:

  • Read-time screening — Memory Defense is write-time only. If retrieval-side defense is critical to your threat model, Agent Memory Guard covers a layer Hindsight doesn't natively today.
  • SHA-256 baselines and snapshot rollback — Memory Defense doesn't ship cryptographic integrity verification or memory rollback. For incident response that requires restoring a known-good state, Agent Memory Guard is the answer.
  • Framework-agnostic middleware — runs alongside Mem0, Letta, CrewAI, agno, and other frameworks without requiring a memory-layer swap.

Where Hindsight Memory Defense is stronger:

  • llm_screen semantic credential detection — catches credentials embedded in conversational prose that pattern-matching misses, which directly addresses MINJA-style seeding.
  • 220-pattern Enterprise catalog — substantially broader credential coverage than the built-in OWASP detector set.
  • HMAC-signed SIEM webhook with platform-specific recipes — Splunk/Datadog/Slack/PagerDuty integrations ship with documented setup, not "wire it up yourself."
  • Per-bank policy granularity — high-trust internal banks stay open while customer-facing banks lock down on the same deployment.

The honest framing: Agent Memory Guard is the standards-aligned baseline for ASI06 controls and the right answer when you need read-side defense, cryptographic integrity, or framework-agnostic middleware. Hindsight Memory Defense is the deeper write-path implementation with semantic credential detection and shipping SIEM audit — the right answer when you've chosen Hindsight as your memory layer and want production-grade write-time screening integrated natively.

For a deep-dive on the comparison including code examples and deployment patterns, see OWASP Agent Memory Guard vs Hindsight Memory Defense.

What To Ask Memory Vendors About ASI06

A buyer's checklist for procurement reviews. Lift this verbatim:

  1. Do you screen the write path? Or does your platform accept any retain payload at face value?
  2. What's your credential pattern coverage? A short canonical list, or a broad provider catalog with regular updates?
  3. Do you do LLM-based detection of credentials in prose? Or regex-only?
  4. Do you detect prompt-injection / jailbreak instructions at write time? Or rely entirely on consolidation downstream?
  5. What's your block action actually do? Real enforcement, or do "blocked" items silently get downgraded to redaction?
  6. Where's the audit trail? SIEM-ready event schema, or just internal logging?
  7. Is policy per-bank or global? Per-bank lets high-trust internal banks stay open while customer-facing banks lock down.
  8. What's your scope model? Can poisoning at user_id=victim automatically promote to org_id?
  9. Self-hosted or shared-tenant? Shared-tenant memory platforms have cross-tenant attack surfaces that self-hosted deployments don't.
  10. Is your defense code reviewable? The recent long-term memory security survey argues that "verifiable, recoverable governance" is itself a structural security property — closed-source platforms ask you to trust unverifiable claims.

A vendor that can answer these questions concretely is ASI06-credible. A vendor that hedges on most of them is not.

Conclusion

Memory poisoning is structurally different from prompt injection. The persistence dimension changes everything — defense in depth requires layers that session-level controls don't provide. OWASP's ASI06 classification reflects the consensus that agentic systems need a separate control set for the memory layer.

Three things to remember:

  1. Memory poisoning is the persistent attack vector against agent memory. One successful injection affects every future session until it's audited and purged. Volume is lower than prompt injection; per-success impact is dramatically higher.
  2. Architecture choices in your memory layer determine which attack classes you can defend. Write-path screening, source provenance, scope isolation, consolidation, and audit trails are architectural properties — you either have them or you don't.
  3. No single layer suffices. Single-layer defenses show limited effectiveness against the Agent Security Bench attack set. Layered correctly across the OWASP ASI06 control set, the math changes — an attacker has to defeat each layer in sequence.

Further Reading

FAQ

What is AI memory poisoning? AI memory poisoning is a persistent attack against agent memory: an attacker writes malicious content into an agent's long-term memory so that the agent acts on the poisoned content in future sessions. Unlike prompt injection, which resets between sessions, memory poisoning persists across every subsequent interaction. OWASP classifies it as ASI06 in the Agentic AI Top 10.

How does memory poisoning differ from prompt injection? Prompt injection is session-scoped — malicious instructions in the current prompt affect that session only and reset when it ends. Memory poisoning is persistent — malicious content written to the agent's memory layer affects every future session until the poisoned memory is detected and purged. The two attacks often interact: memory poisoning frequently uses prompt injection as the seeding mechanism (MINJA is the canonical example).

What is OWASP ASI06? OWASP added Memory and Context Poisoning to its 2026 Agentic AI Top 10 as ASI06. The classification recognizes that agentic systems have a distinct persistent-state attack surface not covered by the LLM Top 10 controls. ASI06 prescribes five defense layers: input moderation, memory sanitization with provenance, trust-aware retrieval, behavioral monitoring, and forensic capabilities.

What are the MINJA and AgentPoison attacks? MINJA (Memory INJection Attack — arXiv 2503.03704) demonstrates that an attacker can inject poisoned memories through query-only interaction, with reported injection success rates over 95% under the paper's benchmark conditions across LLM-based agent implementations. AgentPoison (NeurIPS 2024, arXiv 2407.12784) is a backdoor-style attack on RAG-based agents with at least 80% attack success at less than 0.1% poison rate and less than 1% benign impact, with no model retraining required. Both attacks bypass session-level defenses by exploiting the memory write path.

Can self-hosted memory layers be poisoned? Yes — self-hosting removes shared multi-tenant attack surfaces but doesn't address the application-layer attacks (prompt-injection-seeded memory writes, indirect prompt injection in retrieved content, sleeper memory). Self-hosting is a security property; it's not a complete defense.

How do you detect memory poisoning? Detection requires cross-session pattern correlation, persistence-aware monitoring, and write-path audit trails. The OWASP ASI06 control set calls for security_events-style audit attribution, behavioral monitoring of agent decisions, and forensic snapshots that allow rollback to a known-good state when poisoning is identified. Detection downstream of poisoning is harder than prevention upstream.

Does Hindsight protect against memory poisoning? Hindsight ships Memory Defense in two tiers. Basic (OSS) addresses credential hygiene via a 44-pattern regex sensitive_data detector. Cloud Enterprise extends to a 7-stage pipeline including llm_screen (semantic credential detection), prompt_injection (MINJA-style detection at the write path), and additional integrity controls, plus a security_events audit trail and HMAC-signed webhooks for SIEM integration. Both tiers are write-path defenses; cross-time defense lives in Hindsight's consolidation layer. Defense in depth across the full ASI06 control set is still required.

What's the difference between Hindsight Memory Defense Basic and Enterprise? Basic ships the 44-pattern sensitive_data regex detector with redact action — credential hygiene only. Enterprise ships the full 7-stage pipeline (base64_decode + detect_secrets 220-pattern catalog + llm_screen + sensitive_data + prompt_injection + size_anomaly + protected_keys), real block enforcement, security_events audit trail, and the memory_defense.violation HMAC-signed webhook with SIEM integration recipes for Splunk, Datadog, Slack, and PagerDuty.

How fast can memory poisoning take effect? Once a poisoned memory is stored, it's available for retrieval immediately. The earliest measurable effect happens on the next agent query that surfaces the poisoned content. Sleeper memory attacks (arXiv 2605.15338) delay activation until trigger conditions accumulate, which can be hours, days, or weeks. The temporal decoupling between attack and effect is what makes detection harder than for prompt injection.

Are existing memories scanned when defense policies are added? No. Memory Defense is not retroactive — adding or changing a policy only affects future retain calls. To clean a bank that already contains unredacted or potentially poisoned content, re-ingest the affected memories or remove them manually. Build defense policies into the bank from creation when possible.

What is OWASP Agent Memory Guard? OWASP Agent Memory Guard is the OWASP-sanctioned reference implementation for ASI06 (Memory and Context Poisoning). It's an open-source runtime defense layer that screens every memory read and write through a pipeline of detectors with a YAML policy supporting allow / redact / quarantine / block dispositions. Built-in detectors cover prompt injection markers, secret and PII leakage, protected-key modifications, and size anomalies, plus SHA-256 cryptographic baselines for tamper detection and forensic snapshots for rollback to known-good states. Released in mid-2026, the project hosts on the OWASP Foundation site.

What's the difference between OWASP Agent Memory Guard and Hindsight Memory Defense? Both are ASI06 implementations; they're complementary, not substitutes. Agent Memory Guard screens reads and writes, ships SHA-256 cryptographic baselines and snapshot rollback, and runs as drop-in middleware across frameworks. Hindsight Memory Defense focuses on the write path, ships an Enterprise tier with llm_screen (semantic credential detection in prose) and a 220-pattern catalog, and includes HMAC-signed SIEM webhooks with per-platform setup recipes. Agent Memory Guard is the standards-aligned baseline; Memory Defense is the deeper write-path implementation integrated natively into Hindsight. The dedicated OWASP Agent Memory Guard vs Hindsight Memory Defense article walks through both side by side.