AI Coding Market Map
Supporting Capabilities

AI Memory Solutions

Nascent
Emerging
Growth
Maturing
Mature

Libraries and services that give AI agents persistent memory across sessions — so they remember project context, decisions, and entity relationships instead of starting from scratch every time. Without memory infrastructure, agents lose effectiveness on any work that spans more than a single session.

Overview

Category maturity: Emerging. The category has moved past pure experimentation — multiple tools have raised institutional funding, published credible benchmarks, and earned production deployments. That said, the tooling landscape remains fragmented, retrieval architectures have not fully converged, and enterprise procurement paths are nascent for most standalone providers.

Direction of travel: The next 6-12 months will see consolidation around two architectural patterns:

  • Managed temporal knowledge graph services (Zep, Cognee) for applications requiring nuanced relational context
  • Lighter-weight key-value or vector hybrid layers (Mem0, Supermemory) for personalization-focused assistants and coding tools

Platform-level memory from OpenAI, Microsoft, AWS, and GitHub will absorb the simpler end of the market, leaving standalone infrastructure competing on precision, customization, and data residency. Open-source libraries will continue proliferating but face increasing pressure to differentiate on production reliability rather than benchmark scores.

Coalesced patterns: The industry has settled on a workable foundation:

  • Dual-memory architecture: short-term memory as an in-context sliding window or FIFO buffer, long-term memory as an external store (graph, vector, or SQL)
  • LangGraph, LlamaIndex, and most agent frameworks now expose first-class memory interfaces
  • API-first deployment with optional self-hosted open-source versions as the standard pricing model for standalone providers
  • MCP server compatibility increasingly expected as a baseline integration surface for coding agent use cases

Unsolved problems: Several gaps carry real operational weight:

  • Governance over what gets remembered and forgotten — including right-to-be-forgotten under GDPR and the operational question of stale memory degrading responses over time — is unsolved for most tools
  • Data residency and sovereignty for memory stores remains a trip wire for European and regulated-industry deployments
  • Multi-agent memory sharing, where multiple agents read and write a shared memory store without clobbering each other's context, has no well-established pattern
  • Memory hallucination and retrieval accuracy remain genuinely hard at scale; even the best benchmarked tools show degradation on long-horizon, high-update-frequency scenarios

Recommendations

1. If your coding agents are forgetting context between sessions, deploy Beads or Mem0 now. Beads (git-backed, zero infrastructure) is the right starting point for teams that want the fastest path to persistent coding agent memory with no cloud dependency. Mem0's OpenMemory MCP server is the right choice for cross-tool memory that follows the developer across Claude, Copilot, and VS Code. Both are deployable in hours. The friction cost of not deploying is high: developers are already experiencing GitHub Copilot and VS Code with session memory, and the contrast with memoryless agents is becoming noticeable.

2. Evaluate AWS AgentCore Memory before committing to a standalone service if your stack is already on Bedrock. Amazon shipped a first-party managed memory service this quarter with IAM controls, Kinesis streaming, and episodic memory. For AWS-resident workloads, this eliminates a third-party dependency and satisfies most enterprise procurement requirements out of the box. A standalone specialist provider (Mem0, Zep, Cognee) still wins when retrieval accuracy on complex, relational, or temporally sensitive queries is the primary requirement.

3. Use temporal knowledge graphs for applications that need to remember evolving facts about entities over time. For customers, projects, tickets, and other entities whose state changes, Zep and Cognee have both validated the temporal knowledge graph pattern in production and published credible benchmarks. Standard vector RAG (approximately 60% accuracy on complex queries) is not adequate for these use cases; graph-enhanced retrieval (approximately 93% accuracy) is. Budget for the higher ingestion overhead and evaluate on a representative sample of your actual query patterns before choosing a provider.

4. Confirm data governance before deploying any managed memory service. Every tool that persists agent memory to a cloud service is, by definition, sending potentially sensitive data off-premises. Before deploying, confirm: what data goes into the memory store, who has access to it, how long it is retained, and whether it is subject to the right-to-be-forgotten requirements applicable to your customers. Zep, Cognee, and Mem0 all offer self-hosted options; AWS AgentCore Memory stays within your AWS account boundary. Start with self-hosted or in-account options and evaluate managed cloud services only after confirming the data handling terms satisfy your compliance requirements.


1. Temporal knowledge graphs are becoming the dominant retrieval architecture. The memory layer is converging on temporal knowledge graphs as the retrieval pattern of choice for production-grade agents, with Zep's Graphiti, Cognee's ECL pipeline, and several new entrants all landing on graph-first designs. The reason is straightforward: agents need to know not just what is true, but when it became true and whether it has since been superseded. Vector similarity search, the default for early-generation RAG, cannot express these temporal relationships without substantial engineering overhead. The week's benchmark data reinforces the shift: Cognee's graph-enhanced retrieval achieves approximately 93% accuracy on complex queries versus approximately 60% for standard vector RAG, and Zep achieves an 18.5% accuracy improvement on LongMemEval with 90% latency reduction.

2. The coding agent memory category is accelerating rapidly. Three distinct developments this week signal that memory is becoming table-stakes infrastructure for coding agents specifically. GitHub Copilot's cross-agent memory system (public preview, January 2026) gives Copilot the ability to carry context across coding, CLI, and code review sessions. Beads, Steve Yegge's git-backed memory system, has gained significant community traction with documentation, plugins, and integrations appearing quickly after launch. VS Code's February 2026 release added native agent memory spanning sessions. These converging signals suggest that coding agents without persistent memory will be treated as deficient by developers within two to three quarters.

3. Platform-level memory is commoditizing the conversational-assistant use case. OpenAI, Microsoft, and GitHub all shipped or broadly enabled memory features in Q1 2026 that are enterprise-capable and carry meaningful compliance infrastructure. Microsoft Copilot Memory reached general availability in January 2026 with Microsoft Purview eDiscovery integration. OpenAI's project-scoped enterprise memory keeps sensitive data within project boundaries. For engineering organizations primarily running AI assistants rather than autonomous agents, these platform features may satisfy the use case without additional infrastructure.

4. Benchmark proliferation is creating signal alongside noise. Multiple vendors now publish retrieval accuracy claims against LongMemEval, LoCoMo, and the DMR benchmark. Supermemory claims approximately 99% on LongMemEval_s using agentic retrieval flows; Zep publishes 94.8% on DMR; Memori reports 81.95% on LoCoMo. These benchmarks are useful for directional comparison, but each tests a different memory scenario and none fully replicates production agent workloads. The right interpretation: any tool scoring above 80% on LongMemEval is demonstrably competitive; below 70% warrants scrutiny before production deployment.

5. AWS entered the managed memory market with enterprise-grade controls. Amazon Bedrock AgentCore Memory is now a first-party managed service with streaming notifications for long-term memory updates (announced March 12, 2026), episodic memory strategies, and native IAM/policy controls. This raises the bar for standalone memory services competing for AWS-resident workloads. Teams already on Bedrock now have a low-friction memory option without adding a third-party dependency, which accelerates the question for portfolio companies: is the memory quality improvement from a specialized layer worth the integration overhead against a managed platform option?


Tools

Mem0 logoMem0

  • Maker: Mem0.ai (YC-backed; $24M Series A from YC, Peak XV, and Basis Set, October 2025)
  • Memory model: Hybrid vector and graph. Extracts discrete "memory facts" from conversations, stores them in a vector index for semantic retrieval, with an optional graph layer adding entity relationship links when richer relational context is needed.
  • Access: Managed API (mem0.ai) with a free tier and paid plans scaling by API calls. Open-source self-hosted version available at github.com/mem0ai/mem0 under Apache 2.0. Approximate cost at scale is not publicly disclosed beyond the free tier, but self-hosting is supported for cost-sensitive deployments.
  • Integration surface: Native LangChain and LlamaIndex integrations. MCP server available (mem0-mcp). Direct REST API and Python/TypeScript SDKs. OpenAI, Anthropic, and most major LLM providers supported as the underlying model.
  • Strengths:
    • Production-scale retrieval performance: 26% relative accuracy improvement over OpenAI memory on the LOCOMO benchmark, with 91% lower p95 latency and 90% fewer tokens consumed per query.
    • Dual deployment model (managed API or self-hosted open-source) provides flexibility for teams that need data residency controls without forgoing managed infrastructure.
    • Optional graph memory layer adds relational context for entity-rich applications without requiring a full graph database deployment.
    • Active release cadence: v1.0.8 shipped March 26, 2026 with new vector provider integrations and LLM provider additions.
  • Limitations:
    • Graph memory is most valuable when entity relationships are dense; simpler conversational memory applications can use vector-only mode and achieve strong results without the added architectural complexity.
    • The managed API is the highest-fidelity path, so teams requiring complete on-premises data residency should plan for additional engineering effort to operationalize the self-hosted version at production scale.
    • Benchmark results are published on LOCOMO; teams should validate performance on their own workload patterns before committing to production, as benchmark scenarios may not reflect their specific memory update frequency.
  • Enterprise readiness: Developing. Managed API supports data isolation; self-hosted option supports data residency. Access controls and audit logging are available but enterprise SLA documentation is not yet publicly prominent.
  • Best for: Personalization-focused AI assistants and agents where per-user memory extraction from conversational history is the primary use case, especially when integration with LangChain or LlamaIndex is already in place.
  • This week: v1.0.8 released March 26, 2026 with Turbopuffer vector database provider and MiniMax LLM provider support.

Zep logoZep

  • Maker: Getzep, Inc.
  • Memory model: Temporal knowledge graph, powered by Graphiti (now open-sourced separately). Each fact in the graph carries a validity window: when it became true and when, if ever, it was superseded. Ingests unstructured conversational data and structured business data into a unified context graph that stays current as underlying data changes.
  • Access: Managed API (getzep.com) with paid plans. Graphiti is open-sourced at github.com/getzep/graphiti. Zep's main repo (getzep/zep) provides examples and integrations; the core managed service is closed-source.
  • Integration surface: LangChain and LlamaIndex native integrations. Direct API. The Graphiti engine can be used independently for teams that want the temporal graph capability without the full Zep managed service.
  • Strengths:
    • Temporal knowledge graph architecture is among the most rigorous approaches to memory in production, handling fact updates and contradictions as first-class problems rather than edge cases.
    • Strong benchmark performance: 94.8% on the DMR benchmark (versus 93.4% for MemGPT), 18.5% accuracy improvement on LongMemEval with 90% latency reduction.
    • Arxiv paper (2501.13956) documents the architecture with academic rigor, supporting enterprise procurement conversations that require technical diligence.
    • Graphiti open-source release extends the reach of the technology to teams that want the temporal graph engine without the full managed service.
  • Limitations:
    • Temporal graph construction carries higher ingestion overhead than vector-only approaches; teams should benchmark ingestion latency against their memory update frequency before deploying at high write volumes.
    • The full managed service is closed-source, so teams requiring audit of the memory processing logic will need to rely on the Graphiti open-source engine and build their own managed layer.
    • The richest use cases require structured as well as unstructured data ingestion; teams with purely conversational memory needs may find the architecture heavier than necessary.
  • Enterprise readiness: Developing. The temporal architecture provides strong foundations for memory governance (knowing when facts were true), but enterprise-facing controls documentation (SLAs, data residency, audit logs) is not publicly prominent.
  • Best for: Complex enterprise agent applications where memory must track evolving facts about entities over time, such as CRM-integrated agents, customer success workflows, and any scenario where "what was true last month" differs from "what is true today."
  • This week: Graphiti open-source release and Neo4j feature coverage are the most notable recent signals; no specific product announcements confirmed in the past 7 days.

Letta logoLetta (formerly MemGPT)

  • Maker: Letta AI
  • Memory model: Stateful agent platform with explicit, editable memory blocks as first-class components of agent state. Uses an LLM-as-operating-system model: the agent manages its own memory, deciding what to persist to long-term storage and what to keep in immediate context. Memory blocks are developer-accessible and mutable.
  • Access: Open-source at github.com/letta-ai/letta. Managed hosted service available. SDK-based integration.
  • Integration surface: Native Python SDK (letta). REST API. Supports most major LLM providers. Letta Code ships as an integrated coding agent product.
  • Strengths:
    • Memory as explicit, transparent, developer-controlled state is a meaningful differentiator: engineers can inspect, edit, and reason about what an agent remembers rather than treating memory as a black box.
    • Conversations API (January 2026) enables shared memory across parallel agent interactions, directly addressing multi-agent scenarios where agents need to share context.
    • Remote Environments (March 2026) allows agents to persist work and context across different execution contexts (e.g., continuing on mobile what started on desktop).
    • Letta Code demonstrates the memory-first architecture applied directly to coding agent workflows, giving engineering teams a working reference implementation.
  • Limitations:
    • The LLM-as-OS paradigm is architecturally sophisticated and rewards teams who invest in understanding the memory model; the right starting point is the documentation on memory blocks and state management before deployment.
    • Self-managed memory decisions by the LLM mean that memory quality is correlated with the underlying model's reasoning quality; teams using weaker base models should validate memory reliability before production.
    • The platform is broader than a memory layer alone; teams seeking a minimal-footprint memory-only integration will find Letta's full agent platform more than they need.
  • Enterprise readiness: Developing. Open-source flexibility supports data residency; managed service adds operational simplicity. Enterprise SLA and access control documentation is maturing but not yet fully prominent.
  • Best for: Teams building autonomous agents that need transparent, inspectable memory with developer control over what gets persisted, particularly in multi-agent or long-horizon task scenarios.
  • This week: Conversations API shipped January 21, 2026. Remote Environments shipped March 4, 2026. Letta Code announced.

Beads logoBeads (bd)

  • Maker: Steve Yegge (individual creator; open source)
  • Memory model: Git-backed structured memory and task graph. Stores agent work, tasks, dependencies, and context as a dependency-aware graph committed to the project's git repository. Memory travels with the codebase. Uses four types of dependency links: parent/child (epics), blocking issues, and related work.
  • Access: Open-source at github.com/steveyegge/beads. Free to use. MCP server available for coding agent integration; CLI (bd) is the primary interface.
  • Integration surface: MCP server for coding agent integration (Claude, Cursor, etc.). CLI for direct use. Designed to work with any coding agent that supports MCP or can run shell commands.
  • Strengths:
    • Git-backed storage means memory is version-controlled, reproducible, auditable, and travels with the repository, solving context loss across agent restarts without any external service dependency.
    • Dependency-aware task graph addresses the "50 First Dates" problem directly: agents restart knowing exactly what work was completed, what is blocked, and what comes next.
    • Zero infrastructure overhead: no database, no API service, no managed cloud required. Memory is just structured files in git.
    • Strong early community momentum with plugins and integrations appearing quickly after launch.
  • Limitations:
    • Git-backed storage is well-suited to code-adjacent memory (task state, project context) and the right fit grows with codebase-resident workflows; teams needing conversational memory or user preference persistence across products will want a complementary solution.
    • The tool is very recently released; production stability and edge case behavior at scale are not yet validated across diverse codebases and team sizes.
    • As an individual-creator open-source project, long-term maintenance trajectory depends on community adoption velocity.
  • Enterprise readiness: Early. No managed service, no SLA, no enterprise access controls. The git-native design is strong for auditability but the tool is pre-enterprise on most other procurement criteria.
  • Best for: Engineering teams running coding agents (Claude Code, Cursor, Copilot) on complex, long-horizon projects where context loss across sessions is the primary pain point, and where the simplicity of git-native storage is a feature rather than a limitation.
  • This week: Initial public release and launch post by Steve Yegge. Strong early community engagement and third-party integrations appearing within days of release.

Cognee logoCognee

  • Maker: Topoteretes (seed-funded; $7.5M from Pebblebed, backed by OpenAI and FAIR founders)
  • Memory model: Knowledge graph memory using an ECL (Extract, Cognify, Load) pipeline. Ingests documents, conversations, and external data in any format, constructs a knowledge graph from the content, and enables retrieval via combined graph traversal and vector search. Designed for continuous memory updates as new information flows in.
  • Access: Open-source at github.com/topoteretes/cognee under Apache 2.0. Managed service available. PyPI package (cognee). Approximately 6 lines of code to initialize.
  • Integration surface: Python library. Supports integration with LangChain, LlamaIndex, and direct API usage. Multiple vector store backends supported (Weaviate, Qdrant, and others).
  • Strengths:
    • Graph-enhanced retrieval substantially outperforms standard RAG: approximately 93% accuracy on complex queries versus approximately 60% for vector-only RAG, based on published evaluations.
    • Production deployments at 70+ organizations, including Bayer (scientific research) and the University of Wyoming (policy documents), provide concrete evidence of real-world reliability.
    • $7.5M seed with OpenAI and FAIR founder backing provides financial runway and credibility for enterprise conversations.
    • GitHub Secure Open Source program graduation signals meaningful security review of the open-source codebase.
  • Limitations:
    • Knowledge graph construction from unstructured data involves non-trivial ingestion latency; teams with real-time or near-real-time memory update requirements should benchmark the pipeline against their data velocity.
    • The ECL pipeline is most powerful when applied to rich document corpora; simpler chat-history-only memory applications can achieve strong results with lighter-weight approaches.
    • As with most knowledge graph systems, retrieval quality depends on graph construction quality; teams should budget for tuning the extraction pipeline on their specific data types.
  • Enterprise readiness: Developing. Production deployments at enterprise organizations, seed funding with reputable backers, and GitHub Secure Open Source graduation all support a developing rating. Formal enterprise SLA and compliance documentation is maturing.
  • Best for: Organizations that need to build memory from rich, heterogeneous document corpora (research, policy, product knowledge) where relationships between concepts matter as much as individual facts.
  • This week: No specific product announcements this week; background signals include confirmed production deployments and the GitHub Secure Open Source graduation.

MYMemary

  • Maker: kingjulio8238 (open source, community-maintained)
  • Memory model: Autonomous memory layer that wraps existing agents and automatically updates memory as the agent interacts. Supports both Ollama-served local models (Llama 3 8B/40B default) and OpenAI-hosted models. Dashboard visualization of captured memories.
  • Access: Open-source at github.com/kingjulio8238/Memary. Free. No managed service.
  • Integration surface: Python wrapper layer around existing agents. Intended to integrate with minimal code changes.
  • Strengths:
    • Low-friction integration model: wraps existing agents rather than requiring architectural changes, which reduces adoption cost for teams with agents already in production.
    • Local model support via Ollama is a meaningful differentiator for teams with data residency requirements or cost constraints that preclude cloud model usage for memory operations.
    • Dashboard visualization of agent memories provides observability that most memory tools lack.
  • Limitations:
    • The project is community-maintained without institutional backing; teams evaluating for production should review GitHub activity to confirm the maintenance trajectory meets their reliability requirements.
    • Future roadmap mentions removing the demo ReAct agent to support broader agent compatibility; teams should confirm current compatibility with their agent framework before adopting.
    • No managed service or enterprise support path; the right deployment context is self-hosted with internal engineering ownership.
  • Enterprise readiness: Early. No SLA, no managed service, no enterprise access controls. Suitable for teams willing to own operational responsibility.
  • Best for: Teams running local or cost-sensitive LLM deployments that want a lightweight, auto-updating memory layer without architectural refactoring of their existing agents.
  • This week: No new signals this week.

MHMotorhead

  • Maker: Metal (getmetal)
  • Memory model: Redis-backed memory server. Manages conversation context windows with automatic summarization: once a configurable message window size is reached, the server summarizes the oldest half of the window and compresses it into a rolling summary. Supports vector similarity search via Redis for message retrieval.
  • Access: Open-source at github.com/getmetal/motorhead. Free. Rust-based server, Docker deployment.
  • Integration surface: REST API. LangChain native integration (MotorheadMemory). Redis as the backing store.
  • Strengths:
    • Automatic sliding window summarization is a simple, reliable pattern for managing context length without requiring a graph database or complex retrieval architecture.
    • Rust implementation provides strong baseline performance and low resource overhead for the memory server itself.
    • Redis backing store is a familiar, well-understood technology for most engineering teams, reducing operational complexity.
  • Limitations:
    • The sliding window summarization approach works well for linear conversation history; teams needing temporal relationship tracking or entity graph retrieval should evaluate Zep or Cognee for those requirements.
    • GitHub activity on the core Motorhead repo has been quiet in 2026; teams considering adoption should assess the maintenance trajectory and note that Redis has published its own separate agent-memory-server (github.com/redis/agent-memory-server) that may represent a more actively maintained alternative for Redis-native teams.
    • No managed service, no enterprise support path.
  • Enterprise readiness: Early. Simple deployment model is a strength; limited recent development activity is a factor to weigh for production adoption.
  • Best for: Teams already running Redis infrastructure that want the simplest possible conversation memory with automatic summarization, and where relationship-aware retrieval is not required.
  • This week: No new signals this week. GitHub activity on the core repo appears limited in 2026.

LangChain Memory logoLangChain Memory (LangMem SDK)

  • Maker: LangChain, Inc.
  • Memory model: SDK-level abstractions for long-term memory within LangChain and LangGraph agent workflows. LangMem extracts important information from conversations, optimizes agent behavior through prompt refinement, and maintains long-term memory across sessions via LangGraph's Memory Store. Deep Agents (released March 15, 2026) adds a structured planning and memory runtime built on LangGraph.
  • Access: Open-source. LangMem available at langchain-ai.github.io/langmem. LangGraph's Memory Store is included in the LangGraph platform. Free for self-hosted; LangSmith/LangGraph Cloud for managed deployment.
  • Integration surface: Native to LangChain and LangGraph. MongoDB integration for persistent store. Filesystem-based memory option. First-party integration with the full LangChain tool ecosystem.
  • Strengths:
    • Deep integration with the most widely adopted LLM orchestration framework means zero net-new integration overhead for teams already running LangChain or LangGraph agents.
    • LangMem's prompt refinement capability lets agents improve their own instructions based on interaction history, a differentiated form of behavioral memory beyond fact recall.
    • Deep Agents runtime (March 2026) adds planning, memory, and context isolation in a single harness, reducing the number of moving parts for teams building structured multi-step agents.
  • Limitations:
    • LangChain's memory abstractions are tightly coupled to the LangChain ecosystem; teams not using LangChain will find better value in a standalone memory library.
    • The LangGraph Memory Store is a storage primitive, not an opinionated memory management system; teams needing extraction, summarization, and retrieval logic will build more themselves than with purpose-built tools like Mem0 or Zep.
    • LangChain's library surface is large and evolving rapidly; teams should pin dependency versions carefully and monitor for breaking changes.
  • Enterprise readiness: Developing. LangGraph Cloud provides a managed option with operational support. Self-hosted deployments on LangGraph are well-documented and widely used in production.
  • Best for: Engineering teams already running LangChain or LangGraph as their agent orchestration layer, for whom a deeply integrated memory module is preferable to introducing a separate memory service dependency.
  • This week: Deep Agents framework released March 15, 2026, with planning, persistent memory, and context isolation as first-class features.

LlamaIndex Memory logoLlamaIndex Memory

  • Maker: LlamaIndex, Inc.
  • Memory model: Composable memory block system. Supports StaticMemoryBlock (persistent user data), FactExtractionMemoryBlock (LLM-driven fact extraction from conversations), and VectorMemoryBlock (vector store-backed long-term memory). Short-term memory operates as a FIFO queue; overflow is flushed to long-term memory blocks. The ChatMemoryBuffer is deprecated in favor of the new composable Memory class.
  • Access: Open-source. Python and TypeScript SDKs. Free for self-hosted; LlamaCloud for managed deployment. Integrates with Weaviate, Qdrant, and other vector stores as backends.
  • Integration surface: Native to LlamaIndex agent framework. Works with LlamaIndex's query engines and data structures. LlamaCloud for hosted deployment with enterprise features.
  • Strengths:
    • Composable memory block design is architecturally clean: teams can combine static, extracted, and vector-backed memory in a single agent without bespoke integration code.
    • LLM-driven fact extraction (FactExtractionMemoryBlock) is a first-class feature, not a downstream add-on, supporting higher-quality long-term memory than simple message history storage.
    • The transition from the deprecated ChatMemoryBuffer to the new Memory class is well-documented; teams already on LlamaIndex have a clear migration path to more capable memory.
  • Limitations:
    • As with LangChain Memory, LlamaIndex Memory is most valuable within the LlamaIndex ecosystem; teams on other frameworks gain less.
    • VectorMemoryBlock requires an external vector store to be provisioned and managed; teams should plan for that operational dependency.
    • Memory capabilities are embedded in a broader framework, which means the memory-specific roadmap is driven by the framework's overall priorities rather than by memory-category requirements.
  • Enterprise readiness: Developing. LlamaCloud provides a managed enterprise option. The framework is in production at scale across many organizations.
  • Best for: Teams using LlamaIndex as their primary agent and data framework, who want native composable memory without adding a separate service dependency.
  • This week: Ongoing transition from deprecated ChatMemoryBuffer to new composable Memory class. No specific release announcements confirmed this week.

Adoption and Traction

  • Mem0: Released v1.0.8 on March 26, 2026 with Turbopuffer vector database integration and MiniMax LLM support. v1.0.0 major release earlier this quarter included API modernization, improved vector store support, and enhanced GCP integration. Published research showing 26% relative accuracy improvement over OpenAI memory on LOCOMO benchmark, with 91% lower p95 latency and 90% fewer tokens. GitHub repo active with regular commits.

  • Zep: Graphiti, the temporal knowledge graph engine underlying Zep, has been open-sourced and has attracted significant attention from the graph database community, including a feature post from Neo4j. Arxiv paper (2501.13956) on the temporal knowledge graph architecture published in January 2026, lending research credibility.

  • Letta: Conversations API shipped January 21, 2026, enabling shared memory across parallel agent interactions. Remote Environments feature released March 4, 2026. Letta Code, a memory-first coding agent product, was announced. GitHub repo (letta-ai/letta) remains active with consistent release cadence.

  • Beads: Steve Yegge announced the release on X with strong community response. Blog posts from Yegge on best practices and the underlying philosophy ("The Beads Revolution") have circulated in developer communities. Third-party integrations and plugins appearing within weeks of launch signals genuine adoption momentum. Jeffrey Emanuel (creator of MCP Agent Mail) described Beads plus Agent Mail as the core infrastructure primitives for multi-agent coding workflows.

  • Cognee: Raised $7.5M seed round led by Pebblebed, backed by founders of OpenAI and Facebook AI Research. Disclosed 70+ production deployments including Bayer (scientific research workflows) and the University of Wyoming (policy document evidence graph). Graduated the GitHub Secure Open Source program, signaling meaningful security review. PyPI package active.

  • Supermemory: Published ~99% accuracy on LongMemEval_s using agentic multi-agent retrieval flows (3 reading agents, 3 searching agents). Claims top rankings on LongMemEval, LoCoMo, and ConvoMem. MCP server available for integration with Claude and other compatible assistants. Published MemoryBench, an open benchmark suite for conversational memory evaluation.

  • AWS AgentCore Memory: Streaming notifications for long-term memory announced March 12, 2026, with Kinesis integration. Episodic memory strategy added. Quality evaluations and policy controls for trusted agent deployment announced this week.

  • GitHub Copilot Memory: Cross-agent memory system launched in public preview in January 2026 for all paid Copilot plans. Available across coding agent, CLI, and code review. Opt-in by default. (Platform-level signal, not an infrastructure profile.)


New Entrants & Watch List

New entrants (standalone memory infrastructure):

  • Supermemory (supermemory.ai): Memory API and MCP server claiming approximately 99% accuracy on LongMemEval_s via agentic multi-agent retrieval (3 reading agents, 3 searching agents). Also publishes MemoryBench, an open unified benchmark for conversational memory evaluation. Claims top rankings on LongMemEval, LoCoMo, and ConvoMem. MCP server enables integration with Claude and compatible assistants. The benchmark-first positioning and open benchmark suite are worth watching; if retrieval claims hold under independent replication, Supermemory would represent the highest-accuracy retrieval option in the category. Worth tracking for inclusion on the core coverage list.

  • MemOS (MemTensor/MemOS on GitHub): An open-source Memory Operating System for LLMs and AI agents, unifying store, retrieve, and manage for long-term memory with persistent skill memory designed for cross-task skill reuse and evolution. Launched an OpenClaw plugin (March 8, 2026) with local persistent SQLite, hybrid FTS5 + vector search, task summarization, skill evolution, multi-agent collaboration, and a Memory Viewer dashboard. Active GitHub community. The skill evolution and multi-agent collaboration capabilities differentiate this from simpler fact-recall libraries and warrant a closer look.

  • Memori (MemoriLabs): SQL-native, LLM-agnostic memory infrastructure described as "agent-native memory infrastructure that turns agent execution and interactions into structured, persistent state for production systems." Evaluated on LoCoMo benchmark at 81.95% accuracy with an average of 1,294 tokens per query. SQL-native architecture could be a meaningful differentiator for teams whose existing data infrastructure centers on relational databases rather than vector stores or graph databases.

  • AWS AgentCore Memory (Amazon Bedrock): First-party managed memory service with streaming notifications for long-term memory updates via Kinesis (March 12, 2026), episodic memory strategies, IAM/policy controls, and quality evaluations. Not a standalone third-party tool, but a significant market entrant that competes directly with Mem0, Zep, and Cognee for AWS-resident workloads. Teams already on Bedrock should evaluate this before committing to a standalone memory service.

  • Google Always On Memory Agent (Google Cloud Platform, open source): Open-sourced by a Google PM on the official GCP GitHub, MIT licensed. Demonstrates a memory agent that ingests continuously, consolidates in the background, and retrieves later without relying on a conventional vector database, using the Google Agent Development Kit. A reference architecture signal rather than a shipping product, but the approach of using an LLM-driven agent rather than a vector index for retrieval aligns with Supermemory's agentic retrieval claims and merits watching.

  • SimpleMem (aiming-lab, GitHub): Efficient memory framework based on semantic lossless compression. MCP server went live January 14, 2026, open source. Supports cross-session persistent memory. Research-origin project (ICLR 2026 adjacent work LightMem may be related). Worth monitoring for practical adoption signals beyond the academic community.

Coding agent toolkits with memory features:

  • GitHub Copilot Memory (Microsoft/GitHub): Cross-agent memory system launched in public preview for all paid Copilot plans (January 2026). Spans coding agent, CLI, and code review. Opt-in by default. Flag for CLI/IDE landscape: This signals that coding assistant vendors now treat persistent memory as a standard feature, not a differentiator. The memory design (preference and workflow learning across agent types, not document retrieval) reflects a user-behavior-modeling pattern. This is a leading indicator that coding agents without memory will be positioned as entry-level products within 2-3 quarters. Track under CLI/IDE landscape; not in scope as standalone infrastructure.

  • VS Code Agent Memory (Microsoft): February 2026 VS Code release (1.110) added agent memory spanning sessions for coding agents, CLI, and code review. Same signal as GitHub Copilot Memory above; included separately because it reflects VS Code's native agent runtime gaining memory capabilities independent of Copilot-specific tooling.

Reclassifications:

  • None this week.