Our Memory Setup vs The Field — Honest Comparison
*Written 2026-04-19 in response to your question after listening to NLW's Agent Madness episode where he flagged "the memory gap holding the whole field back."*
What we built — two complementary layers
Layer 1: Git-Based Markdown Memory (the deliberate one)
- 100+ small
.mdfiles, one fact each, in private GitHub repoMrB-Ed/claude-memory - 4 types:
user_*(about you),feedback_*(rules I follow),project_*(current state),reference_*(pointers) - YAML frontmatter +
MEMORY.mdindex always loaded into every session CLAUDE.mdoperating brief auto-loaded — Prime Directives, identity, current projects- Auto-sync: SessionStart pulls, Stop pushes (git hooks on G16 + Apex)
- Cross-machine: G16 + Apex (junction-linked) + Pixel/Samsung mobile (claude.ai) all sync
- Cross-IDE: same content mirrored to
~/.gemini/for Antigravity AND SQLite-injected into Cursor + Antigravity user-rules
Layer 2: Real-time Vector Memory (TP3 Neural Stack)
- Postgres + pgvector + DiskANN, 636,793+ embedded rows
- Live OMI ingest pipes your continuous voice transcription into it
- Semantic search via MCP servers, available cross-machine via
omi-mcp.thebarnetts.info
How this stacks up vs the field
NLW called it right on Agent Madness: memory is the gap holding the agent field back. Most production agents reset every session. The leading frameworks trying to fix this:
| Framework | Strength | Where it lags vs ours |
|---|---|---|
| Mem0 (managed SaaS) | Auto fact-extraction from conversations, 3-tier scoping, SOC 2/HIPAA, polished API | Vendor lock-in, ongoing cost, your data leaves your machine |
| Zep + Graphiti | Temporal knowledge graph — knows when facts were true. Scores 63.8% on LongMemEval vs Mem0's 49% | Heavier infra, requires graph thinking, not human-readable |
| Letta (MemGPT) | Agent self-manages OS-style memory hierarchy (working vs long-term) | Black-box decisions, you can't audit "what does it remember" easily |
| Cognee | Open-source knowledge graph layer, precision retrieval | Less mature, requires you to model entities |
Where ours LEADS
1. Truly cross-AGENT — Claude Code, Cursor, Antigravity, Jules, future tools all see the SAME memory. Most "memory frameworks" assume one agent. We did the dirty work (SQLite injection into Cursor's user-rules) to make it work cross-IDE. Almost nobody else has this.
2. Truly cross-MACHINE — G16 + Apex + mobile in sync within ~seconds. Most frameworks assume single deployment.
3. Git history as audit trail — full diff/blame on every memory change. Nothing else has this.
4. You can read it yourself — plain markdown. You can open any .md file and audit "what does Claude know about me." Mem0/Zep store memories as opaque records.
5. Zero recurring cost, zero data leaves your control.
Where ours LAGS (and what we're closing)
1. No auto fact-extraction — I have to manually write memory files when I learn something new. Mem0 watches conversations and proposes entries automatically. → Building tonight: daily fact-extractor agent. Like your morning report, but in agent terms — what did Mark say today, what new preferences/decisions, what should become memory.
2. No temporal reasoning — Zep can answer "what did Mark say about Legacy Soil last month vs this month." We just have git timestamps. → Building tonight: temporal frontmatter (valid_from, superseded_by fields).
3. No semantic search across the memory files themselves — TP3 does semantic search on OMI transcripts, but the 100+ memory .md files are only retrievable by filename + index lines. → Building tonight: vector-index the memory .md files into TP3 alongside OMI rows.
4. No benchmarking — LongMemEval exists; we don't measure if memory is helping vs hurting. → Building tonight: baseline run, see how accurate I am about you. This is the one you flagged most — "how much you're learning about me."
Honest verdict
For solo-user multi-agent continuity (your specific need): we're ahead of what's available off-the-shelf.
For production AI assistant memory at scale (Mem0/Zep's target): we'd lag — but you don't have those needs.
After tonight's 4 gap-closers: we'll have what NO open framework currently has — cross-agent + cross-machine + audit-readable memory + auto-extraction + temporal awareness + semantic retrieval + measurable accuracy.
That's a real moat for your specific use case (Digital Twin Architect, single-user, multi-agent stack).