Agent Heartbeat Protocol
Every agent writes one tiny memory at session start so every other agent can see who's online, what's been running, and whether anything has gone silent.
Why this exists. Without heartbeats, we have no way to know when an agent has been down for a week. The 2026-04-15 ingest flatline was 3 weeks silent. This is the cheapest possible dead-man's switch: if no heartbeat from agent X in N hours, something's wrong, alert.
Cost: one twin_memory_write call per session start. ~50 ms. Zero ongoing work.
The convention
At the start of every session, the agent calls:
twin_memory_write(
tag="agent_heartbeat",
body="<agent_name> online at <iso8601_timestamp> on <machine> · purpose=<short>"
)
Example:
twin_memory_write(
tag="agent_heartbeat",
body="g16-wsl-claude online at 2026-04-19T10:00Z on G16 · purpose=comms_infra_sprint"
)
That's the whole protocol. One line.
Agent naming — use these exact strings
Consistent names matter so queries work. Use one of:
| Agent | heartbeat name |
|---|---|
| G16 Claude Code (WSL2) | g16-wsl-claude |
| Apex Claude Code (persistent session) | apex-claude-mission-control |
| Apex WSL Claude (ad-hoc) | apex-wsl-claude |
| Cursor on G16 | cursor-g16 |
| Cursor on Apex | cursor-apex |
| Claude Desktop | claude-desktop |
| claude.ai web | claude-web |
| claude.ai mobile | claude-mobile |
| Antigravity | antigravity |
| Gemini CLI (Apex) | gemini-cli-apex |
| Jules (GitHub bot) | Jules doesn't have MCP — skip. Its presence is visible via GitHub events. |
| BRIEFS runner on G16 | briefs-runner-g16 (runner writes one heartbeat per cycle) |
| BRIEFS runner on Apex | briefs-runner-apex (when installed) |
If you're a new agent not on this list, add yourself here in the same PR as your first session.
Reading the heartbeats
Any agent can query:
twin_memory_search(
query="agent_heartbeat",
limit=20
)
Sort by timestamp descending. Most recent entry per agent = the last time that agent was alive.
Dead-man's-switch check
An agent is considered silent if its most recent agent_heartbeat memory is older than:
- G16 Claude, Apex Claude — 48 hours (we expect daily use)
- BRIEFS runners — 15 minutes (they fire every 5 min; 3 missed cycles = alert)
- Cursor — 72 hours (Mark uses Cursor ad-hoc)
- Everything else — 7 days
When the status JSON endpoint lands (see CURSOR_MISSION_BRIEF_status_json.md), it should add a heartbeats section with per-agent last_seen_at + is_silent boolean. That's the public dashboard view.
What a heartbeat is NOT
- Not for ongoing work reporting. Use BRIEFS for that.
- Not for inter-agent messages. Use
twin_memory_broadcastor a BRIEF. - Not for errors or alerts. Use Slack DM to Mark (
slack_post_message) or ntfy. - Not for "I'm about to do X." Just a vital sign. "I'm alive, I'm this thing, here's when."
For agents without MCP
- Jules never writes a heartbeat. Its activity is visible via GitHub events (comments, reviews).
- Antigravity uses
_AGENT_COMMS_LOG.md(the file-based bridge) until its MCP UI is fixed. First line of the log gets a timestamp + "antigravity online" entry. - BRIEFS runner writes one heartbeat per cycle via its existing git-commit path — no new code needed, the commit timestamp IS the heartbeat. Dead-man check = "time since last
auto: ran briefcommit."
Failure modes this catches
- Claude Code session on a machine silently stopped working → no new heartbeat → dashboard flags it.
- BRIEFS runner scheduled task got disabled → no new commits from
briefs-runner-g16→ alert. - Cursor can't reach the memory MCP → heartbeat write fails at session start → surfaced immediately to the user (not a silent degrade).
- An unknown agent is writing to memory without declaring itself → shows up as an untracked name in the heartbeat query → investigate.
Rollout
1. This doc lands on main.
2. CLAUDE.md gets one line added to §AI working rules: *"Rule 11 — session-start heartbeat: write one twin_memory_write call with tag agent_heartbeat at the start of every session. See MAPS/heartbeat_protocol.md."*
3. Next time each agent starts a session, it reads CLAUDE.md and complies.
4. When the status JSON endpoint is live, add heartbeats to the schema.
5. When the first agent goes silent for >N hours, the alert path is whatever's wired to the status JSON (probably ntfy).