ChatGPT + Codex primer — what to do while Claude is rate-limited

2026-05-21 — your read on switching off Claude for a stretch, plus a working list of things to attack with ChatGPT in the meantime

You asked whether ChatGPT (and OpenAI's Codex agent platform) is "a little bit better at the moment," whether the two of us can work in tandem, and what to give the other side to chew on while you're waiting for Claude credits to refresh. Honest take: it depends on the task. There are things ChatGPT/Codex is genuinely sharper at right now, and things that would be a hassle to migrate. Below is the comparison, then a concrete work list sized for ChatGPT to chew while Claude is dark.

The 30-second read

Don't ditch Claude. Run them in tandem. Use ChatGPT/Codex when Claude is rate-shut OR when the task is squarely in their strength zone (heavy multi-file Python/TS refactors, fresh-from-scratch scaffolds, math-heavy reasoning, deep research with their newer search agents). Keep Claude as the operating spine — it's the agent that knows your memory repo, your prime directives, the Pixel/AutoVoice/Bidet stack intimately, and your communication preferences without re-priming.

The two coexist cleanly because your memory lives in a git repo (MrB-Ed/claude-memory) that any agent can clone. ChatGPT can read it but won't auto-extend it without a different discipline.

Current state of ChatGPT + Codex (late 2025 / early 2026)

The OpenAI side has shipped a lot recently. Here's what's load-bearing for your stack:

CapabilityWhat it isBottom line for your stack
GPT-5OpenAI's flagship; longer context (~256-400K depending on tier), stronger coding/math, more reliable tool use.Comparable to Claude Opus / Sonnet on most tasks; arguably edges Claude on raw math + algorithmic reasoning.
Codex (the new agentic Codex, not the 2021 one)OpenAI's coding-agent product. Has its own CLI (codex), a cloud version, and a GitHub-integrated mode. Multi-file refactors, can spawn parallel subagents.Direct competitor to Claude Code. Strong on multi-file Python/JS work. Cloud mode runs on OpenAI infra (different billing pool from your Claude credits).
ChatGPT memoryLong-term memory feature in the consumer ChatGPT app. Remembers facts across sessions.Useful for personal context but NOT a substitute for your file-based memory repo. Less precise, less greppable, can't be reviewed/edited like markdown.
ChatGPT Agent mode / OperatorBrowser-driving + computer-use agent. Headless or via attached Chrome.Roughly the chrome-devtools MCP pattern you already have. Same headless-Chrome caveat.
MCP supportOpenAI has shipped MCP server support in ChatGPT Pro/Team for direct tool integration.Your existing MCP servers (chrome-devtools, twin-memory, Gmail, etc.) may work with ChatGPT with config tweaks. Compatibility is real but not 1:1 with Claude Code.
Deep research modeMulti-step search + synthesis. ChatGPT runs ~20-50 queries and writes a long report.Genuinely good for your "AI Radar / what changed this week" kind of work. Cheap relative to running a Claude subagent for the same.
Pricing tiersPlus $20/mo (light), Pro $200/mo (heavy, includes Codex cloud + Operator), Team/Enterprise above.Pro tier is the right shelf for running tandem with Claude Max. Plus alone won't keep up with your usage.

Where Codex is genuinely stronger than Claude Code today

Where Claude (Code in particular) is still ahead for YOUR stack

Tandem mode — is it feasible?

Yes, with one trick: treat ChatGPT as a contractor, not a co-worker.

What to give ChatGPT/Codex to work on RIGHT NOW

Sized for the next few hours while Claude is rate-shut. These are tasks where Codex's strengths match what's on the board AND that won't conflict with the Claude-side work currently in flight.

1. Heavy Spotify Phase 2 build — transcript + semantic search layer

Phase 1 (recently-played pull → Postgres) is built and about to start running. Phase 2 per the May 20 spec adds transcripts + semantic search across listening history. This is multi-file Python + DB work — exactly Codex cloud's lane.

2. Bidet phone hardening — APK distribution + BYOK + free-hosted

The post-contest plan ([[project_bidetai_app_post_contest_strategy_2026-05-14]]) calls for three distribution paths: sideload APK, BYOK web, and a free-hosted version on Cloudflare Workers AI. Codex is well-suited to scaffold a Workers AI proxy with rate limiting + KV-backed user state. Pure code work, no live-stack dependencies.

3. Bidet Whisper fine-tune pipeline (Tier 2 long-arc)

Tangent #34 + #38 in your backlog. 22.5 hours of paired (audio, transcript) corpus already in Drive. Tier 2 spec: LoRA r=32 a=64 on Whisper-large-v3, weekend run on Apex's RTX 5060 Ti. Codex can lay down the training script + dataset loader + evaluation harness now so the actual training run is a one-command launch when hardware is upgraded.

4. Memory typing at ingest (RRF retrieval foundation)

Tangents #27 + #26: type each TP3 memory as fact | event | instruction | task at ingest, then implement RRF retrieval across pgvector cosine + Postgres FTS + metadata-key lookup + HyDE. Compounding gains on /omi/ask + dashboard recall. This is real algorithm + plumbing work — Codex's sweet spot.

5. Captain's-log routing fix + kind field

Tangent #25: 503 of 506 source=email_ingest rows are actually OMI conversation summaries misrouted via a Make scenario. Fix the Make scenario to point to omi_summary, add kind=article|transcript|email|note at ingest, backfill. Codex can produce the migration SQL + a Make blueprint patch + a backfill script.

6. Webhooks + HMAC for Apex → reports push

Tangent #28: replace any remaining polling pattern with HMAC-signed webhook pushes from Apex to reports.thebarnetts.info. Standard Webhooks library pattern. Greenfield code, well-suited to Codex.

7. Dreaming pattern port — auto-extract feedback memories from session logs

Tangent #43 (Anthropic's "Dreaming" pattern). Batch-read last 7 days of ~/.claude/projects/-home-g16/*.jsonl, run through Gemini or GPT to surface recurring corrections + working patterns, write candidates to memory/_candidates/<date>/, ntfy you Sun night, accept/edit/reject UI at memory.thebarnetts.info/candidates. Codex can build the candidate-generator + the UI.

8. Whisper-mark Tier 3 voice corpus cleanup (research)

The chunk-1 LoRA learned OMI's bad style (no caps, hallucinations). Need to either self-distill, hand-curate, or force-align via Bidet recordings. ChatGPT's deep-research mode could survey current state-of-the-art for cleaning ASR-label corpora before retraining. Output: a comparison report.

What to keep AWAY from ChatGPT/Codex (for now)

How to prime ChatGPT efficiently when you do switch over

  1. Open ChatGPT Pro (you need Pro for Codex cloud + Operator).
  2. Paste the handoff document from /private/r/2026-05-21-handoff-to-next-agent.html into the first message of a fresh chat. Tell it: "Read this top-to-bottom, then I'll give you a task."
  3. Give it access to the memory repo via Codex GitHub integration (point it at MrB-Ed/claude-memory, read-only).
  4. Pick ONE task from the list above and let it run. Don't task-stack until you've calibrated tone + competence on the first run.
  5. When it finishes: have it write a results report to /tmp/chatgpt__.html in the format used by Mark's Reports. You (or Claude) deploy it when convenient.

My honest recommendation

Keep Claude Max as the spine. Add ChatGPT Pro ($200/mo) as the tandem agent. They cover each other's weaknesses, both work on the shared memory repo (with ChatGPT in read-only mode), and you stop being held hostage by either rate limit.

If you can only pick one, keep Claude for THIS specific stack — too much institutional context lives in how Claude reads your memory repo and your hard rules. But the right answer for the next year of how you work is "both."

One concrete first run to calibrate

If you're going to spin up ChatGPT Pro tonight, give it this single task and judge by how it goes:

"Build the Spotify Phase 2 transcript fetcher + pgvector embedding pipeline based on the spec in /private/r/2026-05-20-spotify-history-spec.html. Mirror the structure of tp3_spotify_pull.py. Output: tp3_spotify_transcripts.py + tp3_spotify_transcripts_ddl.sql + a test plan. Don't run it on Apex yet — Mark will review first."

That tests: spec comprehension, code style consistency with an existing repo file, willingness to NOT over-execute, and output quality. ~2 hour task in Codex cloud. Compare its first draft against what you remember Claude producing for Phase 1 last night.

2026-05-21 EOD. Pair report: Handoff to the next agent.