Mark's Reports

AI Radar — 2026-05-08

Scored for Mark Barnett: cost-conscious, local-first, Apex (16 GB VRAM / 13.8 GB RAM), TP3 stack, Bidet AI, teaching workflow.


Actions I recommend Mark take this week (top 3, max)

  1. Enjoy the doubled Claude Code limits — no action required — Anthropic's May 6 update doubled the 5-hour rate limit on Pro/Max/Team/Enterprise and removed the peak-hours throttle. API Tier 1 Opus limits went up 1500%/900% input/output. Backed by 220K+ SpaceX Colossus GPUs going live this month. Source: anthropic.com/news/higher-limits-spacex

  2. Test Gemma 4 26B-A4B on Apex Ollama — Google's MoE model has 26B total params but only ~4B activated per token (same per-token VRAM cost as a 4B model, ~6–8 GB). It's outperforming frontier models on multimodal tasks and trending at 8.73M downloads. Likely fits Apex with headroom. Command: ollama pull gemma4:26b-a4b-it. Source: huggingface.co/google/gemma-4-26B-A4B-it-assistant

  3. Skim the MCP 2026 roadmap — SEP (Spec Enhancement Proposals) process is now open. Transport scalability and agent-to-agent communication are the next two milestones, both directly relevant to TP3's MCP architecture. No action needed now, but watch for the transport upgrade — it may simplify the SSE vs stdio decision. Source: blog.modelcontextprotocol.io/posts/2026-mcp-roadmap


Ranked candidates

# Candidate Source Score Why it matters for Mark Integration cost
1 Claude Code rate limits 2× Anthropic (May 6) 21 Direct: Pro/Max get 2× 5-hr block, peak throttle gone, Opus API +1500% token/min. Same price, more throughput. SpaceX compute deal removes scarcity risk. Zero — already live
2 Gemma 4 26B-A4B MoE HuggingFace/Google (May 5) 21 MoE: only 4B params active per token → fits Apex 16GB VRAM at Q4. Multimodal (vision+text). 8.73M downloads = community validated. Prior radar recommended 4B; this is the frontier jump. Low — ollama pull, single test run
3 MCP 2026 roadmap + Jama Connect MCP launch MCP blog + GlobeNewswire (May 4) 17 TP3 runs 4 MCP servers. Transport scalability SEP is next on roadmap. Jama as first enterprise MCP adopter signals ecosystem maturing — more MCP-compatible tools incoming. Info-only this week
4 DeepSeek V4-Flash (MIT, Ollama) HuggingFace (April 24, trending) 15 158B MoE, MIT license, 1M context, $0.14/M tokens via API or local with 24GB+ VRAM. V4-Flash outperforms V3 significantly. Apex's 16GB VRAM is ~8GB short for comfortable local run — wait for further quantization or test with CPU offload. Medium — VRAM constrained on Apex
5 Kimi K2.6 (Moonshot AI) Ollama cloud, latent.space (May 1) 13 Open weights (HuggingFace), SOTA coding model rivaling Opus 4.6 on benchmarks, 256K context. Weights are 610GB full-precision; local needs heavy quantization. Ollama listing is cloud-only. Not local-first today. High — not locally runnable on Apex yet
6 OpenAI Voice API — reasoning + TTS OpenAI (May 7) 13 New realtime voice models with reasoning + translation baked in. Could enhance the Ray-Bans TTS pipeline beyond the current Tasker /brief chain. Costs money (API). Worth watching for pricing before acting. Low code-wise, but costs $ — needs Mark approval
7 GPT-5.5 Instant (new ChatGPT default) OpenAI (May 5) 11 52.5% fewer hallucinated claims on high-stakes prompts vs GPT-5.3, AIME 81.2 (up from 65.4). API pricing $5/$30 per M tokens. More expensive than Claude; benchmark is self-reported — no independent confirmation yet. No TP3 integration path. N/A — no action warranted

Cut with reason


Sources scanned


Cost of this run

Generated 2026-05-08 19:06 by tp3_scripts/ai_radar/run_radar.ps1.