AI Radar — 2026-05-08

Scored for Mark Barnett: cost-conscious, local-first, Apex (16 GB VRAM / 13.8 GB RAM), TP3 stack, Bidet AI, teaching workflow.

Actions I recommend Mark take this week (top 3, max)

Enjoy the doubled Claude Code limits — no action required — Anthropic's May 6 update doubled the 5-hour rate limit on Pro/Max/Team/Enterprise and removed the peak-hours throttle. API Tier 1 Opus limits went up 1500%/900% input/output. Backed by 220K+ SpaceX Colossus GPUs going live this month. Source: anthropic.com/news/higher-limits-spacex
Test Gemma 4 26B-A4B on Apex Ollama — Google's MoE model has 26B total params but only ~4B activated per token (same per-token VRAM cost as a 4B model, ~6–8 GB). It's outperforming frontier models on multimodal tasks and trending at 8.73M downloads. Likely fits Apex with headroom. Command: ollama pull gemma4:26b-a4b-it. Source: huggingface.co/google/gemma-4-26B-A4B-it-assistant
Skim the MCP 2026 roadmap — SEP (Spec Enhancement Proposals) process is now open. Transport scalability and agent-to-agent communication are the next two milestones, both directly relevant to TP3's MCP architecture. No action needed now, but watch for the transport upgrade — it may simplify the SSE vs stdio decision. Source: blog.modelcontextprotocol.io/posts/2026-mcp-roadmap

Ranked candidates

#	Candidate	Source	Score	Why it matters for Mark	Integration cost
1	Claude Code rate limits 2×	Anthropic (May 6)	21	Direct: Pro/Max get 2× 5-hr block, peak throttle gone, Opus API +1500% token/min. Same price, more throughput. SpaceX compute deal removes scarcity risk.	Zero — already live
2	Gemma 4 26B-A4B MoE	HuggingFace/Google (May 5)	21	MoE: only 4B params active per token → fits Apex 16GB VRAM at Q4. Multimodal (vision+text). 8.73M downloads = community validated. Prior radar recommended 4B; this is the frontier jump.	Low — `ollama pull`, single test run
3	MCP 2026 roadmap + Jama Connect MCP launch	MCP blog + GlobeNewswire (May 4)	17	TP3 runs 4 MCP servers. Transport scalability SEP is next on roadmap. Jama as first enterprise MCP adopter signals ecosystem maturing — more MCP-compatible tools incoming.	Info-only this week
4	DeepSeek V4-Flash (MIT, Ollama)	HuggingFace (April 24, trending)	15	158B MoE, MIT license, 1M context, $0.14/M tokens via API or local with 24GB+ VRAM. V4-Flash outperforms V3 significantly. Apex's 16GB VRAM is ~8GB short for comfortable local run — wait for further quantization or test with CPU offload.	Medium — VRAM constrained on Apex
5	Kimi K2.6 (Moonshot AI)	Ollama cloud, latent.space (May 1)	13	Open weights (HuggingFace), SOTA coding model rivaling Opus 4.6 on benchmarks, 256K context. Weights are 610GB full-precision; local needs heavy quantization. Ollama listing is cloud-only. Not local-first today.	High — not locally runnable on Apex yet
6	OpenAI Voice API — reasoning + TTS	OpenAI (May 7)	13	New realtime voice models with reasoning + translation baked in. Could enhance the Ray-Bans TTS pipeline beyond the current Tasker `/brief` chain. Costs money (API). Worth watching for pricing before acting.	Low code-wise, but costs $ — needs Mark approval
7	GPT-5.5 Instant (new ChatGPT default)	OpenAI (May 5)	11	52.5% fewer hallucinated claims on high-stakes prompts vs GPT-5.3, AIME 81.2 (up from 65.4). API pricing $5/$30 per M tokens. More expensive than Claude; benchmark is self-reported — no independent confirmation yet. No TP3 integration path.	N/A — no action warranted

Cut with reason

Anthropic enterprise AI company (Blackstone/Goldman/H&F) — Corporate investment news, zero impact on Mark's stack or cost.
OpenAI ads in ChatGPT — Consumer feature, not in Mark's workflow.
OpenAI Codex safety post — Internal policy/ops. No actionable signal for Mark.
MiMo-V2.5-Pro (Xiaomi, 1T params) — Dropped this week, zero Ollama support yet, no quantized builds, no community testing. Monitor next week.
Sulphur-2-base (text-to-video) — Not in any current use case for Mark.
Anthropic finance agents — Enterprise vertical, not relevant.
GPT-5.5-Cyber (trusted access for security) — Security research tool behind verified access program. Not applicable.

Sources scanned

Gmail newsletters: Search returned auth scope error — Gmail MCP requires re-auth with broader read scope. Gap: newsletter scan skipped this week. Recommend fixing Gmail MCP scope before next run.
Anthropic blog: 3 articles published May 4–6, 2026. All fetched successfully.
OpenAI blog RSS: 20 articles published May 4–8, 2026. Fetched and parsed.
Google DeepMind blog: Page loaded but no posts dated after May 1 were surfaced (most recent visible: April 2026 Gemma 4 announcement). Likely cache/CDN lag — counted as partial gap.
HuggingFace trending: Top 15 models by trending score fetched. 8 updated within last 3 days.
HN Algolia API: Top AI stories pulled. Results skewed toward older dates (May 2025 items surfacing due to index lag) — used web search to verify recency.
WebSearch supplemental: 5 targeted searches for DeepSeek V4, Kimi K2.6, Claude limits, MCP roadmap, GPT-5.5 pricing.

Cost of this run

Input tokens: ~55,000 (context + fetched pages + search results)
Output tokens: ~1,800
Estimated cost: ~$0.19 (Sonnet 4.6 at $3/$15 per M tokens)

Generated 2026-05-08 19:06 by tp3_scripts/ai_radar/run_radar.ps1.