Updated 2026-05-09 09:13 EDT · manual generation while AI Radar weekly cron delivery is being repaired

AI Radar — Mark Barnett — 2026-05-09

Coverage window: 2026-05-02 → 2026-05-09. Bidet-phone contest deadline: 2026-05-24 (15 days out).

Top 3 actions this week

Lock the bidet-phone submission shape now, not next weekend. DEV's Gemma 4 Challenge ($3K, deadline 5/24) is the right venue; Kaggle's Good Hackathon ($200K, deadline 5/18) is a stretch you should not try to retrofit into. DEV requires a public write-up + working repo — start the README outline tonight while transcription is fresh on Pixel 8 Pro. Source: dev.to challenge page, Kaggle competition page.
Audit Gemma 4 E4B's audio encoder before the contest ends — it may eat Whisper-tiny. E4B has a built-in audio encoder (50% smaller than 3N's, 40ms frames) that produces reasoned text, not just transcripts. For a brain-dump app this is one less model to ship. Spend 2 hours: feed it a Pixel-recorded sample, compare output and latency vs. your current Whisper-tiny+Gemma chain. If E4B-audio holds up, drop Whisper, halve your APK, and write the contest narrative around "single-model on-device." If it doesn't, you've got a defensible "why we kept Whisper" paragraph. Source: MindStudio Gemma 4 audio encoder breakdown, Gemma 4 model card.
Pin Claude Code to ≤ v2.1.135 OR test 2.1.138 in a throwaway worktree before merging anything important. Anthropic shipped 2.1.136 → 2.1.138 between 5/7 and 5/9; 2.1.137 specifically fixed a Windows VS Code activation regression, and worktree.baseRef (5/7) changed default branching behavior. You run Claude Code constantly on G16/Apex — a silent default change in worktree base will bite. Source: Claude Code changelog, releasebot.io.

Ranked candidates

#	Item	Score	Why-it-matters-for-Mark	Integration cost
1	Gemma 4 E4B built-in audio encoder	9	Could collapse bidet-phone's two-model pipeline into one. Direct to your live contest build.	2-3h spike, reversible
2	DEV Gemma 4 Challenge logistics ($3K, 5/24)	9	This is the contest you're already in. Write-up format and judging rubric drive the submission shape.	0h research, 4-6h writing across the week
3	Claude Code 2.1.136-138 changes (worktree.baseRef, MCP /clear bug, autoMode hard_deny)	8	You hit this daily on G16+Apex. `worktree.baseRef=fresh` vs `head` will silently change which commit your subagents branch from.	30 min — read changelog, set explicit value in settings.json
4	Claude Opus 4.7 + 1M context at standard pricing	7	$5/$25 per MTok unchanged BUT new tokenizer uses ~1.0-1.35x more tokens per task. Your $140/mo budget could drift up silently. New `effort` + `task budgets` knobs are real cost levers.	1h: re-benchmark a typical session, decide on default effort tier
5	LiteRT-LM production GA + Qualcomm/MediaTek NPU support	7	Pixel 8 Pro has a Tensor G3 NPU you're not touching today. LiteRT-LM is the official path to NPU acceleration for Gemma 4 on Android. Future Bidet-phone v2 lever, not contest-week.	High — re-architect inference layer. Park for post-contest.
6	whisper.cpp commit c81b2dab (5/7): Ruby GVL-free transcribe, Windows fixes	6	You don't use the Ruby bindings, but the Windows build improvements matter for any Apex-side whisper.cpp work. Low-priority refresh.	15 min — `git pull && rebuild` if you use whisper.cpp directly
7	Anthropic "Dreaming" research preview (agents self-improve overnight from past sessions)	5	Same conceptual territory as your TP3 memory pipeline. Worth a 30-min read post-contest to see whether it overlaps your reflection layer. Not GA.	0 right now — gated access
8	OpenAI GPT-5.5 Instant (5/5) + ads-in-ChatGPT pilot (5/7)	3	You're not on OpenAI. Hallucination drop (-52.5% vs 5.3) is industry signal, not action. Ads launch is a "watch from a distance" item.	0

Cut with reason

DeepMind × EVE Online partnership (5/6). Cool research story, zero developer surface. Skip.
SpaceX/Colossus 220K-GPU deal. Capacity press release; doesn't change your day. Skip.
EU Digital Omnibus on AI provisional agreement (5/7). Enforcement starts 2027-2028. Not actionable in 2026-05.
Five Eyes "Careful Adoption of Agentic AI" guidance. Government posture doc, not a tool. File mentally; don't read.
Hugging Face trending: VibeVoice, MinerU2.5, EverMemOS, UniVidX. None map to your stack right now. EverMemOS (self-organizing LLM memory) is closest to TP3 — re-check in 30 days when there's a reference implementation, not just a paper.
ChatGPT for Excel/Sheets global rollout. You don't live in spreadsheets. Skip.
CAISI/NIST evaluating frontier models pre-release. Compliance news. Skip.
Hacker News AI-attacks coverage (Claude Code extortion case, Mexico tax data breach via Claude+ChatGPT). Real but not your threat model — your stack is private, single-user, and Tailscale-fronted. Note the existence; don't chase.
Wan2.2-TI2V-5B and other text-to-video. Out of scope for your projects.
Anthropic Code Review / CI auto-fix / Routines. All look genuinely useful but every one is a 2-4h evaluation; queueing them until after 5/24 is the right call given the contest deadline. Re-surface 5/25.

Sources scanned

Anthropic news + Claude Code changelog (releasebot.io/anthropic, claudefa.st changelog, code.claude.com/docs/changelog)
Simon Willison live blog of Code w/ Claude 2026 (5/6)
Claude Opus 4.7 launch + pricing pages (anthropic.com/news/claude-opus-4-7, platform.claude.com pricing, llm-stats.com, Caylent deep-dive)
Google Gemma 4 launch + model card (blog.google, ai.google.dev/gemma, deepmind.google/models/gemma)
Hugging Face Gemma 4 blog post (huggingface.co/blog/gemma4) for tooling matrix
Android Developers Blog: Gemma 4 on Android (android-developers.googleblog.com)
MindStudio: Gemma 4 E2B/E4B audio encoder analysis
Kaggle Gemma 4 Good Hackathon page + DEV Gemma 4 Challenge page
LiteRT/LiteRT-LM: developers.googleblog.com, github.com/google-ai-edge/LiteRT-LM, infoq.com (5/2026)
whisper.cpp releases page (github.com/ggml-org/whisper.cpp/releases) — latest commit c81b2dab 2026-05-07
OpenAI news (openai.com/index/gpt-5-5-instant, releasebot.io/openai, TechCrunch 5/5)
Hacker News front pages 5/2 and 5/7; rockcybermusings AI security weekly 5/1-5/7
Bloomberg + 9to5Google for DeepMind/EVE; nextgov + CNBC for CAISI testing program

Cost of this run

8 WebSearch calls + 2 WebFetch calls. Estimated Anthropic spend: ~$0.35-0.55 in input/output tokens (Opus 4.7, ~30K input / ~2K output equivalent). Comfortably under the $1/run cap.