Mark's Reports
Updated 2026-05-09 09:13 EDT · manual generation while AI Radar weekly cron delivery is being repaired

AI Radar — Mark Barnett — 2026-05-09

Coverage window: 2026-05-02 → 2026-05-09. Bidet-phone contest deadline: 2026-05-24 (15 days out).


Top 3 actions this week

  1. Lock the bidet-phone submission shape now, not next weekend. DEV's Gemma 4 Challenge ($3K, deadline 5/24) is the right venue; Kaggle's Good Hackathon ($200K, deadline 5/18) is a stretch you should not try to retrofit into. DEV requires a public write-up + working repo — start the README outline tonight while transcription is fresh on Pixel 8 Pro. Source: dev.to challenge page, Kaggle competition page.

  2. Audit Gemma 4 E4B's audio encoder before the contest ends — it may eat Whisper-tiny. E4B has a built-in audio encoder (50% smaller than 3N's, 40ms frames) that produces reasoned text, not just transcripts. For a brain-dump app this is one less model to ship. Spend 2 hours: feed it a Pixel-recorded sample, compare output and latency vs. your current Whisper-tiny+Gemma chain. If E4B-audio holds up, drop Whisper, halve your APK, and write the contest narrative around "single-model on-device." If it doesn't, you've got a defensible "why we kept Whisper" paragraph. Source: MindStudio Gemma 4 audio encoder breakdown, Gemma 4 model card.

  3. Pin Claude Code to ≤ v2.1.135 OR test 2.1.138 in a throwaway worktree before merging anything important. Anthropic shipped 2.1.136 → 2.1.138 between 5/7 and 5/9; 2.1.137 specifically fixed a Windows VS Code activation regression, and worktree.baseRef (5/7) changed default branching behavior. You run Claude Code constantly on G16/Apex — a silent default change in worktree base will bite. Source: Claude Code changelog, releasebot.io.


Ranked candidates

# Item Score Why-it-matters-for-Mark Integration cost
1 Gemma 4 E4B built-in audio encoder 9 Could collapse bidet-phone's two-model pipeline into one. Direct to your live contest build. 2-3h spike, reversible
2 DEV Gemma 4 Challenge logistics ($3K, 5/24) 9 This is the contest you're already in. Write-up format and judging rubric drive the submission shape. 0h research, 4-6h writing across the week
3 Claude Code 2.1.136-138 changes (worktree.baseRef, MCP /clear bug, autoMode hard_deny) 8 You hit this daily on G16+Apex. worktree.baseRef=fresh vs head will silently change which commit your subagents branch from. 30 min — read changelog, set explicit value in settings.json
4 Claude Opus 4.7 + 1M context at standard pricing 7 $5/$25 per MTok unchanged BUT new tokenizer uses ~1.0-1.35x more tokens per task. Your $140/mo budget could drift up silently. New effort + task budgets knobs are real cost levers. 1h: re-benchmark a typical session, decide on default effort tier
5 LiteRT-LM production GA + Qualcomm/MediaTek NPU support 7 Pixel 8 Pro has a Tensor G3 NPU you're not touching today. LiteRT-LM is the official path to NPU acceleration for Gemma 4 on Android. Future Bidet-phone v2 lever, not contest-week. High — re-architect inference layer. Park for post-contest.
6 whisper.cpp commit c81b2dab (5/7): Ruby GVL-free transcribe, Windows fixes 6 You don't use the Ruby bindings, but the Windows build improvements matter for any Apex-side whisper.cpp work. Low-priority refresh. 15 min — git pull && rebuild if you use whisper.cpp directly
7 Anthropic "Dreaming" research preview (agents self-improve overnight from past sessions) 5 Same conceptual territory as your TP3 memory pipeline. Worth a 30-min read post-contest to see whether it overlaps your reflection layer. Not GA. 0 right now — gated access
8 OpenAI GPT-5.5 Instant (5/5) + ads-in-ChatGPT pilot (5/7) 3 You're not on OpenAI. Hallucination drop (-52.5% vs 5.3) is industry signal, not action. Ads launch is a "watch from a distance" item. 0

Cut with reason


Sources scanned


Cost of this run

8 WebSearch calls + 2 WebFetch calls. Estimated Anthropic spend: ~$0.35-0.55 in input/output tokens (Opus 4.7, ~30K input / ~2K output equivalent). Comfortably under the $1/run cap.