AI Radar — week of 2026-04-25

The weekly scout you waited a week for. Permissions fix worked — sources scanned, candidates ranked, top 3 actions called out.

Filter: Mark's Prime Directive — cost-conscious, local-first, robust, anti-hype. Sources: 7 Gmail AI newsletters, Anthropic / OpenAI / DeepMind blogs, HuggingFace trending, HN top AI. Cost of this run: ~$0.09 (Sonnet 4.6).

What was broken last week: AI Radar's headless session was permission-blocked at all 9 sources two weeks running (4/20 partial, 4/24 fully blocked). Permissions added to Apex's .claude/settings.json, scheduled-task working directory pointed at the TP3 repo, and run_radar.ps1 patched. Tonight's run is the first full successful scan in three weeks.

Top 3 actions for this week

Pull Gemma 4 E4B on Apex via Ollama — ollama pull gemma4:e4b runs in ~5 GB RAM (well within Apex's 13.8 GB budget). Free local multimodal chat/inference model. Test it as an Oracle backend. Hits the local-LLM sovereignty goal with zero API cost. (No spend.)
Audit Claude Code token spend this week — Claude Opus 4.7 ships a new tokenizer that consumes up to 35% more tokens at the same $5/$25 per-million price. If Claude Code auto-upgraded, your bill may be climbing without you noticing. Check your Anthropic usage dashboard before the month closes.
Evaluate OpenAI Privacy Filter for TP3 ingest — 1.5B param model (only 50M active), runs locally, scrubs PII from Omi transcripts before they hit the embedding step. One-time ~2-4h integration into tp3_omi_ingest_worker.py. Available free on HuggingFace now. (No spend — model is open-weight.)

Ranked candidates

#	Candidate	Source	Score	Why it matters	Integration cost
1	Gemma 4 + Ollama v0.20.6	DeepMind blog · ollama.com	22/25	E4B (4B params, 5 GB RAM) runs locally on Apex today. Multimodal, 128K context, Apache 2.0. Ollama v0.20.6 adds native Gemma 4 support + Llama 4 Scout. Direct hit on local-LLM sovereignty.	`ollama pull gemma4:e4b` · ~5 min
2	OpenAI Privacy Filter	openai.com · huggingface.co	18/25	Open-weight, 1.5B total / 50M active params, 96% F1 PII detection, runs fully local, 128K context. Scrubs names / emails / passwords / API keys from Omi transcripts before cloud embedding.	Python integration into ingest worker · ~2-4 h
3	Claude Opus 4.7	anthropic.com/news	17/25	87.6% SWE-bench (up from 80.8%), high-res vision (3.75 MP), task budgets for agentic loops. But: new tokenizer uses up to 35% more tokens — same list price hides a real cost increase. Watch your bill.	Already live · audit spend
4	Gemini 3.1 Flash TTS	deepmind.google	14/25	Improved expressiveness and naturalness in AI voice output. Relevant to your morning-digest-to-podcast goal and Bidet AI audio phase. Cloud API only — costs money.	API integration · costs $X/mo, needs your approval
5	Gemini air-gapped + Deep Research on private data	The Keyword (Apr 24)	13/25	Gemini can now run air-gapped (on-prem / VPC) and Deep Research can operate on private enterprise data. Enterprise-tier pricing, not accessible today — but directionally confirms local-sovereign AI is becoming mainstream.	No action — monitor for pricing release
6	MCP Server Cards (v2.1 spec)	modelcontextprotocol.io	13/25	Standard `.well-known` URL for MCP server discovery — lets registries find TP3 MCP endpoints without connecting. Relevant when omi-mcp-public is exposed more broadly. 10K+ active public MCP servers now.	Future-state · no immediate action
7	Qwen3.6-35B-A3B-GGUF (Unsloth)	huggingface.co (trending #8)	13/25	1.49M downloads, GGUF-quantized for local inference. Strong multimodal. But: needs ~20 GB RAM at 4-bit — Apex can't run it. File for when Apex gets more RAM or you add a GPU.	Not feasible on Apex today
8	Google Colab MCP Server	developers.googleblog.com	13/25	Open-source MCP server letting any AI agent connect to Google Colab notebooks. Free Colab tier still applies. Could let Antigravity or Apex Claude run notebooks without local GPU.	~30 min to wire up
9	Headless Agents	VentureBeat digest (Apr 23)	11/25	Theme: enterprises running agents without human-in-the-loop at scale, governance gap is real. Useful mental model for how TP3 scheduled tasks + Oracle should be framed.	Informational only
10	OpenAI WebSockets in Responses API	openai.com	10/25	Streaming audio + reduced latency for agent workflows. Relevant only if Mark builds with the OpenAI API — he doesn't today.	N/A for current stack

Cut, with reason

GPT-5.5 (OpenAI, Apr 23) — not in Mark's stack, competitor product, no local option (9/25)
Claude Design (Anthropic Labs, Apr 17) — visual design tool, not relevant to TP3 / Bidet / teaching
OpenAI Workspace Agents / Codex enterprise — cloud-only, enterprise pricing, not for a solo builder
ChatGPT for Clinicians — healthcare domain, not relevant
DeepSeek-V4-Pro (862B) / Kimi-K2.6 (1.1T) — way too large for local inference on Apex
Anthropic / NEC Japan partnership — corporate news, zero practical impact
GPT Images 2.0 (Apr 21) — image gen, not relevant to TP3 or active projects

Sources scanned

Gmail: 7 newsletters (AI Daily Brief x5, The Keyword x1, VentureBeat x1) — all from whitelisted senders in tp3_scripts/ai_radar/senders.txt
RSS / web: Anthropic news (4 items this week), OpenAI blog RSS (20 items), Google DeepMind blog (5 items)
HN Algolia API: top AI stories — results skewed older (Feb-Mar); limited last-7d items surfaced
HuggingFace: trending models page — top 10 scraped live
Web search: supplementary verification for Gemma 4 specs, Ollama v0.20.6, OpenAI Privacy Filter, Claude Opus 4.7 tokenizer details

Verified source URLs

Gemma 4: https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/
Ollama Gemma 4: https://ollama.com/library/gemma4
Claude Opus 4.7: https://www.anthropic.com/news/claude-opus-4-7
OpenAI Privacy Filter: https://openai.com/index/introducing-openai-privacy-filter/ · https://huggingface.co/openai/privacy-filter
MCP spec: https://modelcontextprotocol.io/specification/2025-11-25
Gemini TTS: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-tts
Colab MCP: https://developers.googleblog.com/announcing-the-colab-mcp-server-connect-any-ai-agent-to-google-colab/

Run cost: ~18K input tokens + ~1.2K output tokens · ~$0.09 on Sonnet 4.6 rates. Source markdown lives at tp3_neural_stack/MAPS/ai_radar_2026-04-25.md.