Mark's Reports

← All reports

AI Radar — week of 2026-04-25

The weekly scout you waited a week for. Permissions fix worked — sources scanned, candidates ranked, top 3 actions called out.

Filter: Mark's Prime Directive — cost-conscious, local-first, robust, anti-hype. Sources: 7 Gmail AI newsletters, Anthropic / OpenAI / DeepMind blogs, HuggingFace trending, HN top AI. Cost of this run: ~$0.09 (Sonnet 4.6).

What was broken last week: AI Radar's headless session was permission-blocked at all 9 sources two weeks running (4/20 partial, 4/24 fully blocked). Permissions added to Apex's .claude/settings.json, scheduled-task working directory pointed at the TP3 repo, and run_radar.ps1 patched. Tonight's run is the first full successful scan in three weeks.

Top 3 actions for this week

  1. Pull Gemma 4 E4B on Apex via Ollamaollama pull gemma4:e4b runs in ~5 GB RAM (well within Apex's 13.8 GB budget). Free local multimodal chat/inference model. Test it as an Oracle backend. Hits the local-LLM sovereignty goal with zero API cost. (No spend.)
  2. Audit Claude Code token spend this week — Claude Opus 4.7 ships a new tokenizer that consumes up to 35% more tokens at the same $5/$25 per-million price. If Claude Code auto-upgraded, your bill may be climbing without you noticing. Check your Anthropic usage dashboard before the month closes.
  3. Evaluate OpenAI Privacy Filter for TP3 ingest — 1.5B param model (only 50M active), runs locally, scrubs PII from Omi transcripts before they hit the embedding step. One-time ~2-4h integration into tp3_omi_ingest_worker.py. Available free on HuggingFace now. (No spend — model is open-weight.)

Ranked candidates

#CandidateSourceScoreWhy it mattersIntegration cost
1 Gemma 4 + Ollama v0.20.6 DeepMind blog · ollama.com 22/25 E4B (4B params, 5 GB RAM) runs locally on Apex today. Multimodal, 128K context, Apache 2.0. Ollama v0.20.6 adds native Gemma 4 support + Llama 4 Scout. Direct hit on local-LLM sovereignty. ollama pull gemma4:e4b · ~5 min
2 OpenAI Privacy Filter openai.com · huggingface.co 18/25 Open-weight, 1.5B total / 50M active params, 96% F1 PII detection, runs fully local, 128K context. Scrubs names / emails / passwords / API keys from Omi transcripts before cloud embedding. Python integration into ingest worker · ~2-4 h
3 Claude Opus 4.7 anthropic.com/news 17/25 87.6% SWE-bench (up from 80.8%), high-res vision (3.75 MP), task budgets for agentic loops. But: new tokenizer uses up to 35% more tokens — same list price hides a real cost increase. Watch your bill. Already live · audit spend
4 Gemini 3.1 Flash TTS deepmind.google 14/25 Improved expressiveness and naturalness in AI voice output. Relevant to your morning-digest-to-podcast goal and Bidet AI audio phase. Cloud API only — costs money. API integration · costs $X/mo, needs your approval
5 Gemini air-gapped + Deep Research on private data The Keyword (Apr 24) 13/25 Gemini can now run air-gapped (on-prem / VPC) and Deep Research can operate on private enterprise data. Enterprise-tier pricing, not accessible today — but directionally confirms local-sovereign AI is becoming mainstream. No action — monitor for pricing release
6 MCP Server Cards (v2.1 spec) modelcontextprotocol.io 13/25 Standard .well-known URL for MCP server discovery — lets registries find TP3 MCP endpoints without connecting. Relevant when omi-mcp-public is exposed more broadly. 10K+ active public MCP servers now. Future-state · no immediate action
7 Qwen3.6-35B-A3B-GGUF (Unsloth) huggingface.co (trending #8) 13/25 1.49M downloads, GGUF-quantized for local inference. Strong multimodal. But: needs ~20 GB RAM at 4-bit — Apex can't run it. File for when Apex gets more RAM or you add a GPU. Not feasible on Apex today
8 Google Colab MCP Server developers.googleblog.com 13/25 Open-source MCP server letting any AI agent connect to Google Colab notebooks. Free Colab tier still applies. Could let Antigravity or Apex Claude run notebooks without local GPU. ~30 min to wire up
9 Headless Agents VentureBeat digest (Apr 23) 11/25 Theme: enterprises running agents without human-in-the-loop at scale, governance gap is real. Useful mental model for how TP3 scheduled tasks + Oracle should be framed. Informational only
10 OpenAI WebSockets in Responses API openai.com 10/25 Streaming audio + reduced latency for agent workflows. Relevant only if Mark builds with the OpenAI API — he doesn't today. N/A for current stack

Cut, with reason

Sources scanned

Verified source URLs

Run cost: ~18K input tokens + ~1.2K output tokens · ~$0.09 on Sonnet 4.6 rates. Source markdown lives at tp3_neural_stack/MAPS/ai_radar_2026-04-25.md.