Saturday 4/25 evening — what we actually did today
A plain-language read on today's big rebuild work, what it gets you, and what's next. Posted 2026-04-25 ~9:30 PM ET.
The one-paragraph version
Today we started turning your TP3 Neural Stack from "a pile of Python scripts on Windows" into "a clean, reproducible system in containers." The hard, irreversible prep work is done — backups taken, repo tagged, scheduled tasks exported. Tonight we rebooted both machines and 7 of 8 services came back automatically. The one that didn't (the shared-memory MCP) is the exact gap the next phases close. You can now diagnose, restart, or rebuild your whole stack faster, and we're 60% of the way through the safe-prep phase before any risky changes happen.
What we did today — in plain English
The setup (Phase 0 of the rebuild)
Before Phase 0, your TP3 stack ran like this: a bunch of Python scripts launched by Windows scheduled tasks, each with its own dependencies, talking to a Postgres database that lives directly on Apex. If a script broke, you'd have to remember which one, what version, what library, and whether anything else depended on it. After a reboot, things came back in some half-recovered state — sometimes everything worked, sometimes it didn't, and you couldn't always tell which.
Phase 0 is the prep step that lets us safely move all that into Docker containers. Containers are like sealed shipping crates — each service has its own crate with all its dependencies inside, and the crate runs the same way every single time. Today the work was about making that move safe:
C:\Users\Breezy\tp3_pre_migration_backup\2026-04-25\. If anything goes sideways, we can roll back to today's data in minutes.
pre-migration-2026-04-25 tag pushed on main. One command rewinds the code to where it was at start-of-day.
The collateral fixes (not strictly Phase 0, but on the path)
While prepping for the rebuild, two things were hurting Bidet right now and got fixed today:
processor.py replaced with Apex's Ollama-first version, the Gemini-key gate at app.py line 265 removed, and tp3_configured() no longer treats a missing Gemini key as "not configured." Bidet sessions on G16 now ingest into TP3 via local embeddings.
.claude/settings.json, scheduled task working directory pointed at the TP3 repo, and run_radar.ps1 now sets cwd explicitly. (See the AI Radar report for tonight's run-it-now results.)
The reboot test (tonight)
You rebooted Apex earlier today and G16 separately. Both came back up. The Docker Compose stack on Apex auto-started 7 of its 8 services within seconds — postgres, minio, ingest, embed, bidet, pinger, autoheal, all healthy. That's the rebuild paying off already: a year ago a fresh boot meant 20 minutes of "did everything actually start?" checking.
omi-mcp.thebarnetts.info) is NOT in the Docker Compose stack yet. After reboot it stayed dead, returning 502 errors to every agent that tried to read or write shared memory. We brought it back manually with Restart-OmiMCP.ps1. This is the exact kind of thing Phase 5+ moves into containers so it auto-recovers like the others.
What you actually get from this
| Before today | After today (and where Phase 0 finishes) |
|---|---|
| Reboot = "did everything come back? not sure" | Reboot = 7/8 services healthy in 60 seconds, the 8th gets named |
| Service crashes = manual restart, hope you remember which script | autoheal kills the bad one, restart-policy brings it back |
Code-vs-environment drift causes silent failures (today's faster-whisper bug) | Image is built from requirements.txt — if it's not in the image, it's not in the running code |
| Rolling back = "what was the state yesterday again?" | One git checkout pre-migration-2026-04-25 + restore postgres backup |
| Each service has its own setup quirks | Same shape every time: docker compose up |
What's still pending in Phase 0
0.0.0.0 (server bind setting), but Bidet's processor treats it as a client URL. Two ways to fix: change the Windows env var to a URL, or patch the processor to normalize bare hosts. Going with the patch — cleaner, doesn't touch Windows globals.
os.environ.get() in the scripts, diff against .env, find anything missing before container migration. Quick.
.env. So if a service stops pinging, you get notified.
Where we go from here
- Finish Phase 0 (~2 hours of work split across the items above). Once these are done, Phase 0 is closed and the system is ready for the actual migration.
- Phase 1 — Re-embed. Re-process the ~3,500 rows that were embedded with Gemini back into local embeddings. COPY-only — original rows untouched until the new ones verify.
- Phase 2-4 — Container migration. Move ingest, embed, and the supporting Python services into Compose. This is what's already running for postgres + minio + bidet.
- Phase 5 — Bidet in Compose (already done partway — Bidet container is up).
- Phase 6 — MCP servers in Compose (this closes tonight's gap — omi-mcp + biometric-mcp auto-recover after reboots).
- Phase 7 — Clean-run window. One full week with no manual touch. If the stack survives a week without intervention, the rebuild is real.
- After that — LiteLLM harness layer. The 4/14 plan to put one OpenAI-compatible proxy in front of all your local + cloud LLMs with routing rules. Big win for "go all local" but only after the operator layer is rock-solid.
What I need from you (and only you)
- Granulator path pick — Path 1 (cement mixer), Path 2 (Japanese-style stainless bowl), or Path 3 (cake turntable POC)? See the Pearl granulator report. Once you pick, I prep a printable shopping list.
- Bonsai-8B test — local model worth a download for side-by-side with gemma3:4b. Y/N. Download-only, no spend.
- Confirm I should add MCP server containerization to the Phase 6 plan as an explicit task (so it doesn't slip again).
Lessons saved tonight (so they don't repeat)
- Tailscale-first hypothesis. When G16 can't reach Apex services, default theory is local Tailscale, not server outage. Confirmed tonight when omi-mcp 502 turned out to be partially a tunnel-down and partially MCP-orphaned.
- omi-mcp not in compose. Documented as a gap so future agents post-reboot know to fire
Restart-OmiMCP.ps1until Phase 6 fixes it for good. - Post-rebuild state snapshot. What's running, what's disabled, and which scheduled tasks now belong to compose vs. standalone — captured for any future agent context.