Drafted 2026-05-23 after today's 22-minute brain dump crashed during transcription. Plan only — no code changed yet. Review and redline before any touch.
app.py, 888 lines), always-on-top dark window, title Bidet AI — Honest Answersrecorder.py — sounddevice 16 kHz mono, RMS silence detection, beep ladder at 30/45/60s (NOT the 10/15/20 in HANDOFF — already drifted)transcriber.py — openai-whisper library, model medium, GPU fp16 on RTX 4070, one big blocking model.transcribe() call, regex de-loop pass afterprocessor.py — Ollama local-first (gemma3:4b) with Gemini cloud fallback. Glossary + speaker-context prepended to every promptdistributor.py — webhook POST + Google Drive upload (Drive currently skipped, no DRIVE_FOLDER_ID in .env)tp3_ingest.py — MinIO audio archive + Postgres ingest with Gemini embedding (works when Ollama is up)~/.bidet/prompts.json — user-overridable per-tab, plus custom tab supportClean Raw (builtin clean)Clean Analysis (builtin analysis)Clean for AI (builtin forai)Clean for Judges (builtin judges) + a Generate Judges Pitch button below the notebookbidet_ai.ico exists in repo root (256×256 RGBA, looks old per Mark)app.py has no iconbitmap / iconphoto call. So the running window shows Tk's default feather. The .ico is just sitting there.Today's working sessions in bidet_ai.log — 30+ prior runs, all 30s–3min audio → transcribe in 30s–2min, "Whisper done: X chars" always logs. One outlier today.
Log shows exactly one line for today:
12:16:01 INFO Starting Whisper transcription of …audio_2026-05-23_12-16-01.wav
Then nothing. No Whisper done, no Pipeline failed, no exception. App went black.
The pipeline() thread in app.py:659 is wrapped in try/except Exception and would have logged anything Python-level. So model.transcribe() either:
compression_ratio_threshold=2.0 and condition_on_previous_text defaulted on, openai-whisper has a known long-audio failure mode: 30s chunks where it gets stuck repeating, retries internally, never returns. The dedup regex runs after transcribe; it cannot help if transcribe never finishes.io.StringIO() at the top of app.py and transcriber.py — see lines 17–20 of both). If the Whisper C extension faulted, the GUI process would die silently.pipeline() runs in a daemon thread so Tk should still pump, but very long GPU work with the Python GIL released and re-acquired in chunks can stall Tk's event loop on Windows. Result: the window stops repainting, Mark sees "black," kills it.We cannot know which without re-running with proper stderr capture. Mitigation regardless of which one it was:
faster-whisper (CTranslate2 backend). Already installed in system Python (Mark used it to recover the 22-min audio today — succeeded on CPU int8 in reasonable time, would be much faster on GPU). It:
vad_filter=True) that solves the loop-on-silence problem at the source instead of regex'ing it laterStringIO(). The current if sys.stderr is None: sys.stderr = io.StringIO() block silently eats every CUDA / cuBLAS / ctranslate2 message. Route to a rotating file handler so the next mystery crash has a tail.Bidet AI — Honest Answers → Bidet AI. One line in app.py:146.self.iconbitmap(default=str(PROJECT_ROOT / "bidet_ai.ico")) in _build_ui. NEW icon asset required from Mark — current .ico is the old logo he wants replaced. Need him to drop the new PNG/SVG in repo root; I'll convert to multi-resolution .ico (16/32/48/64/128/256) for taskbar + alt-tab + title bar.Clean Raw shows the cleaned output, not the raw transcript. Mark wants the raw verbatim Whisper output as Tab 1. Add a new builtin raw tab; current clean moves to Tab 2 with a different prompt.cornell directive (already wired in FORMAT_DIRECTIVES). Prompt rewrite needed: current CLEAN_PROMPT returns prose; Mark wants tight grouped bullets.forai behavior. Label rename only.analysis returns a structured-headings report (summary / topics / actions / decisions / questions / follow-ups). Mark wants a prose summary "anyone could read." Trim headings, output 3–6 paragraphs.Clean for Judges tab + Generate Judges Pitch button. Contest is shipped, no longer needed. Per the sister rule from today (feedback_never_delete_work_archive_2026-05-23), archive — don't delete:
JUDGES_PROMPT, format_for_judges, _run_judges, _fill_judges into archive/judges_mode.py so the strings survivejudges tab from the default tab list + remove the button from _build_ui~/.bidet/prompts.json migration: drop the judges entry on first run after upgrade so Mark's existing state doesn't re-add it~/.bidet/prompts.json on Mark's machine has the old four-tab structure. On startup, detect old schema and rewrite to new four-tab structure preserving any overrides Mark made on clean / analysis / forai. Back up the old file as prompts.json.bak.20260523 first.| Upgrade | Why | Cost |
|---|---|---|
| faster-whisper backend with segment streaming + VAD | Fixes today's crash class, faster, partial output survives crash | ~1hr swap. Side-by-side test against openai-whisper on Mark's voice sample required before cutover (HANDOFF requirement) |
| Live transcription progress | Currently zero feedback during a 5+ min transcribe. Show segment count + last 80 chars of newest segment in the status bar | 30 min, comes free with faster-whisper iterator |
| Real stderr/stdout to log file | The io.StringIO() swallow is the reason today's crash is a mystery |
10 min |
| Wire the icon | Visible win every time Mark sees the taskbar | 5 min once new icon arrives |
| Waveform meter on record button | Visual confirmation mic is hot; today's silence detection is invisible until a beep | 1hr, sounddevice already provides RMS |
| Crash-resilient pipeline | Write raw_*.txt to disk as transcribe streams, not after it completes. Drive/TP3 ingest the partial if app dies. |
1hr |
Whisper model bump to large-v3 faster-whisper |
Mark's recovery today used it; tiny WER improvement on his voice. RTX 4070 12GB handles it in fp16 with ~5 GB headroom | Test required first |
tk.Tk in 90% of cases. Risk: minor layout drift, +1 dep. Payoff: looks more 2026.Unclean Raw, rework clean/analysis/forai prompts for Clean for Me / Clean Summary / Clean for AI (1hr)~/.bidet/prompts.json migration with .bak (15 min)New deps proposed (Phase A): faster-whisper>=1.2.1 (already in system Python; add to venv + requirements.txt). No others.
Open question for Mark: New icon asset? Source file (PNG/SVG) and target style. I'll convert to multi-res .ico.