31-minute Bidet AI brain dump — cleaned for others
Original session: 2026-05-09, 16:49 ET → 17:21 ET. 31 min 28 sec. 18,648 chars raw. On-device "Clean for others" generate hung at ~5:22 PM ET; this is the recovery — produced from the recovered RAW transcript.
I built this app because I couldn't write report-card comments
I'm a middle-school teacher. Every quarter I have to write personalized comments for every student, and I couldn't. I'd sit and stare at the screen and overthink and overanalyze and question myself until what came out was generic — because generic was all I had energy left to produce. Until I started doing voice brain-dumps and feeding the audio to AI to organize. That changed everything. I built a separate program that takes those brain-dumps and inserts the cleaned-up comments straight into the gradebook. Bidet AI is the generalization of that. It's the same trick — speak freely, let AI organize — but for any moment when the keyboard is the bottleneck.
For me, that's most moments.
What the app actually does
Bidet AI is an Android app. You tap Record, you talk for as long as you want, you tap Stop. The phone transcribes everything on-device — no cloud, no telemetry, nothing leaves the phone. Then you get three views of the same content:
- RAW — the verbatim transcript, exactly as you said it
- Clean for me — reorganized for your own re-reading: topic-grouped, repetitions collapsed, the parts you said to yourself preserved
- Clean for others — polished prose someone else can read and understand. The version I'd send to a colleague, drop into a doc, paste into a writeup
There's a "Show what changed" toggle on both Clean tabs that highlights what the AI added or removed compared to your raw, so you stay the author of your own intellect — you can see the model's edits, accept or push back. (That's the Vygotskyan-scaffolding answer to the "AI writes for you, you lose the skill" critique. The model doesn't replace your thinking; it makes your thinking legible.)
Why this matters beyond me
There are a lot of people who feel like they're not understood, and a lot of people who feel like they don't understand others. It's isolating. Bidet AI is a way to communicate in someone else's language without losing your own. It interprets you when people don't get you. When even an AI doesn't understand you, this makes you understandable.
The first audience I think about is my own students. Some of them have handwriting that's almost unreadable. Some have typing accommodations on their IEPs. Some can't get a thought out in writing fast enough to finish before they lose it. If a kid can tell me the story of what we just studied — disjointed, jumbled, full of "and then, like, and so" — Bidet AI can produce a version I can read, and I can see whether they actually got it. The transcript becomes the accommodation. That's the policy hook for getting this approved as an assistive technology under existing 504 / IEP frameworks.
The contest decision
We're submitting to the Kaggle Gemma 4 Good Hackathon. Deadline May 18. Verified prize tree:
- Main Track — for the best overall projects ($100K total, four placements)
- Impact Track — Future of Education ($10K) — "multi-tool agents that adapt to the individual and empower the educator" — that's literally the pitch
- Special Technology Track — Cactus prize ($10K) — "best local-first mobile or wearable application that intelligently routes tasks between models" — that's literally the architecture (Whisper-tiny for speech, Gemma 4 for the cleaning, both on-device)
The three buckets stack — a single submission can win all three. Theoretical ceiling around $70K from one APK. I'm not betting on first place — I'd settle for "got their attention." But the rubric is 70% video pitch + vision narrative and 30% technical depth. I've already engineered the technical depth; the video is the lever.
The video concept
I have a cartoon intro idea I want to use. Take the bidet metaphor literally for the first 8 seconds:
A character walks up to a toilet, bends over like he's about to throw up. Instead the top of his skull opens and his brain falls in. Skull closes. Then a fountain of water sprays the brain back up out of the toilet, all clean and sparkly. The brain hovers there, gleaming.
"Take a brain dump. Bidet AI cleans the mess."
That's the hook. I want to generate that 8-second clip with Google's AI Studio (Gemini Pro / Veo) and a new logo with Gemini's image API.
After the cartoon, I want the rest of the three minutes to be: 1. The personal story — me, briefly, on why I built it 2. A sped-up phone demo — record → RAW → Clean for me → Clean for others, side-by-side, so you can see the transformation 3. A who-it's-for sweep — students with handwriting trouble, ELL kids, ADHD, dyslexic, late-deaf, anyone whose voice outruns their fingers 4. An open ending — I want judges to fill in the next use case themselves. "Oh yeah, and what about..." That's the win.
The honest blockers
- Whisper-vs-Gemma flavor question: the app currently ships in two flavors. The Whisper flavor (Whisper-tiny → Gemma) works reliably today. The Gemma-only flavor (audio straight into Gemma 4's multimodal mode) is the more impressive technical story but hasn't reliably produced a transcript yet. We pick the Whisper flavor by default unless Gemma-mode lands solidly in the next 48 hours. Cactus prize doesn't require single-model — it explicitly rewards "intelligently routing between models," which is what Whisper→Gemma does.
- Unsloth fine-tune narrative: I'd love to ship a per-user fine-tune ("the more it learns me, the better it is") to qualify for the Unsloth $10K bonus. The blocker is that runtime LoRA adapters don't load on Gemma 4 + LiteRT-LM today; the workaround is a full 3.6 GB model swap per retrain, defensible as quarterly behavior. We test the converter path this week. If it works, it's in. If not, we mention it as roadmap and skip the bonus.
- One submission per team. I have three builds — phone, desktop, server. I can't submit them separately. The desktop and server builds are proof the architecture generalizes; only the phone APK gets shipped.
What's already done
- Three-tab UI shipped (RAW · Clean for me · Clean for others) with the "Show what changed" diff toggle
- Long-press copy on RAW + an explicit Copy button (because long-press wasn't reliable)
- Both flavor APKs built, signed, installed on my Pixel 8 Pro with Gemma 4 E4B (3.6 GB) cached locally
- 31-minute brain dump captured cleanly today — 63 audio chunks, full transcript, exactly as I spoke it
- Verified Kaggle prize tree direct from the contest page (logged-in session)
What's blocking me right now
The Generate step on Clean for others hung for 5+ minutes on this 31-minute input and never returned. That's the next thing to fix before the contest demo — output token cap, streaming output so I can see progress, and a foreground service so screen-sleep doesn't kill it.
This document is the Clean-for-others version of a 31-minute spoken brain dump that the on-device app failed to generate. The recovered RAW transcript is also published — see source link.