Casting: Two voices — teenage girl + teenage boy, delivered in UNISON for each title line (Acts 1-2). Beat 10 chant layers a second voice for density. Beat 12 climax adds a THIRD voice (optional debate-team feel). Final manifesto is ONE voice — an older narrator, adult, warm.
| Time | Line | Voice | Delivery |
|---|---|---|---|
| 0:00–0:02 | (alarm clock) | SFX only | Black screen. Crisp analog alarm. Cuts abruptly to first image. |
| 0:02–0:06 | "I miss sleep." | Girl + Boy (unison) | Flat. Tired. Deadpan. No drama yet. |
| 0:06–0:10 | "I am a mess." | Girl + Boy (unison) | Slight self-aware edge. Still flat. |
| 0:10–0:14 | "I study my play." | Girl + Boy (unison) | Quiet focus. Almost whispered. |
| 0:14–0:18 | "I put it all out there." | Girl + Boy (unison) | First notch louder. Conviction starts. |
| 0:18–0:22 | "I work through the pain." | Girl + Boy (unison) | Grit. Teeth a bit clenched on "pain." |
| 0:22–0:26 | "I work behind the scene." | Girl + Boy (unison) | Steady. Honoring the crew. Slight pride. |
| 0:26–0:30 | "I must be perfect." | Girl + Boy (unison) | Tight. The pressure line. Jaw tight. |
| 0:30–0:34 | "Sometimes I miss." | Girl + Boy (unison) | The pivot. Honest. Quieter than the line before. Weight under it. |
| 0:34–0:41 | "The work is never done." | Girl + Boy (unison) | Hold 7 sec. Slowest delivery in the piece. Acceptance + resolve. Breath between words. Then silence before the music SNAPS. |
| 0:41–0:45 | "I like to win. I like to win." | Dual + layered chant | Boom-boom tempo. Each "I like to win" punches on the beat as images cut (one arm × 4). Chant stacks — by the 2nd repetition, add a third voice underneath. |
| 0:45–0:49 | "I like to win. I like to win." | Dual + layered chant continues | Same tempo, images switch to both-arms pairs. Energy peaking. |
| 0:49–0:55 | "I WON." | 3+ voices together, louder | The climax. Past tense. Full voice. Land HARD on "won." Hand raised, trophy raised — hold 3 sec. Music drops to a single low note. |
| 0:55–0:58 | "Work is work. Find your grind." | Single adult narrator | Warm. Grounded. Not a shout. A father/coach voice delivering the truth. Navy + Knighthead logo card on screen. |
Use the Gemini 2.5 multi-speaker TTS model in AI Studio. It supports named speaker roles in a single prompt. Paste this into the Text-to-Speech playground:
Generate each line as a separate clip so you can place it on the timeline precisely. Export as WAV or MP3. Gemini 2.5 native TTS is free-tier in AI Studio up to limit; shouldn't cost anything for this.
Goal: a 60-second track that mirrors the three-act structure. Quiet grind → slow build → punching victory. Should NOT have lyrics (would fight the voiceover).
| Section | Time | Feel |
|---|---|---|
| Act 1 (Grind) | 0:02–0:18 | Sparse piano or muted synth pulse. Low BPM (70-80). One note every 2 beats. Leaves space for voiceover. |
| Act 2 (Price) | 0:18–0:34 | Add low strings / cello drone underneath. BPM same. Tension rises under "I must be perfect." Drops to near-silence under "Sometimes I miss." Reverb tail. |
| Breath (Never Done) | 0:34–0:41 | Music drops to ONE sustained low cello or pad note. 7 seconds of near-silence with that one held note. The "hold your breath" moment. |
| Act 3 (Victory) | 0:41–0:55 | SNAP into a tempo-doubled (140 BPM) drum + bass + brass build. Four 4-beat phrases punching on the boom-boom-boom of the chant. Each "I like to win" lands on a downbeat. On "I WON," everything hits at once. Taiko-drum impact. |
| Manifesto | 0:55–0:58 | Music drops to ONE reverb-tailed piano note + low drone. "Work is work. Find your grind." over that. Fade to silence. |
Google AI Studio's music model (Lyria RealTime or Lyria 2) accepts text prompts. Paste this:
aistudio.google.com → open the TTS playground → paste the voiceover prompt above → generate + download each speaker line as a separate WAVWhen you're ready to start recording/generating, let me know and I'll walk alongside step-by-step.