Find Your Grind — Voiceover Script + Music Brief

Voiceover Script

Casting: Two voices — teenage girl + teenage boy, delivered in UNISON for each title line (Acts 1-2). Beat 10 chant layers a second voice for density. Beat 12 climax adds a THIRD voice (optional debate-team feel). Final manifesto is ONE voice — an older narrator, adult, warm.

Time	Line	Voice	Delivery
0:00–0:02	(alarm clock)	SFX only	Black screen. Crisp analog alarm. Cuts abruptly to first image.
0:02–0:06	"I miss sleep."	Girl + Boy (unison)	Flat. Tired. Deadpan. No drama yet.
0:06–0:10	"I am a mess."	Girl + Boy (unison)	Slight self-aware edge. Still flat.
0:10–0:14	"I study my play."	Girl + Boy (unison)	Quiet focus. Almost whispered.
0:14–0:18	"I put it all out there."	Girl + Boy (unison)	First notch louder. Conviction starts.
0:18–0:22	"I work through the pain."	Girl + Boy (unison)	Grit. Teeth a bit clenched on "pain."
0:22–0:26	"I work behind the scene."	Girl + Boy (unison)	Steady. Honoring the crew. Slight pride.
0:26–0:30	"I must be perfect."	Girl + Boy (unison)	Tight. The pressure line. Jaw tight.
0:30–0:34	"Sometimes I miss."	Girl + Boy (unison)	The pivot. Honest. Quieter than the line before. Weight under it.
0:34–0:41	"The work is never done."	Girl + Boy (unison)	Hold 7 sec. Slowest delivery in the piece. Acceptance + resolve. Breath between words. Then silence before the music SNAPS.
0:41–0:45	"I like to win. I like to win."	Dual + layered chant	Boom-boom tempo. Each "I like to win" punches on the beat as images cut (one arm × 4). Chant stacks — by the 2nd repetition, add a third voice underneath.
0:45–0:49	"I like to win. I like to win."	Dual + layered chant continues	Same tempo, images switch to both-arms pairs. Energy peaking.
0:49–0:55	"I WON."	3+ voices together, louder	The climax. Past tense. Full voice. Land HARD on "won." Hand raised, trophy raised — hold 3 sec. Music drops to a single low note.
0:55–0:58	"Work is work. Find your grind."	Single adult narrator	Warm. Grounded. Not a shout. A father/coach voice delivering the truth. Navy + Knighthead logo card on screen.

Generating the voiceover in Google AI Studio

Use the Gemini 2.5 multi-speaker TTS model in AI Studio. It supports named speaker roles in a single prompt. Paste this into the Text-to-Speech playground:

[Girl voice: teenage, flat, tired] I miss sleep. [Boy voice: teenage, flat, tired] I miss sleep. [Girl voice: self-aware] I am a mess. [Boy voice: self-aware] I am a mess. [Girl voice: quiet focus] I study my play. [Boy voice: quiet focus] I study my play. [Girl voice: conviction starting] I put it all out there. [Boy voice: conviction starting] I put it all out there. [Girl voice: grit] I work through the pain. [Boy voice: grit] I work through the pain. [Girl voice: steady, proud] I work behind the scene. [Boy voice: steady, proud] I work behind the scene. [Girl voice: tight, pressured] I must be perfect. [Boy voice: tight, pressured] I must be perfect. [Girl voice: quiet honesty] Sometimes I miss. [Boy voice: quiet honesty] Sometimes I miss. [Girl voice: slow, resolute, pauses between words] The work is never done. [Boy voice: slow, resolute, pauses between words] The work is never done. [Girl voice: chant, punchy] I like to win. I like to win. [Boy voice: chant, punchy] I like to win. I like to win. [Girl voice: chant continues] I like to win. I like to win. [Boy voice: chant continues] I like to win. I like to win. [Girl + Boy unison, shouted, triumphant] I WON. [Adult narrator: warm, grounded, slow] Work is work. Find your grind.

Generate each line as a separate clip so you can place it on the timeline precisely. Export as WAV or MP3. Gemini 2.5 native TTS is free-tier in AI Studio up to limit; shouldn't cost anything for this.

Music Brief

Goal: a 60-second track that mirrors the three-act structure. Quiet grind → slow build → punching victory. Should NOT have lyrics (would fight the voiceover).

Structural plan

Section	Time	Feel
Act 1 (Grind)	0:02–0:18	Sparse piano or muted synth pulse. Low BPM (70-80). One note every 2 beats. Leaves space for voiceover.
Act 2 (Price)	0:18–0:34	Add low strings / cello drone underneath. BPM same. Tension rises under "I must be perfect." Drops to near-silence under "Sometimes I miss." Reverb tail.
Breath (Never Done)	0:34–0:41	Music drops to ONE sustained low cello or pad note. 7 seconds of near-silence with that one held note. The "hold your breath" moment.
Act 3 (Victory)	0:41–0:55	SNAP into a tempo-doubled (140 BPM) drum + bass + brass build. Four 4-beat phrases punching on the boom-boom-boom of the chant. Each "I like to win" lands on a downbeat. On "I WON," everything hits at once. Taiko-drum impact.
Manifesto	0:55–0:58	Music drops to ONE reverb-tailed piano note + low drone. "Work is work. Find your grind." over that. Fade to silence.

Google AI Studio — music generation prompt

Google AI Studio's music model (Lyria RealTime or Lyria 2) accepts text prompts. Paste this:

A 60-second cinematic commercial score for a high school sports + arts motivational ad. NO LYRICS, instrumental only. Structure: - 0-16 sec: sparse, quiet, muted piano + low cello drone, slow pulse, 70 BPM, introspective, leaving lots of space - 16-32 sec: add low strings and a building ostinato, tension rising, still 70 BPM, NO drums yet, room for a voiceover on top - 32-40 sec: DROP to a single sustained low cello note, near-silence, reverb tail, suspense - 40-52 sec: SNAP to 140 BPM with taiko drums, driving bass, brass swells, FOUR four-beat phrases, each phrase punches harder than the last, classic sports-commercial build - 52-55 sec: massive peak, full orchestra + taiko hit, sustained brass chord - 55-60 sec: drop to single piano note + low drone, fade to silence Emotional arc: quiet grind → pressure → breath → explosive victory → grounded resolution. Cinematic, moody, Hans Zimmer meets Two Steps From Hell. Instrumental only, no vocals, no lyrics.

Fallback plan if Lyria output isn't usable

Udio / Suno v4.5 (not what Mark asked but highest quality AI music) — $10/mo, 30-sec free samples without subscription
Epidemic Sound — search "cinematic sports motivation" — paid library, licensed for YouTube + commercial
YouTube Audio Library (free, royalty-free) — search "dramatic cinematic sports" — works for YouTube, school use fine

What Mark does from here

Go to aistudio.google.com → open the TTS playground → paste the voiceover prompt above → generate + download each speaker line as a separate WAV
In the same AI Studio → open Lyria/Music Generation → paste the music prompt → generate a 60-sec track → download as MP3 or WAV
In PowerDirector 365 → import the image files (see storyboard PowerDirector Import List), the VO WAV files, the music WAV
Place each image on the timeline per the storyboard timing
Drop VO lines on their beat marks; music bed underneath
Add title-line text overlays (Roboto Slab 900, navy/gray, uppercase) per the script
Ken Burns slow push on each still; fast-cut the victory chant beats
Export 1920x1080 H.264

When you're ready to start recording/generating, let me know and I'll walk alongside step-by-step.