Supertonic voices — pick yours

Other knobs we can turn

Beyond the 10 built-in voices, Supertonic exposes these per-call parameters — we can A/B any combination:

Parameter	Range	What it changes
`total_steps`	5 (low) — 12 (high)	Generation quality. Default 8 (medium). Higher = better articulation, slower generation. For your stack, 8 is the sweet spot; 10-12 worth trying if you want maximum polish.
`speed`	0.7 (slow) — 2.0 (fast)	Playback speed without changing pitch. Current samples are 1.05 (slightly faster than natural). 0.95-1.1 sounds most natural; 1.2-1.5 useful for digest-style fast briefings.
`lang`	31 languages	English, Spanish, French, Arabic, Korean, German, Portuguese, Italian, etc. The voice can pronounce the same text in different language modes. `"na"` is language-agnostic (best for mixed-language).
Voice cloning	any audio sample	You can train a voice from a 5-10 second sample of YOUR voice. Then ntfy alerts speak in your voice. Per the Supertonic demo page — "Voice Builder \| Cloning Demo". Deeper integration; worth exploring once the base pipeline is live.

What I'm asking

Tell me three things:

Which voice number (F1-F5 or M1-M5)?
Speed — stick with 1.05x, slow it to 1.0, or speed up to 1.15x?
Voice cloning — want to record a 10-sec sample of YOUR voice and have all ntfy alerts speak in your voice? (Yes / no / later)

Once you pick, I lock that into the Tasker integration. The current 8 actions in your TP3 Notification task end with a Say that uses GoogleTTS. I'll replace it with an HTTP Request to the Supertonic endpoint + Music Play on the returned WAV. Same triggers, better voice, runs on G16 today + migrates to Apex post-Saturday.

Supertonic 3 v1.3.1 · HTTP serve at 192.168.1.185:7788 · survives shell exit via setsid nohup

Supertonic voices — pick yours

Female voices (F1 - F5)

Male voices (M1 - M5)

Other knobs we can turn

What I'm asking