Tutorials

Step-by-step recipes for common tasks. Pick the one closest to what you're building.

Clone a voice and synthesise — Upload a reference clip plus a consent recording, then synthesise in your custom voice — zero-shot, no training wait.
Design a voice (no reference clip) — Create a voice from a natural-language description alone. No reference audio, no consent recording, no cloned likeness — the engine generates a synthetic speaker that fits your prompt.
Stream TTS over WebSocket — Open a streaming session, push text in chunks, drain audio frames in real time, and barge-in with interrupt.