Rekvon AI · voice cloning that breathes

One clip in. Broadcast-ready audio out.

The proven pipeline, wrapped in three moves.

Clone

Upload a 2-second-plus reference. Rekvon AI validates it, resamples to the engine rate, and builds a reusable voice profile.

Generate

Paste any script. Each chunk is synthesized in the cloned voice, then stitched with breath and 15 ms crossfades.

Ship

High-pass, gentle compression, loudness at −16 LUFS. Download one mastered mono WAV with live progress.

Controls that shape the performance, not just the words.

Every render is tunable and reproducible. Dial the delivery, lock the voice, and master to broadcast in a single pass.

Broadcast master, built in

Every render is high-passed at 80 Hz, gently compressed, and normalised to −16 LUFS, the YouTube loudness standard. It leaves the pipeline ready to publish, not ready to fix.

−16 LUFS80 Hz high-pass15 ms crossfade200 ms breath

seed

Consistent voice

Fix the seed and the same script renders identically, every single time.

Pace & speed

Speed the delivery up or slow it down, with no pitch drift.

Expressiveness

Push emotion up or hold it flat with real ChatterBox exaggeration and pacing controls.

Languages

English, Hindi in Devanagari, and romanized Hinglish.

Four engines. One voice profile. Auto-selected per host.

MLX Fish S2 Pro

Apple Silicon · 44.1 kHz

Studio-grade fidelity on macOS, for final masters where quality is everything.

ChatterBox live

CPU / CUDA · 24 kHz

The deployed, CPU-viable engine. 23 languages and roughly 3.7× real-time on six cores.

XTTS-v2

Linux · 24 kHz

Coqui multilingual as an optional install, with full pace and speed control.

Demo

Any host · no ML

A placeholder tone for CI and UI work. Rekvon AI never swaps engines silently on you.

One price. No wall to try it.

$0.10 / minute

Metered by the second on generated audio. No subscription, no seat fees, and no payment gateway to get started.

Every engine includedUnlimited voices−16 LUFS masters WAV downloadLive render progressYour voices stay private

Start free →

Questions, answered.

Do I own the cloned voices and the audio?+

Yes. Every voice profile and every render is scoped to your account. Other users cannot see or use them, and nothing is shared without you.

Which languages work today?+

English and Hindi in Devanagari both render at high quality on the production engines. Romanized Hinglish works approximately through the English model.

What can I upload as a reference clip?+

WAV, MP3, M4A and most common formats, up to 25 MB. Clips shorter than two seconds are rejected so the clone has enough to work with.

How does quality differ across engines?+

ChatterBox is the deployed CPU engine, running around 3.7× real-time on six cores across 23 languages. Fish S2 Pro delivers studio-grade fidelity on Apple Silicon.

How am I billed?+

At $0.10 per minute of generated audio, metered per second. Pricing is display-only for now, so there is no payment step to try the full pipeline.

What do I get out of a render?+

One mastered mono WAV per render: chunked, stitched with breath and crossfades, then high-passed, compressed, and loudness-locked to −16 LUFS.

A voice that breathes, not a robot that reads.