Custom Voice Training Pro

Cloud GPUs Private Models Quality Metrics
1
Upload Training Data

Upload 30+ minutes of clean audio with transcripts for best results. More data = better quality.

Drag & drop your audio files

or click to browse

MP3, WAV, FLAC supported. Max 2GB total.

0 files
0 minutes uploaded
Transcripts
Auto-generated or upload
2
Configure Training
3
Start Training
Training typically completes in 1-4 hours depending on dataset size and quality settings.
0
Audio Minutes
50
Credits Required
~2h
Est. Time
Your Credits
10

Training requires Pro plan or credits

Upgrade to Pro
My Trained Models
0

No trained models yet.

Best Practices
  • 30+ minutes of audio (more is better)
  • Clean audio, minimal background noise
  • Single speaker per training job
  • Varied speech (not just reading)
  • Consistent recording conditions
  • Accurate transcripts improve quality

Training Credits

Standard
30 credits
  • Up to 30 min audio
  • Basic fine-tuning
  • ~1 hour training
Recommended
High Quality
50 credits
  • Up to 2 hours audio
  • Extended training epochs
  • ~2-3 hours training
Ultra
100 credits
  • Unlimited audio
  • Maximum quality
  • ~4-6 hours training