HeyGen Video Translate
HeyGen Video Translate is a powerful AI-powered video translation and dubbing model that automatically translates spoken content in videos into different languages — complete with lip-sync and natural voice cloning. Transform your videos for global audiences without reshooting.
Why It Stands Out
- Full video translation: Automatically translates speech and generates new audio in the target language.
- Lip-sync technology: AI adjusts lip movements to match the translated audio for natural-looking results.
- Voice cloning: Preserves the original speaker's voice characteristics in the new language.
- Multi-language support: Translate into English, Spanish, French, Hindi, Italian, German, Polish, Portuguese, Chinese, Japanese, Dutch, and more.
- Simple workflow: Just upload a video and select your target language — no manual dubbing required.
Parameters
| Parameter | Required | Description |
|---|
| video | Yes | Upload or link to the source video file. |
| output_language | Yes | Target language for translation (English, Spanish, French, etc.). |
Supported Languages
- English
- Spanish
- French
- Hindi
- Italian
- German
- Polish
- Portuguese
- Chinese
- Japanese
- Dutch
- And more...
How to Use
- Upload your video — drag and drop a file or paste a public URL.
- Select output language — choose the target language for translation.
- Click Run and wait for processing to complete.
- Preview and download the translated video.
Best Use Cases
- Global Content Distribution — Localize marketing videos, ads, and promotional content for international markets.
- E-learning & Training — Translate educational videos and courses for multilingual audiences.
- Social Media Expansion — Reach new audiences by translating viral content into multiple languages.
- Corporate Communications — Distribute internal videos and announcements across global teams.
- Entertainment & Media — Dub interviews, documentaries, and video content for foreign markets.
Pricing
| Metric | Price |
|---|
| Per second of video | $0.0375 / s |
Total cost = duration of video (in seconds) × $0.0375
Examples
- 30s video → 30 × $0.0375 = $1.125
- 60s video → 60 × $0.0375 = $2.25
- 5 min (300s) video → 300 × $0.0375 = $11.25
- 10 min (600s) video → 600 × $0.0375 = $22.50
Pro Tips for Best Quality
- Use videos with clear speech and minimal background noise for optimal translation accuracy.
- Single-speaker videos typically produce better lip-sync results than multi-speaker content.
- Ensure the original audio is high quality — translation quality depends on accurate speech recognition.
- For best lip-sync, use videos where the speaker's face is clearly visible and well-lit.
Notes
- Ensure uploaded video URLs are publicly accessible.
- Processing time varies based on video duration and current queue load.
- Please ensure your content complies with usage guidelines.