50% zniżki na modele Vidu Q3 i Q3 Pro · Tylko w WaveSpeedAI | 20 maja – 2 czerwca

HeyGen Avatar V Digital Twin API

heygen /

HeyGen Avatar V Digital Twin is a fast AI avatar video generation model that creates natural digital twin videos from text or audio with lip-sync, optional captions, background removal, and MP4/WebM output. Ready-to-use REST inference API for digital humans, virtual presenters, product explainers, marketing videos, training content, social media clips, and professional avatar video workflows with simple integration, no coldstarts, and affordable pricing.

digital-human
Wejście

Przeciągnij i upuść lub kliknij, aby przesłać

Remove the avatar background. Requires a matting-enabled avatar.

Bezczynny

$0.12za uruchomienie·~83 / $10

PrzykładyZobacz wszystkie

Powiązane modele

README

HeyGen Avatar IV Digital Twin

HeyGen Avatar IV Digital Twin generates a talking avatar video from a selected HeyGen digital twin avatar and an uploaded audio clip. It is designed for presenter videos, spokesperson content, personalized avatar delivery, and other avatar-driven speaking workflows with flexible output controls.

Why Choose This?

  • Digital twin avatar workflow Use a prebuilt digital twin avatar to generate speaking video from audio.

  • Audio-driven speech performance Upload an audio clip to drive the avatar’s timing, expression, and speaking delivery.

  • Flexible framing controls Choose aspect ratio, fit mode, and output resolution to match your target platform.

  • Optional background removal Enable remove_background for supported matting-enabled avatars.

  • Optional caption export Enable caption to generate a sidecar SRT subtitle file alongside the video.

  • Production-ready API Suitable for personalized presenter videos, internal communications, ads, explainers, and virtual spokesperson workflows.

Parameters

ParameterRequiredDescription
avatarYesSelected HeyGen digital twin avatar.
audioYesAudio clip used to drive the avatar video.
fitNoHow the avatar is framed in the output, such as cover.
remove_backgroundNoRemove the avatar background. Requires a matting-enabled avatar.
captionNoGenerate a sidecar SRT caption file alongside the video.
output_formatNoOutput video format. Default: mp4.
resolutionNoOutput resolution, such as 720p.
aspect_ratioNoOutput aspect ratio, such as 16:9.

How to Use

  1. Choose your avatar — select the digital twin avatar you want to use.
  2. Upload your audio — provide the voice track that should drive the avatar.
  3. Adjust framing (optional) — choose fit, resolution, and aspect_ratio.
  4. Enable extras (optional) — turn on remove_background or caption if needed.
  5. Submit — run the model and download the generated avatar video.

Example Use Case

Generate a polished office presenter video from a digital twin avatar and a short voice clip for internal announcements or marketing content.

Pricing

Pricing is based on the uploaded audio duration.

Audio DurationCost
5s$0.60
6s$0.72
7s$0.84
8s$0.96
10s$1.20
15s$1.80

Billing Rules

  • Base price is $0.12 per second
  • Minimum billed duration is 5 seconds
  • Audio duration is rounded up to the next whole second

Best Use Cases

  • Digital spokesperson videos — Create branded speaking-avatar content quickly.
  • Internal communications — Deliver announcements, updates, and training clips with a consistent avatar.
  • Marketing and ads — Produce talking-head promo content without filming.
  • Explainers and onboarding — Turn voice scripts into presenter-led video.
  • Localized delivery — Reuse the same avatar with different voice tracks and captions.

Pro Tips

  • Upload clean audio for better speaking rhythm and lip-sync quality.
  • Keep clips short while testing avatar style and framing.
  • Use caption when the final video needs accessible subtitles.
  • Only enable remove_background if the selected avatar supports matting.
  • Match aspect_ratio to your final platform, such as 16:9 for widescreen delivery.

Notes

  • avatar and audio are required.
  • Billing uses the uploaded audio duration, with a minimum of 5 seconds.
  • Audio duration is rounded up to the next whole second before billing.
  • remove_background requires a matting-enabled avatar.
  • caption generates a sidecar SRT file, not burned-in subtitles.

Related Models

  • HeyGen Avatar IV avatar workflows — Useful when you need other avatar generation modes or delivery options.
  • Talking avatar video workflows — Useful when you want image-based or non-digital-twin avatar generation instead of a preset HeyGen twin.
  • Subtitle and caption workflows — Useful when you need more advanced subtitle styling or burn-in options.
Dostępność:Ta strona korzysta z modeli AI udostępnianych przez podmioty trzecie.