WaveSpeedAI APIWavespeed AIInfinitetalk Video To Video

Infinitetalk Video To Video

Infinitetalk Video To Video

Playground

Try it on WavespeedAI!

InfiniteTalk is an audio-driven conversational AI video generation model. Create talking or singing videos from a single video and audio input. Our endpoint starts with $0.15 per 5 seconds (480p) or $0.3 per 5 seconds (720p) video generation and supports a maximum generation length of minutes.

Features

InfiniteTalk Video-to-Video

What is InfiniteTalk?

InfiniteTalk creates new videos by combining an input silent video and an audio track. It ensures precise lip synchronization while aligning head, face, and body movements with the audio. With optional masking and prompting, you can control which areas move and how the scene appears. The model also maintains visual identity for natural and consistent results.

Why it looks great

  • Accurate lip synchronization: matches lip motion precisely to the audio.
  • Full-body coherence: aligns head pose, facial expressions, and posture with speech.
  • Mask control: optional mask images let you define which regions can move.
  • Instruction following: prompts can guide style, pose, or behavior.
  • Identity preservation: ensures consistent visual identity across all frames.

Pricing

Output ResolutionCost per 5 secondsMax Length
480p$0.1510 minutes
720p$0.3010 minutes

Billing Rules

  • Standard Rate: $0.03 per second
  • HD (720p) Rate: $0.06 per second (Double the Standard Rate)
  • Minimum Charge: All videos are billed for a minimum of 5 seconds (costing at least $0.15).
  • Billing Cap: To keep your costs predictable, billing is capped at 600 seconds (10 minutes).

How to Use

  1. Upload the audio file.
  2. Upload a video as the base.
  3. (Optional) Upload a mask image to control which regions can move.
  4. (Optional) Write a prompt to guide the style, pose, or expressions.
  5. Select the output resolution (480p or 720p).
  6. Set the seed if you want reproducibility.
  7. Submit the job and download the generated video.

Note

  • Max clip length per job: 10 minutes
  • Processing speed: ~10–30 seconds of wall time per 1 second of video (varies by resolution and queue load)

More Versions

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk/video-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "resolution": "480p",
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
audiostringYes--The audio for generating the output.
videostringYes-The video for generating the output.
mask_imagestringNo-Optional mask image to specify the person in the video to animate.
promptstringNo-The positive prompt for the generation.
resolutionstringNo480p480p, 720pThe resolution of the output video.
seedintegerNo-1-1 ~ 2147483647The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

© 2025 WaveSpeedAI. All rights reserved.