Browse ModelsWavespeed AILongcat Avatar 1.5 Multi

Longcat Avatar 1.5 Multi

Longcat Avatar 1.5 Multi

Playground

Try it on WavespeedAI!

LongCat Avatar 1.5 Multi converts a single image and two audio inputs into multi-character talking or singing videos at up to 720p, capped at 30 seconds per clip. Ready-to-use REST API, no coldstarts, affordable pricing.

Features

LongCat Avatar 1.5 Multi

What is LongCat Avatar 1.5 Multi?

LongCat Avatar 1.5 Multi animates a single image containing two people from two distinct audio tracks, producing super-realistic, lip-synchronized multi-character video with natural dynamics and identity preserved across frames.

Why it looks great

  • Per-speaker lip sync: each speaker’s mouth tracks its own audio precisely.

  • Full-body coherence: head motion, facial expressions, and posture align with the assigned audio for each speaker.

  • Identity preservation: both faces remain visually consistent throughout the clip.

  • Flexible turn-taking: choose between simultaneous speech and ordered turn-taking via the order parameter.

Pricing

Output ResolutionCost per 5 secondsMax Length
480p$0.1530 seconds
720p$0.3030 seconds

Billing Rules

  • Standard Rate: $0.03 per second
  • HD (720p) Rate: $0.06 per second (Double the Standard Rate)
  • Minimum Charge: All videos are billed for a minimum of 5 seconds (costing at least $0.15).
  • Billing Cap: Billing is capped at 30 seconds.
  • order=meanwhile: billed by max(left_audio_duration, right_audio_duration).
  • order=left_right / right_left: billed by left_audio_duration + right_audio_duration (both audios played sequentially).

How to Use

  1. Upload the left_audio and right_audio files.
  2. Upload your image (it should clearly show two people, left and right).
  3. Select the speaking order (left_right, right_left, or meanwhile).
  4. Select the resolution (480p or 720p).
  5. (Optional) Add a prompt to guide expression, style, or pose.
  6. Submit the job and download the result once it’s ready.

Note

  • Max clip length per job: up to 30 seconds. Longer audio is automatically trimmed.
  • Processing speed: approximately 10-30 seconds of wall time per 1 second of video (varies by resolution and queue load).

More Versions


  • InfiniteTalk — Higher-end avatar video workflow for more advanced speaking performance.
  • InfiniteTalk Multi — Multi-character avatar workflow for more complex scenes.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/longcat-avatar-1.5/multi" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "order": "meanwhile",
    "resolution": "480p",
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagestringYes-The image for generating the output.
left_audiostringYes--The audio of the persion on the left for generating the output.
right_audiostringYes--The audio of the persion on the right for generating the output.
promptstringNo-The positive prompt for the generation.
orderstringNomeanwhilemeanwhile, left_right, right_leftThe order of the two audio sources in the output video, "meanwhile" means both audio sources will play at the same time, "left_right" means the left audio will play first then the right audio will play, "right_left" means the right audio will play first then the left audio will play.
resolutionstringNo480p480p, 720pThe resolution of the output video.
seedintegerNo-1-1 ~ 2147483647The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsstringArray of URLs to the generated content (empty when status is not completed).
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.