Longcat Avatar 1.5 Multi

Playground

LongCat Avatar 1.5 Multi converts a single image and two audio inputs into multi-character talking or singing videos at up to 720p, capped at 30 seconds per clip. Ready-to-use REST API, no coldstarts, affordable pricing.

Features

LongCat Avatar 1.5 Multi

What is LongCat Avatar 1.5 Multi?

LongCat Avatar 1.5 Multi animates a single image containing two people from two distinct audio tracks, producing super-realistic, lip-synchronized multi-character video with natural dynamics and identity preserved across frames.

Why it looks great

Per-speaker lip sync: each speaker’s mouth tracks its own audio precisely.
Full-body coherence: head motion, facial expressions, and posture align with the assigned audio for each speaker.
Identity preservation: both faces remain visually consistent throughout the clip.
Flexible turn-taking: choose between simultaneous speech and ordered turn-taking via the order parameter.

Pricing

Output Resolution	Cost per 5 seconds	Max Length
480p	$0.15	30 seconds
720p	$0.30	30 seconds

Billing Rules

Standard Rate: $0.03 per second
HD (720p) Rate: $0.06 per second (Double the Standard Rate)
Minimum Charge: All videos are billed for a minimum of 5 seconds (costing at least $0.15).
Billing Cap: Billing is capped at 30 seconds.
order=meanwhile: billed by max(left_audio_duration, right_audio_duration).
order=left_right / right_left: billed by left_audio_duration + right_audio_duration (both audios played sequentially).

How to Use

Upload the left_audio and right_audio files.
Upload your image (it should clearly show two people, left and right).
Select the speaking order (left_right, right_left, or meanwhile).
Select the resolution (480p or 720p).
(Optional) Add a prompt to guide expression, style, or pose.
Submit the job and download the result once it’s ready.

Note

Max clip length per job: up to 30 seconds. Longer audio is automatically trimmed.
Processing speed: approximately 10-30 seconds of wall time per 1 second of video (varies by resolution and queue load).

More Versions

Single-character version

InfiniteTalk — Higher-end avatar video workflow for more advanced speaking performance.
InfiniteTalk Multi — Multi-character avatar workflow for more complex scenes.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/longcat-avatar-1.5/multi" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "order": "meanwhile",
    "resolution": "480p",
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes		-	The image for generating the output.
left_audio	string	Yes	-	-	The audio of the persion on the left for generating the output.
right_audio	string	Yes	-	-	The audio of the persion on the right for generating the output.
prompt	string	No		-	The positive prompt for the generation.
order	string	No	meanwhile	meanwhile, left_right, right_left	The order of the two audio sources in the output video, "meanwhile" means both audio sources will play at the same time, "left_right" means the left audio will play first then the right audio will play, "right_left" means the right audio will play first then the left audio will play.
resolution	string	No	480p	480p, 720p	The resolution of the output video.
seed	integer	No	-1	-1 ~ 2147483647	The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Longcat Avatar 1.5 Longcat Image Edit