Wan 2.1 Mocha
Playground
Try it on WavespeedAI!MoCha performs Video-To-Video character swaps using reference images, replacing a video’s character without per-frame pose or depth maps. Ready-to-use REST inference API, no coldstarts, affordable pricing.
Features
MoCha 🎭 — AI Video Character Replacement
MoCha is an end-to-end video character replacement system that seamlessly swaps the main character in a video with a new one provided via reference images. Unlike traditional methods, it requires no explicit per-frame structural guidance (such as pose or depth maps), while maintaining realistic motion, lighting, and facial expressions throughout the clip.
🌟 Key Features
-
🧠 Structure-Free Replacement No need for pose or depth maps — MoCha automatically aligns motion, expression, and body posture.
-
🎥 Motion Preservation Accurately transfers the source actor’s motion, emotion, and camera perspective to the target character.
-
🎨 Identity Consistency Maintains the new character’s facial identity, lighting, and style across frames without flickering.
-
⚙️ Easy Setup Works with a single image and a source video — no need for complex preprocessing or rigging.
-
💡 High Realism, Low Effort Perfect for film, advertising, digital avatars, and creative character transformation.
💰 Pricing
| Resolution | Price per 5s | Price per second | Max Length |
|---|---|---|---|
| 480p | $0.20 | $0.04 / s | 120 s |
| 720p | $0.40 | $0.08 / s | 120 s |
Billing Rules
- Minimum charge: 5 seconds - any video shorter than 5 seconds is billed as 5 seconds.
- Maximum billed duration: 120 seconds (2 minutes)
⚙️ How to Use
- Upload
image— A clear reference image of the new character (recommended formats: JPG / PNG, avoid WEBP). - Upload
video— The motion source; MoCha extracts pose and expression dynamics from this clip. - Add
prompt(optional) — Guide the output, e.g. “preserve outfit; natural expressions; no background changes.” - Select
resolution— Choose between 480p or 720p. - Generate — Wait a moment while MoCha processes the replacement.
- Review & Iterate — Fix a
seedto reproduce results, or vary it for A/B comparisons.
🧩 Tips for Best Results
- Match Pose & Composition: Keep your reference image’s camera angle, body orientation, and framing close to the target video.
- Keep Aspect Ratios Consistent: Use the same aspect ratio between your input image and video.
- Limit Video Length: For best stability, keep clips under 60 seconds — longer clips may show slight quality degradation.
- Lighting Consistency: Match lighting direction and tone between image and video to minimize blending artifacts.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/mocha" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"resolution": "480p",
"seed": -1
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| image | string | Yes | - | The image for generating the output. | |
| video | string | Yes | - | The video for generating the output. | |
| prompt | string | No | - | The positive prompt for the generation. | |
| resolution | string | No | 480p | 480p, 720p | The resolution of the output video. |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |