Alibaba Happyhorse 1.1 Reference To Video
Playground
Try it on WavespeedAI!Alibaba HappyHorse 1.1 Reference to Video generates new video scenes from reference images, preserving character identity, visual style, and scene consistency. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Alibaba Happy Horse 1.1 Reference-to-Video
Alibaba Happy Horse 1.1 Reference-to-Video generates new video scenes guided by one or more reference images, helping maintain consistent characters, styles, and visual identity across the output. It combines reference-image grounding with natural-language prompting to create cinematic videos in 720p or 1080p.
- Need to generate directly from text? Try Alibaba Happy Horse 1.1 Text-to-Video.
- Need to animate a single starting frame instead? Try Alibaba Happy Horse 1.1 Image-to-Video.
Why Choose This?
-
Reference-guided consistency Use up to multiple reference images to preserve character identity, visual style, outfit details, and overall scene language.
-
Prompt + image control Combine reference images with a text prompt to control the scene, action, mood, and camera behavior more precisely.
-
Cinematic motion Generate smooth, expressive video motion while keeping important visual elements stable and recognizable.
-
Flexible output settings Choose output resolution, aspect ratio, duration, and seed to match your creative and production needs.
-
Production-ready API Access the model through a REST inference API with no cold starts for scalable integration into apps and workflows.
Parameters
| Parameter | Required | Description |
|---|---|---|
| images | Yes | Reference image URLs. Supports 1–9 images. |
| prompt | Yes | Text description of the desired scene, action, style, or motion. |
| resolution | No | Output resolution: 720p (default) or 1080p. |
| aspect_ratio | No | Output aspect ratio. Default: 16:9. |
| duration | No | Video length in seconds. Range: 3–15, default 5. |
| seed | No | Random seed for reproducibility. Range: 0–2147483647. |
How to Use
- Upload your reference images — provide
1–9image URLs that define the character, style, or visual identity you want to preserve. - Write your prompt — describe the target scene, action, camera behavior, lighting, and mood.
- Choose resolution — use
720pfor lower-cost iteration or1080pfor higher-quality final output. - Set aspect ratio — choose the format that best fits your target platform or composition needs.
- Set duration — choose a clip length between
3and15seconds. - Set a seed (optional) — use a fixed seed for more reproducible generations.
- Submit — generate and download your video.
Example Prompt
A cinematic fashion scene with the same character walking through a softly lit modern city street at night, gentle camera tracking, subtle wind in the hair and clothing, elegant movement, realistic lighting, premium commercial style
Pricing
Per 5 Seconds
| Resolution | Cost |
|---|---|
| 720p | $0.70 |
| 1080p | $1.40 |
Example Costs
| Resolution | 3s | 5s | 10s | 15s |
|---|---|---|---|---|
| 720p | $0.42 | $0.70 | $1.40 | $2.10 |
| 1080p | $0.84 | $1.40 | $2.80 | $4.20 |
Billing Rules
- Base price:
720pcosts $0.70 per 5 seconds - 1080p surcharge:
1080pcosts 2× the720prate - Total price formula:
total_price = 0.70 × (resolution == "1080p" ? 2 : 1) × duration / 5
Best Use Cases
- Character consistency across scenes — Keep the same person, outfit, or visual identity across multiple generated videos.
- Brand and campaign content — Maintain a stable look and style across ad creatives, promos, and commercial storytelling.
- Style-preserving video generation — Use reference images to anchor art direction, color palette, and visual tone.
- Narrative concepting — Generate new scenes based on known characters or environments for storyboarding and ideation.
- Social media and short-form content — Create visually consistent clips tailored to different platforms and aspect ratios.
- Creative prototyping — Explore motion and scene variations while preserving core reference details.
Pro Tips
- Use clear, high-quality reference images that strongly represent the character, outfit, or style you want to preserve.
- Include multiple reference images when consistency across facial features, costume details, or design elements is important.
- Be specific in your prompt about scene, action, camera motion, lighting, and mood.
- Use
720pfor rapid testing, then switch to1080pfor final-quality renders. - Reuse the same
seedwhen you want more reproducible outputs. - Start with shorter durations to validate identity consistency and motion before generating longer clips.
Notes
- Both
imagesandpromptare required. imagessupports1–9reference image URLs.- Ensure all image URLs are publicly accessible.
- Supported video duration is
3–15 seconds. - Supported resolutions are
720pand1080p. - Pricing scales linearly with
duration. 1080ppricing is exactly 2× the720prate.- Please ensure your content complies with applicable usage policies.
Related Models
- Alibaba Happy Horse 1.1 Text-to-Video — Generate cinematic videos directly from natural-language prompts.
- Alibaba Happy Horse 1.1 Image-to-Video — Animate a single reference image into a cinematic video clip.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/alibaba/happyhorse-1.1/reference-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | Text description of the desired scene. | |
| images | array | Yes | [] | 1 ~ 9 items | Array of reference image URLs (1-9). |
| resolution | string | No | 720p | 720p, 1080p | Output video resolution. |
| aspect_ratio | string | No | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4 | The aspect ratio of the generated video. |
| duration | integer | No | 5 | 3 ~ 15 | Video length in seconds (3-15). |
| seed | integer | No | - | -1 ~ 2147483647 | Random seed for reproducibility. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | object | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |