Skywork Ai Skyreels V4 Reference To Video
Playground
Try it on WavespeedAI!SkyReels V4 Reference to Video is a fast AI reference-to-video generation model that creates high-quality videos from reference images or a reference video and a text prompt using the SkyReels V4 omni-video workflow. Ready-to-use REST inference API for character-consistent videos, product visuals, branded storytelling, social media clips, advertising creatives, concept videos, and professional reference-based video generation workflows with simple integration, no coldstarts, and affordable pricing.
Features
Skywork AI SkyReels V4 Reference-to-Video
Skywork AI SkyReels V4 Reference-to-Video generates videos from reference images, with optional reference video guidance for stronger motion and scene control. It supports standard and fast modes, multiple resolutions, optional sound effects, and prompt-driven generation for character consistency, product motion, scene transfer, and other reference-based video workflows.
Why Choose This?
-
Reference-guided video generation Use up to
3reference images to guide identity, appearance, style, or scene elements. -
Optional reference video support Add up to
1reference video when you want stronger motion or temporal guidance. -
Two generation modes Choose
stdfor higher-quality output orfastfor quicker, lower-cost generation. -
Multiple resolutions Supports
480p,720p, and1080pto balance quality and budget. -
Flexible aspect ratios Choose
16:9,9:16, or1:1depending on your target platform. -
Production-ready workflow Suitable for character-driven video, product clips, stylized motion design, and reference-based storytelling.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | The prompt describing the generated video. Reference tags are added automatically when missing. |
| images | No | Reference image URLs. Upload up to 3 images. |
| ref_videos | No | Optional reference video URL. Upload at most 1 video. |
| aspect_ratio | No | Aspect ratio of the generated video. Supported values: 16:9, 9:16, 1:1. Default: 16:9. |
| duration | No | Duration of the generated video in seconds. Range: 3–15. Default: 5. |
| resolution | No | Output video resolution. Supported values: 480p, 720p, 1080p. Default: 1080p. |
| sound | No | Whether to generate sound effects with the video. Default: false. |
| mode | No | Quality/performance mode. Supported values: std, fast. Default: std. fast mode currently requires sound=false. |
How to Use
- Write your prompt — describe the motion, scene, camera movement, and visual direction you want.
- Add reference images (optional) — upload up to
3images to guide identity, look, or scene consistency. - Add a reference video (optional) — upload
1video if you want stronger motion or temporal guidance. - Choose aspect ratio — select
16:9,9:16, or1:1. - Set duration — choose a clip length between
3and15seconds. - Choose resolution — use
480p,720p, or1080pdepending on quality and cost needs. - Choose mode — use
stdfor higher quality orfastfor quicker generation. - Enable sound (optional) — turn this on if you want generated sound effects. If using
fast, keepsound=false. - Submit — run the model and download the generated video.
Example Prompt
A cinematic fashion shot with smooth camera movement, elegant character motion, soft studio lighting, premium commercial pacing, and stable identity across the sequence.
Pricing
Pricing depends on duration, resolution, mode, and whether you use a reference video.
Without Reference Video
Standard Mode
| Resolution | Per Second | 5s Cost |
|---|---|---|
| 480p | $0.11 | $0.55 |
| 720p | $0.14 | $0.70 |
| 1080p | $0.35 | $1.75 |
Fast Mode
| Resolution | Per Second | 5s Cost |
|---|---|---|
| 480p | $0.08 | $0.40 |
| 720p | $0.11 | $0.55 |
| 1080p | $0.275 | $1.375 |
With Reference Video
Standard Mode
| Resolution | Per Second | 5s Cost |
|---|---|---|
| 480p | $0.18 | $0.90 |
| 720p | $0.25 | $1.25 |
| 1080p | $0.625 | $3.125 |
Fast Mode
| Resolution | Per Second | 5s Cost |
|---|---|---|
| 480p | $0.15 | $0.75 |
| 720p | $0.20 | $1.00 |
| 1080p | $0.50 | $2.50 |
Example Costs
Without Reference Video · Standard Mode
| Resolution | 3s | 5s | 10s | 15s |
|---|---|---|---|---|
| 480p | $0.33 | $0.55 | $1.10 | $1.65 |
| 720p | $0.42 | $0.70 | $1.40 | $2.10 |
| 1080p | $1.05 | $1.75 | $3.50 | $5.25 |
Without Reference Video · Fast Mode
| Resolution | 3s | 5s | 10s | 15s |
|---|---|---|---|---|
| 480p | $0.24 | $0.40 | $0.80 | $1.20 |
| 720p | $0.33 | $0.55 | $1.10 | $1.65 |
| 1080p | $0.825 | $1.375 | $2.75 | $4.125 |
With Reference Video · Standard Mode
| Resolution | 3s | 5s | 10s | 15s |
|---|---|---|---|---|
| 480p | $0.54 | $0.90 | $1.80 | $2.70 |
| 720p | $0.75 | $1.25 | $2.50 | $3.75 |
| 1080p | $1.875 | $3.125 | $6.25 | $9.375 |
With Reference Video · Fast Mode
| Resolution | 3s | 5s | 10s | 15s |
|---|---|---|---|---|
| 480p | $0.45 | $0.75 | $1.50 | $2.25 |
| 720p | $0.60 | $1.00 | $2.00 | $3.00 |
| 1080p | $1.50 | $2.50 | $5.00 | $7.50 |
Billing Rules
- Base multiplier starts from $0.10 per second
- Pricing scales linearly with
duration - Prices differ between
stdandfast - Adding a
ref_videosinput increases the rate sounddoes not affect pricing directlyfastmode currently requiressound=false
Best Use Cases
- Character consistency — Use reference images to keep the same person or style across a clip.
- Motion-guided generation — Add a reference video when movement or pacing matters.
- Product and fashion videos — Build polished, controlled motion clips from still references.
- Creative prototyping — Use
fastmode for quick iteration before moving tostd. - Reference-based storytelling — Combine image identity guidance with optional motion guidance.
Pro Tips
- Use reference images when identity, styling, or appearance matters most.
- Add a reference video only when you need stronger motion or temporal guidance.
- Keep prompts focused on motion, camera behavior, and scene progression.
- Use
fastmode for rough iteration, then switch tostdfor final-quality output. - Keep
sound=falseinfastmode. - Start with shorter durations to validate the concept before generating longer clips.
Notes
promptis required.imagessupports up to 3 reference images.ref_videossupports up to 1 reference video.durationsupports 3–15 seconds.resolutiondefaults to1080p.modedefaults tostd.fastmode currently requiressound=false.- Pricing depends on
duration,resolution,mode, and whether a reference video is included.
Related Models
- Skywork AI SkyReels V4 Text-to-Video — Generate videos directly from text prompts.
- Skywork AI SkyReels V4 Image-to-Video — Generate videos from a first-frame image with optional middle and end frame guidance.
- Skywork AI SkyReels V3 Reference-to-Video — Earlier reference-based video workflow in the SkyReels lineup.
- Skywork AI SkyReels V3 Extend Video — Continue an existing video clip with newly generated footage.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/skywork-ai/skyreels-v4/reference-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"aspect_ratio": "16:9",
"duration": 5,
"resolution": "1080p",
"sound": false,
"mode": "std"
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The prompt describing the generated video. Reference tags are added automatically when missing. | |
| images | array | No | [] | - | Reference image URLs. Upload up to 3 images. |
| ref_videos | array | No | - | - | Optional reference video URL. Upload at most 1 video. |
| aspect_ratio | string | No | 16:9 | 16:9, 9:16, 1:1 | Aspect ratio of the generated video. |
| duration | integer | No | 5 | 3 ~ 15 | Duration of the generated video in seconds. |
| resolution | string | No | 1080p | 480p, 720p, 1080p | Output video resolution. |
| sound | boolean | No | false | - | Whether to generate sound effects with the video. |
| mode | string | No | std | std, fast | Quality/performance mode. Fast mode currently requires sound to be false. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content. |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |