GPT Image 2 現已上線。在圖像生成器中試用→
首頁/探索/Happyhorse Models/alibaba/happyhorse-1.0/reference-to-video

Alibaba Happy Horse 1.0 Reference-to-Video

alibaba /

Alibaba Happy Horse 1.0 (Reference-to-Video) generates new video scenes guided by reference images, maintaining consistent characters, styles, and visual identity. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-video
輸入

拖放檔案或點擊上傳

preview

就緒

您的請求將花費 $0.7 每次運行。

使用 $10 您可以運行此模型大約 14 次。

還有一件事:

示例查看全部

README

Alibaba Happy Horse 1.0 Reference-to-Video

Alibaba Happy Horse 1.0 Reference-to-Video generates new video scenes guided by one or more reference images, helping maintain consistent characters, styles, and visual identity across the output. It combines reference-image grounding with natural-language prompting to create cinematic videos in 720p or 1080p.

Why Choose This?

  • Reference-guided consistency Use up to multiple reference images to preserve character identity, visual style, outfit details, and overall scene language.

  • Prompt + image control Combine reference images with a text prompt to control the scene, action, mood, and camera behavior more precisely.

  • Cinematic motion Generate smooth, expressive video motion while keeping important visual elements stable and recognizable.

  • Flexible output settings Choose output resolution, aspect ratio, duration, and seed to match your creative and production needs.

  • Production-ready API Access the model through a REST inference API with no cold starts for scalable integration into apps and workflows.

Parameters

ParameterRequiredDescription
imagesYesReference image URLs. Supports 1–9 images.
promptYesText description of the desired scene, action, style, or motion.
resolutionNoOutput resolution: 720p (default) or 1080p.
aspect_ratioNoOutput aspect ratio. Default: 16:9.
durationNoVideo length in seconds. Range: 3–15, default 5.
seedNoRandom seed for reproducibility. Range: 0–2147483647.

How to Use

  1. Upload your reference images — provide 1–9 image URLs that define the character, style, or visual identity you want to preserve.
  2. Write your prompt — describe the target scene, action, camera behavior, lighting, and mood.
  3. Choose resolution — use 720p for lower-cost iteration or 1080p for higher-quality final output.
  4. Set aspect ratio — choose the format that best fits your target platform or composition needs.
  5. Set duration — choose a clip length between 3 and 15 seconds.
  6. Set a seed (optional) — use a fixed seed for more reproducible generations.
  7. Submit — generate and download your video.

Example Prompt

A cinematic fashion scene with the same character walking through a softly lit modern city street at night, gentle camera tracking, subtle wind in the hair and clothing, elegant movement, realistic lighting, premium commercial style

Pricing

Per 5 Seconds

ResolutionCost
720p$0.70
1080p$1.40

Example Costs

Resolution3s5s10s15s
720p$0.42$0.70$1.40$2.10
1080p$0.84$1.40$2.80$4.20

Billing Rules

  • Base price: 720p costs $0.70 per 5 seconds
  • 1080p surcharge: 1080p costs the 720p rate
  • Total price formula:
    total_price = 0.70 × (resolution == "1080p" ? 2 : 1) × duration / 5

Best Use Cases

  • Character consistency across scenes — Keep the same person, outfit, or visual identity across multiple generated videos.
  • Brand and campaign content — Maintain a stable look and style across ad creatives, promos, and commercial storytelling.
  • Style-preserving video generation — Use reference images to anchor art direction, color palette, and visual tone.
  • Narrative concepting — Generate new scenes based on known characters or environments for storyboarding and ideation.
  • Social media and short-form content — Create visually consistent clips tailored to different platforms and aspect ratios.
  • Creative prototyping — Explore motion and scene variations while preserving core reference details.

Pro Tips

  • Use clear, high-quality reference images that strongly represent the character, outfit, or style you want to preserve.
  • Include multiple reference images when consistency across facial features, costume details, or design elements is important.
  • Be specific in your prompt about scene, action, camera motion, lighting, and mood.
  • Use 720p for rapid testing, then switch to 1080p for final-quality renders.
  • Reuse the same seed when you want more reproducible outputs.
  • Start with shorter durations to validate identity consistency and motion before generating longer clips.

Notes

  • Both images and prompt are required.
  • images supports 1–9 reference image URLs.
  • Ensure all image URLs are publicly accessible.
  • Supported video duration is 3–15 seconds.
  • Supported resolutions are 720p and 1080p.
  • Pricing scales linearly with duration.
  • 1080p pricing is exactly the 720p rate.
  • Please ensure your content complies with applicable usage policies.

Related Models