WaveSpeed.ai
Startseite/Entdecken/Pixverse AI Models/pixverse/pixverse-v5.6/text-to-video
text-to-video

text-to-video

PixVerse V5.5

pixverse/pixverse-v5.6/text-to-video

PixVerse V5.6 transforms text prompts into realistic videos with smooth motion and natural detail in seconds—ideal for stories, ads, and social clips. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input
Enable audio generation for the video.

Idle

Ihre Anfrage kostet $0.45 pro Durchlauf.

Für $10 können Sie dieses Modell ungefähr 22 Mal ausführen.

Noch etwas::

BeispieleAlle anzeigen

README

PixVerse v5.6 — Text-to-Video

PixVerse v5.6 Text-to-Video turns a written scene description into a short animated clip. You control resolution (360p–1080p), duration (5s / 8s / 10s) and aspect ratio (16:9, 4:3, 1:1, 3:4, 9:16), while the model handles camera motion, lighting and transitions for you.

Highlights

  • Multiple resolutions – 360p, 540p, 720p, 1080p for previews through to final export.
  • Flexible aspect ratios – 16:9, 4:3, 1:1, 3:4, 9:16 to match feeds, stories and banners.
  • Variable duration – 5, 8 or 10 seconds per clip.
  • Audio co-generation – Optionally generate synchronized audio alongside your video.
  • Prompt reasoning (thinking_type) – optional system-side enhancement that can refine and structure your prompt.
  • Negative prompt support – steer the model away from artefacts such as "watermark", "text", "distortion".
  • Seed control – fix a seed for reproducible generations, or vary it for multiple takes.

Parameters

ParameterRequiredDescription
promptYesUp to 2048 characters describing the scene, pacing and camera moves
resolutionNo360p, 540p, 720p, 1080p (default: 540p)
durationNo5, 8, or 10 seconds (default: 5). 10s not available for 1080p
resolution_ratioNo16:9, 4:3, 1:1, 3:4, 9:16 (default: 1:1)
generate_audio_switchNoEnable audio generation for the video
thinking_typeNoauto (default), enabled, or disabled
negative_promptNoTerms to avoid (e.g., watermark, text, logo, glitch)
seedNoInteger for reproducibility

How to Use

  1. Write the prompt — Describe key shots, mood and motion, e.g. "Anime rooftop at sunset, slow dolly-in toward the character, hair and clothes moving in the wind, cinematic lighting."

  2. Set resolution & aspect ratio

    • 9:16 for TikTok, Reels
    • 16:9 for YouTube videos
    • 1:1 for Instagram
    • 4:3 for feed posts and thumbnails
  3. Choose duration — 5s for quick previews or punchy hooks; 8s for more developed mini-scenes.

  4. Enable audio (optional) — Check generate_audio_switch for synchronized sound.

  5. Adjust thinking & negative_prompt (optional) — Set thinking_type to "enabled" if you want the system to help structure a complex prompt. Add negative prompt to suppress unwanted elements.

  6. Set seed — Keep a seed fixed while you tweak the prompt; change it when you want new takes.

  7. Run and download — Generate the clip, review, then iterate as needed.

Pricing

Base Price (Video Only)

Resolution5s8s10s
360p$0.35$0.70$0.77
540p$0.35$0.70$0.77
720p$0.45$0.90$0.99
1080p$0.75$1.50

Audio Add-on (when generate_audio_switch is enabled)

Resolution5s8s10s
360p/540p+$0.45+$0.45+$0.45
720p+$0.35+$0.45+$0.45
1080p+$0.75+$0.45

Billing Rules

  • 1080p does not support 10-second duration.
  • Audio generation is billed as an add-on to the base video price.
  • Total cost = Base Price + Audio Add-on (if enabled)

Best Use Cases

  • Social Media Content — Create engaging short-form videos for TikTok, Instagram Reels, and YouTube Shorts.
  • Marketing & Ads — Generate eye-catching promotional clips without filming.
  • Concept Visualization — Bring creative ideas to life for pitches and presentations.
  • Music Videos — Produce visual content with synchronized audio.
  • Storytelling — Create narrative scenes for creative projects.

Pro Tips

  • Write shot-by-shot for best results (wide → medium → close-up, etc.).
  • Keep the number of major events small; let the model focus on a few strong beats.
  • Higher resolutions (720p / 1080p) are better for export and editing; use 360p / 540p for fast iteration.
  • For vertical platforms, set resolution_ratio to 9:16 or 3:4 from the start to avoid awkward cropping.
  • Use negative_prompt to avoid common issues like "blurry, distorted, watermark, text overlay".

Notes

  • 1080p resolution does not support 10-second duration.
  • Audio generation adds processing time but creates fully finished videos.
  • For best results, be specific about lighting, mood, and movement in your prompt.