PixVerse V6 generates high-quality videos from text prompts with flexible duration (1-15s), multiple resolutions up to 1080p, and optional audio generation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Ocioso
$0.1por execução·~10 / $1
PixVerse V6 is PixVerse's latest text-to-video model, delivering high-fidelity cinematic video from natural language prompts. With resolution options from 360p to 1080p, flexible aspect ratios, optional synchronized audio generation, and a thinking mode for complex scenes, it supports a wide range of creative and production workflows.
High-fidelity video generation Produces detailed, visually coherent video with accurate motion, lighting, and scene composition from text descriptions.
Four resolution tiers Generate from 360p up to 1080p — balance quality and cost based on your delivery needs.
Optional audio generation Enable generate_audio_switch to produce synchronized ambient sound and atmosphere alongside the video.
Thinking mode The thinking_type parameter lets the model apply extended reasoning for complex or nuanced scene descriptions.
Flexible aspect ratios Supports multiple orientations to fit social, cinematic, and broadcast formats.
Prompt Enhancer Built-in tool to automatically improve your scene descriptions for richer output.
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the scene, motion, camera style, and atmosphere. |
| aspect_ratio | No | Output aspect ratio. Default: 16:9. |
| resolution | No | Output resolution: 360p, 540p, 720p (default), or 1080p. |
| duration | No | Clip length in seconds. Default: 5. |
| generate_audio_switch | No | Whether to generate synchronized audio for the video. Default: off. |
| thinking_type | No | Reasoning mode for scene generation. Default: auto. |
| Resolution | Without Audio | With Audio |
|---|---|---|
| 360p | $0.025/s | $0.035/s |
| 540p | $0.035/s | $0.045/s |
| 720p | $0.045/s | $0.060/s |
| 1080p | $0.090/s | $0.115/s |