Introducing WaveSpeedAI LTX 2 19b Text-to-Video on WaveSpeedAI

LTX-2 19B Launches on WaveSpeedAI: Text-to-Video Generation with Synchronized Audio

The race to create production-ready AI video generators just reached a new milestone. LTX-2 19B, Lightricks’ groundbreaking text-to-video foundation model, is now available on WaveSpeedAI—bringing synchronized audio-video generation, multiple performance modes, and up to 20-second clips to creators, marketers, and developers.

Unlike traditional video AI models that generate silent clips requiring separate audio post-production, LTX-2 19B produces complete audiovisual experiences in a single pass. Footsteps sync perfectly with walking animations. Ambient soundscapes match the visual environment. Speech-like tones and environmental audio emerge naturally from your text prompt—no audio editing required.

What is LTX-2 19B?

LTX-2 19B is the first DiT-based (Diffusion Transformer) audio-video foundation model to combine synchronized sound and video generation in one unified system. With 19 billion parameters, it represents a fundamental architectural shift in how AI generates multimedia content.

Released by Lightricks in late 2025 and now fully open-sourced, LTX-2 has already been recognized as one of the most developer-friendly video AI models on the market. It runs efficiently on consumer GPUs, delivers production-ready outputs at resolutions up to 1080p, and—critically for WaveSpeedAI users—is available through a ready-to-use REST API with no cold starts and affordable per-second pricing.

The model supports flexible aspect ratios (16:9 landscape and 9:16 vertical), variable durations from 5 to 20 seconds, and three resolution tiers (480p, 720p, 1080p) to balance quality, speed, and cost.

Key Features That Set LTX-2 Apart

Synchronized Audio-Video Generation

The defining feature of LTX-2 is its ability to generate audio that naturally aligns with visual content. When you prompt for “a thunderstorm over a city skyline,” you get lightning flashes and the rumble of thunder. A “jazz pianist performing in a dim club” produces not just animated hands on keys, but the ambient soundscape of a live performance.

This isn’t background music layered on top—it’s contextual audio generated through the same diffusion process that creates the visuals, ensuring temporal and semantic alignment.

Production-Ready Quality

LTX-2 19B has been benchmarked against top-tier competitors like Sora 2 and Kling 2.6. While Sora 2 leads in photorealism for certain use cases, LTX-2 delivers a compelling balance: naturally reactive characters, temporally consistent motion, and—uniquely—20-second video generation, compared to Sora 2’s 12-second cap.

According to industry comparisons, LTX-2 achieves near-parity with Sora 2 in visual quality while costing approximately 40% less per generation and offering longer duration outputs.

Flexible Resolution and Aspect Ratios

WaveSpeedAI’s implementation gives you full control over output format:

480p: Fast iteration, lowest cost—ideal for rapid prototyping and testing multiple prompts
720p: Balanced quality and cost, suitable for most social media and web use cases
1080p: Maximum detail for final deliverables, presentations, and high-end content

You can switch between 16:9 landscape (YouTube, desktop) and 9:16 vertical (TikTok, Instagram Reels, Stories) to match platform requirements without additional tooling.

Variable Duration Control

Generate clips from 5 to 20 seconds—long enough to establish a narrative beat, show a product demo, or create a complete social media snippet. This extended duration sets LTX-2 apart from competitors and reduces the need for stitching multiple generations together.

Real-World Use Cases

Create TikTok, Reels, and Stories with built-in audio in seconds. No need for separate audio sourcing, licensing, or manual syncing. Prompt “skateboarding through a neon-lit tunnel” and get a complete clip ready to upload.

Product Demonstrations

Generate promotional videos with ambient sound that enhances the visual narrative. A prompt like “coffee being poured into a ceramic mug in a sunlit kitchen” produces steam, motion, and the sound of liquid hitting porcelain.

Marketing and Advertising

Produce ad content with cohesive audiovisual design. LTX-2’s ability to generate contextually appropriate audio means your product shots come with matching soundscapes—no stock audio library required.

Prototyping and Concept Visualization

Quickly visualize ideas for stakeholder reviews. Iterate at 480p to test prompt variations, then render finals at 1080p once the concept is locked. The fixed seed parameter ensures reproducibility across iterations.

Content Creators and YouTubers

Generate B-roll, intros, or narrative sequences with synchronized sound. The 20-second duration window is ideal for establishing shots, transitions, or standalone story beats.

How to Get Started on WaveSpeedAI

Using LTX-2 19B on WaveSpeedAI is straightforward:

Navigate to the model page: https://wavespeed.ai/models/wavespeed-ai/ltx-2-19b/text-to-video
Write your prompt: Describe the scene, action, and any specific audio cues (e.g., “footsteps on gravel,” “distant thunder,” “jazz piano”)
Configure settings:
- Resolution: Choose 480p (fast iteration), 720p (balanced), or 1080p (final quality)
- Aspect ratio: 16:9 for landscape, 9:16 for vertical
- Duration: 5–20 seconds based on your content needs
- Seed (optional): Set a fixed value for reproducible results
Run: Submit your request and receive a video with synchronized audio—no post-production required

WaveSpeedAI handles all infrastructure: instant cold starts, optimized inference, and per-second billing. You pay only for what you generate, with transparent pricing starting at $0.06 for a 5-second 480p clip.

Python SDK Example

import wavespeed

output = wavespeed.run(
    "wavespeed-ai/ltx-2-19b/text-to-video",
    {
        "prompt": "A golden retriever playing in autumn leaves, slow motion",
        "resolution": "720p",
        "aspect_ratio": "16:9",
        "duration": 10
    },
)

print(output["outputs"][0])  # Video URL with audio

Pricing That Scales

WaveSpeedAI offers usage-based pricing that scales with resolution and duration:

Resolution	5s	10s	15s	20s
480p	$0.06	$0.12	$0.18	$0.24
720p	$0.08	$0.16	$0.24	$0.32
1080p	$0.12	$0.24	$0.36	$0.48

This pricing model ensures you can iterate freely at lower resolutions and reserve high-quality renders for final outputs—maximizing both creative flexibility and cost efficiency.

Why Choose WaveSpeedAI?

WaveSpeedAI provides the infrastructure advantages you need for production workflows:

No cold starts: Instant inference, even after extended idle periods
Fast inference: Optimized GPU allocation for minimal wait times
Affordable pricing: Pay only for the seconds and resolution you use
REST API: Simple integration into existing workflows, automation pipelines, or custom applications
Transparent billing: No hidden fees, subscription tiers, or compute credits

Pro Tips for Best Results

Be specific about audio: While audio generates automatically, describing sounds in your prompt (“thunderstorm,” “jazz music,” “footsteps”) helps guide the model
Match aspect ratio to platform: Use 9:16 for vertical-first platforms (TikTok, Stories), 16:9 for YouTube and desktop
Iterate at 480p: Dial in your prompt at lower cost, then upscale to 1080p for final delivery
Use fixed seeds: When testing prompt variations, lock the seed to isolate the effect of your changes
Combine multiple clips: For longer content, generate 20-second segments and edit them together in post

The Future of Audiovisual AI

LTX-2 19B represents a fundamental shift in video AI—from generating silent clips to producing complete audiovisual experiences. As the first DiT-based audio-video foundation model, it sets a new baseline for what creators should expect from generative video tools.

With WaveSpeedAI handling infrastructure and Lightricks’ open-source model providing cutting-edge generation quality, you can focus on what matters: creating compelling content.

Try LTX-2 19B Today

Ready to generate your first synchronized audio-video clip? Head to the LTX-2 19B model page on WaveSpeedAI and start creating. Whether you’re a solo creator, marketing team, or developer building automated content pipelines, LTX-2 19B delivers production-ready results at a price that scales with your needs.

Start generating now: https://wavespeed.ai/models/wavespeed-ai/ltx-2-19b/text-to-video