Scaling AI Video Generation: How Novita AI Achieves Dual Optimization of Efficiency and Cost with WaveSpeedAI
WaveSpeedAI has significantly improved our inference efficiency and helped us cut video generation costs by up to 67%. With faster and more reliable video processing, we’re able to deliver an exceptional user experience at scale.”
— Junyu Huang, Novita AI COO
Customer Background
Novita AI is a company focused on AI inference infrastructure, dedicated to providing creators, developers, and enterprises with reliable and efficient video generation inference services. The company supports the deployment of multiple mainstream video generation models, covering end-to-end capabilities from image-to-video and text-to-video generation, serving global creative users and AI platforms at resolutions ranging from 720P to 1080P.
Challenges Before WaveSpeedAI
As the number of models and service complexity increased, Novita AI faced several challenges in its inference architecture and operations:
-
Complex resource scheduling due to multi-model deployment: Supporting multiple models such as Wan 2.1, Kling V1.6, and Hunyuan Video, each with different memory and computational requirements, resulted in significant differences in inference efficiency.
-
High costs for HD inference with underutilized GPUs: Especially for 720P and 1080P video generation tasks, individual inference cycles consumed large amounts of GPU memory, leading to high per-unit generation costs.
-
Unstable latency under high concurrency: Some large models experienced significant response delays during peak user traffic, negatively affecting end-user experience and platform reputation.
Collaboration with WaveSpeedAI
To address these challenges, Novita AI established a deep collaboration with WaveSpeed AI, focusing on the optimized deployment of the following core models:
- Wan 2.1 Image-to-Video / Text-to-Video
- Hunyuan Video Fast
- Kling V1.6 Image-to-Video / Text-to-Video
With WaveSpeed AI’s support, Novita was able to fine-tune each model individually and dynamically schedule GPU resources across a unified pool, thereby maximizing both performance and cost efficiency.
Results & Benefits
✅ Inference Performance Optimization: Inference efficiency improved by up to 25%, with average video generation time reduced by 30–40%.
Model | Resolution | Pre-Optimization Time | Post-Optimization Time |
---|---|---|---|
Hunyuan Video Fast | 720P | 2 minutes | 1 minute 30 seconds |
Wan 2.1 Text-to-Video | 1280×720 | 2 minutes 24 seconds | 1 minute 55 seconds |
Wan 2.1 Image-to-Video | 1280×720 | 3 minutes 10 seconds | 2 minutes 30 seconds |
Kling V1.6 Image-to-Video | 1080P / 5s | $0.98 / video | $0.92 / video |
✅ Cost Structure Optimization: Average per-call cost reduced by over 30%, with up to 66% savings in high-resolution scenarios.
Model | Resolution | Pre-Optimization Cost | Post-Optimization Cost | Cost Reduction |
---|---|---|---|---|
Hunyuan Video Fast | 720P | $0.18 / sec | $0.06 / sec | -66.7% |
Wan 2.1 Text-to-Video | 1280×720 | $0.06 / sec | $0.04 / sec | -33.3% |
Wan 2.1 Image-to-Video | 1280×720 | $0.08 / sec | $0.06 / sec | -25.0% |
Kling V1.6 Image-to-Video | 1080P / 5s | $0.49 / video | $0.46 / video | -6.1% |
✅ Improved System Stability: Model responses are more stable under high concurrency, video generation success rates increased, and failure rates dropped to below 0.05%, significantly enhancing the user experience.
Looking Ahead
In the future, Novita AI will continue to deepen its collaboration with WaveSpeed AI to further enhance the flexibility and stability of multi-model deployment, explore more efficient video inference frameworks, and continuously optimize its cost structure. With WaveSpeedAI’s technical strengths, Novita AI is confident in its ability to deliver faster, more stable, and more cost-effective video generation services to global customers—pushing the boundaries of technology and business value in the field of AI media generation.
Try them now!
🔗Wan-2.1-14b-vace
🔗Hunyuan Video
🔗MiniMax Video 01
🔗Kling V1.6
Follow us on Twitter, LinkedIn and join our Discord channel to stay updated.
© 2025 WaveSpeedAI. All rights reserved.