Home/Explore/Wan 2.1 Video Models/wavespeed-ai/wan-2.1/text-to-image-lora

text-to-image

wavespeed-ai/wan-2.1/text-to-image-lora

Revolutionary text-to-image generation powered by Wan 2.1, delivering ultra-realistic images with photographic authenticity and exceptional detail fidelity. We modified the Wan 2.1 Video Model to make it also support image generation, and found it could achieve SOTA image generation quality! This endpoint also supports LoRAs.

Doc

Hint: You can drag and drop a file or click to upload

width
height
If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.
If set to true, the safety checker will be enabled.
If set to true, the function will wait for the image to be generated and uploaded before returning the response. It allows you to get the image directly in the response. This property is only available through the API.

Idle

A young woman hanging laundry on a sunny balcony, soft shadows, fluttering clothes, warm afternoon light, urban neighborhood, photorealistic, 35mm lens

Your request will cost $0.025 per run.

For $1 you can run this model approximately 40 times.

One more thing:

ExamplesView all

A young woman hanging laundry on a sunny balcony, soft shadows, fluttering clothes, warm afternoon light, urban neighborhood, photorealistic, 35mm lens
A couple reading books in a tiny café during rainy afternoon, raindrops on window, warm tones, vintage decor, cozy mood, photorealistic
A kid playing with a dog in a sunlit park, fallen leaves, trees swaying in wind, golden hour lighting, spontaneous joyful expression
A family eating dinner at a small round table, homemade dishes, ambient kitchen light, lively conversation, slightly cluttered background
B0x13ng Boxing Video The boxer throws a quick series of jabs and then a right cross.
B0x13ng Boxing video two boxers are in the ring, but one of them is significantly shorter than the other. The shorter female fighter in black shorts is aggressively throwing punches, trying to reach his taller opponent. The taller fighter, wearing black shorts, simply holds out his glove on the shorter fighter’s forehead, keeping her at a frustrating distance.
A fierce female fighter walking into the ring, hooded robe, determined expression, spotlight on her, crowd in shadows, LoRA: B0x13ng Boxing Video, dramatic sports energy
Boxer resting in the corner of the ring between rounds, exhausted posture, trainer speaking to him, water splashing, towel in hand, grungy gym mood, cinematic realism
cinematic wide shot of a massive bio-mechanical city on a desert planet at dusk. Towers are fused with organic, plant-like structures. A lone wanderer in a weathered cloak walks towards the city gates. The twin suns are setting, casting long shadows and a deep orange glow across the landscape. Matte painting, epic scale, concept art in the style of Syd Mead and John Harris.
 a steaming bowl of authentic Japanese tonkotsu ramen, rich and creamy broth, perfectly cooked chashu pork with a seared surface, a soft-boiled egg with a gooey yolk, glistening noodles, garnished with fresh green onions and nori seaweed. Professional studio lighting, dramatic side light creating deep shadows, shallow depth of field, background is a dark, rustic ramen shop.

README

Wan 2.1 AI Video Model

We present Wan2.1, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. Wan2.1 offers these key features:

  • 👍 SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
  • 👍 Multiple Tasks: Wan2.1 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
  • 👍 Visual Text Generation: Wan2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
  • 👍 Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.