Home/Explore/Speech Generation/wavespeed-ai/ace-step

text-to-audio

wavespeed-ai/ace-step

ACE-Step is a foundation model for music generation that creates up to 4 minutes of music with lyrics from text. The model preserves fine-grained acoustic details, enabling advanced control mechanisms such as voice cloning, lyric editing, remixing, and track generation.

Idle

Your request will cost $0.0002 per run.

For $1 you can run this model approximately 5000 times.

ExamplesView all

README

ACE-Step β€” Text to Audio 🎡

ACE-Step Text-to-Audio is a next-generation AI music generation model that composes complete songs β€” including vocals, instrumentals, and lyrics β€” directly from text descriptions. It enables creators to produce professional-quality music up to 4 minutes long, from a simple prompt and a few style tags.

✨ Key Features

  • 🎢 Text-to-Music Generation Transform plain descriptions into coherent music tracks with melody, rhythm, and lyrics. Example: β€œA soulful R&B song with emotional vocals and smooth piano chords.”

  • 🎚 Style Tag Control Enter multiple tags such as lofi, hiphop, drum and bass, trap, chill to guide genre, tempo, and energy.

  • 🎀 Vocal & Lyric Creation Generates original vocals and synchronized lyrics that fit your prompt’s tone and rhythm.

  • πŸͺ„ Voice Cloning & Remixing (Advanced) Optionally replicate vocal tone or remix existing musical ideas using the same control interface.

  • 🎧 Fine-Grained Acoustic Fidelity Maintains dynamic balance, spatial quality, and instrument clarity for professional-grade sound.

  • πŸ•’ Flexible Duration Adjustable from a few seconds to 4 minutes (240 seconds) β€” ideal for everything from jingles to full songs.

🎡 Use Cases

  • Music Production & Songwriting β€” Generate complete demos or backing tracks instantly.
  • Film, Game & Media Scoring β€” Create mood-specific tracks with precise control over emotion and pacing.
  • Advertising & Content Creation β€” Design catchy audio for short-form content or brand storytelling.
  • Education & Experimentation β€” Teach structure, genre, or lyric composition with immediate feedback.
  • Soundtrack Prototyping β€” Preview musical direction before full studio production.

βš™οΈ Parameters

ParameterDescription
tags*List of genres or styles (e.g., lofi, hiphop, drum and bass, chill)
lyrics(Optional) Provide custom lyrics or leave blank for auto-generated ones
durationMusic length in seconds (up to 240)
seedFix for reproducibility or randomize for new variations

πŸ’° Pricing

MetricPrice
Per second of generated audio$0.0002 / s

🎢 Summary

ACE-Step Text-to-Audio empowers musicians, content creators, and storytellers to compose songs from words alone β€” blending lyrical intelligence, genre control, and acoustic quality into one seamless creative tool.