Introducing WaveSpeedAI Ace Step 1.5 on WaveSpeedAI

The Future of AI Music Creation Is Here: ACE-Step 1.5

Music creation has long been the domain of trained musicians, expensive studios, and hours of painstaking production. That changes today. We’re excited to announce the availability of ACE-Step 1.5 on WaveSpeedAI — an AI music generation model that transforms simple text descriptions into full-length songs, complete with vocals and lyrics in over 50 languages.

Whether you’re a content creator looking for custom background music, a songwriter prototyping ideas, or a developer building audio-powered applications, ACE-Step 1.5 puts professional-quality music generation at your fingertips for a fraction of a cent per second.

What Is ACE-Step 1.5?

ACE-Step 1.5 is a text-to-audio model that generates music from two simple inputs: style tags that describe the genre, mood, and instrumentation, and optional structured lyrics that guide the vocal performance. The model can produce tracks up to four minutes long with high acoustic fidelity, supporting everything from lo-fi ambient instrumentals to full pop songs with verses, choruses, and bridges.

What sets ACE-Step 1.5 apart is its combination of quality, flexibility, and cost. The model supports over 50 languages for lyric generation, handles complex song structures with section markers like [Verse], [Chorus], and [Bridge], and does it all at just $0.0003 per second — meaning a full four-minute track costs less than $0.05.

Key Features

Up to 4-minute tracks: Generate full-length songs up to 240 seconds, enough for complete musical compositions with multiple sections and transitions.
Tag-based style control: Define your sound with comma-separated tags like steampunk, electro swing, jazz, piano, ticking clock or pop, female vocals, upbeat, guitar, 120bpm. Mix and match genres, instruments, moods, and tempos.
Structured lyrics support: Write lyrics with standard song structure markers — [Verse], [Chorus], [Bridge], [Outro] — and the model arranges the music accordingly.
50+ language support: Generate vocals in dozens of languages, making it ideal for global content creation and multilingual projects.
Instrumental mode: Leave the lyrics field empty to generate purely instrumental tracks — perfect for background music and soundscapes.
Reproducible results: Use seed values to regenerate identical outputs, ensuring consistency across iterations.
Flexible duration control: Set your exact desired track length with precision, from short jingles to full-length compositions.

Real-World Use Cases

Creating original music for YouTube videos, TikTok content, podcasts, and Instagram reels has traditionally meant either licensing stock music or hiring composers. ACE-Step 1.5 lets creators generate custom tracks tailored to their content’s mood and pacing. Need an upbeat 30-second intro? A mellow two-minute background track for a tutorial? Describe it with tags, and you have original music in seconds.

Game and App Development

Game developers and app builders can generate dynamic soundtracks, menu music, and ambient audio without licensing headaches. The tag-based system makes it easy to create thematically consistent music across different scenes or levels — dark ambient for dungeons, triumphant orchestral for boss victories, relaxing acoustic for menus.

Music Production and Songwriting

Songwriters and producers can use ACE-Step 1.5 as a rapid prototyping tool. Write your lyrics, choose a style direction with tags, and hear a full arrangement in moments. Iterate on ideas at virtually zero cost before committing to studio production. At less than two cents per minute of generated audio, experimentation becomes essentially free.

Bulk Audio Generation

Businesses that need large volumes of original music — media companies, advertising agencies, e-learning platforms — can generate hundreds of unique tracks cost-effectively. The API-first approach makes it straightforward to integrate music generation into automated content pipelines.

Multilingual and Global Projects

With support for over 50 languages, ACE-Step 1.5 is uniquely suited for projects that span markets and cultures. Generate the same song concept with lyrics in English, Japanese, Spanish, and Korean — each with natural-sounding vocal delivery.

Getting Started on WaveSpeedAI

Using ACE-Step 1.5 on WaveSpeedAI is straightforward. You can start generating music through the model page or integrate it directly into your applications via the API.

Here’s a quick example using the WaveSpeed Python SDK:

import wavespeed

output = wavespeed.run(
    "wavespeed-ai/ace-step-1.5",
    {
        "tags": "pop, female vocals, upbeat, guitar, 120bpm",
        "lyrics": "[Verse]\nWalking down the city streets at night\nNeon signs are painting everything in light\n\n[Chorus]\nWe're alive, we're alive tonight\nNothing's gonna stop us feeling right",
        "duration": 120,
    },
)

print(output["outputs"][0])  # Audio output URL

The tags parameter is the only required field. Add lyrics for vocal tracks, set duration to control track length (up to 240 seconds), and optionally use seed for reproducible results.

A few tips to get the best results:

Be specific with tags: The more descriptive your tags, the more targeted the output. Combine genre, instrument, mood, and tempo tags for precise control.
Use structure markers: Lyrics with [Verse], [Chorus], and [Bridge] markers produce more musically coherent arrangements than unstructured text.
Start short, then extend: Prototype with 30-60 second clips before generating full-length tracks to quickly find the right style direction.
Try instrumental first: Generate without lyrics to evaluate the musical style, then add vocals once you’re happy with the sound.

Why WaveSpeedAI?

Running ACE-Step 1.5 on WaveSpeedAI gives you several advantages over self-hosted alternatives:

No cold starts: Your requests are processed immediately — no waiting for model loading or GPU allocation.
Fast inference: Optimized infrastructure delivers generated audio quickly, even for full four-minute tracks.
Affordable pricing: At $0.0003 per second of generated audio, even heavy usage stays remarkably cheap.
Simple API: A clean REST API and Python SDK mean you can integrate music generation into any workflow in minutes.
No hardware requirements: Skip the hassle of provisioning GPUs and managing model weights. Just send a request and get your audio.

Start Creating Music Today

ACE-Step 1.5 represents a genuine step forward in making music creation accessible to everyone. Whether you need a single custom track or thousands of unique compositions, the combination of quality, flexibility, and affordability makes it a compelling tool for creators and developers alike.

Head over to the ACE-Step 1.5 model page to start generating music right now — no setup required, no subscription needed. Describe your sound, write your lyrics, and let the model do the rest.