Introducing WaveSpeedAI Ace Step Audio To Audio on WaveSpeedAI

Introducing ACE-Step Audio-to-Audio: Transform Your Music with AI-Powered Remixing and Lyric Editing

The world of AI-powered music creation just took a massive leap forward. WaveSpeedAI is excited to announce the availability of ACE-Step Audio-to-Audio, a groundbreaking music transformation model that lets you remix, restyle, and rewrite songs directly from uploaded audio files. Whether you’re a producer looking to create genre-bending remixes or a content creator needing to adapt tracks for different contexts, this model opens up unprecedented creative possibilities.

What is ACE-Step Audio-to-Audio?

ACE-Step is an open-source foundation model developed collaboratively by ACE Studio and StepFun, two pioneers in AI music technology. The 3.5 billion parameter model was publicly released in May 2025 and has quickly established itself as a state-of-the-art solution for music generation and transformation.

Unlike traditional audio tools that simply apply filters or effects, ACE-Step understands music at a deep structural level. It analyzes the rhythm, tempo, melodic structure, and acoustic characteristics of your input audio, then regenerates it according to your creative direction—while preserving the elements that make the original track recognizable.

The model integrates diffusion-based generation with Sana’s Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, enabling it to maintain fine-grained acoustic details that other AI music tools often lose. The result is professional-grade audio quality that stands up to production standards.

Key Features

Remix Mode: Transform the musical style of any track while preserving its core rhythm, tempo, and melodic structure. Turn a pop song into a lo-fi chillhop version, convert an indie track to EDM, or create synthwave remixes of acoustic originals.
Lyrics Mode: Edit or completely replace vocal content while keeping instrumental layers intact. This powerful feature uses flow-edit technology for localized lyric modifications, preserving the original melody, vocal timbre, and accompaniment.
Style Control via Tags: Guide your transformations using intuitive genre and mood tags like “jazz,” “cinematic,” “trap,” “ambient chill,” or “electronic.” The model understands musical contexts and applies appropriate stylistic changes.
High Fidelity Preservation: Unlike many AI audio tools that introduce artifacts or muddy the sound, ACE-Step retains fine acoustic and timbral details from the original audio, ensuring professional-grade output quality.
Reproducible Outputs: Use the seed parameter to reproduce specific results or create slight variations on successful transformations. This makes the creative process more predictable and collaborative.
Multilingual Support: The underlying model supports 19 languages, with exceptional performance in English, Chinese, Spanish, Japanese, German, French, Portuguese, Italian, and Korean—perfect for international music projects.

Real-World Use Cases

For Music Producers and DJs

Create unique remixes of existing tracks without starting from scratch. AI remix tools are rapidly transforming the music industry, and ACE-Step leads the pack with its ability to generate genre-transfer remixes that maintain the essence of the original while exploring entirely new sonic territories. Produce multiple variations for A/B testing in your sets or find the perfect version for different venues and audiences.

For Content Creators

Adapt licensed music for different platforms, campaigns, or cultural contexts. Need a track to feel more upbeat for a product launch video? Want to create a mellower version for a documentary segment? ACE-Step handles these transformations while keeping your audio legally consistent with the original source.

For Artists and Songwriters

Experiment with your own compositions in ways that would take hours in a traditional DAW. Quickly prototype how your song might sound as a different genre, or test new lyrical directions without re-recording vocals. The model’s ability to preserve melodic structure while changing style makes it an invaluable composition tool.

For Localization and Adaptation

Rewrite lyrics for different markets while maintaining the musical integrity of the original track. The multilingual capabilities make it particularly powerful for artists and labels looking to expand their reach internationally.

Getting Started on WaveSpeedAI

Using ACE-Step Audio-to-Audio on WaveSpeedAI is straightforward:

Upload Your Audio: Provide an MP3 or WAV file of the track you want to transform.
Describe the Original: Add tags that describe the current genre and style of your track (e.g., “pop,” “acoustic,” “upbeat”).
Set Your Target: Specify tags for your desired output style (e.g., “jazz,” “electronic,” “cinematic”).
Choose Your Mode: Select “remix” to change the musical style or “lyrics” to modify vocal content.
Generate: Let the model work its magic. With WaveSpeedAI’s optimized infrastructure, you’ll have results in seconds, not minutes.

For lyrics mode, you can optionally input the original lyrics for better contextual understanding and provide new lyrics to be generated. The model handles the rest, matching your new words to the existing rhythm and melody.

Explore the full capabilities at wavespeed.ai/models/wavespeed-ai/ace-step/audio-to-audio.

Why WaveSpeedAI?

Running ACE-Step on WaveSpeedAI gives you distinct advantages:

Blazing Fast Inference: Our infrastructure delivers the model’s impressive speed directly to you—no waiting for GPU allocation or cold starts.
No Cold Starts: Your requests run immediately on warm, production-ready infrastructure. Time is creativity, and we don’t waste yours.
Affordable Pricing: At just $0.0002 per second of generated audio, experimenting with different styles and variations costs pennies, not dollars.
Simple REST API: Integrate ACE-Step into your production pipelines, creative tools, or applications with a straightforward API interface.

The Future of Music Transformation

The vision behind ACE-Step extends beyond just another music tool. As stated by its creators, the goal is to “build the Stable Diffusion moment for music”—creating a foundation model that democratizes professional-quality music production. The Audio-to-Audio capability represents just one piece of this broader ecosystem, which includes specialized features for rap generation, stem separation, and more.

For producers, creators, and music enthusiasts, this means access to capabilities that previously required expensive studios, years of training, or teams of specialists. ACE-Step puts professional remix and adaptation tools in everyone’s hands.

Start Remixing Today

ACE-Step Audio-to-Audio is available now on WaveSpeedAI. Whether you’re reimagining classic tracks, adapting content for new audiences, or simply exploring what your music could become, the tools are ready.

Visit wavespeed.ai/models/wavespeed-ai/ace-step/audio-to-audio to start transforming your audio. With instant inference, no setup required, and pricing that encourages experimentation, there’s never been a better time to discover what AI-powered music transformation can do for your creative workflow.