Mirelo AI Sfx V1.5 Video To Video

Playground

Mirelo SFX V1.5 generates synchronized sound effects and audio for any video, producing synced SFX to enhance visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Mirelo SFX v1.5 turns your videos into synchronized sound effects using advanced multimodal AI. It listens, sees, and imagines — automatically generating realistic or cinematic sound layers that perfectly match the visual rhythm. Whether it’s footsteps, explosions, or ambient noise, this model brings motion to life.

Why it sounds great

AI-driven sound synthesis – Generates sound effects that fit object motion, timing, and energy directly from video frames.
Cinematic awareness – Detects on-screen actions (impacts, motion, intensity) and produces corresponding effects.
Multiple variations – Create multiple versions of the same video for creative control and sound design diversity.
High coherence – Outputs seamlessly loopable audio segments aligned to scene transitions.
Plug-and-play – Just upload a video clip, set samples, and receive ready-to-use sound effects.

Limits and Performance

Max duration per job: up to 10 seconds (minimum billing covers 5 seconds)
Processing speed: typically 6–12 seconds per generation
Input: MP4, MOV, or URL video upload
Output: AI-generated synchronized sound effects (WAV or MP3)

Pricing

Duration range (seconds)	Billing rule	Approx. cost per second
0–5 s	Minimum charge (5 s)	$0.007 × num_samples × 5 = $0.035 × num_samples
5–10 s	Actual duration billed	$0.007 × num_samples × duration ≈ $0.007 × num_samples per second
>10 s	Capped at 10 s	$0.07 × num_samples maximum per run

How to Use

Upload a video (drag & drop or paste a URL).
(Optional) Write a prompt to describe sound context (e.g., “soft footsteps on wood,” “metal clangs,” “cinematic ambience”).
Set num_samples — the number of different sound versions to generate.
(Optional) Fix seed for reproducibility or randomize for variation.
Click Run — preview and download results.

Pro tips for best quality

Use short, focused clips (≤10s) to maintain strong visual-sound alignment.
For cinematic realism, include context in the prompt (e.g., “rainy street, distant thunder”).
Generate multiple samples to audition variations before final mixdown.
Adjust seed for subtle variations in timing and sound character.

Note

Each sample is generated independently; total cost scales linearly with num_samples.
Minimum billing covers 5 seconds even for shorter clips.
Works best with clear, high-contrast motion — busy scenes may mix sound layers automatically.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result

set -euo pipefail

export WAVESPEED_API_KEY="your-api-key"

REQUEST_BODY=$(cat <<'JSON'
{
  "video": "https://interactive-examples.mdn.mozilla.net/media/cc0-videos/flower.mp4",
  "num_samples": 2,
  "seed": -1
}
JSON
)

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/mirelo-ai/sfx-v1.5/video-to-video" \
  -H "Authorization: Bearer ${WAVESPEED_API_KEY}" \
  -H "Content-Type: application/json" \
  -d "${REQUEST_BODY}")

TASK=$(printf '%s' "${SUBMIT_RESPONSE}" | jq 'if type == "object" and has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "${TASK}" | jq -r '.id // empty')
if [ -z "${PREDICTION_ID}" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "${TASK}" | jq -r '.urls.get // empty')
if [ -z "${RESULT_URL}" ]; then RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/${PREDICTION_ID}/result"; fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body \
    "${RESULT_URL}" \
    -H "Authorization: Bearer ${WAVESPEED_API_KEY}")
  RESULT=$(printf '%s' "${RESPONSE}" | jq 'if type == "object" and has("data") then .data else . end')
  STATUS=$(printf '%s' "${RESULT}" | jq -r '.status // empty')

  case "${STATUS}" in
    completed) printf '%s\n' "${RESULT}" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "${RESULT}" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "${STATUS}" >&2; exit 1 ;;
  esac
done

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
video	string	Yes		-	The video for generating the output.
prompt	string	No		-	Text prompt to guide sound effect generation
num_samples	integer	No	2	2 ~ 4	Number of sound effects to generate
seed	integer	No	-1	-	The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Output values, usually URL strings; some models return text strings or structured result objects (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction
data.model	string	Model ID used for the prediction
data.outputs	array<string \| object>	Array of generated outputs (empty when status is not completed). Items are usually URL strings, but may be text strings or structured result objects, depending on the model.
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to poll for the prediction result
data.status	string	Status: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Overview