AI Talking Photos brings your photos to life — upload a portrait and text, and watch the person speak. Supports 5-15 seconds duration. Ready-to-use REST inference API, no coldstarts, affordable pricing.
Idle
$0.3per run·~33 / $10
AI Talking Photos makes any portrait speak. Upload a photo, type what you want the person to say, and AI generates a realistic talking video with accurate lip-sync — no filming, no voiceover recording required.
Realistic lip-sync generation AI maps the text to natural lip movements and facial expressions for believable, human-quality talking video.
Any portrait, any text Works on photos of real people, illustrations, historical figures, or fictional characters — if there's a face, it can talk.
Adjustable duration Generate clips from 5 to 15 seconds to match your content length.
Reproducible results Use the seed parameter to lock in a specific output for consistent iterations.
| Parameter | Required | Description |
|---|---|---|
| image | Yes | Portrait photo to animate (URL or file upload). |
| text | Yes | The text you want the person to speak. |
| duration | No | Video length in seconds. Range: 5–15. Default: 5. |
| seed | No | Random seed for reproducible results. Use -1 for a random seed. |
| Duration | Cost |
|---|---|
| 5s | $0.30 |
| 10s | $0.60 |
| 15s | $0.90 |