#talking-head
6 articles
所有标签 pixverse pixverse-v6 video-extend ai-video announcement model-release wavespeedai video-generation image-to-video audio transition video-effects text-to-video llm ai-models glm zhipu-ai claude gpt gemini deepseek comparison ai-news phota image-editing ai-tools image-to-image image-enhance upscaler image-quality text-to-image image-generation photorealistic camera-control vfx anthropic cybersecurity ai-music suno minimax google lyria magihuman davinci sand-ai digital-human lip-sync talking-head open-source audio-video davinci-magihuman text-to-audio music-generation professional ai-image-generator free-tools flux midjourney nano-banana seedream qwen-image pollo-ai lovart freepik best-ai-image-generator ai-video-generator sora kling veo seedance wan vidu best-ai-video-generator higgsfield kling-image-o3 kuaishou ai-image-generation 4k girl-filter face-transformation style-transfer portrait smile-filter photo-editing watermark-removal video-editing sora-alternative sora-shutdown openai pika grok ltx veo-4 photo-colorizer colorize photo-restoration body-swap face-swap portrait-transfer video-to-video prismaudio video-to-audio foley ai-audio sound-generation hunyuan audio-generation v2a iclr recraft recraft-v4 text-to-vector svg design dall-e vocal-remover karaoke music-production stem-separation people-remover object-removal inpainting fotor alternative ai-image photo-editor content-creation audio-converter desktop-app mp3 wav flac aac file-conversion image-converter png jpg webp heic video-converter mp4 mov avi webm janitor-ai media-io video-editor m2.7 ai-model agent coding benchmark age-filter entertainment aging dog-selfie pet-content gender-swap ghibli-filter anime studio-ghibli midjourney-v8 stable-diffusion best-tools ai-content-detector content-moderation content-safety nsfw-detection text-moderation image-moderation video-moderation api moderation-api developer-guide sora-2 character-consistency sketch-to-video animation infinitetalk alibaba celebrity-look-alike face-recognition clothes-changer virtual-try-on fashion fat-filter meme fortune-teller math-solver education story-generator creative-writing review baseten 2026 canva fal-ai fireworks-ai leonardo-ai modal gpu-cloud replicate cloudflare runpod together-ai ai-research helios bitdance bitdance-14b autoregressive qwen-image-2 typography skyreels skyreels-v3 talking-avatar portrait-animation soulx flashhead soulx-flashhead real-time streaming nano-banana-2 nano-banana-pro guide bytedance tutorial ai-images wavespeed-desktop android mobile playground batch-processing lora workflow ai-pipeline ffmpeg audio-conversion image-conversion video-conversion video-merge video-trimming video-enhancement video-upscale inworld tts text-to-speech voice-ai coming-soon multimodal gpt-image kimi moonshot-ai ai-assistant privacy local-ai personal-ai prediction deepmind genie-3 world-model interactive-environments mova clawdbot personal-assistant automation chatbot javascript typescript sdk python speculation ai-collaboration productivity ai-agents no-code app-builder development apple 3d background-remover face-enhancer image-enhancement image-eraser inpaint tools claude-code codex ai-coding cursor developer-tools image-enhancer ai-platforms inference hedra avatars heygen ai-avatar creative video-marketing ideas adobe firefly quality rankings image-translation localization image-upscaling enhancement video-extension video-upscaling enterprise developer clipdrop stability-ai dalle deepai performance black-forest-labs vertex-ai infrastructure hailuo-ai hugging-face tencent ideogram text-rendering imagen kling-ai luma-ai dream-machine serverless nightcafe ai-art pika-labs runway-ml lm-arena runway digital-twins tips video-production synthesia dalle-3 prompting avatar multi-modal aimlapi byteplus comfyui dreamina kie-ai openart openrouter poyo-ai skywork topaz upscaling qwen training fine-tuning depth controlnet pose upscale outpaint canny lightricks sdxl background-removal marketing event e-commerce product-photography mochi cogvideo social-media instagram
daVinci-MagiHuman:碾压所有数字人生成器的开源模型
daVinci-MagiHuman 是一个 150 亿参数的开源模型,能在单张 H100 上 2 秒内生成唇形同步的说话头像视频。胜过 Ovi 1.1(80% 胜率)和 LTX 2.3(60.9%),采用 Apache 2.0 许可,支持多语言,速度极快。
2 分钟阅读
daVinci MagiHuman图像转视频现已登陆WaveSpeedAI
daVinci MagiHuman图像转视频是一款150亿参数的开源模型,可将参考图像动画化为电影级视频,并支持可选音频同步。性能媲美WAN 2.5。最高支持1080p分辨率,时长5至10秒。提供REST API,价格为$0.04/秒,无冷启动。
2 分钟阅读
daVinci MagiHuman Text-to-Video现已登陆WaveSpeedAI
daVinci MagiHuman Text-to-Video可从文本提示生成以人物为核心的电影级视频,支持可选音频同步。150亿参数开源模型,分辨率最高1080p,时长5至10秒。提供REST API,定价$0.04/秒,无冷启动。
1 分钟阅读
InfiniteTalk Fast Video-to-Video Multi 现已登陆WaveSpeedAI
InfiniteTalk Fast 多角色唇形同步技术可将视频与两条音轨转换为逼真的说话或演唱视频。比标准版便宜50%,支持最长10分钟视频。提供开箱即用的REST推理API,性能卓越,无冷启动,价格实惠。
1 分钟阅读
InfiniteTalk Video-to-Video Multi现已登陆WaveSpeedAI
InfiniteTalk Video-to-Video Multi 可从视频和两路音频输入生成逼真的多角色唇形同步视频,支持 480p/720p 分辨率,时长最长可达 10 分钟,并保持全身动作连贯性。提供即用型 REST 推理 API,性能卓越,无冷启动,定价实惠。
1 分钟阅读
SoulX FlashHead:96 FPS 实时 AI 说话人头像
SoulX FlashHead 可以以 96 FPS 的速度生成实时流式说话人头像视频,零身份漂移,支持无限时长视频。立即在 WaveSpeedAI 上体验。
2 分钟阅读