← Blog

What Is HappyHorse-1.0? The Mystery #1 AI Video Model

HappyHorse-1.0 just hit #1 on Artificial Analysis with no public team. Here's what's confirmed and what still needs verification.

10 min read
What Is HappyHorse-1.0? The Mystery #1 AI Video Model

Hey, guys. Dora is here. I track the Artificial Analysis Video Arena leaderboard most weeks — blind user votes, Elo ratings, no lab self-reporting. Last week, a name I’d never seen sat at the top of both text-to-video and image-to-video rankings. HappyHorse-1.0. No known team. No brand. GitHub and HuggingFace links that say “coming soon.”

If you evaluate video models before integrating them into a pipeline — and you’ve learned to be skeptical of leaderboard hype — this is a breakdown of what’s confirmed, what’s only claimed, and what the gap between those two means for decisions right now.

How HappyHorse-1.0 Appeared on the Radar

Artificial Analysis Video Arena: what this leaderboard is and why it matters

Artificial Analysis runs a video arena. Users submit a text prompt or reference image. The system generates outputs from two models. Users see both side by side, don’t know which model made which, and pick the one they prefer.

Those votes feed into an Elo rating system — same math used in chess rankings. A model’s score rises when users pick it, drops when they don’t, adjusted for opponent strength. The result is a ranking based entirely on aggregate human preference under blind conditions. No cherry-picked lab submissions. No self-reported benchmarks.

Blind user votes and Elo: not self-reported benchmarks

Every other video model ranking I’ve seen has the same problem — the people reporting the numbers are the people who built the model. Artificial Analysis removes that. The quality signal comes entirely from users who don’t know what they’re voting on.

Elo differences are relative. A 60-point gap means one model wins roughly 58-59% of head-to-head matchups. A 5-point gap is noise.

T2V #1 (Elo 1333), I2V #1 (Elo 1392) — what these numbers mean in context

As of early April 2026, HappyHorse-1.0’s positions on the Artificial Analysis leaderboard:

CategoryEloRank
Text-to-Video (no audio)1333#1
Image-to-Video (no audio)1392#1
Text-to-Video (with audio)1205#2
Image-to-Video (with audio)1161#2

The previous #1 ​in T2V without audio was Dreamina Seedance 2.0 at 1,273. A 60-point Elo gap is not small. In the I2V no-audio category, HappyHorse leads Seedance 2.0 by 37 points.

With audio included, the picture inverts — Seedance 2.0 edges HappyHorse out for #1. The gap there is narrow: 14 points in T2V with audio, 1 point in I2V with audio.

One thing to keep honest about: Elo scores for newly added models are more volatile than established ones. Seedance 2.0 has over 7,500 vote samples in the T2V category. HappyHorse’s sample count isn’t publicly broken out yet. These numbers will shift as more votes come in. The direction of that shift is unknown. This conclusion has an expiration date — models update fast.

What We Know About the Model

Everything in this section comes from ​happyhorses.io. I’m flagging that upfront because none of these technical claims have been independently verified by a third party as of this article’s publication date (April 8, 2026).

Single self-attention Transformer architecture, 40-layer design (claimed by happyhorse-ai.com, unverified)

The site describes a single unified Transformer with 40 layers. Text tokens, a reference image latent, and noisy video and audio tokens are — according to the site — jointly denoised within one token sequence. The first and last 4 layers reportedly use modality-specific projections. The middle 32 layers share parameters across all modalities. No cross-attention.

A separate marketing site (happy-horse.art) claims 15 billion parameters. That number doesn’t appear on the primary domain or in any independent reporting.

The architecture description is specific enough to be falsifiable — if and when the weights become available, someone will verify or contradict it within hours.

Multilingual audio-video generation: Chinese, English, Japanese, Korean, German, French (claimed)

The site lists six natively supported languages for joint audio-video generation: Chinese, English, Japanese, Korean, German, and French. The happy-horse.art page adds Cantonese as a seventh and mentions “ultra-low WER lip-sync.”

I have no way to test these claims. No weights, no API, no reproducible demo. The arena outputs visible on Artificial Analysis don’t systematically test multilingual audio capability.

Text-to-video and image-to-video in one pipeline (reported)

The site describes a unified pipeline handling both T2V and I2V. This is consistent with the leaderboard data — HappyHorse-1.0 appears in both arenas under the same model name, suggesting a single model rather than separate specialized models.

The site also claims joint audio synthesis — dialogue, ambient sounds, and Foley effects generated alongside video in one pass. The #2 rankings in the “with audio” categories suggest the audio generation exists and is competitive, even if it’s not leading.

What’s Still Unverified

Team identity: pseudonymous per Artificial Analysis, speculated Asia-origin

Nobody has publicly claimed credit for HappyHorse-1.0. Artificial Analysis themselves used the word “pseudonymous” when announcing the model’s addition to the arena — meaning it was submitted without a verifiable team or organization attached.

Community speculation on X has pointed toward an Asia-based origin. The reasoning is partly the multilingual capabilities (CJK languages prominently featured), partly timing patterns that resemble previous stealth drops from Chinese AI labs. None of this constitutes confirmation. Speculation about origin is not identification of origin.

The happyhorse-ai.com site states: “Base model, distilled model, super-resolution model, and inference code — all released.” It also says: “​Everything is open.​”

As of April 8, 2026, both the GitHub link and the Model Hub link on that same site say “coming soon.” They point nowhere. I searched HuggingFace and GitHub for HappyHorse weights. Nothing.

The site says everything is released. The links say it isn’t. This didn’t match the documentation.

Parameter count and hardware requirements: no independent confirmation

The 15B parameter claim appears on a secondary site (happy-horse.art), not the primary domain. The primary site mentions inference speeds — roughly 2 seconds for a 5-second clip at 256p, roughly 38 seconds for 1080p on an H100 — but these are self-reported vendor numbers. No third party has published independent benchmarks on inference speed or memory requirements.

Without downloadable weights, nobody outside the model’s creators can verify parameter count, architecture details, or hardware requirements. This is where my data ends.

WAN 2.7 speculation: what’s driving it, why it remains unconfirmed

Some community members have speculated that HappyHorse-1.0 is actually WAN 2.7 — a next version from Alibaba’s WAN video model family — tested under a pseudonym before official launch.

The logic: WAN 2.6 sits on the Artificial Analysis leaderboard at Elo 1,189 for T2V (well below HappyHorse). Anonymous model drops before launches have become a pattern in the Chinese AI ecosystem. The Pony Alpha situation in February 2026 is the clearest precedent — a mystery model appeared on OpenRouter, triggered a guessing game, and turned out to be Z.ai’s GLM-5 doing a stealth stress test.

But parallel patterns don’t prove identity. The architecture description on HappyHorse’s site doesn’t obviously match publicly known WAN architecture. No leaked weights, no API fingerprinting, no insider confirmation has connected the two. I don’t know. Better than making something up.

Why “Mystery Origin” Is Relevant for Builders

Elo is blind — quality signal is real regardless of team identity

Users who voted HappyHorse outputs above Seedance 2.0 and Kling 3.0 didn’t know what they were voting for. If the model consistently wins blind comparisons, that tells you something real about output quality — regardless of who built it.

The quality signal doesn’t require knowing the team. It requires trusting the methodology.

Access uncertainty: no stable API or public weights today

Quality signal and practical usability are two different things. As of today: no public API, no downloadable weights, no documented pricing, no SLA.

For anyone building a pipeline or shipping a product, HappyHorse-1.0 doesn’t exist as an option yet. The leaderboard rank is real. The access is not.

What to watch: GitHub release, weight availability, API access signals

Three things would move HappyHorse from “leaderboard entry” to “real option”: a GitHub repository with actual weights and inference code, a HuggingFace model card with verifiable details and a license, or an API endpoint with documented pricing.

None exist as of this writing.

Where It Sits in the Current Video Model Landscape

Current T2V and I2V leaderboard context

Top of the Artificial Analysis T2V leaderboard (no audio), early April 2026:

RankModelEloAPI AvailableReleased
#1HappyHorse-1.01333NoApr 2026
#2Seedance 2.0 720p1273No public APIMar 2026
#3SkyReels V41245Yes ($7.20/min)Mar 2026
#4Kling 3.0 1080p Pro1241Yes ($13.44/min)Feb 2026
#5PixVerse V61240Yes ($5.40/min)Mar 2026

I2V (no audio) follows the same pattern: HappyHorse at 1,392, Seedance 2.0 at 1,355, PixVerse V6 at 1,338, Grok Imagine Video at 1,333, Kling 3.0 Omni at 1,297.

The two highest-quality models by Elo — HappyHorse and Seedance 2.0 — are both inaccessible via public API. Positions 3 through 5 in T2V are separated by 5 Elo points — a statistical tie.

Why this matters for teams evaluating video generation stacks

Two separate questions. Which model produces the best blind-comparison output? HappyHorse-1.0, based on current data. Which model can you actually integrate today? Not HappyHorse.

The practical leaderboard starts at position #3. SkyReels V4 offers the best quality-to-price ratio among accessible options. Kling 3.0 Pro costs more but runs 1080p natively. PixVerse V6 is the cheapest per minute in the top tier.

If HappyHorse releases weights or an API in the coming weeks, the calculus changes. That’s a real possibility — stealth-drop-then-release has played out multiple times this year. It’s also possible nothing materializes for months.

FAQ

Who made HappyHorse-1.0?

Unknown. Artificial Analysis describes it as “pseudonymous.” Community speculation points to an Asia-based team, but no organization has claimed it.

Is HappyHorse-1.0 available to use right now?

Not in any production-ready way. GitHub and Model Hub links say “coming soon.” No public API, no downloadable weights, no documented pricing as of April 8, 2026.

Is HappyHorse-1.0 the same as WAN 2.7?

Unconfirmed. The speculation exists because anonymous pre-launch drops are common in the Chinese AI ecosystem — the Pony Alpha / GLM-5 precedent being the most recent. No direct evidence connects HappyHorse to Alibaba’s WAN family.

How does Artificial Analysis rank video models?

Blind user voting. Users compare two videos from the same prompt without knowing which model made which, then pick their preference. Votes feed into an Elo rating system.

When will HappyHorse-1.0 weights be released?

No timeline given. “Coming soon” for both GitHub and Model Hub. No public commitment to hold anyone to.

The leaderboard numbers are real. Everything else — team, weights, access, timeline — is pending. To be verified.

Previous posts: