Models.dev API: Routing, Pricing, and Capability Checks

I’m Dora. I hit this while cleaning up a model router config.

The router had three sources of truth: one local YAML file, one provider dashboard, and one pricing spreadsheet that nobody wanted to touch. The spreadsheet was the dangerous one. It looked precise because it had numbers. It was not precise. It was just old.

That is where the Models.dev API becomes useful. Not as runtime infrastructure. Not as a provider API. As a structured metadata feed for model catalogs, rough pricing checks, and pre-routing filters.

I checked the public endpoints again before writing this. The Models.dev official README lists api.json, models.json, and catalog.json as public JSON endpoints. The Models.dev API page also presents the project as an open model metadata catalog, not an inference provider.

This piece documents how I would use it in a catalog and routing workflow. This is where my data ends: Models.dev helps with selection data. It does not replace provider docs, live model availability, rate-limit checks, or production SLAs.

What the Models.dev API Gives You

Provider data, model metadata, and combined catalog

The useful split is this:

Endpoint	What I use it for	What I do not use it for
`api.json`	Provider-specific model entries, pricing, limits, capabilities	Final runtime truth
`models.json`	Provider-agnostic model metadata	Provider-specific overrides
`catalog.json`	Combined catalog when I need both layers together	Blind routing decisions

I start with api.json because most routing work begins with provider availability. A router does not call “GPT-5” in the abstract. It calls a provider endpoint with a provider’s model ID, auth method, limits, billing behavior, and failure modes.

The model-only layer is still useful. Models.dev stores provider-agnostic facts such as model name, family, release date, last updated date, knowledge cutoff, reasoning support, tool calling, structured output, temperature support, limits, modalities, weights, licenses, links, and benchmark references. That is the right layer for a canonical model catalog.

Provider entries can then override those facts. Models.dev explicitly explains that provider fields can differ from the underlying model metadata when the same model is served with different limits, pricing, modalities, or feature support.

That sentence matters more than the endpoint list. It is the routing warning.

Which JSON endpoints are confirmed before writing

Confirmed before writing:

curl https://models.dev/api.json
curl https://models.dev/models.json
curl https://models.dev/catalog.json

I would still verify them during implementation. Not because I doubt the docs. Because catalog sync jobs fail in boring ways: CDN cache changes, schema changes, provider renames, numeric fields becoming null, and one model entry carrying an unexpected status.

Good metadata ingestion is defensive. Bad ingestion trusts shape because one sample looked clean.

Build a Local Model Catalog

Syncing fields your app actually needs

I would not mirror the full JSON into application logic. I would sync a smaller internal table:

type LocalModelCatalogEntry = {
  canonicalModelId: string
  providerId: string
  providerModelId: string
  displayName: string

  contextLimit?: number
  inputLimit?: number
  outputLimit?: number

  inputModalities: string[]
  outputModalities: string[]

  supportsReasoning?: boolean
  supportsToolCall?: boolean
  supportsStructuredOutput?: boolean
  supportsTemperature?: boolean

  inputCostPerMillion?: number
  outputCostPerMillion?: number
  cacheReadCostPerMillion?: number
  cacheWriteCostPerMillion?: number

  releaseDate?: string
  lastUpdated?: string
  status?: string
  sourceSyncedAt: string
}

The names are mine. The fields are mapped from documented Models.dev concepts, not invented schema promises.

For costs, I only map fields that exist in the metadata. If a cost field is absent, I do not fill it with zero unless the provider explicitly says the cost is zero. Missing and free are not the same thing. That little mistake will make a budget estimator lie.

I paused here.

A catalog that tracks everything becomes another stale catalog. A catalog that tracks the fields used in routing, budget checks, and UI filtering is easier to keep honest.

Normalizing provider and canonical model IDs

I keep two IDs. The canonical ID answers: “What is this model?” The provider model ID answers: “What string do I pass to this provider?” Those are not the same problem.

OpenAI’s model list API is a good example. It returns available model objects with provider-specific identifiers. That is provider API truth for OpenAI calls. It is not a universal canonical identity system.

So my mapping looks like this:

const routeTarget = {
  canonicalModelId: "openai/gpt-5.5",
  providerId: "openai",
  providerModelId: "gpt-5.5",
}

For a different provider serving the same or similar model family, I do not assume the same ID, context, pricing, tool behavior, or throttling.

Provider fallback breaks here when teams get casual. Having many tools is not the problem. Having to manage your tools is.

Capability-Based Model Filtering

Context, modalities, reasoning, tools, structured output

The first filter should be boring.

function passesWorkflowRequirements(model, workflow) {
  if (model.contextLimit < workflow.minContextTokens) return false
  if (!model.inputModalities.includes(workflow.requiredInput)) return false
  if (!model.outputModalities.includes(workflow.requiredOutput)) return false
  if (workflow.needsToolUse && !model.supportsToolCall) return false
  if (workflow.needsJSON && !model.supportsStructuredOutput) return false
  if (workflow.needsReasoning && !model.supportsReasoning) return false
  return true
}

That gets rid of models that should never enter the routing pool.

Google’s Gemini models API is a useful reminder of why provider-side metadata still matters. The provider’s own API is where I would check model availability, supported methods, and current capability details before letting a candidate enter production traffic.

Models.dev can help shortlist. The provider API confirms what is callable in your account and region.

Filtering out models that fail workflow requirements

My routing pool usually has three stages:

Stage	Question	Data source
Catalog filter	Could this model fit the workflow?	Models.dev metadata
Provider validation	Can my account call it now?	Provider API or dashboard
Runtime test	Does it behave under my load?	Internal evals, logs, retries

The mistake is skipping stage two.

A model can pass the catalog filter and still fail production. It may be unavailable for your account. It may have lower limits on a partner platform. It may support tools in one API surface but not another. It may have beta status. It may return structured output differently from what your parser expects.

So that’s where the bottleneck was. Not the metadata. The assumptions around the metadata.

Pricing Sync and Cost Estimation

Turning metadata into rough budget checks

I use catalog pricing for pre-flight estimation, not accounting.

A simple estimator is enough:

function estimateRunCostUsd(model, inputTokens, outputTokens) {
  const input = (inputTokens / 1_000_000) * (model.inputCostPerMillion ?? 0)
  const output = (outputTokens / 1_000_000) * (model.outputCostPerMillion ?? 0)
  return input + output
}

That number is useful when comparing routing candidates. It is not an invoice.

OpenAI’s model docs show why pricing and capability data need direct verification. Provider docs are where I would check current model IDs, context windows, supported tools, and pricing before committing production budgets.

Models.dev is a useful sync layer. Provider pricing pages and API docs remain the final source for billing-sensitive decisions.

Handling stale prices and provider overrides

I add three controls. First, every synced row gets sourceSyncedAt. Second, every estimate gets a confidence label:

type CostConfidence = "catalog_estimate" | "provider_verified" | "contract_override"

Third, enterprise or committed-use pricing lives outside Models.dev. That belongs in your own billing config. This matters with provider fallback. Same model name. Different provider. Different price. Different context limit. Different rate limit. Sometimes different tool support.

Blind switching is how a fallback path becomes a billing incident.

Models.dev is open-source and designed to be updated by the community. That is useful. It is also the reason I would never let a sync job silently overwrite production cost assumptions without review.

Routing and Fallback Patterns

My fallback pattern is not “if provider A fails, call provider B.”

It is more like this:

const workflow = {
  minContextTokens: 128000,
  requiredInput: "text",
  requiredOutput: "text",
  needsToolUse: true,
  needsJSON: true,
  maxEstimatedCostUsd: 0.25,
}

const candidates = catalog
  .filter(model => passesWorkflowRequirements(model, workflow))
  .filter(model => estimateRunCostUsd(model, 20000, 4000) <= workflow.maxEstimatedCostUsd)
  .filter(model => providerHealth[model.providerId] === "healthy")
  .sort(byPreferredProviderThenCostThenLatency)

Then I test the top candidates against the actual workflow. Not a synthetic “hello world.” The real prompt shape. Real tool schema. Real output parser. Real timeout. Real retry settings.

One fewer switch. Sounds small. Adds up fast.

What to test before production routing

Before I let a model enter production routing, I check:

Test area	What I verify
Auth	API key, project access, region, account tier
Availability	Model callable through the intended provider API
Limits	Context, output, file/input constraints
Tools	Tool calling behavior and error shape
Structured output	JSON validity and schema drift
Cost	Catalog estimate vs provider docs vs internal billing
Rate limits	Burst, sustained traffic, retry-after behavior
Fallback safety	Whether fallback output is compatible with the caller
Observability	Logs include provider, model ID, retry count, cost estimate

The most boring row is usually the one that saves the incident.

I do not treat Models.dev as a routing engine. I treat it as model metadata JSON that helps build the routing engine’s shortlist.

FAQ

Which Models.dev JSON endpoint should I use first?

For routing work, I start with api.json because provider-specific data is usually the first thing the application needs. For a model catalog UI, catalog.json may be more convenient because it combines provider data and canonical model facts. For clean model identity work, models.json is the narrower source.

Can Models.dev replace provider API docs?

No. It can reduce the time spent building a shortlist. It cannot prove your account can call a model, what your exact rate limits are, whether your enterprise contract changes pricing, or whether a provider changed behavior after the catalog was updated.

How should I handle stale model pricing?

Treat synced pricing as an estimate until verified. Store sync timestamps. Add provider-level overrides. Keep contract pricing in your own billing config. Review changes before they affect routing or budget alerts.

What fields matter most for model routing?

The fields I care about first are provider ID, provider model ID, context limit, output limit, modalities, reasoning support, tool call support, structured output support, status, and rough input/output costs. Benchmarks are useful later. They do not replace workflow tests.

Conclusion

The Models.dev API is useful because it turns scattered model information into something a builder can sync, filter, and inspect. That is the right scope.

It gives a catalog enough structure to answer early routing questions: which provider serves which model, what the rough limits are, what capabilities are documented, and what pricing might look like before deeper verification. It does not give runtime guarantees. It does not make provider APIs interchangeable. It does not remove the need to test tool use, structured output, context behavior, rate limits, and cost under your own workload.

My current rule is simple. Use Models.dev API data to build the shortlist. Use provider APIs and production tests to approve the route. To be verified again when the schema changes.

Previous posts: