What Is design.md for Coding Agents?

Last week I asked Claude Code to add three new screens to a side project. By screen two, the button radius had drifted, the headline font had quietly shifted, and the “secondary” gray was a different gray. Same prompt structure. Same model. Same session. The output just stopped caring about what came before.

This is the friction design.md for coding agents is trying to remove. It’s a plain markdown file that holds your design system in a shape an AI agent can read every time it generates UI — not as a one-off prompt, but as persistent context. Stitch reads it. Claude Code reads it. Cursor reads it. So does anything else that picks up context files from a repo root.

I’m Dora, and I spend most of my time exploring AI-assisted development workflows. This piece documents what the file actually is, why Google open-sourced the spec, and where it earns its place in a real workflow versus where it doesn’t.

What design.md Is and Why Google Labs Published It

File format and design-system role

A DESIGN.md file is two layers stacked into one document. The top is YAML front matter — colors, typography, spacing, rounded corners, components — written as structured tokens. Below that is markdown prose that explains what the tokens are for and how to use them. The tokens give an agent exact values. The prose tells it why those values exist.

Here’s roughly what the front matter looks like, lifted from the official DESIGN.md specification on GitHub:

yaml

---
name: Heritage
colors:
  primary: "#1A1C1E"
  tertiary: "#B8422E"
  neutral: "#F7F5F2"
typography:
  h1:
    fontFamily: Public Sans
    fontSize: 3rem
rounded:
  sm: 4px
spacing:
  sm: 8px
  md: 16px
---

Then the ## Overview, ## Colors, ## Typography sections in plain markdown. Section order is fixed — Overview, Colors, Typography, Layout, Elevation, Shapes, Components, Do’s and Don’ts. Sections can be skipped, but the ones present have to appear in that order.

The format ships with a CLI — npx @google/design.md lint validates structure and checks WCAG AA contrast ratios on component color pairs. diff compares two files and flags token-level regressions. export outputs Tailwind v3 config, Tailwind v4 @theme CSS, or W3C DTCG-format JSON. The package is published as @google/design.md on npm.

Why persistent visual context matters for coding agents

The original problem is simple. An agent generating UI has no memory of your design system unless you give it something structured to read. You can describe your palette in a prompt, get a button back, then ask for a card and watch the spacing logic reset. The model isn’t bad at writing code. It has no anchor.

design.md is the anchor. Stitch passes it in as context on every generation request. Claude Code and Cursor pick it up the same way they pick up CLAUDE.md or AGENTS.md — a file in the repo root that the agent reads before answering. Google’s framing in the open-source announcement is that this lets agents “know exactly what a color is for” rather than guessing intent from a prompt.

I tested this on the same Claude Code project that drifted last week. Added a DESIGN.md at the root, regenerated the three screens. Button radius held. Headline font held. The “secondary” gray was the same gray in all three places. Hypothesis confirmed.

The Real Problem It Solves in AI UI Generation

Consistency across screens and iterations

The thing that broke in my earlier session wasn’t one screen’s quality. Each individual screen looked fine. The problem was that screen two didn’t know what screen one had decided. Every generation started from a slightly different interpretation of “your brand.”

This is the failure mode design.md targets. Not “make the UI prettier” — make it the same UI across generations. The tokens are normative. When the front matter says tertiary: “#B8422E”, the agent has no room to interpret it as “a warm orange.” It’s that hex value or it’s wrong.

For high-frequency workflows — where you’re generating five, ten, twenty screens a week and someone has to maintain them — this matters more than first-output quality. One inconsistency per screen at scale becomes a cleanup job. Coding agents design tokens, defined once in a file, kill that cleanup job before it starts.

Why prose plus tokens is stronger than tokens alone

This is the part I almost dismissed at first. Why write paragraphs of “Boston Clay is the sole driver for interaction” when the hex value is already in the YAML?

Because the agent uses the prose to make judgment calls the tokens don’t cover. The tokens tell it the tertiary color is #B8422E. The prose tells it that color is for interaction only — not for decorative accents, not for headlines. When a prompt is ambiguous (“add a notification badge”), the prose decides whether the badge gets the interaction color or a neutral.

The same logic applies to “Do’s and Don’ts” — explicit guardrails like “never use drop shadows on cards” or “always use sentence case for button labels.” Negative constraints carry weight that pure tokens can’t express. This is a design spec for agents, not a CSS file.

The format isn’t bound to Stitch. The spec is Apache 2.0, the CLI is on npm, and the W3C dtcg export means design.md tokens can flow into any tool that reads the W3C Design Tokens Format Module. Stitch is one consumer. Claude Code, Cursor, Antigravity, Gemini CLI are others.

Who Should Care About design.md

I’ll be honest about the boundary here, because design.md isn’t a universal fit.

It earns its place if:

You’re generating UI with coding agents at least weekly, across more than one screen
You have a design system — even a thin one — that you want preserved across generations
You work with more than one agent or tool and want the same brand context in all of them
You’re tired of pasting “remember our palette is X, Y, Z” into every prompt

It doesn’t earn its place if:

You generate one-off mockups and discard them
Your “design system” is whatever the agent produces this time
You’re inside a fully Figma-led workflow with a mature token pipeline already running through Style Dictionary or similar — design.md is lighter than what you have

For AI-native product teams running design.md alongside other AI workflow files (CLAUDE.md, AGENTS.md), it slots in cleanly. One markdown file per concern. No build step. No JSON schema to fight. The cost of trying it is one file in the repo root and a lint command.

For platforms building agent-driven generation surfaces — including unified AI generation layers that route requests across multiple models and need to keep brand context consistent across each call — design.md is the closest thing the ecosystem has right now to a portable contract between a brand and an agent. According to the Google Labs announcement, the format was built specifically to be exportable and importable across tools. That portability is the point.

FAQ

What is design.md used for?

It’s a markdown file that gives coding agents persistent context about a design system — colors, typography, spacing, components, plus prose explaining how to apply them. The agent reads it every time it generates UI, so the output stays consistent across screens and sessions without you re-specifying brand rules in each prompt.

Is it only for Google Stitch workflows?

No. The format originated in Stitch but Google open-sourced the spec under Apache 2.0. Any AI tool that reads context files — Claude Code, Cursor, GitHub Copilot, Antigravity, Gemini CLI — can use it. The CLI exports to Tailwind config, CSS variables, or W3C DTCG JSON, so the tokens flow into non-agent tooling as well.

Why does AI-generated UI need a design-system file?

Because models have no memory of your brand between generations. Without a structured ai design system file, every prompt re-interprets your design language from scratch. With one, the tokens act as hard constraints and the prose covers judgment calls. The difference shows up most clearly at scale — five screens generated with design.md hold their style; five screens generated without it drift.

Which teams should experiment with it first?

Teams already generating UI with coding agents weekly. Solo developers and small product teams running Claude Code or Cursor get the fastest payoff — drop the file in, regenerate, see the consistency. Larger orgs with mature Figma + Style Dictionary pipelines should treat design.md as a complement, not a replacement: use it to give agents a digestible subset of the existing system.

Conclusion

design.md isn’t a revolutionary file format. It’s a markdown file with YAML at the top. That’s the entire point — the format LLMs read best is the one they were trained on the most, and that’s plain text.

What it actually does is shift the question from “how do I describe my design in this prompt” to “where does my design live so every agent can read it.” One file, one location, every tool that picks it up gets the same answer. One fewer thing to re-specify. Sounds small. Adds up fast.

I’ve had it in two projects for a week. It works. Long-term — whether teams maintain these files with the same rigor as a real design system, or whether they rot the way READMEs rot — that’s still to verify. Run it on a project of your own. That’ll tell you more than anything I say.

Previous posts：