What Is Claude Managed Agents?

I counted the components our team built just to keep one Claude-based agent running in production. Agent loop, sandbox, state persistence, error recovery, credential scoping, session tracing. Six systems. Four had nothing to do with what the agent actually does. Someone in the channel asked “why did this take three months” and I didn’t have a good answer that wasn’t just the word “plumbing.”

That’s the context in which Anthropic shipped Claude Managed Agents — public beta, live as of April 8, 2026. This is not a new model. Not Claude 5, not a new Opus variant, not a reasoning upgrade. It’s an infrastructure layer — a managed runtime and agent harness that sits on top of the models you’re already using. If you evaluate it as a model release, you’ll misread what it does. Evaluate it as infra, and you’ll understand why it exists.

What Claude Managed Agents Actually Is

Not a new model: a managed agent harness and infrastructure layer

You define an agent — model, system prompt, tools, MCP servers, skills — and Anthropic runs it. The harness handles the agent loop, tool execution, sandboxing, session management, and event history. You don’t build the loop. You don’t manage the runtime.

Anthropic’s engineering team describes it as a “meta-harness” — a system designed to accommodate future harnesses as models improve, rather than encoding fixed assumptions about what Claude can or can’t do. Their engineering blog post on Managed Agents architecture draws an analogy to how operating systems virtualized hardware into abstractions general enough for programs that didn’t exist yet. Whether that analogy holds long-term is an open question. For now, the practical upshot: you configure, Anthropic operates.

What “harness” means: tool execution, session management, sandbox, event history

A Managed Agent session gives Claude access to a cloud container with pre-installed packages (Python, Node.js, Go, etc.), network access rules, and mounted files. Claude can read files, run bash commands, browse the web, and execute code inside a sandboxed environment. The harness handles prompt caching and context compaction automatically. Event history is persisted server-side and fetchable in full — you’re not losing state when sessions run long.

Public beta status: beta header required on all endpoints

Every API request to Managed Agents endpoints requires the beta header. If you’re using the SDK, it sets the header automatically. If you’re calling the API directly with curl or your own HTTP client, you add it manually. Skip it, your requests fail. I missed this on my first attempt (the error message is clear, but it’s the kind of thing you don’t see when skimming docs).

The Problem It Solves

The DIY agent problem: months of infra work before shipping

Here’s what building a production agent looked like before this. You had the model. You had the prompt. You had the tool definitions. But between “works in a notebook” and “runs reliably for customers” sat months of undifferentiated infrastructure work.

What builders had to build before: agent loops, sandboxes, state management, error recovery

Agent loops with retry logic. Sandboxes isolating tool execution from production systems. State management for long-running sessions. Checkpointing so agents could resume after interruptions. Permission scoping so an agent couldn’t exceed its boundaries. Observability to trace what happened when things broke. None of it optional. All of it the same across teams.

What Managed Agents abstracts away

All of the above. You define the agent and the environment. Anthropic handles tool orchestration, context management, error recovery, and execution tracing. The quickstart guide walks through the full lifecycle: create agent, configure environment, launch session, send events, stream results. Three API calls to get something running.

That doesn’t mean zero work. You still design your system prompt, choose which tools to expose, set permission boundaries. But the infrastructure plumbing is off your plate.

Key Capabilities

Secure sandboxed code execution

Each session runs in an isolated cloud container. You configure what’s installed, what network access is allowed, what files are mounted. The agent operates inside that boundary. Not your servers, not your risk surface.

Persistent long-running sessions and stateful event history

Sessions persist. Event history is stored server-side and accessible via API. This is built for tasks running minutes or hours, not single-turn interactions. If something interrupts a session, the event log lets you resume.

Multi-agent coordination (research preview — separate invitation required)

This is the one people keep getting wrong. Multi-agent coordination — where one agent spins up other agents — exists, but it’s in research preview. Same for outcomes and memory features. These are not available by default. You need a separate invitation to access them. Don’t architect around these being generally available today. I’ve seen multiple write-ups describe multi-agent as a shipping feature. It’s not.

Built-in prompt caching and compaction

The harness manages context automatically — caching repeated prompt content, compacting context when sessions run long. No need to implement your own truncation or summarization.

Session tracing in Claude Console

Every session is traceable in the Claude Console. Full event stream — what the agent did, which tools it called, what results came back. For debugging production agents, this matters more than most bullet points on a feature page.

How It Relates to Other Claude Products

The naming gets confusing. Anthropic now has several overlapping surfaces for building with Claude. Here’s how they map.

Messages API: Direct model access. You send a prompt, get a response. No harness, no agent loop, no tool orchestration. You build everything yourself. This is what most developers have been using through the standard API.

Claude Agent SDK: Same tools and agent loop that power Claude Code, packaged as a Python/TypeScript library. But you manage the runtime. Your infra, your scaling, your sandboxing.

Claude Managed Agents: Anthropic manages both the loop and the runtime. You configure. They operate. The difference from the Agent SDK isn’t capability — it’s who owns the infra.

Claude Code / Cowork: End-user products, not API primitives. Code is an agentic coding assistant. Cowork handles desktop knowledge work. Built on the same agent patterns, but they’re finished products, not building blocks.

The mental model: Messages API is the raw material. Agent SDK is the toolkit you run in your own shop. Managed Agents is the hosted workshop. Code and Cowork are the finished goods.

Who Should Pay Attention

Teams shipping long-running or async agents without dedicated infra

If you’ve been delaying an agent feature because nobody wants to own the sandbox and session infrastructure — this is directly aimed at you. Anthropic’s rate limits documentation confirms Managed Agents endpoints are rate-limited separately from the Messages API, which means your existing API usage won’t interfere with agent sessions.

Enterprise teams needing sandboxing and permissions out of the box

Managed Agents ships with scoped permissions, identity management, and execution tracking. For regulated environments where “we built our own sandbox” doesn’t satisfy compliance, having Anthropic manage that layer removes one category of risk.

Builders prototyping agents who want to skip the harness work

Still in the “does this agent concept work” phase? Three API calls to a running session. That’s a real reduction in time-to-first-experiment.

Current Limitations

Beta header required:

Every request needs it. SDK handles it automatically. Manual API calls don’t. This is a beta product — behaviors may change between releases.

Outcomes / multiagent / memory: research preview only, separate invitation needed

These features require separate access approval through Anthropic. They are not part of the default public beta. Plan accordingly.

Rate limits: create endpoints 60 rpm, read endpoints 600 rpm, plus org-level spend limits

Managed Agents endpoints have their own rate limits, separate from Messages API limits. Create endpoints (agents, environments, sessions): 60 requests per minute. Read endpoints (session status, event fetching): 600 rpm. Organization-level spend limits and tier-based rate limits still apply on top. For prototyping and early production, these are fine. If you’re planning hundreds of concurrent sessions, factor these numbers into capacity planning early.

FAQ

Is Claude Managed Agents free to use?

No. You pay standard Claude API token pricing for model usage, plus $0.08 per session-hour for active runtime (idle time excluded). Web search within sessions costs $10 per 1,000 searches. No free tier for Managed Agents — you need API access and credits.

Does it work with Claude Opus 4.6 and Sonnet 4.6?

Yes. You specify the model when creating an agent. Both claude-opus-4-6 and claude-sonnet-4-6 are supported. Opus for deep reasoning tasks, Sonnet for the speed-cost balance most production workloads need.

What’s the difference between Managed Agents and Claude Agent SDK?

Capability-wise, similar — same tools, same agent loop patterns. The difference is operational. Agent SDK: you run the agent on your infra. Managed Agents: Anthropic hosts and operates the runtime. Full control vs. skip the infra work.

Can I use Claude Managed Agents in production today?

The public beta is accessible to all API users. But it’s a beta. Behaviors may change. Research preview features (multi-agent, outcomes, memory) are not generally available. For production, evaluate the current stable feature set and plan for iteration.

Do I need a special beta header to use it?

Yes. Every API call requires the anthropic-beta: managed-agents-2026-04-01 header. The SDK sets it automatically. Raw HTTP requests need it added manually, or calls get rejected.

Previous posts: