Colorado AI Act in · EU AI Act (High-Risk) in · ISO 42001 + NIST AI RMF + Agentic AI — unified in one toolkit

UC Berkeley Agentic AI Risk Management Profile: Autonomy, Containment, and Control (2026)

The first academic framework extending NIST AI RMF for autonomous agents. Bounded autonomy, defense-in-depth, and an autonomy spectrum that changes how you classify and govern agents.

By Abhishek G Sharma Published March 21, 2026 13 min read
Layered defense architecture containing an autonomous AI agent
Berkeley's layered containment model for agentic AI systems

Why UC Berkeley Built a Separate Agentic Profile

UC Berkeley's Center for Long-Term Cybersecurity (CLTC) released the Agentic AI Risk-Management Standards Profile on February 11, 2026. The authors — Nada Madkour, Jessica Newman, Deepika Raman, Krystal Jackson, Evan R. Murphy, and Charlotte Yuan — started from a straightforward observation: NIST AI RMF 1.0 doesn't address autonomous agents because it was written before they existed at enterprise scale.

NIST's framework assumes request-response AI systems. You send a prompt, you get a response, a human evaluates it. Agentic AI breaks every one of those assumptions. Agents plan multi-step action sequences, invoke tools and APIs with production credentials, delegate to other agents, persist memory across sessions, and take actions that cascade through real systems. The risk surface isn't a property of the model alone — it's an emergent property of how the agent is configured, what tools it can access, and what environment it operates in.

Stanford Law's March 2026 analysis put it bluntly: "Kill switches don't work if the agent writes the policy." The implication is that single-layer controls — even well-designed ones — aren't sufficient for systems that can modify their own operational context. Berkeley's profile responds with two core concepts that have become foundational for enterprise agentic governance: bounded autonomy and defense-in-depth containment.

The profile is a 67-page research document, not a regulation. It doesn't carry legal force. But it's already being cited by the US government (CLTC submitted a response to a federal request on AI agent security considerations in March 2026) and by industry groups developing agentic governance standards. For practitioners, the value isn't compliance — it's conceptual clarity. Berkeley gives you a vocabulary and a framework for thinking about agent risk that existing standards don't provide.

Bounded Autonomy
Designing agents to operate within explicitly defined boundaries — tool access, data scope, financial limits, decision scope — with hard stops on out-of-bounds behavior.
Defense-in-Depth
Layering multiple independent defenses so that no single control failure causes system compromise.
Autonomy Spectrum
Agent autonomy classified from L0 (no autonomy) to L5 (full autonomy), with governance requirements scaling proportionately.
Agent Cards
Structured documentation capturing an agent's limitations, intended use, risk mitigations, and operational boundaries — analogous to model cards.

The Autonomy Spectrum (L0–L5)

Berkeley's most immediately useful contribution is the autonomy spectrum. Instead of treating agents as a binary — either autonomous or not — the profile classifies agency across six levels. This classification drives every subsequent governance decision: what controls you need, how frequently humans must intervene, and what containment measures are appropriate.

LevelNameCharacteristicsGovernance Approach
L0No AutonomyFollows instructions exactly, no deviationMinimal controls
L1Task SuggestionSuggests tasks, user approves each oneUser oversight sufficient
L2Delegated TasksExecutes pre-approved task sequencesHuman approval of each sequence
L3Planned ExecutionPlans multi-step actions, human approves the planHuman approval of plan before execution
L4Autonomous ExecutionExecutes within strict bounds, minimal human interventionBounded autonomy design required
L5Full AutonomySets own goals and executes independentlyMaximum containment controls

The governance implications are direct. L0 and L1 agents can operate under trust-based governance — you trust the system and verify occasionally. L2 and L3 agents need trust-but-verify approaches where humans approve plans or sequences before execution. L4 and L5 agents demand verification-centric governance: bounded autonomy design, defense-in-depth containment, continuous monitoring, and circuit breakers.

Most enterprise agents deployed in 2026 operate at L2 or L3. The ones that cause governance headaches are the L4 agents that organizations treat as L2 — agents with tool access, API credentials, and database permissions that were granted incrementally without anyone updating the governance classification.

The The AI Controls Toolkit (ACT) Tier 2 assessment maps every agent in your organization to Berkeley's autonomy spectrum and identifies where governance controls don't match the actual autonomy level.
Compare toolkit tiers to find the right level for your organization.

Bounded Autonomy Design

Bounded autonomy is Berkeley's answer to the question: "How do you let agents act autonomously without losing control?" The answer: define explicit boundaries across five dimensions, enforce them technically (not just in policy), and trigger hard stops — not soft warnings — when an agent attempts to exceed its bounds.

Five Boundary Types

The critical distinction Berkeley makes: policy documentation alone isn't a bound. If the bound isn't technically enforced — if the agent could theoretically exceed it and only get flagged in a log somewhere — it's not a bound, it's a suggestion. Bounded autonomy requires both the policy artifact (what the agent is allowed to do, documented and auditable) and the technical enforcement (controls that actually prevent out-of-bounds behavior in real time).

Defense-in-Depth Containment

Defense-in-depth is a security concept borrowed from military doctrine and applied to network security for decades. Berkeley adapts it for agentic AI: layer multiple independent defenses so that no single control failure allows an agent to cause harm. The operative word is "independent" — each layer must function even if adjacent layers fail.

Six-Layer Defense Stack

Berkeley recommends treating high-capability agents (L4 and L5) as untrusted entities — not because you assume malice, but because you prepare for worst-case behavior. This framing is borrowed from zero-trust network architecture and it applies directly: the agent proves its trustworthiness through containment, not through promises in its system prompt.

A common objection from engineering teams: "We built this agent. We know what it does. Why treat it as untrusted?" The answer is that you built the agent's configuration, not the foundation model driving its reasoning. You don't control how the model interprets edge cases, ambiguous instructions, or novel input combinations. Defense-in-depth protects against the gap between your intent and the agent's behavior — which, in production at scale, is always larger than you expect.

For SMEs, a practical starting point: implement layers 1, 4, and 5 first (execution isolation, continuous monitoring, and circuit breakers). These three layers catch the majority of incidents. Add layers 2, 3, and 6 (capability restrictions, network isolation, and kill switch) in a second phase once the monitoring baseline is established and you know what "normal" agent behavior looks like.

Six-layer defense-in-depth containment architecture
Six-layer defense-in-depth containment for agentic systems

New Risk Categories for Agentic AI

Berkeley identifies risk categories that are unique to agentic systems — risks that don't exist in traditional AI and aren't addressed in NIST AI RMF 1.0. These categories overlap with but aren't identical to the OWASP Agentic Top 10 (which focuses on security vulnerabilities specifically).

Each risk category maps to specific NIST AI RMF subcategories that Berkeley extends. For example, "Action Overreach" maps to MAP 2.2 (risk identification) but requires additional subcategory-level specificity that the base framework doesn't provide. The OWASP Agentic Top 10 guide covers the security-specific risks (ASI01–ASI10) in detail.

Mapping to ISO 42001 and NIST AI RMF

Berkeley's profile explicitly extends NIST AI RMF. The table below maps each core concept to both NIST subcategories and ISO 42001 controls, showing where the profile supplements existing standards and where it introduces entirely new requirements.

Berkeley ConceptISO 42001NIST AI RMFImplementation Actions
Autonomy Spectrum (L0–L5)A.6.1 (scope)MAP 2.2, GOVERN 1.4Classify every agent by autonomy level; match governance controls to classification
Bounded AutonomyA.6.1, A.4.3, A.9.2MAP 1.1, MAP 2.2Document bounds per agent; implement technical enforcement for all five boundary types
Defense-in-DepthA.6.2, A.10MANAGE 2.1, MANAGE 3.1Deploy 6-layer containment; test each layer independently; schedule kill-switch drills
Harm Pathway AnalysisClause 6.1 (risk)MAP 3.1, MAP 3.3Map causal chains from agent action to real-world harm; prioritize controls by pathway severity
Agent CardsA.8 (communication)GV.4.2Create Agent Card documentation per system; update on every capability change
Agentic Risk CategoriesNot addressedExtended MAP 2.2Add action overreach, tool-chain, cascade, and accountability gap to risk register
ISO 42001 covers about 40% of Berkeley's recommendations through existing Annex A controls. The remaining 60% — especially autonomy classification, defense-in-depth, and agentic risk categories — requires supplementary governance that the standard doesn't address.
See the full ISO 42001 implementation guide for base-layer controls.

UC Berkeley vs Singapore IMDA

Berkeley and IMDA are the two purpose-built agentic AI governance frameworks released in 2026. They address different aspects of the same problem, and most organizations will need both.

DimensionUC Berkeley CLTCSingapore IMDA MGF
Primary FocusTechnical risk managementOperational governance
OriginAcademic researchGovernment regulatory body
Core ConceptsBounded autonomy, defense-in-depth, autonomy spectrumFour dimensions, meaningful human accountability
Answers the Question"What technical controls do I need?""What governance structure do I need?"
Multi-AgentAddresses delegation chainsLimited coverage
Implementation SpecificityLow (conceptual, academic)Moderate (artifact-level)
Regulatory TrajectoryCited by US governmentAdopted as international baseline

The complementarity is straightforward: IMDA tells you what governance to build (accountability structures, disclosure requirements, lifecycle processes). Berkeley tells you what technical controls to implement (containment layers, boundary enforcement, autonomy classification). Use both. See the full IMDA implementation guide for the governance layer.

Practical Implementation Steps for SMEs

Berkeley's profile is academic in tone and conceptual in structure. Translating it into operational controls requires a structured approach. Here's how an SME with 3 to 10 agents can implement bounded autonomy and defense-in-depth in 6 to 8 weeks.

Common Mistakes to Avoid

Organizations that attempt to implement Berkeley's concepts without structured guidance consistently make the same errors:

Step-by-Step Implementation

Step 1: Classify every agent on the autonomy spectrum. Map each agent to L0 through L5. Be honest — if an agent has API credentials and can take actions without per-action human approval, it's at least L3 regardless of what the deployment documentation says. Classify based on what the agent can actually do, not what you intended it to do.

Step 2: Define bounded autonomy specifications. For each L3+ agent, document all five boundary types: tool/API whitelist, data access scope, financial limits, temporal constraints, and decision scope boundaries. Each boundary needs both a policy artifact (documented, auditable) and a technical enforcement mechanism (code that actually prevents violations).

Step 3: Deploy defense-in-depth containment. Start with the three most impactful layers: execution isolation (containers), continuous monitoring (log every action), and circuit breakers (automatic halt on threshold breach). Add capability restrictions, network isolation, and kill switches in the second phase. Test each layer independently — a circuit breaker that's never been triggered in testing is an assumption, not a control.

Step 4: Create Agent Cards for each system. Document each agent's purpose, autonomy level, tool access, data scope, boundary specifications, containment layers, known limitations, and accountable human. Agent Cards should be living documents updated every time the agent's capabilities change — not static PDFs that were accurate six months ago.

Step 5: Add agentic risk categories to your risk register. Extend your existing risk register with Berkeley's four new categories: action overreach, tool-chain exploitation, inter-agent cascade, and autonomy without accountability. Assess each agent against all four categories. This step typically surfaces risks that weren't visible under traditional AI risk assessment frameworks — particularly inter-agent cascade risks in organizations running multiple agents that share data or delegate tasks to each other.

The AI Controls Toolkit (ACT) Tier 2 translates Berkeley's academic concepts into operational templates — Agent Card templates, bounded autonomy specification sheets, defense-in-depth checklists, and agentic risk assessment worksheets.
Compare toolkit tiers to see what's included.

Frequently Asked Questions

When did UC Berkeley release the Agentic AI Risk Profile?

UC Berkeley's Center for Long-Term Cybersecurity (CLTC) released the Agentic AI Risk-Management Standards Profile on February 11, 2026. It is a 67-page research document that extends NIST AI RMF 1.0 specifically for autonomous AI agents. It is research-based guidance, not a formal standard or regulation.

What is the autonomy spectrum in the Berkeley profile?

The autonomy spectrum classifies AI agent autonomy from L0 (no autonomy, follows instructions exactly) through L5 (full autonomy, sets own goals). Governance requirements scale proportionately: L0 and L1 agents need trust-based controls, L2 and L3 need trust-but-verify approaches, and L4 and L5 agents require full verification-centric governance with maximum containment.

What does bounded autonomy mean for AI agents?

Bounded autonomy means designing agents to operate within explicitly defined operational boundaries: authorized tool and API whitelists, scoped data access permissions, financial authorization caps, task duration limits, and decision scope boundaries. Any action outside bounds triggers human escalation or agent halt through hard stops, not soft warnings. It requires both policy documentation and technical enforcement.

What is defense-in-depth containment for agentic AI?

Defense-in-depth containment layers multiple independent defenses against agent misbehavior: execution environment isolation, capability restrictions, network isolation, continuous monitoring with anomaly detection, circuit breakers for automatic halt on threshold breach, and a kill switch for emergency manual deactivation. The principle is that no single control failure should cause system compromise.

What are Agent Cards?

Agent Cards are structured documentation artifacts — similar to model cards — that capture an agentic system's limitations, provenance, intended use, risk mitigation decisions, and operational boundaries. They serve as the primary documentation tool communicating bounded autonomy design to auditors, regulators, and deployment teams. The concept was introduced in Gemini's analysis of the Berkeley profile.

Assess Your AI Governance Readiness

The free assessment covers governance ownership, AI inventory, risk workflow, policy baseline, and evidence readiness across all frameworks. Estimated completion: ~15 minutes. No login.