When did UC Berkeley release the Agentic AI Risk Profile?

UC Berkeley's Center for Long-Term Cybersecurity (CLTC) released the Agentic AI Risk-Management Standards Profile on February 11, 2026. It is a 67-page research document that extends NIST AI RMF 1.0 specifically for autonomous AI agents. It is research-based guidance, not a formal standard or regulation.

What is the autonomy spectrum in the Berkeley profile?

The autonomy spectrum classifies AI agent autonomy from L0 (no autonomy, follows instructions exactly) through L5 (full autonomy, sets own goals). Governance requirements scale proportionately: L0 and L1 agents need trust-based controls, L2 and L3 need trust-but-verify approaches, and L4 and L5 agents require full verification-centric governance with maximum containment.

What does bounded autonomy mean for AI agents?

Bounded autonomy means designing agents to operate within explicitly defined operational boundaries: authorized tool and API whitelists, scoped data access permissions, financial authorization caps, task duration limits, and decision scope boundaries. Any action outside bounds triggers human escalation or agent halt through hard stops, not soft warnings. It requires both policy documentation and technical enforcement.

What is defense-in-depth containment for agentic AI?

Defense-in-depth containment layers multiple independent defenses against agent misbehavior: execution environment isolation, capability restrictions, network isolation, continuous monitoring with anomaly detection, circuit breakers for automatic halt on threshold breach, and a kill switch for emergency manual deactivation. The principle is that no single control failure should cause system compromise.

What are Agent Cards?

Agent Cards are structured documentation artifacts - similar to model cards - that capture an agentic system's limitations, provenance, intended use, risk mitigation decisions, and operational boundaries. They serve as the primary documentation tool communicating bounded autonomy design to auditors, regulators, and deployment teams. The concept was introduced in Gemini's analysis of the Berkeley profile.

UC Berkeley Agentic AI Risk Profile: Bounded Autonomy 2026

Why UC Berkeley Built a Separate Agentic Profile

UC Berkeley's Center for Long-Term Cybersecurity (CLTC) released the Agentic AI Risk-Management Standards Profile on February 11, 2026. The authors - Nada Madkour, Jessica Newman, Deepika Raman, Krystal Jackson, Evan R. Murphy, and Charlotte Yuan - started from a straightforward observation: NIST AI RMF 1.0 doesn't address autonomous agents because it was written before they existed at enterprise scale.

NIST's framework assumes request-response AI systems. You send a prompt, you get a response, a human evaluates it. Agentic AI breaks every one of those assumptions. Agents plan multi-step action sequences, invoke tools and APIs with production credentials, delegate to other agents, persist memory across sessions, and take actions that cascade through real systems. The risk surface isn't a property of the model alone - it's an emergent property of how the agent is configured, what tools it can access, and what environment it operates in.

Stanford Law's March 2026 analysis put it bluntly: "Kill switches don't work if the agent writes the policy." The implication is that single-layer controls - even well-designed ones - aren't sufficient for systems that can modify their own operational context. Berkeley's profile responds with two core concepts that have become foundational for enterprise agentic governance: bounded autonomy and defense-in-depth containment.

The profile is a 67-page research document, not a regulation. It doesn't carry legal force. But it's already being cited by the US government (CLTC submitted a response to a federal request on AI agent security considerations in March 2026) and by industry groups developing agentic governance standards. For practitioners, the value isn't compliance - it's conceptual clarity. Berkeley gives you a vocabulary and a framework for thinking about agent risk that existing standards don't provide.

Bounded Autonomy: Designing agents to operate within explicitly defined boundaries - tool access, data scope, financial limits, decision scope - with hard stops on out-of-bounds behavior.

Defense-in-Depth: Layering multiple independent defenses so that no single control failure causes system compromise.

Autonomy Spectrum: Agent autonomy classified from L0 (no autonomy) to L5 (full autonomy), with governance requirements scaling proportionately.

Agent Cards: Structured documentation capturing an agent's limitations, intended use, risk mitigations, and operational boundaries - analogous to model cards.

The Autonomy Spectrum (L0–L5)

Berkeley's most immediately useful contribution is the autonomy spectrum. Instead of treating agents as a binary - either autonomous or not - the profile classifies agency across six levels. This classification drives every subsequent governance decision: what controls you need, how frequently humans must intervene, and what containment measures are appropriate.

Level	Name	Characteristics	Governance Approach
L0	No Autonomy	Follows instructions exactly, no deviation	Minimal controls
L1	Task Suggestion	Suggests tasks, user approves each one	User oversight sufficient
L2	Delegated Tasks	Executes pre-approved task sequences	Human approval of each sequence
L3	Planned Execution	Plans multi-step actions, human approves the plan	Human approval of plan before execution
L4	Autonomous Execution	Executes within strict bounds, minimal human intervention	Bounded autonomy design required
L5	Full Autonomy	Sets own goals and executes independently	Maximum containment controls

The governance implications are direct. L0 and L1 agents can operate under trust-based governance - you trust the system and verify occasionally. L2 and L3 agents need trust-but-verify approaches where humans approve plans or sequences before execution. L4 and L5 agents demand verification-centric governance: bounded autonomy design, defense-in-depth containment, continuous monitoring, and circuit breakers.

Most enterprise agents deployed in 2026 operate at L2 or L3. The ones that cause governance headaches are the L4 agents that organizations treat as L2 - agents with tool access, API credentials, and database permissions that were granted incrementally without anyone updating the governance classification.

The The AI Controls Professional assessment maps every agent in your organization to Berkeley's autonomy spectrum and identifies where governance controls don't match the actual autonomy level.
Compare toolkit tiers to find the right level for your organization.

Bounded Autonomy Design

Bounded autonomy is Berkeley's answer to the question: "How do you let agents act autonomously without losing control?" The answer: define explicit boundaries across five dimensions, enforce them technically (not just in policy), and trigger hard stops - not soft warnings - when an agent attempts to exceed its bounds.

Five Boundary Types

Tool/API Boundary: Whitelist of authorized tools and APIs. The agent can only invoke what's explicitly listed. Any out-of-bounds invocation is blocked - not flagged for review, blocked. Enforcement at the API gateway level, not application level.
Data Access Boundary: Scoped read/write permissions at the database and API level. The agent can read the FAQ database but can't read customer PII. Enforced via database-level permissions, not application logic that the agent could bypass.
Financial Boundary: Monetary caps per transaction and per time period. An agent authorized to process refunds up to $100 can't approve a $500 refund regardless of context. Validation occurs before execution, not after.
Temporal Boundary: Duration limits before mandatory human review. An agent running a data migration can't operate for 12 hours without a human checkpoint. Timer-based escalation, enforced externally.
Decision Scope Boundary: Types of decisions the agent is authorized to make. A customer service agent can resolve billing disputes under $200 but must escalate contract termination requests regardless of dollar amount. Threshold-based human approval for high-impact decision categories.

The critical distinction Berkeley makes: policy documentation alone isn't a bound. If the bound isn't technically enforced - if the agent could theoretically exceed it and only get flagged in a log somewhere - it's not a bound, it's a suggestion. Bounded autonomy requires both the policy artifact (what the agent is allowed to do, documented and auditable) and the technical enforcement (controls that actually prevent out-of-bounds behavior in real time).

Defense-in-Depth Containment

Defense-in-depth is a security concept borrowed from military doctrine and applied to network security for decades. Berkeley adapts it for agentic AI: layer multiple independent defenses so that no single control failure allows an agent to cause harm. The operative word is "independent" - each layer must function even if adjacent layers fail.

Six-Layer Defense Stack

Layer 1 - Execution Environment Isolation: Run agents in sandboxed containers (VM, Docker, or cloud isolation). Prevent any agent process from accessing host-level resources, other tenants, or production infrastructure outside its scope.
Layer 2 - Capability Restrictions: No shell access, no kernel-level operations, no filesystem writes outside designated directories. The agent can do what you've whitelisted; everything else is denied by default.
Layer 3 - Network Isolation: Agents communicate only with approved endpoints. No arbitrary HTTP requests, no DNS resolution outside the whitelist, no lateral movement between services.
Layer 4 - Continuous Monitoring: Log every action, tool invocation, and API call. Anomaly detection compares agent behavior against baseline patterns. Alert on deviations - don't wait for incident reports.
Layer 5 - Circuit Breakers: Automatic halt when predefined thresholds are breached: too many API calls per minute, financial authorization exceeded, error rate above baseline, or any action flagged by anomaly detection. Circuit breakers fire automatically - they don't wait for human review.
Layer 6 - Kill Switch: Emergency manual deactivation. A human can terminate any agent immediately with a single action. The kill switch must be tested regularly - a kill switch that hasn't been tested is an assumption, not a control.

Berkeley recommends treating high-capability agents (L4 and L5) as untrusted entities - not because you assume malice, but because you prepare for worst-case behavior. This framing is borrowed from zero-trust network architecture and it applies directly: the agent proves its trustworthiness through containment, not through promises in its system prompt.

A common objection from engineering teams: "We built this agent. We know what it does. Why treat it as untrusted?" The answer is that you built the agent's configuration, not the foundation model driving its reasoning. You don't control how the model interprets edge cases, ambiguous instructions, or novel input combinations. Defense-in-depth protects against the gap between your intent and the agent's behavior - which, in production at scale, is always larger than you expect.

For SMEs, a practical starting point: implement layers 1, 4, and 5 first (execution isolation, continuous monitoring, and circuit breakers). These three layers catch the majority of incidents. Add layers 2, 3, and 6 (capability restrictions, network isolation, and kill switch) in a second phase once the monitoring baseline is established and you know what "normal" agent behavior looks like.

Six-layer defense-in-depth containment for agentic systems

New Risk Categories for Agentic AI

Berkeley identifies risk categories that are unique to agentic systems - risks that don't exist in traditional AI and aren't addressed in NIST AI RMF 1.0. These categories overlap with but aren't identical to the OWASP Agentic Top 10 (which focuses on security vulnerabilities specifically).

Action Overreach: The agent takes actions beyond its intended scope - not through malice but through misinterpretation of goals, ambiguous instructions, or optimizing for a proxy metric that leads to unintended consequences.
Tool-Chain Exploitation: An agent with access to multiple tools chains them in ways the designers didn't anticipate. For example, using a web search tool to find credentials, then using those credentials via an API tool to access restricted data.
Inter-Agent Cascade: When agents delegate to other agents, a failure or misalignment in one agent cascades through the delegation chain. The orchestrating agent may not detect that a sub-agent has deviated from intended behavior.
Autonomy Without Accountability: The agent operates at L4 or L5 autonomy but the governance structure still assumes L1 or L2 oversight. No one is monitoring, no one is accountable, and incidents are discovered through consequences rather than detection.

Each risk category maps to specific NIST AI RMF subcategories that Berkeley extends. For example, "Action Overreach" maps to MAP 2.2 (risk identification) but requires additional subcategory-level specificity that the base framework doesn't provide. The OWASP Agentic Top 10 guide covers the security-specific risks (ASI01–ASI10) in detail.

Mapping to ISO 42001 and NIST AI RMF

Berkeley's profile explicitly extends NIST AI RMF. The table below maps each core concept to both NIST subcategories and ISO 42001 controls, showing where the profile supplements existing standards and where it introduces entirely new requirements.

Berkeley Concept	ISO 42001	NIST AI RMF	Implementation Actions
Autonomy Spectrum (L0–L5)	A.6.1 (scope)	MAP 2.2, GOVERN 1.4	Classify every agent by autonomy level; match governance controls to classification
Bounded Autonomy	A.6.1, A.4.3, A.9.2	MAP 1.1, MAP 2.2	Document bounds per agent; implement technical enforcement for all five boundary types
Defense-in-Depth	A.6.2, A.10	MANAGE 2.1, MANAGE 3.1	Deploy 6-layer containment; test each layer independently; schedule kill-switch drills
Harm Pathway Analysis	Clause 6.1 (risk)	MAP 3.1, MAP 3.3	Map causal chains from agent action to real-world harm; prioritize controls by pathway severity
Agent Cards	A.8 (communication)	GV.4.2	Create Agent Card documentation per system; update on every capability change
Agentic Risk Categories	Not addressed	Extended MAP 2.2	Add action overreach, tool-chain, cascade, and accountability gap to risk register

ISO 42001 covers about 40% of Berkeley's recommendations through existing Annex A controls. The remaining 60% - especially autonomy classification, defense-in-depth, and agentic risk categories - requires supplementary governance that the standard doesn't address.
See the full ISO 42001 implementation guide for base-layer controls.

UC Berkeley vs Singapore IMDA

Berkeley and IMDA are the two purpose-built agentic AI governance frameworks released in 2026. They address different aspects of the same problem, and most organizations will need both.

Dimension	UC Berkeley CLTC	Singapore IMDA MGF
Primary Focus	Technical risk management	Operational governance
Origin	Academic research	Government regulatory body
Core Concepts	Bounded autonomy, defense-in-depth, autonomy spectrum	Four dimensions, meaningful human accountability
Answers the Question	"What technical controls do I need?"	"What governance structure do I need?"
Multi-Agent	Addresses delegation chains	Limited coverage
Implementation Specificity	Low (conceptual, academic)	Moderate (artifact-level)
Regulatory Trajectory	Cited by US government	Adopted as international baseline

The complementarity is straightforward: IMDA tells you what governance to build (accountability structures, disclosure requirements, lifecycle processes). Berkeley tells you what technical controls to implement (containment layers, boundary enforcement, autonomy classification). Use both. See the full IMDA implementation guide for the governance layer.

Practical Implementation Steps for SMEs

Berkeley's profile is academic in tone and conceptual in structure. Translating it into operational controls requires a structured approach. Here's how an SME with 3 to 10 agents can implement bounded autonomy and defense-in-depth in 6 to 8 weeks.

Common Mistakes to Avoid

Organizations that attempt to implement Berkeley's concepts without structured guidance consistently make the same errors:

Classifying agents by intent rather than capability: An agent deployed as an L2 assistant but granted L4-level API credentials is an L4 agent from a governance perspective. Classify based on what the agent can actually do, not what you designed it to do.
Defense-in-depth with only two or three layers: Berkeley recommends six or more independent layers. Organizations commonly implement monitoring and a kill switch and call it "defense-in-depth." That's two layers, not six, and they aren't independent if both rely on the same logging infrastructure.
Assuming L5 agents are ever acceptable without maximum containment: No current governance framework endorses deploying fully autonomous agents (L5) in production without comprehensive containment. If your agent sets its own goals and executes independently, maximum containment isn't optional.
Not testing containment mechanisms: Defense-in-depth requires regular drills, not just documentation. A circuit breaker that's never been triggered in testing might fail silently in production. Schedule quarterly containment tests for every L3+ agent.
Treating the profile as purely academic: Berkeley's concepts are research-framed but operationally applicable. "Bounded autonomy" translates directly to documented operational bounds with technical enforcement. "Defense-in-depth" translates directly to layered security architecture. Don't dismiss it as theory.
Skipping harm pathway analysis: Berkeley's harm pathway methodology maps causal chains from agent actions to real-world consequences. Skipping this analysis means you're assessing risk at the capability level rather than the impact level - and missing the compound risks from tool-chaining and delegation.

Step-by-Step Implementation

Step 1: Classify every agent on the autonomy spectrum. Map each agent to L0 through L5. Be honest - if an agent has API credentials and can take actions without per-action human approval, it's at least L3 regardless of what the deployment documentation says. Classify based on what the agent can actually do, not what you intended it to do.

Step 2: Define bounded autonomy specifications. For each L3+ agent, document all five boundary types: tool/API whitelist, data access scope, financial limits, temporal constraints, and decision scope boundaries. Each boundary needs both a policy artifact (documented, auditable) and a technical enforcement mechanism (code that actually prevents violations).

Step 3: Deploy defense-in-depth containment. Start with the three most impactful layers: execution isolation (containers), continuous monitoring (log every action), and circuit breakers (automatic halt on threshold breach). Add capability restrictions, network isolation, and kill switches in the second phase. Test each layer independently - a circuit breaker that's never been triggered in testing is an assumption, not a control.

Step 4: Create Agent Cards for each system. Document each agent's purpose, autonomy level, tool access, data scope, boundary specifications, containment layers, known limitations, and accountable human. Agent Cards should be living documents updated every time the agent's capabilities change - not static PDFs that were accurate six months ago.

Step 5: Add agentic risk categories to your risk register. Extend your existing risk register with Berkeley's four new categories: action overreach, tool-chain exploitation, inter-agent cascade, and autonomy without accountability. Assess each agent against all four categories. This step typically surfaces risks that weren't visible under traditional AI risk assessment frameworks - particularly inter-agent cascade risks in organizations running multiple agents that share data or delegate tasks to each other.

The AI Controls Professional translates Berkeley's academic concepts into operational templates - Agent Card templates, bounded autonomy specification sheets, defense-in-depth checklists, and agentic risk assessment worksheets.
Compare toolkit tiers to see what's included.

UC Berkeley Agentic AI Risk Management Profile: Autonomy, Containment, and Control (2026)

Why UC Berkeley Built a Separate Agentic Profile

The Autonomy Spectrum (L0–L5)

Bounded Autonomy Design

Five Boundary Types

Defense-in-Depth Containment

Six-Layer Defense Stack

New Risk Categories for Agentic AI

Mapping to ISO 42001 and NIST AI RMF

UC Berkeley vs Singapore IMDA

Practical Implementation Steps for SMEs

Common Mistakes to Avoid

Step-by-Step Implementation

Frequently asked questions

Assess Your AI Governance Readiness