Four interlocking functions of NIST AI RMF with GOVERN at the center connecting MAP MEASURE and MANAGE
GOVERN sits at the center—it drives MAP, MEASURE, and MANAGE.

Understanding the AI RMF Architecture Before You Start

The NIST AI Risk Management Framework isn't a checklist. It's a methodology for thinking about AI risk systematically. That distinction matters because most organizations treat it as a compliance exercise—"check off the 72 subcategories"—and miss the point entirely. The framework is designed to be iterative, contextual, and proportionate to risk. A low-risk internal scheduling tool doesn't need the same treatment as a credit-scoring model that affects people's lives.

The framework is built on four core functions: GOVERN (organizational policies, culture, and accountability), MAP (context, impacts, and risk identification), MEASURE (metrics, testing, and evaluation), and MANAGE (risk treatment, monitoring, and incident response). GOVERN is the cross-cutting function—it doesn't sit in a sequence with the other three. It informs and integrates with all of them continuously. Think of GOVERN as the operating system and MAP, MEASURE, MANAGE as the applications running on it.

Within these four functions sit 19 categories and approximately 72 subcategories. Each subcategory describes a specific outcome the organization should work toward, not a specific control to implement. That's a critical distinction from ISO 42001, where Annex A controls are prescriptive. NIST tells you what to achieve; your organization decides how to achieve it.

The framework also introduces seven trustworthiness characteristics that cut across everything: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed. Every subcategory ultimately serves one or more of these characteristics. When you're prioritizing which subcategories to implement first, these characteristics are your decision filter.

One more concept worth understanding before you start: Profiles. A current profile documents where your organization stands today against the framework. A target profile documents where you want to be. The gap between the two becomes your implementation roadmap. Without profiles, you've got no baseline and no way to measure progress. The Profiles concept is borrowed from the NIST Cybersecurity Framework and works exactly the same way here—you can create profiles at the organizational level, per business unit, or per AI system depending on your governance granularity.

Starting from zero? The free readiness assessment gives you a current-state snapshot across all four functions in about 15 minutes. No login required.

GOVERN — Building the Organizational Foundation

GOVERN contains six categories covering the organizational infrastructure for AI risk management. This is where most implementations should start, because without governance structure, the other three functions have no anchor. You can't identify risks (MAP) or measure them (MEASURE) if nobody owns the process, nobody has set risk tolerances, and there's no AI policy to guide decisions.

The highest-priority subcategories for SMEs are GV.1.1 (AI risk management policies established), GV.1.2 (risk tolerances defined), GV.2.1 (roles and responsibilities assigned), and GV.5.2 (third-party AI risk management policies). If you're consuming vendor AI APIs—which most SMEs are—GV.5.2 is non-negotiable. You need documented expectations for how third-party AI providers manage risk on your behalf, and contracts that enforce those expectations.

Evidence artifacts for GOVERN include: AI governance policy, RACI matrix with named individuals, risk appetite statement specifying acceptable AI risk levels, third-party AI vendor inventory with risk assessments, and training records showing staff understand the governance framework. The policy doesn't need to be long—2–4 pages covering scope, principles, risk tolerance, roles, and review cadence is sufficient for most SMEs.

A common failure pattern in GOVERN: organizations create the policy but don't operationalize it. The policy says "all AI systems require risk assessment before deployment," but nobody enforces it, and two months later the engineering team deploys a new chatbot without telling the governance lead. The fix is to embed GOVERN into existing workflows—add an AI risk checkpoint to your change management process, include AI governance as a standing item in management meetings, and make the RACI matrix a living document that people actually reference.

GV.3 (workforce diversity) is one subcategory that surprises organizations. NIST explicitly recommends that AI risk management teams include people with diverse backgrounds, disciplines, and perspectives—not just engineers. If your AI governance programme is run entirely by your technical team, you're missing the sociotechnical dimension that NIST considers fundamental. Include legal, compliance, product, customer-facing, and domain expert perspectives in your governance structure.

MAP — Understanding Your AI Context and Risks

MAP is where you figure out what you've actually got. Five categories cover context analysis, system categorization, risk-benefit evaluation, risk tolerance per system, and interdependencies. The output of MAP is a complete picture of your AI environment—what systems exist, what risks they carry, who they affect, and how they interact with each other and with your broader business processes.

For SMEs, the critical subcategories are MP.2.3 (AI system categorization by risk level) and MP.3.5 (impact assessment for systems that affect individuals). Every AI system in your inventory needs a risk categorization—not just "high/medium/low" but a structured assessment considering autonomy level, data sensitivity, decision impact, and affected population. Systems that make or influence decisions about people (hiring, lending, pricing, access to services) need a full stakeholder impact assessment.

The practical deliverable from MAP is an AI system inventory with per-system risk profiles. Each entry should include: system name and purpose, data inputs and outputs, risk categorization, affected stakeholders, responsible owner, and identified risks with impact ratings. This inventory becomes the foundation for everything in MEASURE and MANAGE. If it's incomplete or inaccurate, everything downstream is compromised.

One subtlety that catches SMEs: MAP isn't a one-time activity. Every time you deploy a new AI system, change an existing one's scope, or integrate a new third-party AI service, you need to re-run the relevant MAP subcategories. The AI inventory should be treated like an asset register—it's a living document with change control, not a project deliverable you file and forget. Set a quarterly review cadence as a minimum, with event-triggered updates for material changes.

MEASURE — Evaluating Your AI Systems

MEASURE answers the question: how do you know your AI systems are actually performing as intended and not causing harm? Four categories cover assessment practices, risk evaluation, testing and verification (TEVV), and risk prioritization. This is where technical evaluation meets governance—you're producing evidence that your systems are trustworthy, not just hoping they are.

The subcategories that matter most for SMEs are MS.2.5 (bias and fairness evaluation), MS.2.6 (explainability assessment), and MS.3.3 (red-teaming and adversarial testing). Bias evaluation requires testing your models against protected characteristics and documenting the results—even if the results aren't perfect, the documentation of what you tested and what you found is the evidence regulators want to see. Explainability assessment means you can describe, in terms your stakeholders understand, how the system reaches its outputs. Red-teaming means stress-testing the system against adversarial inputs before deployment.

Evidence artifacts for MEASURE include: bias evaluation reports with methodology and findings, performance benchmark documentation, pre-deployment test results (including adversarial scenarios), explainability documentation per system, and a documented TEVV schedule showing ongoing evaluation rather than one-time testing. The frequency of evaluation should be proportionate to risk—high-risk systems might need quarterly review, while low-risk tools might only need annual checks.

A practical approach for resource-constrained SMEs: don't try to build a custom testing infrastructure from day one. Leverage existing open-source fairness toolkits (AI Fairness 360, Fairlearn) for bias evaluation, document the methodology you used and the thresholds you applied, and keep the results in a structured format that an auditor could review. The evidence doesn't need to be perfect—it needs to be consistent, documented, and proportionate to the system's risk level. A three-page bias evaluation report for a customer-facing credit model is reasonable. A one-paragraph note for an internal meeting scheduler is equally appropriate given the risk difference.

MANAGE — Treating and Monitoring AI Risks

MANAGE is where risk treatment happens. Four categories cover risk treatment decisions, continuous monitoring, incident response and recovery, and residual risk acceptance. This is also the function with the strongest connection to the Colorado AI Act safe harbor—your ability to demonstrate ongoing MANAGE activities is the primary evidence that you're actively governing AI risk, not just documenting it.

MG.2.2 (continuous monitoring of deployed AI systems) and MG.3.1 (AI-specific incident response plan) are the highest-priority subcategories. Monitoring means tracking model performance, drift, bias indicators, and anomalous behavior in production—not just checking that the system is "up." Incident response means having a documented playbook specifically for AI-related incidents: what happens when a model makes a discriminatory decision, when an agent takes an unauthorized action, or when a data pipeline introduces corrupted training data.

MANAGE also covers decommissioning (MG.4). When you retire an AI system, you need documented evidence of how data was handled, what users were notified, and how dependencies were unwound. Most organizations forget this until an auditor asks for it. Build the decommissioning procedure now, even if you don't need it yet.

Evidence artifacts for MANAGE include: risk treatment records with documented decisions (accept, mitigate, transfer, avoid), monitoring dashboards showing ongoing system health, AI incident response plan, incident logs with root cause analysis, corrective action records, and decommissioning records for retired systems.

The implementation sequence for MANAGE follows a 90-day rollout pattern for most SMEs. Weeks 1–4 focus on GOVERN (policy, roles, inventory). Weeks 5–8 tackle MAP and MEASURE (impact assessment, metrics, testing). Weeks 9–12 complete MANAGE (incident playbooks, monitoring setup, decommissioning procedures). This isn't rigid—you'll iterate across functions as you go—but it gives you a defensible baseline in three months. After the initial 90 days, the framework operates as a continuous cycle: GOVERN drives policy updates, MAP identifies new risks, MEASURE evaluates them, and MANAGE treats them. The cycle never stops.

Building your AI incident response plan? The AI Controls Toolkit (ACT) Tier 2 Professional includes an AI-specific IR playbook template covering goal hijack, unauthorized tool invocation, model drift, bias detection, and data pipeline incidents.

Priority Subcategories with Evidence Checklists

You don't need to implement all 72 subcategories on day one. Start with the highest-impact items per function and expand from there. This table shows the priority subcategories for an SME deploying 3–10 AI systems.

FunctionSubcategoryOutcomeEvidence Artifact
GOVERNGV.1.1AI risk management policies establishedAI governance policy (2–4 pages)
GOVERNGV.2.1Roles and responsibilities assignedRACI matrix with named individuals
GOVERNGV.4.1Risk tolerances definedRisk appetite statement
GOVERNGV.5.2Third-party AI risk policiesVendor AI inventory + due diligence records
MAPMP.2.3AI systems categorized by riskAI system inventory with risk profiles
MAPMP.3.5Impact assessment for affected individualsStakeholder impact assessment per high-risk system
MEASUREMS.2.5Bias and fairness evaluatedBias evaluation report with methodology
MEASUREMS.2.6Explainability assessedExplainability documentation per system
MEASUREMS.3.3Red-teaming and adversarial testingPre-deployment test results
MANAGEMG.2.2AI systems continuously monitoredMonitoring dashboard + alert log
MANAGEMG.3.1AI incident response planIR playbook + incident log
MANAGEMG.4.1Residual risk documentedRisk acceptance memos with sign-off

NIST AI RMF vs ISO 42001

These aren't competing standards. They're complementary approaches that serve different purposes. NIST gives you a risk management methodology. ISO 42001 gives you a certifiable management system. Most serious AI governance programmes will use both—NIST for the risk thinking, ISO 42001 for the audit-ready structure.

DimensionNIST AI RMF 1.0ISO/IEC 42001:2023
Legal statusVoluntary federal frameworkInternational certifiable standard
Structure4 functions, 19 categories, ~72 subcategoriesHarmonized Structure (Clauses 4–10) + Annex A (~38 controls)
ApproachSociotechnical, risk-informed, outcome-basedManagement system, process-based, prescriptive
Language"Should" throughout (descriptive)"Shall" for clauses, "should" for Annex A (prescriptive)
ScopeBroader: includes culture, human factors, societal impactNarrower: AI systems within defined management system boundary
Documentation mandateOutcome-focused (you decide what to produce)Explicit "shall retain documented information" requirements
Audit mechanismNo formal audit or certificationStage 1 + Stage 2 certification audit by accredited body
SME accessibilityFlexible but ambiguous (no clear "done" state)Structured but resource-intensive (clear "done" = certified)
Colorado safe harborExplicitly named as qualifying frameworkExplicitly named as qualifying framework
GenAI extensionNIST AI 600-1 (July 2024): 12 GenAI risksNot addressed (standard predates GenAI-specific guidance)
Agentic AI coverageNot addressed (pre-dates agentic frameworks)Partial via Annex A.9.2 (human oversight)

The practical takeaway: if you need a certificate for procurement or regulatory purposes, implement ISO 42001. If you need a risk methodology that covers broader societal impacts and human factors, implement NIST AI RMF. If you're serious about AI governance, implement both. The AI Controls Toolkit (ACT) Tier 1 unified controls matrix maps every NIST subcategory to the corresponding ISO 42001 clause, so you don't duplicate effort.

One common mistake: organizations assume they must choose one framework. That's wrong. The two standards use different terminology for overlapping concepts, but the underlying work products are compatible. An AI risk assessment produced for NIST AI RMF GV.1.1 and MP.3.5 can satisfy ISO 42001 Clause 6.1 with minimal adaptation. A bias evaluation report produced for NIST MS.2.5 serves as evidence for ISO 42001 Annex A.7 (data governance). The trick is building your evidence once and mapping it to both frameworks, which is exactly what the AI Controls Toolkit (ACT) Tier 1 unified controls matrix enables.

12-week implementation roadmap with three phases covering GOVERN then MAP MEASURE then MANAGE
Twelve weeks from policy to production monitoring.

The Generative AI Extension (NIST AI 600-1)

Published July 2024, NIST AI 600-1 extends the base AI RMF with 12 risks that are either unique to or significantly exacerbated by generative AI systems. If your organization deploys any LLM, image generator, code assistant, or RAG-based system, this profile should be treated as a mandatory supplement to the base framework.

The 12 GenAI-specific risks are: confabulation (hallucinated outputs presented as fact), harmful bias amplified by training data, CBRN information risks (dangerous knowledge generation), data privacy violations through memorization or extraction, intellectual property infringement, generation of obscene or abusive content, value chain and component integration risks, system prompt manipulation and jailbreaking, training data poisoning, output homogenization reducing diversity, environmental impact from compute resources, and human-AI configuration challenges where users over-rely on or misunderstand AI outputs.

For each risk, the profile provides suggested actions within the GOVERN, MAP, MEASURE, and MANAGE functions. Organizations that implemented the base RMF before July 2024 should conduct a gap assessment against AI 600-1 to identify where their existing controls fall short for generative AI. The most common gaps are around confabulation monitoring, system prompt security, and training data provenance tracking.

The practical implication for SMEs: if you're using any commercial LLM (GPT-4, Claude, Gemini, or similar) through an API, you've inherited at least six of these twelve risks. Confabulation monitoring means you need a process for validating AI-generated outputs before they reach customers. System prompt security means your prompts aren't exposed to end users who could manipulate them. Data privacy means you've verified that your vendor's data handling practices don't allow training on your proprietary data. These aren't theoretical concerns—they're operational requirements that should appear in your GOVERN policies and MANAGE monitoring procedures.

NIST also published a preliminary draft of IR 8596 (Cybersecurity Framework Profile for AI), which addresses cybersecurity-specific risks for AI systems. As of March 2026 this remains in draft status, but organizations in high-security environments should monitor it for inclusion in future implementation cycles.

Colorado AI Act Safe Harbor Mapping

Colorado SB 24-205 (effective June 30, 2026) explicitly names NIST AI RMF compliance as a qualifying framework for its affirmative defense. If you can demonstrate that you've implemented the framework for your high-risk AI systems, you've got a statutory safe harbor against enforcement actions alleging algorithmic discrimination.

The mapping between Colorado obligations and NIST functions is direct. The Act's transparency and accountability requirements map to GOVERN (GV.1, GV.2). Algorithmic impact assessment obligations map to MAP (MP.3.5) and MEASURE (MS.2.5, MS.2.6). Ongoing monitoring, consumer appeal mechanisms, and correction obligations map to MANAGE (MG.2, MG.3). Notification requirements for adverse decisions map across GOVERN and MANAGE.

For deployers specifically, the Colorado Act requires an impact assessment within 90 days of enforcement (by September 28, 2026). That impact assessment must cover the system's purpose, intended benefits, known limitations, risk mitigation measures, and potential for algorithmic discrimination. If you've already produced evidence for MAP MP.3.5 (stakeholder impact assessment) and MEASURE MS.2.5 (bias evaluation), you've got most of the content you need. The Act essentially requires what NIST already recommends—the difference is that Colorado makes it a legal obligation rather than a voluntary best practice.

The critical point: "compliance with NIST AI RMF" in the Colorado context means demonstrable implementation with evidence artifacts—not a slide deck saying "we follow NIST." Regulators will ask for the AI governance policy, the risk assessments, the bias testing reports, the incident logs, and the corrective action records. The evidence table earlier in this guide shows exactly what to produce for each priority subcategory.

Colorado deadline: June 30, 2026. The AI Controls Toolkit (ACT) Tier 1 unified controls matrix includes a Colorado column mapping each NIST subcategory directly to the corresponding Colorado AI Act obligation. The AI Controls Toolkit (ACT) Tier 2 Professional adds the full implementation project plan, policy templates, and evidence templates.
Read the Colorado AI Act compliance guide →