top of page
Search

The Translation Problem: Evidence Requirements and Stakeholder Variation in Educational AI Governance

  • Writer: Ryan James Purdy
    Ryan James Purdy
  • Mar 19
  • 31 min read




The Translation Problem: Evidence Requirements

and Stakeholder Variation in Educational AI Governance


AI Governance in Education Series

Memorandum No. 3


Ryan James Purdy

Purdy House Publishing & Consulting


December 2025


Working Paper


Abstract

Memorandum No. 1 documented the operational gap in AI governance frameworks for education: the absence of implementation infrastructure despite abundant principles and regulatory requirements. Memorandum No. 2 examined the forcing functions now closing that gap through insurance exclusions, regulatory timelines, and liability exposure. This memorandum addresses the translation layer: the problem of converting governance commitments into the specific evidentiary formats that different stakeholders require. The same governance domain produces different documentation demands depending on whether an insurer, regulator, procurement authority, or board is asking. Aspirational frameworks describe what institutions should value; insurance questionnaires specify what institutions must produce. The resulting fragmentation means an institution can hold an AI policy that simultaneously satisfies its board, fails its insurer's supplemental application, meets state guidance, and stalls vendor procurement. The governance exists; the translation does not.

This memorandum analyzes evidence requirements across four critical domains: transparency and explainability, third-party vendor management, human oversight protocols, and bias testing. It establishes a parallel between the current fragmented state of AI governance assurance and cybersecurity before SOC 2 provided a shared attestation language, suggesting a multi-year trajectory toward standardization. It maps the binding requirements arriving in 2026, including Verisk endorsement availability in January, Colorado CAIA enforcement in June, and EU AI Act high-risk obligations in August, and examines what interim infrastructure institutions require to navigate non-harmonized requirements. The analysis concludes that translation capacity, whether built internally or engaged externally, is necessary during the period when standards have not converged and institutions must nevertheless demonstrate governance to multiple stakeholders with different evidence languages.

Keywords: AI governance, education policy, evidence translation, insurance underwriting, regulatory compliance, SOC 2, ISO 42001, NIST AI RMF, stakeholder variance, attestation frameworks


Key Findings

This memorandum's analysis of evidence translation requirements across stakeholder types yields five principal findings:

The translation problem is structural, not temporary. Different stakeholders evaluate AI governance adequacy against different criteria and require different evidence formats. A policy document that satisfies board oversight may fail an insurer's supplemental application, meet state guidance, and stall vendor procurement simultaneously. The governance exists; the translation does not.

Evidence requirements vary across four critical domains. Transparency and explainability, third-party vendor management, human oversight protocols, and bias testing each produce different documentation demands depending on which stakeholder is asking. Aspirational frameworks describe what institutions should value; insurance questionnaires specify what institutions must produce.

The current period resembles cybersecurity before SOC 2. Fragmented requirements, carrier-specific questionnaires, and the absence of standardized attestation frameworks characterized service organization controls before 2010. The SOC 2 transition took roughly five years from announcement to widespread market recognition. AI governance faces greater complexity and may require longer.

Binding requirements arrive in 2026. January 2026 brings Verisk (ISO) endorsement availability for AI exclusions. June 30, 2026, brings Colorado CAIA enforcement. August 2, 2026, triggers EU AI Act high-risk system obligations. Insurance renewals throughout 2026 will be the first cycle where AI governance documentation materially affects coverage terms.

Interim translation capacity is necessary. Standardization is developing but will not arrive in time to relieve near-term compliance pressure. Institutions must build governance infrastructure that satisfies multiple stakeholders with different evidence languages, either through internal capacity development or external specialist support.


Table of Contents

Abstract

Key Findings

Series Context

Scope, Method, and Limits

1. The Translation Problem

1.1 The Aspirational-Operational Divide

1.2 What Insurers Actually Ask For

1.3 Where Variance Appears

1.4 The Structural Result

2. Evidence Requirements and Stakeholder Variation

2.1 Transparency and Explainability

2.2 Third-Party AI Vendor Management

2.3 Human Oversight Protocols

2.4 Bias Testing and Fairness Audits

2.5 Summary

3. The Pre-SOC 2 Parallel

3.1 The Pre-Standardization Problem

3.2 The SOC 2 Resolution

3.3 The Current State of AI Assurance

3.4 The Parallel and Its Limits

4. The Interim Infrastructure Period

4.1 Current Fragmentation

4.2 Timeline of Binding Requirements

4.3 Likely Convergence Path

4.4 The Interim Imperative

Conclusion

References

About the Author

About Purdy House Institute


Series Context

This memorandum is the third in a series examining AI governance requirements for educational institutions. Memorandum No. 1, "The Operational Gap in Educational AI Governance," established that existing AI governance frameworks are aspirational or regulatory but not operational, leaving institutions without implementation infrastructure.1 Memorandum No. 2, "The Forcing Function," documented how insurance market exclusions, loss signals, and regulatory timelines are converging to compel governance adoption.2 This memorandum addresses the translation layer: how institutions convert governance commitments into the specific evidentiary formats that different stakeholders require.

Scope, Method, and Limits

Scope. This memorandum examines evidence translation requirements for educational institutions and EdTech vendors navigating AI governance demands from multiple stakeholders. The analysis addresses the education sector specifically, focusing on multi-stakeholder evidence translation rather than implementation procedures.

Method. The analysis synthesizes aspirational frameworks (UNESCO Recommendation on the Ethics of AI, OECD AI Principles, NIST AI Risk Management Framework), binding regulatory instruments (EU AI Act, Colorado SB 24-205), and insurance market artifacts (Verisk ISO filings, carrier endorsements, underwriter questionnaire patterns). Evidence categories are derived from Memorandum No. 2's analysis of insurance underwriting requirements.

Limits. This memorandum does not constitute legal advice, audit opinion, or insurance coverage opinion. It does not provide implementation templates or operational procedures. Operational infrastructure requires sector-specific development beyond the scope of this working paper series.


1. The Translation Problem

Educational institutions now face AI governance demands from multiple directions simultaneously. Insurers request documentation for underwriting decisions. Regulators issue guidance with varying degrees of specificity. Procurement authorities require vendor attestations. Boards seek assurance that institutional exposure is managed. Parents expect transparency about how algorithmic systems affect their children. Each stakeholder speaks a different evidence language, and each evaluates governance adequacy against different criteria.

The problem confronting schools and vendors is not an absence of AI governance. Many institutions have adopted AI policies of some kind, whether standalone documents or amendments to existing acceptable use frameworks. The problem is translation. The same governance domain (human oversight, for example, or bias mitigation) produces different documentation requirements depending on who is asking. A policy statement that satisfies a board's request for evidence of due diligence may fail an insurer's supplemental application, which demands not sentiments but artifacts: named personnel, documented protocols, dated logs. State guidance may ask institutions to "ensure meaningful human control" without specifying what documentation would demonstrate compliance. A vendor seeking to work with a school district may need to produce evidence in yet another format, one that aligns with procurement checklists that reference neither the insurer's questionnaire nor the state's guidance document.

This creates a structural condition in which an institution can hold an AI policy that simultaneously satisfies its board, fails its insurer's supplemental application, meets state guidance, and stalls vendor procurement. The governance exists. The translation does not.

1.1 The Aspirational-Operational Divide

The dominant AI governance frameworks available to educational institutions speak in principles. UNESCO's Recommendation on the Ethics of Artificial Intelligence emphasizes proportionality, transparency, human oversight, and fairness.3 The OECD's AI Principles, adopted in 2019 and revised in November 2023, call for accountability, robustness, and human-centered values.4 The NIST AI Risk Management Framework organizes governance around functions (Govern, Map, Measure, Manage) that describe continuous processes rather than discrete deliverables.5 Even state-level guidance documents, which vary considerably in specificity, tend toward aspirational language: institutions should "consider" bias, "ensure" appropriate use, "maintain" human oversight.

These frameworks serve legitimate purposes. They establish shared vocabulary, signal institutional priorities, and provide conceptual architecture for governance programs. What they do not provide is operational infrastructure: the specific documentation, workflows, and evidence artifacts that allow an institution to demonstrate compliance to an external evaluator with defined requirements.

The distinction matters because different stakeholders occupy different positions along the aspirational-operational spectrum. A board reviewing an AI policy may be satisfied by language demonstrating that the institution takes AI governance seriously and has assigned responsibility to appropriate personnel. The board's evaluation criteria are often qualitative: does the policy exist, does it reflect institutional values, does it assign accountability. These are reasonable questions for fiduciary oversight, but they do not map to the evidence formats that other stakeholders require.

1.2 What Insurers Actually Ask For

Insurance underwriters do not typically evaluate whether an institution values fairness. They ask whether the institution can produce a bias audit log for a specific model. They do not assess whether human oversight is important to the institution's AI philosophy. They ask whether documented protocols exist for human-in-the-loop intervention in specific workflows, and whether logs demonstrate that those protocols have been executed. They do not ask whether the institution has an AI policy. They ask whether the institution maintains an AI inventory: a complete, timestamped list of all AI systems in use, including vendor-provided tools, with risk classifications, data inputs, deployment dates, and named owners.6

The shift from aspirational to operational language represents more than a difference in tone. It reflects a difference in evidentiary logic. Aspirational frameworks ask institutions to commit to principles. Operational requirements ask institutions to produce artifacts that demonstrate those principles have been implemented. The former allows qualitative self-attestation; the latter demands documented evidence of existence and execution. Either the bias audit log exists and shows testing was performed, or it does not. Either the incident response drill was conducted and documented, or it was not. Either the vendor risk assessment is on file, or it is not.

This evidentiary standard (documentation sufficient to satisfy an external evaluator with audit authority) can be termed "attestation-grade" evidence. It is the threshold that separates governance-as-statement from governance-as-practice. The challenge is that no authoritative crosswalk exists between the aspirational frameworks institutions have adopted and the attestation-grade evidence that underwriters increasingly require. A school that has built its AI governance program around UNESCO principles or state guidance documents may discover at renewal that its insurer's supplemental application asks questions the existing documentation cannot answer.

1.3 Where Variance Appears

The translation problem is compounded by variance across the stakeholders requesting evidence. Different carriers, different jurisdictions, and different institutional contexts produce different documentation requirements, even when the underlying governance domain is nominally the same.

Regional Variance. Insurers operating under EU AI Act influence increasingly request conformity assessments and fundamental rights impact assessments for high-risk AI systems. The Act's Article 14 imposes specific human oversight documentation requirements, and Article 26 establishes deployer obligations that extend to educational institutions using covered systems.7 Insurers underwriting EU-exposed institutions must account for these requirements in their risk evaluation. By contrast, insurers operating primarily in U.S. markets have developed documentation expectations shaped by state-level developments. Colorado's Consumer Protections for Artificial Intelligence Act (SB 24-205), with enforcement delayed to June 30, 2026, via SB25B-004, requires impact assessments and risk management programs for consequential AI decisions, a category that encompasses many educational applications.8 The New York Department of Financial Services issued Circular Letter No. 7 in 2024 establishing expectations for AI governance among regulated entities, creating a reference point that other state regulators and insurers have noted.9 These frameworks share conceptual DNA but differ in specificity, terminology, and documentation formats.

Carrier Variance. Different insurance carriers have adopted different postures toward AI risk, and those postures manifest in different documentation demands. Some carriers have focused underwriting attention on employee training logs, seeking evidence that personnel using AI systems have received instruction on appropriate use, limitations, and escalation procedures. Other carriers have emphasized vendor oversight, reflecting concern that institutions may be exposed to AI-related claims through third-party tools they did not develop and cannot fully control. Several carriers have filed AI-related exclusions and underwriting rules; public filings and law firm analyses indicate that some language conditions coverage on the insured's ability to identify or detect content created through AI or third-party AI use, a standard that requires not merely vendor due diligence but ongoing monitoring capacity that most educational institutions do not possess.1011 Reinsurance markets have introduced performance warranty models that condition coverage on technical due diligence demonstrating model robustness, a standard derived from product liability logic that translates poorly to educational procurement contexts.

Institutional Variance. K-12 districts, community colleges, research universities, and international schools face different regulatory overlays, different liability exposures, and different operational realities. The K-12 district operates under FERPA, COPPA, IDEA, and state student privacy laws; its AI governance must address minor children as data subjects and service recipients. The research university may deploy AI systems developed internally, triggering developer obligations under emerging frameworks that the K-12 district, as a pure deployer, does not face. The international school operating across jurisdictions confronts the compounding effect of multiple regulatory regimes (GDPR in one campus, state law in another, neither clearly applicable to a third) without the compliance infrastructure that multinational corporations maintain.

1.4 The Structural Result

The cumulative effect of these variances is that no single AI governance document can satisfy all stakeholders. An institution that produces a policy optimized for board consumption (values-forward, principle-based, organizationally appropriate) will find that document insufficient for an insurer's supplemental application, which demands specific artifacts in specific formats. An institution that produces documentation optimized for one carrier's questionnaire may discover that a different carrier asks different questions, or that the same carrier's questionnaire has evolved between renewal cycles. An institution that builds governance infrastructure around current state guidance may find that guidance superseded, contested, or federally preempted before the documentation has been operationalized.

The December 11, 2025, federal executive order on AI policy illustrates this regulatory instability. The order establishes an AI Litigation Task Force charged with challenging state AI laws deemed inconsistent with federal innovation priorities, explicitly names Colorado's algorithmic discrimination provisions as a target, and directs federal agencies to consider conditioning grant funding on states refraining from AI regulation the administration considers onerous.12 State-level guidance cannot serve as stable ground for governance investment when the regulatory foundation itself is contested.

This instability does not reduce the need for governance. It increases it. When external standards are fragmented, contested, and evolving, the burden shifts to institutions to build internally defensible frameworks that can adapt as requirements clarify. The governance vacuum is not permission to defer. It is a transfer of responsibility from legislatures and regulators to institutional leadership.

The translation problem, then, is not a temporary condition awaiting regulatory resolution. It is a structural feature of the current period, one that will persist until standardized assurance frameworks emerge to mediate between aspirational principles and attestation-grade evidence. Memorandum No. 2 in this series established that such frameworks are developing but not yet mature, and that the interim period requires institutions to navigate fragmented requirements without the benefit of harmonized standards.

The sections that follow examine how specific evidence categories manifest across stakeholder types, how the insurance industry's assurance expectations are evolving, and what the interim period demands of institutions and vendors operating in educational contexts.


2. Evidence Requirements and Stakeholder Variation

Memorandum No. 2 identified eight evidence categories that recur across underwriting questionnaires and regulatory guidance: AI inventory and classification, governance structure, risk management documentation, bias testing and fairness audits, explainability and transparency, human oversight protocols, data governance, and third-party AI vendor management. This section examines four of these categories in depth: transparency and explainability, third-party AI vendor management, human oversight protocols, and bias testing and fairness audits. The analysis describes evidence classes and variance patterns, not implementation procedures or templates.

The pattern across categories is consistent. Aspirational frameworks describe what institutions should value. Insurance questionnaires specify what institutions must produce. Regulatory requirements vary by jurisdiction and institutional type. Documentation built for one stakeholder often fails the evidence tests of another, not because the governance is inadequate but because the format, granularity, or terminology does not match what the evaluator expects.

2.1 Transparency and Explainability

Aspirational frameworks treat transparency as a foundational principle. UNESCO's Recommendation on the Ethics of Artificial Intelligence calls for AI systems to be "intelligible" and for affected persons to receive "meaningful explanations."13 The OECD's AI Principles include transparency among their five core values, emphasizing that stakeholders should be able to understand AI system outcomes.14 NIST's AI Risk Management Framework positions transparency as enabling accountability, noting that organizations should "foster a culture of transparency."15 These commitments establish transparency as a governance priority without specifying what documentation would demonstrate compliance.

Insurance underwriters operationalize transparency differently. Emerging questionnaires ask whether the institution can produce documentation explaining how specific AI systems reach decisions affecting students or employees. The question is not whether the institution values transparency but whether it possesses artifacts: structured model documentation describing intended use and performance characteristics, decision logic summaries, or disclosure notices provided to affected parties. Some carriers ask whether the institution has provided disclosures to students or parents that AI is used in decisions affecting them, and whether those disclosures are documented with timestamps and distribution records.16

Regional variation shapes these requirements. The EU AI Act (Articles 13 and 26) requires high-risk AI systems to enable output interpretation and to notify affected persons of AI involvement in decisions.17 Insurers underwriting EU-exposed institutions verify compliance with these provisions, creating cross-jurisdictional documentation demands. In the United States, transparency requirements remain fragmented. New York's proposed S.1169A would require explanations for certain algorithmic decisions affecting consumers, but the bill has not been enacted. Colorado's CAIA requires that consumers receive statements when consequential decisions involve AI, but specific content and format requirements await regulatory guidance.

For educational institutions, transparency creates particular challenges. A school using an AI system for course recommendations or early warning interventions may be unable to explain how the system reaches its outputs because the vendor has not provided that information, the model architecture does not permit simple explanation, or the institution lacks personnel with technical capacity to interpret documentation even if provided. The transparency commitment in the school's AI policy may be genuine, but the artifacts an insurer or regulator requests may not exist and may not be obtainable. In this domain, aspirational guidance asserts intelligibility as a value, while underwriters evaluate documented explanations as artifacts, creating a translation gap when schools can commit to transparency but cannot produce evidence of it.

Transparency requirements intersect with vendor management because institutions often cannot explain AI systems they do not develop.

2.2 Third-Party AI Vendor Management

Aspirational frameworks acknowledge vendor relationships as a governance concern but do not specify what evidence of oversight an institution should produce. The NIST AI RMF includes third-party risk within its "Govern" function; UNESCO's Recommendation calls for due diligence in supply chains.18 The guidance establishes that vendor management matters without defining acceptable documentation.

Insurance underwriters have developed more specific expectations, shaped by loss experience in adjacent domains. Cyber insurance underwriting already requires evidence of third-party risk management: HECVAT assessments for higher education vendors, SOC 2 Type 2 audit reports, contractual security clauses, and evidence that vendors maintain appropriate coverage.19 AI-specific underwriting extends these expectations to algorithmic risk. Questionnaires increasingly ask whether institutions conduct AI governance assessments of vendors, whether contracts include clauses addressing algorithmic accountability, and whether the institution can demonstrate ongoing monitoring of vendor AI practices.

Some carriers have adopted exclusion language that creates significant exposure for educational institutions. Policies and endorsements introducing exclusions or conditions related to AI-generated content may require not merely initial vendor vetting but continuous monitoring capacity: the ability to detect when a vendor has introduced AI functionality, changed model behavior, or produced outputs that create liability exposure.20 Most educational institutions lack infrastructure for this level of vendor surveillance. They adopt tools at the classroom level without centralized procurement oversight. Vendor AI capabilities change post-contract without notification. The gap between what exclusions require and what institutions can demonstrate creates coverage uncertainty that documentation alone cannot resolve.

Contractual variance compounds the challenge. Some insurers expect specific clauses in vendor agreements: audit rights, algorithmic transparency requirements, liability allocation for AI-related claims, restrictions on using institutional data to train models. Other insurers accept evidence that the institution has a vendor management policy without requiring specific contract language. An institution that negotiates contracts to satisfy one carrier's expectations may find those contracts inadequate for another carrier's requirements, or may discover that vendors refuse to accept the clauses insurers expect. The translation gap in vendor management is structural: aspirational guidance calls for due diligence, underwriters demand specific contractual artifacts, and vendors control what contractual terms they will accept.

Vendor management challenges compound with human oversight, as third-party tools may not support the logging and override functions underwriters increasingly require.

2.3 Human Oversight Protocols

Aspirational frameworks universally endorse human oversight without specifying what documentation would demonstrate that control exists. UNESCO emphasizes that humans should retain "the ability to intervene in or reverse AI-based decisions."21 The OECD Principles call for human accountability throughout the AI lifecycle. These commitments are directionally clear and operationally vague.

Insurance and regulatory requirements translate human oversight into artifact demands. The EU AI Act's Article 14 requires that high-risk AI systems be designed to allow "effective oversight by natural persons," and Article 26 requires deployers to assign personnel to oversight functions with documented competence, training, and authority.22 Reinsurance market practices increasingly condition coverage on evidence of documented human oversight, treating its absence as a risk factor affecting premium calculation or coverage eligibility. Questionnaires ask whether institutions have defined which decisions require human review before action, whether review protocols are documented, and whether logs demonstrate that reviews actually occur.23

Human oversight is easily asserted but difficult to evidence. A teacher may review AI-suggested scores, but without logged documentation, oversight leaves no audit trail. An administrator overriding an AI-generated discipline recommendation may do so based on professional judgment, but if the override decision is not recorded with rationale, the human oversight that occurred produces no artifact. The gap is not between institutions that exercise oversight and those that do not; it is between institutions that can prove oversight occurred and those that cannot.

Variation across AI applications within a single institution creates additional complexity. Human oversight appropriate for an AI writing assistant (periodic spot-checking) differs from oversight appropriate for an AI system flagging students for dropout intervention (documented review of every flagged case before action) or an AI tool influencing disciplinary decisions (mandatory human decision-maker with documented authority to override). A single human oversight policy cannot specify appropriate protocols for all AI applications. Granular, application-specific documentation is required, and most institutions have not developed the infrastructure to produce it. In this domain, aspirational guidance asserts meaningful human control as essential, while underwriters evaluate logged intervention records as evidence, creating a translation gap when schools exercise judgment but cannot document it.

Human oversight protocols, even when documented, do not address whether AI systems produce discriminatory outcomes, a question that bias testing must answer.

2.4 Bias Testing and Fairness Audits

Aspirational frameworks treat fairness as a universal commitment without prescribing how an institution should test for bias or what methodology would demonstrate compliance. UNESCO calls for AI systems to avoid discrimination and promote equity.24 The OECD Principles include fairness among core values. These frameworks establish that fairness matters; they do not establish what evidence proves it.

Insurance and regulatory requirements are developing more specific expectations, though standardization remains limited. Reinsurance technical due diligence processes evaluate model robustness and performance characteristics, including fairness metrics, as conditions of coverage under emerging AI assurance offerings. Colorado's CAIA requires developers and deployers of high-risk AI systems to implement risk management programs that include testing for algorithmic discrimination, defined as differential treatment or impact on protected classes.25 The statute anticipates adverse outcome testing and disparate impact analysis, but detailed implementation guidance remains incomplete. In financial services, Fair Lending Act enforcement provides a template: institutions must demonstrate that credit models do not produce discriminatory outcomes across protected categories, using regression testing and disparate impact analysis with established methodologies. No equivalent methodology exists for education.

The absence of standardized bias audit methodology is a critical gap. Healthcare has FDA clearance processes that address algorithmic performance across patient populations. Financial services has decades of Fair Lending enforcement establishing acceptable testing approaches. Education has no sector-specific standard. A school seeking to demonstrate that its AI systems do not produce biased outcomes must determine what populations to test, what metrics to apply, what thresholds constitute acceptable disparity, and how to document findings. These methodological decisions are not specified by any authoritative body. The term "audit" in this context refers to evidentiary documentation of testing, not an established sector audit regime.

Insurers recognize this gap. Broker guidance notes that education sector algorithmic bias claims have not yet produced actuarially significant loss history, meaning underwriters lack data to price this risk with confidence. Some carriers are waiting for governance frameworks to mature before fully engaging AI risk in education. Others apply healthcare and financial services expectations by analogy, asking for bias audit documentation without specifying acceptable formats or methodologies.

For educational institutions, bias testing intersects with civil rights obligations that predate AI governance frameworks. Title VI, Title IX, Section 504, and state civil rights laws already prohibit discrimination in educational programs. AI systems that produce disparate outcomes may trigger liability under these existing frameworks regardless of AI-specific regulation. Documentation demonstrating bias testing serves dual purposes: satisfying emerging AI governance expectations and providing evidence of due diligence under established civil rights compliance. In this domain, aspirational guidance asserts fairness as essential, while underwriters and regulators increasingly expect documented testing results, creating a translation gap when schools cannot produce evidence of testing they have not performed because no methodology tells them how.

2.5 Summary

The four categories examined here demonstrate a consistent structural problem. Aspirational frameworks establish that transparency, vendor management, human oversight, and fairness matter. Insurance and regulatory requirements translate these values into documentation demands. The specific artifacts required vary by carrier, jurisdiction, and institutional context. No single document satisfies all stakeholders.

This variance is not temporary. The sections that follow examine why standardization has not occurred, how the current period resembles earlier phases of assurance framework development in other domains, and what the interim period demands of institutions operating without settled standards.


3. The Pre-SOC 2 Parallel

The translation problem documented in Sections 1 and 2 is not novel. Other sectors have faced analogous conditions: fragmented stakeholder requirements, carrier-specific questionnaires, and the absence of standardized assurance frameworks. The cybersecurity domain provides a useful precedent. Educational AI governance in 2025 resembles cybersecurity compliance in the years before SOC 2 became a common assurance artifact. Understanding how that earlier fragmentation resolved illuminates where AI governance may be heading and why the current period demands interim translation capacity.

3.1 The Pre-Standardization Problem

In 1992, the American Institute of Certified Public Accountants (AICPA) issued Statement on Auditing Standards No. 70 (SAS 70). The standard was designed narrowly: it addressed internal controls over financial reporting at service organizations.26 A company outsourcing payroll processing, for example, could obtain a SAS 70 report demonstrating that the service provider maintained adequate controls relevant to the client's financial statements. The standard served its intended purpose within that scope.

By the mid-2000s, SAS 70 had migrated far beyond its original domain. Data centers, cloud providers, and technology vendors began obtaining SAS 70 reports and presenting them to customers as evidence of security practices. The market lacked a shared language for security assurance, and SAS 70 became a proxy. Vendors used a financial reporting tool to signal security because no alternative existed. As the AICPA later acknowledged, a "popular misunderstanding" developed that service organizations could become "SAS 70 certified," though no such certification existed.27 Industry commentary from this period noted that vendors frequently overstated the significance of SAS 70 reports, implying levels of assurance the standard was never designed to provide.

The result was a high-transaction-cost environment for trust. Every business-to-business relationship involving data or technology required bespoke negotiation over security evidence. Customers sent unique questionnaires, sometimes hundreds of questions in spreadsheet format. Vendors spent thousands of hours answering these questionnaires with varying levels of rigor, often substituting marketing claims for documented controls. The same vendor might answer the same question differently depending on which customer asked and which staff member responded. Practitioners described significant duplication and inconsistency in responding to security assessments, with bottlenecks emerging as questionnaire volume increased.

In 2005, the Shared Assessments program emerged as a partial response. Formed by major banks and Big Four consulting firms, Shared Assessments created the Standardized Information Gathering (SIG) questionnaire, attempting to reduce duplication by establishing a common instrument.28 The SIG improved efficiency but did not solve the underlying problem: it remained a questionnaire based on vendor self-reporting, not an independent attestation. Vendors could answer questions without third-party verification. The market still lacked a mechanism for auditor-based validation against defined criteria.

3.2 The SOC 2 Resolution

In April 2010, the AICPA announced the end of SAS 70 and the creation of Service Organization Control (SOC) reports. The new framework separated financial controls (SOC 1, the successor to SAS 70's intended use) from security and operational controls (SOC 2). SOC 2 examinations report on controls relevant to the Trust Services Criteria categories: Security, Availability, Processing Integrity, Confidentiality, and Privacy.29

SSAE 16 (Statement on Standards for Attestation Engagements No. 16) became effective for service organization reports with periods ending on or after June 15, 2011, formally retiring SAS 70.30 SOC 2 reports, issued under this standard, provided independent auditor attestation that a service organization's controls met defined criteria over a specified period (Type 2 reports) or at a point in time (Type 1 reports).

SOC 2 created a shared language that multiple stakeholders could reference. A vendor undergoing a SOC 2 audit could satisfy numerous customer due diligence requirements with a single artifact. The model reduced transaction costs for both vendors and customers: instead of answering hundreds of unique questionnaires, a vendor could point to an independently attested report; instead of designing and administering custom assessments, a customer could request a SOC 2 report and evaluate it against their risk tolerance.

The framework also created a professional services category. SOC 2 readiness assessments emerged as a recognized offering: consultants who helped organizations prepare for audits by identifying gaps, implementing controls, and building documentation infrastructure before the formal attestation engagement. Boutique firms developed alongside Big Four practices, serving organizations at different scale and complexity levels. The readiness assessment function was explicitly interim: it existed to help organizations achieve a standard, not to replace it. But during the years when SOC 2 adoption was spreading, readiness services filled a necessary gap.

Cyber insurance underwriting increasingly recognized SOC 2 as evidence of security maturity. While no universal requirement emerged, underwriters began treating SOC 2 reports as relevant signals when evaluating applicants. The insurance market did not mandate SOC 2, but it recognized independently attested controls as credible evidence of security posture.31

3.3 The Current State of AI Assurance

No equivalent to SOC 2 exists for AI governance. The market is in a pre-standardization phase analogous to cybersecurity before 2010.

ISO/IEC 42001, published in December 2023, provides the closest approximation to a standardized AI management system framework.32 The standard establishes requirements for an AI Management System (AIMS), covering governance, risk management, and operational controls for AI development and deployment. Certification is available, and early adopters have pursued it. However, adoption remains limited. The standard is sector-agnostic; it does not specify education-sector evidence formats, address student data considerations, or map to the particular governance challenges documented in Section 2. Certification costs and complexity limit accessibility for smaller organizations, including most educational institutions and EdTech vendors.

The NIST AI Risk Management Framework provides conceptual architecture but is not an attestation standard. NIST AI RMF organizes governance around functions (Govern, Map, Measure, Manage) and offers extensive guidance on risk identification and mitigation. What it does not provide is criteria against which an independent auditor could attest. An organization can claim alignment with NIST AI RMF, but no certification or attestation mechanism validates that claim. The framework remains aspirational rather than auditable.33

Insurance underwriters have not converged on standardized AI questionnaires. As documented in Section 1, different carriers ask different questions, emphasize different evidence categories, and apply different risk evaluation criteria. The fragmentation resembles the period before standardized security questionnaires gained traction: each underwriter administers its own instrument, and applicants must translate their governance artifacts into whatever format each questionnaire demands.

The translation problem persists because no shared language exists. Schools cannot point to an independently attested AI governance report that satisfies multiple stakeholders. Vendors cannot obtain a single certification that addresses the concerns of diverse customers. Each transaction requires bespoke evidence production, and each stakeholder evaluates that evidence against criteria that may not be disclosed, may not be consistent, and may change between evaluation cycles.

3.4 The Parallel and Its Limits

The structural parallel between AI governance today and cybersecurity before SOC 2 is clear. Both domains faced fragmented requirements, absence of standardized attestation, high transaction costs for trust verification, and market demand for some form of credible assurance. Both saw partial solutions emerge (SIG questionnaires then, ISO 42001 now) that improved conditions without fully resolving the underlying problem.

The parallel has limits. SOC 2 emerged from a specific institutional context: the AICPA held established authority over attestation standards in the United States and had the capacity to create a new framework when market conditions demanded it. AI governance lacks an equivalent institutional anchor. ISO provides international standardization but operates on longer timelines and produces frameworks that require national adoption and sector-specific adaptation. No single body currently holds the authority to create an AI governance attestation standard that would be recognized across jurisdictions and stakeholder types in the way SOC 2 achieved recognition in its domain.

The timeline for standardization is uncertain. SOC 2 took roughly five years from announcement (2010) to widespread market recognition (mid-2010s), and adoption continues to evolve. AI governance standardization may follow a similar multi-year trajectory, or it may take longer given the greater complexity of AI systems compared to traditional IT controls and the cross-jurisdictional regulatory fragmentation documented in Sections 1 and 2.

What the parallel does suggest is that the interim period requires translation capacity. Before SOC 2, organizations navigating fragmented security requirements needed advisors who understood multiple stakeholder languages and could help produce documentation satisfying diverse evaluators. After SOC 2, that function diminished (though it did not disappear) as standardized artifacts reduced translation burden. AI governance is in the pre-standardization phase. The function of translating between stakeholder requirements, helping organizations identify gaps, and building documentation infrastructure is not permanent market structure. But it is necessary now, during the period when standards have not converged and institutions must nevertheless demonstrate governance to insurers, regulators, procurement authorities, and boards.

The following section examines the timeline of this interim period and what it demands of institutions operating without settled standards.


4. The Interim Infrastructure Period

The SOC 2 parallel examined in Section 3 suggests a trajectory: fragmented requirements eventually consolidate into standardized assurance frameworks, reducing translation burden and transaction costs. That trajectory, however, operates on a multi-year timeline. SOC 2 took roughly five years from announcement to widespread market recognition. AI governance standardization faces greater complexity: more diverse stakeholder types, cross-jurisdictional regulatory fragmentation, and technical heterogeneity that exceeds traditional IT controls. The interim period, during which institutions must navigate non-harmonized requirements, is not a brief transition. It is a sustained condition requiring dedicated infrastructure.

4.1 Current Fragmentation

The 2025-2026 landscape is characterized by overlapping, non-aligned requirements from multiple sources.

At the state level, as of March 2025, agencies in at least 28 states have published or adopted AI guidance for K-12 education, though comprehensiveness and specificity vary dramatically.34 Some states have published detailed frameworks with implementation expectations; others have issued general principles without operational guidance. Colorado's Consumer Protections for Artificial Intelligence Act (SB 24-205), with enforcement delayed to June 30, 2026, via SB25B-004, represents the most prescriptive state approach, explicitly requiring impact assessments and risk management programs for consequential AI decisions in education.35 New York's proposed S.1169A would establish education-specific requirements including bias audits and transparency disclosures, though the bill has not been enacted. California has directed development of AI guidance for educational settings without yet imposing binding requirements. The result is a patchwork: institutions operating across state lines face different expectations in different jurisdictions, with no mechanism for mutual recognition or compliance portability.

At the federal level, there is no single, binding, education-sector AI governance standard that specifies operational evidence formats. The December 11, 2025, executive order established an AI Litigation Task Force to challenge state laws deemed inconsistent with federal innovation priorities, explicitly naming Colorado's algorithmic discrimination provisions as a target.36 The order directs federal agencies to consider conditioning grant funding on states refraining from AI regulation the administration considers onerous. This posture creates uncertainty rather than clarity: state requirements may be contested before they can be enforced, but no federal alternative has been proposed that would provide operational guidance to institutions. FERPA, COPPA, Section 504, and related civil rights requirements remain in force, creating baseline obligations that AI governance must satisfy, but these statutes predate AI adoption and do not address algorithmic decision-making specifically.

At the international level, the EU AI Act imposes binding obligations on educational institutions using high-risk AI systems, with enforcement beginning August 2, 2026, for Annex III systems (which include educational and vocational training applications that may determine access to education or assess students).37 The Act requires conformity assessments, technical documentation, human oversight protocols, and transparency disclosures. Institutions with operations in EU member states, or vendors serving EU-based schools, face compliance obligations that U.S. guidance does not address and that U.S. insurers may not recognize as sufficient for domestic underwriting purposes.

Insurance carriers have developed their own questionnaires and evidence requirements, as documented in Sections 1 and 2. These requirements are not standardized across carriers, creating variance that institutions must navigate at each renewal cycle. The NAIC Model Bulletin on Use of AI Systems by Insurers, adopted in December 2023 and implemented by over twenty-four states as of March 2025, establishes governance expectations for insurers themselves but does not standardize what insurers should require of policyholders.3839

4.2 Timeline of Binding Requirements

The interim period is bounded by identifiable milestones, though the endpoint remains uncertain.

January 2026 marks the proposed effective date for Verisk (ISO) endorsements CG 40 47, CG 40 48, and CG 35 08, enabling carriers to attach AI-specific exclusions to commercial general liability policies.40 Adoption varies by carrier and line; however, standardized endorsement language now exists that enables broad market deployment, and underwriting questionnaires are converging toward a recurring set of evidence categories.

June 30, 2026, brings Colorado CAIA enforcement, creating the first binding U.S. state requirement for AI governance in consequential decisions including education.41 Institutions and vendors operating in Colorado face compliance obligations; multi-state organizations may treat Colorado requirements as a de facto floor.

August 2, 2026, triggers EU AI Act high-risk system obligations for educational applications.42 Institutions in member states and vendors serving EU markets must demonstrate conformity with documentation, oversight, and transparency requirements.

Throughout 2026, insurance renewals will be the first cycle where AI governance documentation materially affects coverage terms. Institutions without documented governance face the exclusionary postures and heightened underwriting scrutiny that Memorandum No. 2 documented.

2027 intensifies pressure. If New York enacts S.1169A, implementation would likely follow an eighteen-month pattern similar to Colorado's delay, placing enforcement in 2027. Additional states will advance AI legislation through 2026-2027 sessions, creating cascade effects as jurisdictions observe peer-state models. As incidents mature into disputes, complaints, and claims, underwriting will increasingly incorporate loss signals and litigation patterns.

2028 and beyond represent a maturing enforcement environment. EU AI Act enforcement patterns will be established. Actuarial models will incorporate settled claims data. Early governance frameworks will have operational track records. The window for institutional initiative in governance design narrows as external requirements increasingly define adequacy.

4.3 Likely Convergence Path

Standardization is developing but not imminent. Several convergence mechanisms are in motion.

ISO/IEC 42001 provides the most plausible foundation for international harmonization. Published in December 2023, the standard establishes requirements for AI Management Systems that align conceptually with both EU AI Act obligations and NIST AI RMF categories. As adoption increases, ISO 42001 certification may become a recognized signal across jurisdictions, similar to how ISO 27001 certification serves as evidence of information security management. However, ISO 42001 is sector-agnostic; education-specific interpretive guidance does not yet exist.43

NIST AI RMF continues to serve as a reference framework in U.S. regulatory contexts. Colorado's CAIA explicitly references NIST AI RMF and ISO 42001 as compliance guidance. New York's proposed S.1169A makes similar references. If these frameworks become statutory touchstones, organizations aligning with them now will be positioned for compliance as requirements crystallize.

The NAIC may develop AI-specific guidance for policyholder underwriting, similar to how cyber insurance questionnaires became more standardized over the past decade. No such initiative has been announced, but the pattern of insurance market standardization following regulatory and loss signal pressure suggests this development is plausible.

EU enforcement patterns will influence global practice. As conformity assessment procedures mature and notified bodies issue decisions, the EU AI Act will generate interpretive precedent that other jurisdictions may reference. Organizations with EU exposure will develop compliance documentation that may prove adaptable to other markets.

The convergence timeline cannot be precisely estimated. Based on the SOC 2 precedent and the complexity factors distinguishing AI governance, a multi-year period of fragmentation should be anticipated, likely extending into the late 2020s. Institutions planning governance infrastructure should not assume that standardization will arrive in time to relieve near-term compliance pressure.

4.4 The Interim Imperative

Standardization is coming but not soon enough. Insurance renewals occur annually. Procurement cycles continue. Regulatory enforcement begins in 2026. Institutions cannot wait for harmonization to resolve the translation problem documented in this memorandum.

The interim period requires institutions to build governance infrastructure that satisfies multiple stakeholders with different evidence languages. This is the translation function: producing documentation that addresses insurer questionnaires, regulatory requirements, procurement expectations, and board oversight demands, recognizing that each stakeholder evaluates adequacy against different criteria. In practice, interim infrastructure means repeatable evidence packaging: inventories, accountability attestations, oversight logs, and vendor records that can be rendered into multiple stakeholder formats.

Institutions typically respond in one of two ways. Some attempt to build translation capacity internally, developing expertise in insurance underwriting expectations, regulatory requirements across jurisdictions, and documentation practices that produce attestation-grade evidence. This approach requires investment in personnel, training, and ongoing monitoring of evolving requirements. For large institutions with dedicated compliance functions, internal capacity may be feasible.

Alternatively, institutions use external specialist support from advisors who have mapped the stakeholder landscape and can translate governance artifacts across evaluator types. This approach recognizes that the expertise required to navigate fragmented requirements is specialized and that building such expertise internally may not be cost-effective for institutions whose primary mission is education, not compliance consulting. The readiness assessment function that emerged around SOC 2 adoption provides a precedent: specialists who help organizations prepare for standards that are still maturing, producing documentation that satisfies current evaluators while positioning for future requirements.

The interim function is not permanent. As standards converge and attestation frameworks mature, the translation burden will diminish. Organizations with ISO 42001 certification, once that certification gains market recognition, will face reduced need for bespoke evidence production. But that future state has not arrived. The current period demands interim infrastructure.


Conclusion

This memorandum has documented the translation problem confronting educational institutions and EdTech vendors: the same governance requirements manifest as different evidence demands depending on which stakeholder is asking. Aspirational frameworks describe what institutions should value. Insurance questionnaires specify what institutions must produce. Regulatory requirements vary by jurisdiction. No single document satisfies all evaluators.

The problem is structural, not temporary. Section 1 established that the aspirational-operational divide creates translation gaps across evidence categories. Section 2 demonstrated how four specific domains (transparency, vendor management, human oversight, bias testing) produce different artifacts for different stakeholders. Section 3 placed this fragmentation in historical context, showing that cybersecurity faced analogous conditions before SOC 2 provided a shared assurance language. Section 4 has mapped the interim period: binding requirements arriving in 2026, convergence developing but not imminent, and institutions facing compliance pressure that standardization will not relieve in time.

Memorandum No. 1 in this series established that existing AI governance frameworks are aspirational or regulatory but not operational. Memorandum No. 2 established that insurance exclusions, loss signals, and regulatory timelines are converging to force governance adoption. This memorandum has addressed the translation layer: how institutions convert governance commitments into the specific evidentiary formats that different stakeholders require.

The governance infrastructure institutions build now will serve them as requirements converge. Documentation aligned with NIST AI RMF and ISO 42001 categories positions institutions for Colorado CAIA compliance, anticipated state requirements, and EU AI Act obligations where applicable. Evidence practices that satisfy current insurer questionnaires build operational competence for future underwriting cycles. Human oversight protocols, bias testing documentation, vendor management records, and transparency disclosures produced now become the audit trail that regulators, insurers, and litigators will evaluate later.

Those who defer will inherit externally defined compliance standards, implementing under deadline pressure what others designed under market initiative. The translation problem does not resolve itself. It requires infrastructure, whether built internally or engaged externally, to bridge the gap between what frameworks prescribe and what evaluators accept as proof.


References

[1] Purdy, R. J. (2025). Memorandum No. 1: The Operational Gap in Educational AI Governance. Purdy House Institute Working Paper Series.

[2] Purdy, R. J. (2025). Memorandum No. 2: The Forcing Function. Purdy House Institute Working Paper Series.

[3] UNESCO. (2021). Recommendation on the Ethics of Artificial Intelligence. Paris: UNESCO. https://unesdoc.unesco.org/ark:/48223/pf0000381137

[4] OECD. (2019, revised November 8, 2023). Recommendation of the Council on Artificial Intelligence (OECD/LEGAL/0449). Paris: OECD. https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449

[5] National Institute of Standards and Technology. (2023). AI Risk Management Framework (AI RMF 1.0). Gaithersburg, MD: NIST. https://www.nist.gov/itl/ai-risk-management-framework

[6] Cherry Bekaert. (2025, November 24). AI in Insurance: How to Build a Compliant Governance Framework. https://www.cbh.com/insights/articles/ai-in-insurance-how-to-build-a-compliant-governance-framework/

[7] Regulation (EU) 2024/1689 of the European Parliament and of the Council (EU AI Act), Articles 14 and 26. Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2024/1689/oj

[8] Colorado General Assembly. (2024). SB 24-205: Colorado Artificial Intelligence Act. Enforcement delayed to June 30, 2026, via SB25B-004. https://leg.colorado.gov/bills/sb24-205

[9] New York Department of Financial Services. (2024). Circular Letter No. 7: Use of Artificial Intelligence Systems and External Consumer Data and Information Sources. Albany, NY. https://www.dfs.ny.gov/industry_guidance/circular_letters/cl2024_07

[10] Hunton Andrews Kurth LLP. (2025). Berkley Companies File Exclusions and Underwriting Rules for AI. Insurance Recovery Blog. https://www.huntonak.com/en/insights/berkley-companies-file-exclusions-and-underwriting-rules-for-ai.html

[11] Clark Hill PLC. (2025). Berkley Companies File Exclusions and Underwriting Rules for AI. https://www.clarkhill.com/news-events/news/berkley-companies-file-exclusions-and-underwriting-rules-for-ai/

[12] Exec. Order No. 14,365, 90 Fed. Reg. 2,053 (December 11, 2025). Ensuring a National Policy Framework for Artificial Intelligence. https://www.federalregister.gov/d/2025-00632

[13] UNESCO. (2021). Recommendation on the Ethics of Artificial Intelligence, Article 40.

[14] OECD. (2019, revised 2023). OECD AI Principles. https://oecd.ai/en/ai-principles

[15] NIST. (2023). AI Risk Management Framework 1.0, Govern Function.

[16] BIPC. (2025, October 8). When Algorithms Underwrite: Regulators Demanding Explainable AI Systems. https://www.bipc.com/when-algorithms-underwrite-insurance-regulators-demanding-explainable-ai-systems

[17] Regulation (EU) 2024/1689, Articles 13 and 26.

[18] (a) NIST. (2023). AI Risk Management Framework 1.0, Govern Function, GV-6.1. (b) UNESCO. (2021). Recommendation on the Ethics of Artificial Intelligence, Article 47.

[19] Higher Education Information Security Council. (2020). HECVAT: Higher Education Community Vendor Assessment Toolkit. EDUCAUSE. https://library.educause.edu/resources/2020/4/higher-education-community-vendor-assessment-toolkit

[20] See note 10.

[21] UNESCO. (2021). Recommendation on the Ethics of Artificial Intelligence, Article 41.

[22] Regulation (EU) 2024/1689, Articles 14 and 26.

[23] Lloyd's Market Association. (2025, September 18). Report on AI Impact on International E&O Market. London. https://lmalloyds.com/lma-report-highlights-impact-of-artificial-intelligence-on-international-eo-market/

[24] UNESCO. (2021). Recommendation on the Ethics of Artificial Intelligence, Articles 43-44.

[25] Colorado General Assembly. (2024). SB 24-205, Section 6-1-1702.

[26] American Institute of Certified Public Accountants. (1992). Statement on Auditing Standards No. 70: Service Organizations. New York: AICPA.

[27] Roosa CPA. (2025). From SAS 70 to SSAE 16 to SSAE 18. https://roosacpa.com/sas70/

[28] Shared Assessments. (2025). What is the SIG? https://sharedassessments.org/about-sig/

[30] GlobalPrivacyBlog. (2011, May 4). Goodbye SAS 70, Hello SSAE 16 and Introducing SOC2 and SOC3. https://www.globalprivacyblog.com/2011/05/goodbye-sas-70-hello-ssae-16-and-introducing-soc2-and-soc3/

[31] Compass ITC. (2024, September 9). A Detailed History of SOC 2 Compliance. https://www.compassitc.com/blog/a-detailed-history-of-soc-2-compliance

[32] International Organization for Standardization. (2023). ISO/IEC 42001:2023 Information Technology - Artificial Intelligence - Management System. Geneva: ISO. https://www.iso.org/standard/81230.html

[33] NIST. (2023). AI Risk Management Framework 1.0.

[34] Education Commission of the States. (2025, July 2). AI Pilot Programs in K-12 Settings. https://www.ecs.org/ai-artificial-intelligence-pilots-k12-schools/

[35] Colorado General Assembly. (2024). SB 24-205; (2025). SB25B-004. https://leg.colorado.gov/bills/sb25b-004

[36] See note 12.

[37] Regulation (EU) 2024/1689, Article 83(3) and Annex III.

[38] National Association of Insurance Commissioners. (2023). Model Bulletin on Use of AI Systems by Insurers. https://content.naic.org/sites/default/files/inline-files/2023-12-4%20Model%20Bulletin_Artificial%20Intelligence.pdf

[39] Fenwick & West LLP. (2025, December 14). Tracking the Evolution of AI Insurance Regulation. https://www.fenwick.com/insights/publications/tracking-the-evolution-of-ai-insurance-regulation

[40] Verisk Insurance Solutions. (2025). ISO Endorsements CG 40 47, CG 40 48, CG 35 08: Exclusion for Generative Artificial Intelligence. Proposed effective date January 1, 2026. https://www.verisk.com/insurance/iso/emerging-issues/artificial-intelligence/

[41] Colorado General Assembly. (2024). SB 24-205, Section 6-1-1701 et seq.

[42] Regulation (EU) 2024/1689, Article 83(3).

[43] ISO. (2023). ISO/IEC 42001:2023.


About the Author

Ryan James Purdy is an assurance professional, independent researcher, and writer focused on AI governance in education and other regulated sectors. His work examines how AI policy and regulatory requirements translate into institutional implementation.

About Purdy House Publishing & Consulting

Purdy House Institute is an independent research imprint publishing working papers on AI governance in education and related regulated contexts. The AI Governance in Education series examines gaps between AI governance policy and institutional implementation.

Correspondence: jamespurdy624@gmail.com



 
 
 

Comments


bottom of page