AI Agents in Action Foundations for Evaluation and Governance 2025

Page 6 of 34 · WEF_AI_Agents_in_Action_Foundations_for_Evaluation_and_Governance_2025.pdf

LLM-based AI agents, for example, introduce new risks such as goal misalignment, behavioural drift, tool misuse and emergent coordination failures that traditional software governance models are unable to manage. Unlike conventional software, agents are increasingly assuming roles that resemble those of human decision-makers rather than static tools. This means that governance models designed solely for access control and system reliability are no longer sufficient. A more useful comparison is the governance applied to human users, who must earn permissions, accountability and trust by demonstrating performance over time. Similarly, trust in AI agents can be established by testing their behaviour against validated cases, running them in human-in-the- loop configurations and gradually expanding autonomy only once reliability has been sufficiently demonstrated. In both cases, the principle of least privilege remains essential, with access limited to information and actions necessary for the task. This report aims to provide a forward-looking analysis of the evolving landscape of AI agents, focusing on the capabilities, infrastructure, classification and safeguards necessary for responsible deployment. To this end, it is structured around four foundational pillars across classification, evaluation, risk assessment and governance, which together form the foundation for a progressive approach to adoption and deployment. Figure 1 presents the general content of this report, which helps guide the responsible adoption and deployment of AI agents. The goal is to equip adopters, providers, technical leaders, organizational decision-makers and other stakeholders with a shared understanding of the current state of agentic systems and emerging oversight practices. Building on established AI governance principles and frameworks, such as those developed by the Organisation for Economic Co-operation and Development (OECD),2 National Institute of Standards and Technology (NIST),3 International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC)4 and others, this paper introduces additional principles addressing autonomy, authority, operational context and systemic risk that extend existing governance guidance from an agent-focused lens. The insights have been informed by working group meetings, workshops and extensive interviews with members of the Safe Systems and Technologies working group of the AI Governance Alliance. AI Agents in Action: Foundations for Evaluation and Governance 6
Ask AI what this page says about a topic: