AI Agents in Action Foundations for Evaluation and Governance 2025
Page 6 of 34 · WEF_AI_Agents_in_Action_Foundations_for_Evaluation_and_Governance_2025.pdf
LLM-based AI agents, for example, introduce new
risks such as goal misalignment, behavioural drift,
tool misuse and emergent coordination failures
that traditional software governance models are
unable to manage. Unlike conventional software,
agents are increasingly assuming roles that
resemble those of human decision-makers rather
than static tools. This means that governance
models designed solely for access control and
system reliability are no longer sufficient. A more
useful comparison is the governance applied
to human users, who must earn permissions,
accountability and trust by demonstrating performance
over time. Similarly, trust in AI agents can be
established by testing their behaviour against
validated cases, running them in human-in-the-
loop configurations and gradually expanding
autonomy only once reliability has been sufficiently
demonstrated. In both cases, the principle of least
privilege remains essential, with access limited to
information and actions necessary for the task.
This report aims to provide a forward-looking analysis
of the evolving landscape of AI agents, focusing
on the capabilities, infrastructure, classification and
safeguards necessary for responsible deployment.
To this end, it is structured around four foundational pillars across classification, evaluation, risk assessment
and governance, which together form the foundation
for a progressive approach to adoption and
deployment. Figure 1 presents the general content
of this report, which helps guide the responsible
adoption and deployment of AI agents.
The goal is to equip adopters, providers, technical
leaders, organizational decision-makers and other
stakeholders with a shared understanding of the
current state of agentic systems and emerging
oversight practices. Building on established
AI governance principles and frameworks,
such as those developed by the Organisation
for Economic Co-operation and Development
(OECD),2 National Institute of Standards and
Technology (NIST),3 International Organization for
Standardization (ISO)/International Electrotechnical
Commission (IEC)4 and others, this paper
introduces additional principles addressing
autonomy, authority, operational context and
systemic risk that extend existing governance
guidance from an agent-focused lens. The
insights have been informed by working group
meetings, workshops and extensive interviews with
members of the Safe Systems and Technologies
working group of the AI Governance Alliance.
AI Agents in Action: Foundations for Evaluation and Governance
6
Ask AI what this page says about a topic: