Artificial Intelligence and Cybersecurity Balancing Risks and Rewards 2025
Page 20 of 28 · WEF_Artificial_Intelligence_and_Cybersecurity_Balancing_Risks_and_Rewards_2025.pdf
Example of existing control capabilities that need to be developed TABLE 2
Control Description
Training data security Data inputs need to be protected and managed to avoid deliberate poisoning and
accidental damage to the AI system.
Prompt curation Prompts need to be curated to mitigate risks of prompt injection and jailbreaking.
Output verification The integrity and reliability of AI outputs need to be verified. Currently, this is mostly
driven by humans.
Monitoring and
detectionThe behaviours of AI systems need to be monitored to detect manipulation in a
timely manner.
Red teaming and
adversarial testingGuidelines and tools for red-teaming AI models, systems and processes using AI outputs
are required. This is particularly critical for regulated sectors that already mandate such
testing. AI systems could be harnessed to red-team AI models with greater efficacy.
Application of risk controls to the attack surface FIGURE 6
Application
software security:
Ringfencing AI
systems until their
security is validated,
before they are put
into production and
integrated with
critical business
processes
Monitoring and
detection: Tools for
monitoring AI-system
behaviours to detect
manipulation
Inventory of core
AI assets: Ensuring
that all new assets
(devices and
software) relating
to AI infrastructure
are mapped
Inventory of
supporting
infrastructure:
Mapping the
infrastructure
supporting the new
AI infrastructure,
such as databases
and APIs, to ensure
that its criticality is
understood and that
it is protected
accordingly
Data protection:
Ensuring that new
data requiring
protection is
identified, and its
criticality (e.g. impact
on business
processes via the AI
models) is mappedIncident-response
management:
Incident-response
procedures and
refreshed
business-continuity
plans to account for
the impacts of
AI-related cyber risksIncident-response
management:
Approaches and
tools for recovering
AI systems that have
been compromised
(“roll-back”
procedures for
AI models)Penetration testing:
Guidelines and
tools for red-teaming
AI models
Monitoring and
detection:
Approaches and
tools for verifying the
integrity and reliability
of AI outputs –
including the role of
human oversightMonitoring and logging
– Manipulation of monitoring tools’ integrity
– Data leakage from monitoring tools
– Compromise of monitoring tools access
Lateral movement
Input
– Data poisoning
– Prompt injection
– Model evasion
(input data causing
altered model
behaviour)
Model development and update
– Malign insertion of vulnerabilities (backdoors)
– Developer errors
– Compromise of development environmentTraining
– Training data poisoning
– Compromise of training
environmentOutput
– Manipulation of
data post-output
(e.g. through API
compromise)Core AI infrastructure
Model
– Exploitation of
vulnerabilities
– Alteration of
model code
Directly supporting infrastructure
Data storage
– Leakage of data
– Manipulation or
insertion of data
(leading to model
poisoning)Underlying
hardware/
software stack,
operating system
– Exploitation of
vulnerabilities leading
to compromise of
underlying
infrastructureAPIs and
interfaces
– Exploitation of
vulnerabilities leading
to data compromise
at APIs
Manipulated input
or output dataBusiness
applications
(What is the output
data used for)
(Non-exhaustive list)
– Driving business
processes
– Presenting
information to end
users/clients
(recommendation
engines, chatbots)
Artificial Intelligence and Cybersecurity: Balancing Risks and Rewards
20
Ask AI what this page says about a topic: