Example of existing control capabilities that need to be developed TABLE 2 Control Description Training data security Data inputs need to be protected and managed to avoid deliberate poisoning and accidental damage to the AI system. Prompt curation Prompts need to be curated to mitigate risks of prompt injection and jailbreaking. Output verification The integrity and reliability of AI outputs need to be verified. Currently, this is mostly driven by humans. Monitoring and detectionThe behaviours of AI systems need to be monitored to detect manipulation in a timely manner. Red teaming and adversarial testingGuidelines and tools for red-teaming AI models, systems and processes using AI outputs are required. This is particularly critical for regulated sectors that already mandate such testing. AI systems could be harnessed to red-team AI models with greater efficacy. Application of risk controls to the attack surface FIGURE 6 Application software security: Ringfencing AI systems until their security is validated, before they are put into production and integrated with critical business processes Monitoring and detection: Tools for monitoring AI-system behaviours to detect manipulation Inventory of core AI assets: Ensuring that all new assets (devices and software) relating to AI infrastructure are mapped Inventory of supporting infrastructure: Mapping the infrastructure supporting the new AI infrastructure, such as databases and APIs, to ensure that its criticality is understood and that it is protected accordingly Data protection: Ensuring that new data requiring protection is identiﬁed, and its criticality (e.g. impact on business processes via the AI models) is mappedIncident-response management: Incident-response procedures and refreshed business-continuity plans to account for the impacts of AI-related cyber risksIncident-response management: Approaches and tools for recovering AI systems that have been compromised (“roll-back” procedures for AI models)Penetration testing: Guidelines and tools for red-teaming AI models Monitoring and detection: Approaches and tools for verifying the integrity and reliability of AI outputs – including the role of human oversightMonitoring and logging – Manipulation of monitoring tools’ integrity – Data leakage from monitoring tools – Compromise of monitoring tools access Lateral movement Input – Data poisoning – Prompt injection – Model evasion (input data causing altered model behaviour) Model development and update – Malign insertion of vulnerabilities (backdoors) – Developer errors – Compromise of development environmentTraining – Training data poisoning – Compromise of training environmentOutput – Manipulation of data post-output (e.g. through API compromise)Core AI infrastructure Model – Exploitation of vulnerabilities – Alteration of model code Directly supporting infrastructure Data storage – Leakage of data – Manipulation or insertion of data (leading to model poisoning)Underlying hardware/ software stack, operating system – Exploitation of vulnerabilities leading to compromise of underlying infrastructureAPIs and interfaces – Exploitation of vulnerabilities leading to data compromise at APIs Manipulated input or output dataBusiness applications (What is the output data used for) (Non-exhaustive list) – Driving business processes – Presenting information to end users/clients (recommendation engines, chatbots) Artificial Intelligence and Cybersecurity: Balancing Risks and Rewards 20

Artificial Intelligence and Cybersecurity Balancing Risks and Rewards 2025