Example behavioural interventions TABLE 4 Intervention Description Example organization(s) Threat modelling mentalityAn internal company approach that focuses on understanding how systems and policies can be exploited or “gamed” by malicious users, and designing defences against such exploitation.Niantic Trust and safety red team exercises to identify gaps or potential problem areas within the product before launch. Service design Designing platform features that allow users to control their experience by enabling options such as disabling comments, blocking users or setting privacy filters – this can lead to bad actors ceasing their offensive behaviour by denying them the desired impact.Instagram Allows users to turn off comments on posts, restrict unwanted interactions and block specific accounts to minimize exposure to harmful content or harassment. User identification (e.g. detect repeat offenders) Systems designed to identify repeat offenders or malicious actors using behavioural patterns, unique identifiers or account verification, helping detect banned users trying to rejoin.Facebook Uses behavioural pattern recognition to detect suspicious activity resulting is disabled accounts and the creation of new accounts under different identities. Detection of behavioural signals (e.g. keywords/ phrases)Algorithms that monitor for specific behavioural signals or keywords associated with harmful content, such as harassment, hate speech or threats, to prevent and mitigate harm.Tinder “Are you sure” feature is a warning to users to think twice about their opening line, using AI to detect harmful language and alerting the sender their message may be offensive. Positive reinforcement for safe behaviourPlatforms that reward or encourage positive online behaviours, such as respectful interactions, and highlight users who contribute to the platform’s safety and positive culture.Reddit Karma system rewards users with “karma” points for positive contributions to discussions, promoting respectful behaviour and discouraging trolling or abusive behaviour. Behaviour-focused warningsDisplaying warnings based on user actions, such as sending alerts if a user is engaging in harmful behaviour, or escalating enforcement (e.g. temporary suspension) after repeated offences.YouTube Provides warnings when users violate content policies, offering them the chance to change their behaviour before harsher penalties like account suspension are imposed. Source: Niantic. (2023). Our Approach To Safety; Instagram Help Centre. (n.d.). Managing Your Privacy Settings; Meta. (2024). How enforcement technology works; Tinder Newsroom. (2021). Tinder Introduces Are You Sure?, an Industry-First Feature That is Stopping Harassment Before It Starts; Reddit. (2024). What is karma?; YouTube Help. (n.d.). Community Guidelines strike basics on YouTube.3.4 Behavioural interventions Behavioural interventions focus on changing individuals’ and groups’ actions and habits to reduce digital risks. These interventions apply strategies from psychology and behavioural science to promote safe online behaviours and discourage risky or harmful activities. Tailoring interventions to fit the specific situation of the individual ensures that the support provided is appropriate and effective. Organizations must develop the capability to implement behavioural interventions by applying approaches that support the effective design and deployment of such strategies. Additionally, balancing privacy concerns with the need to report potential radicalization to law enforcement poses a challenge. It is important to navigate these concerns carefully, ensuring that interventions are both effective in addressing risks and respectful of individual privacy rights. The Intervention Journey: A Roadmap to Effective Digital Safety Measures 33

The Intervention Journey A Roadmap to Effective Digital Safety Measures 2025