The Intervention Journey A Roadmap to Effective Digital Safety Measures 2025
Page 33 of 45 · WEF_The_Intervention_Journey_A_Roadmap_to_Effective_Digital_Safety_Measures_2025.pdf
Example behavioural interventions TABLE 4
Intervention Description Example organization(s)
Threat modelling
mentalityAn internal company approach that focuses on
understanding how systems and policies can be
exploited or “gamed” by malicious users, and designing
defences against such exploitation.Niantic
Trust and safety red team exercises to identify
gaps or potential problem areas within the product
before launch.
Service design Designing platform features that allow users to control
their experience by enabling options such as disabling
comments, blocking users or setting privacy filters
– this can lead to bad actors ceasing their offensive
behaviour by denying them the desired impact.Instagram
Allows users to turn off comments on posts, restrict
unwanted interactions and block specific accounts to
minimize exposure to harmful content or harassment.
User identification
(e.g. detect repeat
offenders) Systems designed to identify repeat offenders or
malicious actors using behavioural patterns, unique
identifiers or account verification, helping detect banned
users trying to rejoin.Facebook
Uses behavioural pattern recognition to detect
suspicious activity resulting is disabled accounts and
the creation of new accounts under different identities.
Detection of
behavioural signals
(e.g. keywords/
phrases)Algorithms that monitor for specific behavioural signals
or keywords associated with harmful content, such as
harassment, hate speech or threats, to prevent and
mitigate harm.Tinder
“Are you sure” feature is a warning to users to think
twice about their opening line, using AI to detect
harmful language and alerting the sender their
message may be offensive.
Positive
reinforcement
for safe behaviourPlatforms that reward or encourage positive online
behaviours, such as respectful interactions, and
highlight users who contribute to the platform’s safety
and positive culture.Reddit
Karma system rewards users with “karma” points
for positive contributions to discussions, promoting
respectful behaviour and discouraging
trolling or abusive behaviour.
Behaviour-focused
warningsDisplaying warnings based on user actions, such as
sending alerts if a user is engaging in harmful behaviour,
or escalating enforcement (e.g. temporary suspension)
after repeated offences.YouTube
Provides warnings when users violate content policies,
offering them the chance to change their behaviour
before harsher penalties like account suspension
are imposed.
Source: Niantic. (2023). Our Approach To Safety; Instagram Help Centre. (n.d.). Managing Your Privacy Settings; Meta. (2024).
How enforcement technology works; Tinder Newsroom. (2021). Tinder Introduces Are You Sure?, an Industry-First Feature
That is Stopping Harassment Before It Starts; Reddit. (2024). What is karma?; YouTube Help. (n.d.). Community Guidelines
strike basics on YouTube.3.4 Behavioural interventions
Behavioural interventions focus on changing
individuals’ and groups’ actions and habits to
reduce digital risks. These interventions apply
strategies from psychology and behavioural science
to promote safe online behaviours and discourage
risky or harmful activities.
Tailoring interventions to fit the specific situation of
the individual ensures that the support provided is
appropriate and effective. Organizations must develop the capability to implement behavioural interventions
by applying approaches that support the effective
design and deployment of such strategies.
Additionally, balancing privacy concerns with
the need to report potential radicalization to law
enforcement poses a challenge. It is important to
navigate these concerns carefully, ensuring that
interventions are both effective in addressing risks
and respectful of individual privacy rights.
The Intervention Journey: A Roadmap to Effective Digital Safety Measures
33
Ask AI what this page says about a topic: