Diverse and inclusive data Equitable data is not a luxury; diverse and inclusive datasets are essential for creating AI that reflects and serves all of humanity. The World Economic Forum’s Global Future Council on Data Equity defines data equity as the shared responsibility for fair data practices that respect and promote human rights, opportunity and dignity. Data equity is a fundamental responsibility that requires strategic, participative, inclusive, proactive and coordinated action. It aims to create a world where data-based systems promote fair, just and beneficial outcomes for all individuals, groups and communities.15 National language models are an important new way to support data equity. A PPP between the United Arab Emirates government and G42 has developed one of the world’s first LLMs based specifically on modern standard Arabic (understood across the Middle East) and regional diverse spoken dialects.16 Known as “Jais”, the LLM draws on local media reports and social media posts to ensure that locally spoken languages are included within the LLM development while also considering cultural norms. Taking inspiration from Google Research’s language inclusion work and the concept of digital language banks, Jais should act as a catalyst to enabling region-specific model requirements. Additionally, Cohere has developed Aya, a dataset (more specifically, a digital language bank) that represents one of the largest collections of multilingual models covering 114 languages, including rare and local dialects.17 The Aya models and datasets have been released publicly with the intention of safely advancing the R&D of multilingual capabilities. Data ownership and sharing The controlled ownership of data enables governments to regulate how data is shared internationally, thereby reducing misuse and promoting trust in AI applications. This complexity of data ownership is now increasing with the emergence of the agent economy and multi-agent interactions, where data is modified many times during use. The past few years have seen a shift to data residency restrictions, often justified as essential to national security. These restrictions are now shaping data centre investment as tech companies look to comply with data residency requirements, operational compliance and, in some cases, the need for individual consent. Microsoft’s recent announcement of significant investment into cloud services in Saudi Arabia,18 for example, is partly driven by market demand and partly by evolving regional data residency requirements.Data protection and privacy Emerging privacy challenges such as deepfakes, AI-generated misinformation and high-profile data breaches are increasing mistrust in AI. Tools such as the World Economic Forum’s Digital Trust Framework can support regulators and industry leaders in considering shared goals and values in the development, use and application of AI.19 Disclosure requirements mandate organizations to share information about their data practices, including how data is collected, used and protected. Broadening these requirements to include AI- derived data enhances data protection. It requires companies to clarify how they use AI to process and generate insights from personal information.20 Expanding these requirements in this way may mean that companies need to offer their users an opt-in/out option to consent to expanding the purpose for which their data is used. Data life cycle management Regulatory tools remain key to safeguarding the privacy and security of data. Existing national data governance frameworks can be adapted and employed to ensure data is managed responsibly in the context of AI development, deployment and use. International agreements on cross-border data flows are becoming increasingly vital tools to minimizing regulatory obstacles, enhancing collaborative research and knowledge sharing related to AI, and building trust in data sharing. Collaboration with stakeholders at regional and global levels can lead to the development of shared terminology of concepts relating to privacy and data protection, thereby promoting clarity and effective communication between all stakeholders. Data intermediaries and stewards, along with leadership from chief data officers, have an important role to play in guiding the data strategy for collecting, sharing and using data. Data free flow with trust (DFFT) policies enforce the need to govern the flow of data, both within the data type and how it is used,21 however more comprehensive data governance is required to ensure that AI is developed responsibly and ethically. This does not happen organically, and, given the recent advancement of AI, governments must address their wider data governance approach to ensure that data is managed responsibly, with safeguards to protect privacy, security and ownership. Emerging privacy challenges such as deepfakes, AI-generated misinformation and high-profile data breaches are increasing mistrust in AI. Blueprint for Intelligent Economies 12

A Blueprint for Intelligent Economies 2024