Advanced AI agents 2.3 The architecture of many current AI agents is often based on or linked to LLMs, which are configured in complex ways. Figure 3 presents a simplified overview of the key components leading to current breakthroughs in AI agents and their growing range of capabilities. Key components of advanced AI agents FIGURE 3: The AI agent begins with user input, which is directed to the agent’s control centre. The user input could be a prompt given to carry out an instruction. The control centre directs the user input to the model, which forms the core algorithmic foundation of the AI agent. This model could be an LLM or an LMM, depending on the application’s needs. The model then processes the input data from the user’s instructions to generate the desired result.17 At the core of the architecture is the control centre, a crucial component that manages the flow of information and commands throughout the system. It acts as the orchestration layer, directing inputs to the model and routing the output to appropriate tools or effectors. In simple terms, this layer orchestrates the flow of information between 1) user inputs, 2) decision-making and planning, 3) memory management, 4) access to tools and 5) the effectors of the system enabling action in digital or physical environments.18 The decision-making and planning component of an AI agent uses the model’s outputs to assist in decision-making and planning of multistep processes. In this segment, advanced features such as chain-of-thought (CoT) reasoning are implemented, which allows the AI agent to engage in multistep reasoning and planning. CoT is a technique where an AI agent systematically processes and articulates intermediate steps to reach a conclusion, which enhances the agent’s ability to solve complex problems in a transparent manner, as each step of the model’s underlying reasoning is reproduced in natural language.19 Memory management is vital for the continuity and relevance of operations. This component ensures that the AI agent remembers previous interactions and maintains context. This is essential for tasks that require historical data to inform decisions or for maintaining conversational context in chatbots. Tools enable the AI agent to access and interact with multiple functions or modalities. For example, in an online setting, an AI agent could have access to external tools such as web searches to gather real-time information and scheduling tools to manage appointments and send reminders, as well as project management software to track tasks and deadlines. In terms of modalities, an AI agent could use natural language processing tools alongside image recognition capabilities to perform tasks that require understanding of text-based as well as visual-based data sources. Once decisions are made or plans set, the effectors component of the AI agent executes the required actions. This could involve interacting AI agent Percepts Environment ActionsSensors Learning Digital infrastructureUser input Physical infrastructureEffectorsControl centre Model Decision- making and planningMemory managementTools Source: World Economic Forum Navigating the AI Frontier: A Primer on the Evolution and Impact of AI Agents 12

Navigating the AI Frontier 2024