Agentic AI has gone from being a neologism at the beginning of the year to becoming the de rigueur agenda item of any corporate strategy meeting. Never mind that Agentic AI has always been quietly doing its job as an inalienable part of GenAI since ChatGPT burst onto the stage.
At a most basic level, any act that has a willful agency and utilizes an LLM would be considered Agentic. Thus, asking a chatbot to fetch a response from a website or from a knowledge corpus would be considered an agentic use.
However, where Agentic AI becomes most potent is when it lines up with a business process and orchestrates not just decision making but action execution as well. Most of the extant hoopla of Generative AI is for this use scenario.
In the FOMO frenzy most organizations are diving headlong to exploit Agentic AI to drive any kind of automation. Every automation is being re-labelled as Agentic without any consideration whether that is the most optimal route of driving productivity in the process.
This has resulted in the co-existence of multiple Agentic AI frameworks in the same company that are either functionally or regionally siloed. These frameworks come in multiple incarnations; some are offered by commercial platforms to improve stickiness and capitalize on a lack of in-depth understanding, some are open source or offered by niche companies and some are offered by hyperscalers and large platform/services companies.
If one were to extrapolate this scenario one year into the future, one can clearly see how managing this Agentic AI sprawl will become nightmarish. Most business processes are cross-functional and utilize software platforms that are cross-functional by design or are built to provide easy integration for different best-of-the-breed function-specific platforms. To drive end-to-end process optimization, one may require agents created in one function to invoke a platform for another function which is also being invoked for a very similar task by a different agent within that function. Thus, one will end up with redundant agents competing for resources and increasing costs. Moreover, if these agents have been created using different frameworks, they would not be able to “talk” to each other.
To prevent such a snafu from ever materializing, organizations need to critically look at the following two aspects:
AgentOps: How to build an agent architecture that ensures interoperability and standardizes resource requests across different agentic frameworks?
FinOps: Given that most agents use ReAct principle, how to control the cost of reasoning which can spiral out of control in a hurry?
AgentOps: I propose a 3-layer architecture with full interoperability both between and within a layer.
Agent Creation Layer: The agent can do one of the few things:
Simple Q&A: Respond to generic questions from user by fetching answers from LLM
Access resource (read-only): This can be independent or concomitant to #1. Here, the agent accesses a data or information source and gets a response. This may require access to enterprise systems or the internet.
Execute actions (Write): Execute specific action(s) based upon pre-defined conditions.
The key question is: What/Who orchestrates these actions? At the most basic level, these can be done by a human through repeated prompting, common approach is to use orchestration capability of most agent engines (Crew, Langgraph, CoPilot …), a more robust approach is to create standard interaction pathways that are framework-agnostic. This is what is contained in the Protocol layer.
Agent Orchestration (Protocol) Layer: The core idea behind the orchestration layer is to abstract away the differences between different agentic frameworks by creating common interoperable pathways. This greatly simplifies managing multiple agentic frameworks that invariably will end up in different parts of an organization. In addition to the agent interactions defined above, an agent may also interact with another agent from the same or different framework. To ensure that all agent-resource or agent-agent interactions happen in a standardized way, The idea is that agent-to-agent or agent-to-resource interactions should be framework agnostic. MCP and A2A are quickly becoming dominant protocols. MCP uses client-server principles and RPC calls from client/agent to resources. A2A uses a similar construct albeit for inter-agent interactions. Memory management
Language Model Layer: This layer is critical both for defining the roles and interactions of agents as well as building efficiency measures like memory management that can improve performance and lower cost.
Language model: Choice of language model is critical in an Agentic AI framework as it governs the accuracy and the associated costs of running an agent. Most agentic systems use ReAct (Reasoning & Action) framework. The reasoning part of this framework uses a lot of inference tokens. Therefore, using a generic LLM may not always be a cost-viable choice. One needs to look at options like – distilled or fine-tuned models that are smaller and customized to the context in which agent would be operating. This would not only make them more accurate and less likely to hallucinate but would reduce the operating cost significantly.
Memory management: As the number of agents and actions they perform increase, it may be helpful to use a memory management framework. Such frameworks usually consist of short- and long-term memory components. Maintaining context in the short term while retaining knowledge that is static in the long term, helps on multiple fronts: reduces latency and costs while improving consistency and accuracy. This step is highly recommended for organizations planning to scale Agentic applications across the organization as cost management will become critical to control a large number of coordinated agents and their actions.
Very insightful read!