AI agents: are they here yet?
AI agents that can automate complex tasks and decision making are the talk of the technosphere at the moment. But how close are we to models that can be trusted to handle important functions on our behalf?
The idea of 'smart' assistants that can perform simple tasks for us is now a familiar one - Siri and Alexa are practically household names. However, agentic AI, the newest buzz coming out of tech companies, takes the concept much further. Agentic AI promises to go beyond simple algorithms to help us with complex tasks — but, as with most things to do with the AI sector, it’s always wise to step back and ask yourself if you’re dealing with reality or speculation.
Agentic AI would have much more autonomy to, as Scientific American puts it, “interact with external apps to perform tasks on behalf of individuals or organizations.” Even as some raise red flags about potential safety concerns, others ask whether this is even possible.
The new agentic AI
The defining difference between the AI we have known and the agentic AI that is being promised by everyone from OpenAI to Meta to Google is that it can do more than summarize or create content. These agents are supposed to be able to make decisions and take action to work toward a specific goal.
Within this definition there are a number of different types of agents. In fact, there's a spectrum of automated assistance that goes from the most elementary (for example, a thermostat) to the most sophisticated (Amazon's automated retail stores).
Arcee.ai usefully summarizes the different options:
- Simple reflex engines “respond to what they see now without using memory or past experiences.” A simple example of this type of agent is a thermostat that can respond to temperature changes.
- Model-based reflex agents “have access to both the external and its own internal world, which gives them more data and flexibility when making decisions.” A smart thermostat, for example, compares current temperatures to past temperature patterns, how many people are at home, and weather conditions before making changes.
- Goal-based agents can “decide how their actions will contribute to achieving their goals.” A navigation system can perceive the environment and evaluate routes, suggesting changes when necessary to get the driver to their destination as quickly as possible.
- Utility-based agents can decide “whether an action will achieve a goal and evaluate how desirable the outcome of that action is.” Robo-advisors in the financial industry help make investment decisions based on a person’s risk tolerance.
- Learning agents “can improve their performance through experience.” DeepMind’s AlphaGo has learned how to play a board game and improved until it was able to beat world champion players.
-
Hierarchical agent systems “break down complex tasks into simpler subtasks, organizing them in hierarchical structure.” Amazon Go stores “Just Walk Out” technology breaks down tasks everything from store management and inventory tracking down to sensor data processing.
As you see, some of this technology is already at work in some real-world applications, others are envisaged as just round the corner.
Your AI travel agent?
Silvio Savarese, chief scientist at cloud-based software company Salesforce, told Scientific American: “An app-based agent tasked with planning your summer vacation, for example, could book your flights, secure tables at restaurants and reserve your lodging while remembering your preference for window seats, your peanut allergy and your fondness for hotels with a pool.”
Ideally, an app-based agent would need to be able to respond when the best flight is booked up and check other airlines or dates, responding in the same way a human would to any roadblocks along the way.
Meanwhile, Amazon is already working on a personal shopping assistant to not only recommend but actually purchase products. Other Amazon AI projects have run into difficulties, as we shall see.
But what if agents hallucinate?
Much of the interest in agentic AI has been sparked by the remarkable growth in GenAI technologies such as ChatGPT and the potential for harnessing their capabilities in AI agents. But to be useful, an AI agent must be able to act effectively in the domain in which it works and that implies a thorough knowledge of that domain.
Knowledge of the world, however, is exactly what GenAI models are missing (as we noted in a previous post What's missing in GenAI?). The result is a tendency to 'hallucinate' - i.e. make stuff up - that developers have been unable to entirely eliminate.
While hallucinations in a research assistant may be just an irritation, in a model designed to make real-world decisions on its user's behalf, they may be catastrophic. This is the issue that has dogged attempts to harness GenAI technologies in one of the world's most popular digital assistants - Amazon's Alexa.
Alexa's warning tale
Amazon has already spent years trying to turn its popular Alexa voice-powered assistant into an AI-driven agent, but big problems persist. “Hallucinations have to be close to zero,” Rohit Prasad, leader of the artificial general intelligence (AGI) team at Amazon, told Financial Times. “It’s still an open problem in the industry, but we are working extremely hard on it.”
Some commentators on the story suggest that poor development strategy may have contributed to Amazon's difficulties, but the fundamental issue remains the reliability of the models
Bullish projections
Despite the very real troubles some companies are having ironing out AI’s issues, Gartner predicts 33% of enterprise software applications will include agentic AI by 2028. In 2024, less than 1% of enterprise software apps used this technology. Business process automation (BPA), supply chain management, and finance are all likely candidates to benefit from advances in agentic AI.
Gartner recognizes the shortcomings of current solutions but is confident they will be overcome: "A big gap exists between current LLM-based assistants and full-fledged AI agents, but this gap will close as we learn how to build, govern and trust agentic AI solutions."
Gartner's faith in the eventual success of agentic AI is shared, surprisingly, by renowned AI skeptic Gary Marcus : "The funny thing is, this time I think the hype is right — at least in the long term. I do genuinely think we will all have our own AI agents, and companies will have armies of them. And they will be worth trillions ....".
But he warns not to expect them “this year (or next, or the one after that, and probably not this decade, except in narrow use cases). All that we will have this year are demos.”
Wait and see
Another question might be whether consumers actually want agentic AI and will adopt the technology in big enough numbers to make the juice worth the squeeze. As Digiday reports, one marketing executive said, “I think we’re a long way from the point where enough people are comfortable allowing an AI agent to spend money on their behalf and take control of the decision making.”
To avoid another Metaverse-style boondoggle, AI companies may be better off focusing on business applications for now and let people book their own vacations.