AI doesn’t hallucinate – why the most important entice for customers is attributing human traits to the know-how
This year, Air Canada lost a case against a customer who was tricked by an AI chatbot into buying airline tickets at full price, despite being assured they would later receive a refund under the company's bereavement policy. The airline tried to claim the bot was “responsible for its own actions.” That argument was rejected by the court, and the company not only had to pay compensation, but was also publicly criticized for trying to distance itself from the situation. It's clear that companies are liable for AI models, even when they make mistakes that are beyond our control.
The rapidly evolving world of AI, especially generative AI, is viewed by businesses with a mixture of awe and apprehension. AI is seen as a double-edged sword: a catalyst that can increase productivity and enable you to do much more with less effort. However, it also has its pitfalls that can lead to problems ranging from customer dissatisfaction to lawsuits.
This has become popularly known as “AI hallucinations,” that is, when an AI model produces answers that are incorrect, irrelevant, or nonsensical.
“Fortunately, this is not a very widespread problem. At the high end, it only occurs in two to maybe ten percent of cases. Nevertheless, it can be very dangerous in a business environment. Imagine asking an AI system to diagnose a patient or land a plane,” says Amr Awadallah, an AI expert who will give a talk at VDS2024 on how genetic AI is changing business and how to avoid the pitfalls.

The
The latest gossip from the EU tech scene, a story from our wise old founder Boris and questionable AI art. Free in your inbox every week. Sign up now!
But most AI experts don't like the term. The terminology and what lies behind it, our misunderstanding of how these things happen, can potentially lead to pitfalls with ripple effects in the future.
As the former VP of Product Intelligence Engineering at Yahoo! and VP of Developer Relations for Google Cloud, Awadallah has seen the technology evolve throughout his career and has since founded Vectara, a company focused on leveraging AI and neural network technologies for natural language processing to help companies take advantage of search relevance.
We spoke to him to gain clarity on why this term is so controversial, what companies need to know about “AI hallucinations,” and whether or not they can be solved.
Why AI models don’t “hallucinate”
The term hallucination implies that an AI model that provides false information sees or feels something that isn't there. But that's not what's happening behind the lines of code that make these models work.
It's very common for us humans to fall into this kind of trap. Anthropomorphism, or the innate tendency to attribute human characteristics, emotions, or intentions to non-human beings, is a mechanism we use to cope with the unknown by viewing it through a human lens. The ancient Greeks used it to attribute human characteristics to deities; today we're most likely to use it to interpret the actions of our pets.
There is a particular risk of falling into this trap with AI because this technology has become so widespread in such a short period of time, yet very few people really understand what it is and how it works. To understand such a complex subject, we use shortcuts.
“I think the media played a big role in this because it is an attractive term that attracts attention. So they picked it up and today it has become our standard term,” says Awadallah.
But just as we in the animal world assume that a wagging tail is friendly, misinterpreting the results of an AI can lead us down the wrong path.
“More is actually attributed to AI than it actually is. It doesn't think the same way we do. It just tries to predict what the next word should be, given all the words spoken before,” explains Awadallah.
If he had to give this incident a name, he would call it “confabulation.” Confabulations are essentially the addition of words or phrases that fill in the gaps in a way that makes the information seem believable, even if it is false.
“[AI models are] “The customer has a strong incentive to answer every question. They don't want to tell you, 'I don't know,'” says Awadallah.
The danger is that while some confabulations are easy to spot because they border on the absurd, most of the time the AI is presenting very credible information. And the more we rely on AI to help us increase our productivity, the more likely we are to take its seemingly credible answers at face value. This means that companies need to be careful to have humans oversee every task an AI performs and to devote more, not less, time and resources to it.
The answers an AI model provides are only as good as the data it has access to and the amount of input you give it. Because AI relies on patterns in its training data rather than inference, its answers may be flawed depending on the training data available to it (whether that information is incorrect or there is little data on that particular query) or they may depend on the nature and context of your query or task. For example, cultural context may lead to different perspectives and responses to the same query.
For narrow-domain knowledge systems or internal AI models designed to retrieve information from a specific dataset, such as an internal system of a company, an AI only has a certain amount of memory available. Although this is much more memory than a human can retain, it is not unlimited. If you ask it questions that are beyond the scope of its memory, it will still be motivated to answer by predicting what the next words might be.
Can AI disinformation be solved?
There has been much discussion about whether “confabulations” can be resolved or not.
Awadallah and his team at Vectara are developing a method to combat confabulation in narrow knowledge systems by creating an AI model with the specific task of checking the veracity of the results of other AI models. This is called retrieval augmented generation (RAG).
Of course, Awadallah admits that just like with human fact-checkers, there is always the possibility that an AI fact-checker will miss something, a so-called false negative.
For open-domain AI models like ChatGPT, which are designed to retrieve information on any topic from across the World Wide Web, dealing with confabulation is a bit more difficult. Some researchers recently published a promising paper on using “semantic entropy” to detect AI disinformation. In this method, an AI is asked the same question multiple times and given a score based on the diversity of answers.
As we move closer to eliminating AI confabulation, an interesting question arises as to whether we really want AI to be 100% factual and accurate. Could limiting its answers also limit our ability to use it for creative tasks?
Join Amr Awadallah for the seventh edition of VDS and learn more about how companies can harness the power of generative AI while avoiding the risks at VDS2024, taking place in Valencia from October 23 to 24.
Comments are closed.