Masters of mimicry and hearsay: how ChatGPT really works
Large language models are capable of impressive feats – but we need to remember that their linguistic dexterity is driven by surprisingly simple mechanisms.
“Any sufficiently advanced technology is indistinguishable from magic,” wrote Arthur C Clarke in 1962. Today’s tech users are a bit more sophisticated than his audience of 60 years ago. Nevertheless, the latest exploits of generative AI engines have sparked the kind of wonder and fear normally reserved for the supernatural.
Astonishing abilities …
ChatGPT was released for public use in November of last year. The engine’s uncanny ability to answer questions and generate texts on any subject in a seemingly unlimited range of styles quickly prompted reactions from wild enthusiasm to deep unease about its apparently all-powerful and all-knowing capabilities.
… but also surprising fallibility
But it was not long before widespread public experimentation with ChatGPT and other models started to throw up examples of bizarre and erratic behavior.
In early March computer scientist Alexander Hanff published an account of his first encounter with ChatGPT. In response to a request to tell him what it knew about him, the bot replied that he had tragically passed away in 2019 at the age of 48. When a shaken Hahn asked for the source of this news, the AI doubled down, referring to obituaries in the Guardian and other publications, although unable to provide links or other information.
In February New York Times reporter Kevin Roose had a similarly disturbing experience with Microsoft’s Bing AI chatbot during which the engine declared its love for him and tried to persuade him to leave his wife. Here the potential dangers go beyond simple factual mistakes.
As Roose himself observed: “… I no longer believe that the biggest problem with these A.I. models is their propensity for factual errors. Instead, I worry that the technology will learn how to influence human users, sometimes persuading them to act in destructive and harmful ways.”
Let’s look under the hood
It’s easy to be be impressed by the human-like performances of these models and their apparent potential for ‘good and evil’. That makes it important to be clear about what they are actually doing. Understanding the way in which the models are built and trained can help us to be less seduced by their successes and less puzzled by their failures.
A simple mechanism
The impressive language abilities of generative AI engines like ChatGPT are based on a relatively simple training mechanism. A neural network, capable of ‘learning’, is presented with the first part of a sentence and must supply the next word. The ‘correct’ answer is then revealed and the network adjusts its parameters accordingly. After training on billions of sentences from a huge corpus of texts, such as Wikipedia or the entire internet, the model becomes very good at predicting the continuation of sentences.
Over the last few years, the methods used by the networks to make their predictions have evolved, but the basic process remains the same. An important step forward was the ‘Transformer’ technology developed by Google in 2017. This took into account dependencies between the elements of the test sentence in making its prediction. It led to a quantum leap in the power and precision of the resultant models.
In spite of these refinements, however, the models remain simple sequence predictors. Their power comes from the sheer size of the language corpuses they are trained on and the huge collection of correlation statistics that results.
Is this knowledge?
Once we grasp the simplicity of the basic mechanism underlying these models, we are in a better position to understand what the AI is doing when it answers a question.
The following example comes from a recent paper, Talking about Large Language Models, by Murray Shanahan, Professor of Cognitive Robotics at Imperial College London.
“Suppose we give an LLM the prompt “The first person to walk on the Moon was ”, and suppose it responds with “Neil Armstrong”.
What are we really asking here?
In an important sense, we are not really asking who was the first person to walk on the Moon. What we are really asking the model is the following question:
Given the statistical distribution of words in the vast public corpus of (English) text, what words are most likely to follow the sequence “The first person to walk on the Moon was… ”?
A good reply to this question is “Neil Armstrong””
Shanahan points out that, in reality, the AI knows nothing about the world in which Armstrong walked on the moon. It has statistical data about the co-occurrence of the two elements in what other people have said or written and this allows it to make the correct connection – but this is not the same as having knowledge of the world. We might call it ‘hearsay’.
Humanizing our machines
Of course, attributing human capacities to technological devices is something we do quite often as a figure of speech:
My watch doesn’t realize we’re on daylight saving time.
The mail server won’t talk to the network.
But, as Shanahan points out, we don’t take these literally. Nobody would think of telling the mail server to try harder.
When it comes to generative AI, however, the temptation to talk about the model as some kind of intelligent agent with knowledge, beliefs, reasoning ability, and maybe even moral qualities, is difficult to resist.
Nevertheless, resist it we must if we want to truly understand the power and the limits of these technologies.
In a recent Forbes article, Lance Eliot issues a blunt warning:
“Do not anthropomorphize AI.
Doing so will get you caught in a sticky and dour reliance trap of expecting the AI to do things it is unable to perform. With that being said, the latest in generative AI is relatively impressive for what it can do. Be aware though that there are significant limitations that you ought to continually keep in mind when using any generative AI app.”
Can machines hallucinate?
Interestingly, one of the most striking examples of this humanizing talk comes from AI practitioners themselves. Sometimes a model may produce factually inaccurate responses and AI engineers will often say that it is suffering from ‘hallucinations’.
The term is obviously used figuratively. The model has no awareness of an objective reality, so there is no sense in which it is making statements based on faulty perceptions of that reality (which is what we normally mean by ‘hallucinating’).
However, speaking in this way suggests that the model has some kind of truth-discerning access to the world which has temporarily broken down. In fact it has no such access.
But what about ‘reasoning’?
We’ve established that the impressive performance of language models doesn’t entitle us to conclude that they have such things as ‘knowledge’ or ‘beliefs’ about the world.
But what about reasoning? Even the earliest computers performed their computations as sequences of logical operations. Surely we can attribute reasoning ability to these much more highly evolved systems?
Here, too, it’s useful to look carefully at what’s going on when a language model performs a logical operation.
“If we prompt an LLM with “All humans are mortal and Socrates is human, therefore”, we are not instructing it to carry out deductive inference.
Rather, we are asking it the following question: Given the statistical distribution of words in the public corpus, what words are likely to follow the sequence ‘All humans are mortal and Socrates is human therefore”.
A good answer to this would be “Socrates is mortal”.”
Again, the model is giving us a probable answer based on the distribution of words in what others have said or written. It is not computing the answer in the way a hard-coded theorem-proving algorithm would.
Just as, in providing factual information, the model is mimicking expression of knowledge, here it is mimicking a process of logical deduction. In reality it is not carrying out such a process.
The chances that the model will reach an invalid conclusion increase with the number of steps involved in its ‘deduction’ and if it is reasoning about something relatively obscure with few references in its training corpus.
Low-code - no code
The same considerations lie behind the decisions of developer portals like GitHub and Stack Overflow not to allow sharing of computer code generated by ChatGPT. Although the model can produce plausible looking computer code at speed, its error rate is quite high and requires close inspection and correction by expert users before it is safe to use.
Again, this should not surprise us. The model is providing the most likely code sequence to satisfy the request – it is not itself coding according to the rules of the language or the context of the application, because it does not know them.
Can’t the model be fine-tuned?
As we have seen, the tendency of LLM’s to produce erroneous or offensive output is unsurprising, since the vast corpuses they are trained on will inevitably include incomplete or inaccurate information and opinions or expressions judged to be socially unacceptable.
To correct this tendency the models are frequently subjected to ‘fine-tuning’, often using a process known as ‘reinforcement learning from human feedback’ or RLHF. This involves rating of the model’s output by human subjects to produce scores that are then fed back to the model to alter the probability of certain types of response.
The aim is to improve the quality of the responses along dimensions like ‘helpfulness’, ‘truthfulness’ and ‘harmlessness’ in order to obtain responses that are more relevant and accurate and less toxic or denigratory.
This fine-tuning undoubtedly works – the feedback-trained models produce fewer wrong or offensive responses than the ‘raw’ or unrefined model.
But again, it is a mistake to think that the AI has somehow been ‘educated’ to be more truthful or polite. The preferences of a panel of human judges have simply been added to the statistical weightings that determine the model’s responses. Regardless of appearances, there is no sense in which the model has acquired faculties such as ‘judgement’ or ‘discretion’.
Masters of mimicry, hearsay and impersonation
Knowing more about how large language models are constructed and the nature of the constructs that underlie their performance, we are in a better position to understand what they can and cannot do.
Put briefly, they are masters of mimicry. And, given the huge body of language they sample from and the massive computational power driving their training, it should not surprise us that their mimicry of human language use is often very convincing.
But they are also masters of ‘hearsay’. Without any direct source of information about the world, their productions are based on what others have said or written. As with any information based on hearsay, we should be cautious about accepting it as the unalloyed truth, without error or bias.
And, when it comes to rigorous disciplines like maths, logical deduction or computer coding, they are impersonators of real mathematicians, logicians and coders. Their efforts look plausible and may often be correct – but this is just a coincidence because in reality they are not hard-coded with the rules and axioms that underlie those disciplines.
So what should we expect from ChatGPT?
Recognizing these limitations doesn’t detract from the fact that these models are capable of remarkable linguistic feats which will certainly make them valuable tools for humans in multiple fields.
From lightening the load of editorial tasks like summarizing, generating routine texts and social media posts to more ‘creative’ work like brainstorming plots and storylines, they will provide precious support to human authors and editors.
And they are also likely to radically change the interaction between humans and the internet by transforming the world of search.
But we must always remember that these performances are in the end based on statistical constructs with all of the limitations that this implies. Failing to keep this in mind, we run the risk that our experiences of using them will be disappointing or even dangerous.