‘Artifical Intelligence’ is a term that lumps together a diverse array of different things. The vagueness of the term enables overly bold claims to be made about genAI, which whilst impressive, misrepresents its strengths and limitations.
LLMs like ChatGPT are statistical approximations of their training data. They generate text by predicting the next token in a given context based on the complex patterns and associations encoded within their model weights.
‘Hallucination’ is a misnomer for inaccurate information generated by genAI. LLMs are always calculating the next most probable token, they do not distinguish between accurate and inaccurate.
So-called ‘thinking’ / ‘reasoning’ models similarly do not ‘think’, they are trained to generate ‘thinking tokens’ before generating tokens for the main response, but it still remains a calculation of the next most probable token.
Evidence is showing that there are limits to the extent genAI generalises, doing well on problems well represened within its ‘training distribution’, but failing for problems outwith it.