The case AI is thinking about

Kanerva’s book disappeared from view and Hofstadter’s own star faded, except when he occasionally reared his head to criticize a new artificial intelligence system. In 2018, he wrote about Google Translate and similar technologies: “There is still something deeply missing from the approach, which is conveyed with a single word: comprehension.” But GPT-4, which launched in 2023, produced Hofstadter’s turning point. “I’m amazed by some of the things the systems do,” he told me recently. “This would have been inconceivable even just ten years ago.” The staunchest deflationist could no longer deflate. Here was a program that could translate as well as an expert, make analogies, improvise, generalize. Who were we to say he didn’t understand? “They do things that are a lot like thinking,” he said. “You could say that are thinking, only in a somewhat strange way.”

LLMs seem to have a “seeing as” machine at their core. They represent each word with a series of numbers that denote its coordinates (its vector) in a high-dimensional space. In GPT-4, a vector of words has thousands of dimensions, which describe its nuances of similarity and difference with any other word. During training, a large language model modifies the coordinates of a word every time it makes a prediction error; Words that appear together in texts move closer together in space. This produces an incredibly dense representation of uses and meanings, in which the analogy becomes a matter of geometry. In a classic example, if you take the word vector for “Paris,” subtract “France,” and then add “Italy,” the closest other vector will be “Rome.” LLMs can “vectorize” an image by encoding what it contains, its mood, and even the expressions on people’s faces, in enough detail to redraw it in a particular style or write a paragraph about it. When Max asked ChatGPT to help him with the park sprinkler, the model wasn’t just spewing text. The photograph of the pipes was compressed, along with Max’s indication, into a vector that captured its most important features. That vector served as the direction to retrieve nearby words and concepts. These ideas, in turn, sparked others as the model built an idea of the situation. He wrote his response with those ideas “in mind.”

A few months ago, I was reading an interview with an anthropic researcher, Trenton Bricken, who worked with colleagues to probe the insides of Claude, the company’s suite of AI models. (His research has not been peer-reviewed or published in a scientific journal.) His team has identified sets of artificial neurons, or “features,” that fire when Claude is about to say something or another. The functions turn out to be like volume controls for concepts; upload them and the model will talk about little else. (In a thought control experiment of sorts, the feature representing the Golden Gate Bridge appeared; when a user asked Claude for a chocolate cake recipe, the suggested ingredients included “1/4 cup dry fog” and “1 cup warm sea water.”) In the interview, Bricken mentioned Google’s Transformer architecture, a recipe for building neural networks that underlies major AI models. (The “T” in ChatGPT stands for “Transformer.”) He argued that the mathematics at the heart of the Transformer architecture closely followed a model proposed decades earlier, by Pentti Kanerva, in “Sparse Distributed Memory.”

Should we be surprised by the correspondence between AI and our own brain? After all, LLMs are artificial neural networks that psychologists and neuroscientists helped develop. What is more surprising is that when the models practiced some memory practice (predicting words) they began to behave in a brain-like way. Today, the fields of neuroscience and artificial intelligence are intertwining; Brain experts are using AI as a kind of model organism. Evelina Fedorenko, a neuroscientist at MIT, has used LLMs to study how the brain processes language. “I never thought I would be able to think about these kinds of things in my life,” he told me. “I never thought we would have models that were good enough.”

It has become commonplace to say that AI is a black box, but arguably the opposite is true: a scientist can investigate the activity of individual artificial neurons and even alter them. “Having a working system that exemplifies a theory of human intelligence is the dream of cognitive neuroscience,” Kenneth Norman, a Princeton neuroscientist, told me. Norman has created computer models of the hippocampus, the region of the brain where episodic memories are stored, but in the past they were so simple that he could only feed them crude approximations of what might fit into a human mind. “Now you can model memory with the exact stimuli you give a person,” he said.