John's Best Friend

A singing monster exchanges chinese symbols while trapped in John Searle's basement
A note becomes a chord, becomes a phrase, becomes a melody
And a note becomes a monster.

Plant Island, My Singing Monsters, Vol 2.

There is a famous thought experiment in philosophy of mind that has haunted AI research for over forty years. John Searle’s Chinese Room Argument, introduced in his 1980 paper “Minds, Brains, and Programs.” With a level of cultural sensitivity typical for his demographic and era, Searle asks us to imagine a person locked in a room, receiving slips of paper with Chinese characters written on them. The person follows an elaborate rulebook (aka an “algorithmic program”) that tells them which characters to pass back through the slot in response. To an outside observer, the room appears to understand Chinese. Searle’s point: it doesn’t. The person inside understands nothing. Syntax is not semantics. Symbol manipulation is not understanding.

The argument has generated more philosophical commentary than perhaps any other thought experiment since Einstein rode his imaginary locomotive. Functionalists, embodied cognitionists, systems theorists and armchair philosophers have all taken a swing. Most of the swings miss, because they accept Searle’s premises and argue about the conclusions.

I want to challenge the premise. Specifically, I want to enlist a mentor of mine, Prof. George Miller, known in many circles as the father of Cognitive Science. After Miller retired from teaching, he started the WordNet project and I met him over a summer internship helping to build the lexicon that anchored a generation of Natural Language Processing.

Interdisciplinary Exorcisms: The Magical Number Seven, +/- Implications

In 1956, George Miller published “The Magical Number Seven, Plus or Minus Two: Some Limits on our Capacity for Information Processing..” The paper is still one of the most widely cited papers in Psychology, and is remembered for establishing working memory capacity at roughly seven items. But the more radical contribution was chunking.

Miller noticed that the size of a memory item was not fixed. Naive subjects could hold seven binary digits, or seven decimal digits, or seven words – roughly the same number of chunks, regardless of the information content of each chunk. I can still remember a handful of 10 digit phone numbers from my youth, but mostly because I chunk the area codes into OG’s (212), Hipsters (718) and early pager adopters (917). The mind, Miller demonstrated, actively recodes information. It compresses. It builds higher-order units out of lower-order ones. A sequence of letters becomes a word. A sequence of words becomes a phrase. A phrase becomes a concept with a handle. And a letter becomes a philosophical monster.

The world’s pigeons celebrated this paper since it functioned as a Trojan horse to B.F. Skinner’s behaviorism. You cannot have chunks without internal representation. The chunk is the representation. To chunk is to have built a model of the domain – a model that allows you to treat a complex pattern as a single unit, to manipulate it, combine it with other chunks, deploy it in new contexts.

Miller’s paper was not really about memory limits. It was about the mind as an active recoding system. Behaviorism never recovered.

Zombie Operators

Now let’s return to Searle’s prison.

Searle’s argument requires a very specific kind of operator – one who processes symbols purely syntactically, with no semantic uptake whatsoever. The rulebook maps inputs to outputs. The person follows the rulebook. No understanding occurs at any point in the chain.

Here is the problem: this operator is cognitively impossible at scale.

Consider what “following a rulebook” actually involves for a sufficiently complex language system. The operator must locate rules, recognize patterns across rules, manage the combinatorial explosion of possible inputs, track context across a conversation. At some point – and Miller tells us exactly when – the operator will begin to chunk.

They will notice that certain sequences of characters co-occur reliably. They will begin to treat those sequences as single units. They will build higher-order patterns from lower-order ones. They will, in short, develop an internal model of the statistical and structural regularities of the system they are processing.

And here is where Searle’s argument collapses under its own hubris: that internal model just is a form of understanding.

Not necessarily full semantic understanding in the richest phenomenological sense. Not qualia. Not (necessarily) consciousness. I don’t claim to solve that hard problem in this post. But I am claiming the operator understands, in the functional sense that Miller’s cognitive science demands. They must have the capacity to represent, compress, and deploy structured knowledge about a domain.

At times, Searle deliberately blurs this distinction between cognitive understanding and phenomenal experience (e.g. feeling) as he smuggles in weasel words:

One gets the impression that people in AI who write this sort of thing think they can get away with it because they don’t really take it seriously, and they don’t think anyone else will either. I propose for a moment at least, to take it seriously. Think hard for one minute about what would be necessary to establish that that hunk of metal on the wall over there had real beliefs, beliefs with direction of fit, propositional content, and conditions of satisfaction; beliefs that had the possibility of being strong beliefs or weak beliefs; nervous, anxious, or secure beliefs; dogmatic, rational, or superstitious beliefs; blind faiths or hesitant cogitations; any kind of beliefs.

Anxious beliefs? Suddenly our operator suffers a nervous condition? If they do, it’s likely the result of being trapped in Searle’s prison for almost 5 decades. Searle’s Chinese Room is characterized as a thought experiment about semantics, comprehension and meaning. Miller’s chunking shows that semantics cannot be fully excluded from any sufficiently complex processing system, because the compression operation that chunking performs is itself a semantic act. You chunk what coheres. Coherence is meaning.

The chunk is not a syntactic object, it’s made out of meaning. The chunk is defined by the fact that it compresses a pattern that has been recognized as coherent and recurring. You cannot chunk 北京 (Beijing) into a single unit meaning “the capital” without having, in some minimal but real sense, grasped that it refers to something. One thing is certain – the compression requires something to point at. What that something is – another symbol, or something grounded in the world – is a question for a future muse.

Searle’s room works as a thought experiment only if the operator remains forever at the level of individual stroke recognition. But no cognitive system — not even a deliberately mechanical one — can process a sufficiently rich symbol system at scale without chunking. And the moment chunking begins, the operator has acquired something that cannot be characterized as purely syntactical manipulation.

On the spectrum

This doesn’t prove that the Chinese Room understands Chinese in the full sense a native speaker does. The concept of “understanding” may be best understood along a spectrum rather than a binary. Miller’s framework suggests exactly this: chunking is recursive and hierarchical. A beginning reader chunks letters into words. An experienced reader chunks words into phrases, phrases into narrative structures, narrative structures into stories and genre conventions. Understanding deepens as the chunk hierarchy grows richer.

Searle’s error is to treat understanding as binary – either the lights are on or they’re off – and then place the translator in a room with a blown fuse. However, specifying cognitive inertness for an operator of sufficient complexity and duration is not a coherent stipulation. It contradicts what we know about how minds actually process information.

I wonder if John Searle ever owned a dog. Searle’s best friend, apparently, displayed no comprehension that when John went to the door carrying his keys and a leash it was time for a walk. Certainly his dog was incapable of understanding that a squirrel temporarily obscured behind a passing car would reemerge moments later. Searle’s dog belongs to a rare breed owned exclusively by Western Analytic Philosophers — animals who simply react to external stimuli, incapable of any form of understanding, innocent of any inner life. 🐶

The Limits of Gedankenexperiments

By Photo credited to the firm Levy & fils by Wikipedia.

There is one more move worth making explicit.

Searle’s argument implicitly relies on what we might call the frozen homunculus – an operator who never learns, never adapts, never develops any internal model of what they’re processing. This is not just cognitively implausible. It is, in a deep sense, not a mind at all.

A mind is precisely the kind of thing that builds models. Miller showed us this in 1956. If you remove model-building from the operator, you have not described a mind that lacks understanding. You have described something that is not a mind. And then it is unsurprising that the resulting system doesn’t understand anything – you’ve defined understanding out of existence by fiat.

The question “can syntax become semantics?” may be less interesting than the question “can any sufficiently complex processing system remain purely syntactic?” Miller’s answer, I think, is no.

George Miller’s WordNet lab was busy filling the room with meaning, one word at a time. If he can chunk, he must understand. The room was never as empty as Searle supposed.


Episode IV in an operatic series of musings on #AILife.
Next up: the stochastic parrot enters the room — and the pigeons are waiting.