Linguists reaching back to Saussure and the structuralists in the 1930s, possibly earlier, those who modeled linguistics as the study of a static object, believed that by identifying and describing the static features and parts of language, the structures that serve communication, they would have lifted the field to another level. When you ask the question “Where does language reside?” the structuralists would point to the work of the lexicographer and the grammarian. In those books! they would say.
Wittgenstein's famous treatise on language in the 1950s suggested something radically different—that meaning emerges not from fixed definitions or abstract grammatical structures but from the way language is actually used in social contexts. He rejected the idea that words have essential meanings that can be cataloged in dictionaries. Instead, he proposed that meaning derives from participation in "language games," conventional patterns of linguistic interaction tied to specific activities and “forms of life.”
For Wittgenstein, language resided not in books but in life, in the dynamic interplay between voices engaged in shared activities. His insight that the meaning of a word is its use in the language challenged the structuralist conception of language as a static system, laying groundwork for the usage-based approaches that would later emerge in linguistics. The fruits of these early philosophers of language ripened into what we are seeing today as cognitive grammar, construction grammar, and computational grammar. The fact that language exists both inside the individual and in society (and now in the parameters of a machine) complicates relations among these sub-theories, but none can be seen as structuralist. What is language isn’t interesting. How does language work. That’s interesting.
Construction Grammar
Adele Goldberg's work represents a key development in modern linguistic theory with considerable significance in the emergence of our recent capacity to chat with machines. Her approach to Construction Grammar, formalized in her influential 1995 book Constructions: A Construction Grammar Approach to Argument Structure, argued that grammatical patterns themselves—not just words—carry meaning.
Language is a structured inventory of form-function pairings ("constructions"), not a list of words and a grammar, that exist at varying levels of complexity and abstraction. A construction is formally defined as
A conventional pairing of form and function, where some aspect of the form or function is not strictly predictable from its component parts or from other previously established constructions.
Goldberg, a protege of George Lakoff of metaphor fame, and other early network linguistic modelers like James Hudson, who began his doctoral studies working with Michael Halliday (functional grammar), swam against the current of mainstream linguistics, which maintained a strict separation between the lexicon (Webster’s Dictionary) and grammar and syntax (The Cambridge Grammar of the English Language, 1,800 pages in length).
Goldberg's innovations demonstrated that syntactic patterns like the ditransitive [Subj V Obj1 Obj2] construction, one of those lower-level abstractions in the inventory, contribute systematic meaning even when used with verbs that don't inherently encode that ditransitive meaning. I bake you an example. Witness:
"He baked her a cake" conveying transfer despite "bake" not being a transfer verb.
Her research program, developed across subsequent works like Constructions at Work (2006) and Explain Me This (2019), provides compelling evidence that constructions form an organized network of related patterns that facilitate both language acquisition and creative language use. At issue now is whether Chomsky’s language acquisition device, a biological implant that theoretically comes as standard equipment, is universal, or whether language is completely learned from the bottom up, the least abstract particles to the highly abstract structures, using human consciousness as a guide.
The Construction Continuum: From Specific to Schematic
In traditional grammar, we separate words from rules. Construction grammar rejects this division, replacing it with a continuum of form-meaning pairings at different levels of abstraction.
Morphemes anchor one end of this continuum. These smallest meaningful units like the past-tense "-ed" to put it in the past or plural "-s" to make it many, pair specific forms with consistent functions. They combine to form words – "dog," “dogged,” but not “dogly," run," “running,” but not “runningful, "beauty," “beautiful,” but not “beautiplicity.” These conventional sound-meaning pairs we store as unitary items and transform them to sense in the silence of consciousness.
Complex words bridge toward more complex structures. Compounds like "doghouse" and derivations like "runner" show how word-level constructions follow patterns while maintaining lexical status. Partially schematic word-level patterns like "pre-X" (preschool, prewar) and "X-able" (readable, doable) demonstrably reveal how words leave open slots. Natural language has evolved to privilege function as the nexus of meaning and form.
Partially filled idioms occupy the middle ground. Expressions like "jog someone's memory" and "drive someone crazy" keep some elements fixed while allowing others to vary. They function as single units despite containing open positions. Fully lexicalized idioms like "kick the bucket.” "going great guns," and “three cuing system”permit almost no variation, acting essentially as multi-word vocabulary items.
Argument structure constructions move toward greater abstraction. The ditransitive construction [Subj V Obj1 Obj2] in "She baked him a cake" or “he cooked her up a story”contribute transfer meaning regardless of the specific verb used. The caused-motion pattern [Subj V Obj Oblique] in "She sneezed the napkin off the table" similarly imposes motion semantics on non-motion verbs.
Information structure constructions organize discourse flow. The topicalization construction "That book, I really enjoyed" foregrounds one element without changing core meaning. It functions systematically across countless possible sentences.
Highly schematic constructions occupy the abstract end of the continuum. The correlative construction [The X-er the Y-er] expresses systematic relationships with minimal lexical specification. The passive construction [Subj aux VPpp (PPby)] reorganizes information structure across countless possible verbs.
This continuum reveals that all linguistic knowledge consists of learned pairings of form with meaning or function. The traditional division between vocabulary and grammar dissolves, replaced by a unified conception of constructions differing only in their level of abstraction and specificity.
Wilson Taylor's Cloze Method: An Early Application of Construction Theory
Wilson Taylor's development of the Cloze procedure in 1953 is a pioneering application of what would later evolve into construction grammar principles, decades before the formal establishment of Construction Grammar in the 1980s by Fillmore, Kay, and Goldberg.
The Cloze procedure, developed by Wilson Taylor at the University of Illinois, involves systematically deleting words from a text and asking readers to fill in the blanks. In theory, if you can ____ this text you _____ fill in the _____ with _____ that ____ sense. As an aside, this is exactly what LMs do, except they have no understanding of the meaning of the sentence.
Named after the Gestalt psychology concept of "closure" (the tendency to complete familiar patterns), this method was originally designed to measure text readability. What makes Taylor's approach revolutionary is that it implicitly recognizes that language comprehension depends on more than just vocabulary knowledge—it requires understanding patterns and relationships between words.
What has always seemed to me so profound about Taylor’s idea was his focus on performance, not on competence, the default in education. As we’ve seen, competence can be knowing the meaning of vocabulary items, speaking in comprehensible structures—aligning with the metrics of the theory. Performance can be wild, unruly, practical, political. We don’t always have to fill in the word we’re expected to supply. Can you fill in the blanks others are hoping you will ignore? Do you know what that strange comment means? What should you do with your next turn to talk? In the words of the poet, Bob Dylan, the next sixty seconds can be like an eternity.
Merging the Theories: Cloze Meets Construction
This fundamental insight aligns with Construction Grammar's later assertion that language consists of learned pairings of form and function at various levels of abstraction. When readers successfully complete a Cloze test, they demonstrate not just knowledge of individual words but an understanding of constructions—the conventional patterns that link form with meaning. Just as Construction Grammar would later argue that we learn language through exposure to recurring patterns, Taylor's Cloze procedure revealed how readers develop expectations about which words should appear in particular contexts based on recurring patterns in language.
Consider how Taylor's approach diverged from traditional readability formulas that simply counted elements like word length or sentence complexity. In his 1953 paper, Taylor noted that his method "…takes a measure of the likeness between the patterns a writer has used and the patterns the reader is anticipating while he is reading…(p. 417)." Wilson Taylor, remarkably, foreshadows Construction Grammar's emphasis on form-meaning correspondences as the basic units of language.
Where traditional approaches separated words (lexicon) from rules (grammar), both Taylor's Cloze method and Construction Grammar recognize that language knowledge exists along a continuum. The Cloze procedure works precisely because language users store and access partially schematic patterns—they can predict what words fit particular slots based on their experience with similar patterns. This is essentially the same insight that Construction Grammar formalized decades later by proposing that constructions exist at varying levels of abstraction, from specific words to highly schematic templates.
Taylor's method also anticipated Construction Grammar's usage-based approach by measuring actual language use rather than abstract competence. The Cloze procedure doesn't ask readers to identify parts of speech or apply grammatical rules. Nor, I might add, does it ask readers to apply phonetical processing, babbling sounds hoping to land on a righteous morpheme. Those notions would be worse than useless. It measures their ability to use contextual patterns to predict missing elements, reflecting their internalized knowledge of language constructions developed through exposure.
Who would be surprised to learn that this kind of sense-making depends on prior knowledge and experience? Who among us would insist that language machines have such prior knowledge and experience? Why is it difficult to win a consensus that these machines are breathing in and blowing back more or less random patterns of language that coalesce in correlation coefficients?
While Taylor himself didn't frame his work in terms of construction theory (which wouldn't emerge for another three decades), his Cloze procedure represented an early recognition that language comprehension relies on our ability to recognize and complete conventional linguistic patterns—precisely the insight that would later form the foundation of Construction Grammar.
Human Meet Machine
When a student interacts with a chatbot about a classroom project, we witness a fascinating collision of language worlds that Taylor, Wittgenstein, and Goldberg would each analyze quite differently.
The human brings a lifetime of embodied experience—memories of dog fur, sunset colors, heartbreak, the smell of baking bread. When they type "I need help with my climate change project," they're not just stringing words together; they're drawing on rich conceptual networks formed through years of social interaction. Their understanding of "project" comes from dozens of previous school assignments, negotiations with peers, and teacher feedback. This is what Wittgenstein would recognize as true participation in a language game—the student knows not just the words but the entire social participation framework supporting academic work.
The chatbot, meanwhile, has never felt rain on its face or stayed up late worrying about a deadline. It has learned statistical relations between constructions ranging from morphemes to discourse markers, precisely the kind of pattern recognition that Taylor's Cloze procedure measures. When the human writes "I need to create a compelling..." the machine predicts likely completions: "argument," "presentation," "narrative." But unlike the human, it has no personal stake in the conversation, no anxiety about grades, no curiosity about the subject matter. It is performing a sophisticated version of the Cloze test, filling in blanks with impressive statistical accuracy.
This creates an so-called uncanny interaction. The student, operating within Wittgenstein's framework, approaches the conversation as a social exchange. They make assumptions about shared knowledge: "You know what my teacher wants, right?" They express frustration: "No, that's not what I meant!" They use incomplete sentences, slang, and cultural references. They are engaging in what they perceive as a collaborative language game.
The chatbot, meanwhile, is executing what Goldberg would recognize as construction-based processing. It identifies the ditransitive pattern in "Can you give me some examples?" and produces an appropriate response structure. It recognizes the [X is Y] construction in "Climate change is caused by..." and completes it with statistically appropriate continuations. It performs these feats without understanding a single thing about climate science in the way humans understand—through direct experience with weather, through concern for the future, through political debate with family members.
What's remarkable is how well this works despite the fundamental asymmetry. The human writes, "I'm totally lost on this assignment," expressing genuine emotional disorientation. The language model, which has never been "lost" on anything (having no sense of orientation to begin with), recognizes this as a construction signaling help-seeking behavior and responds with reassurance: "Don't worry, I can help break this down step by step."
Taylor would find this fascinating—the machine succeeds at his Cloze test spectacularly, predicting words in context with superhuman accuracy. Yet it lacks what Taylor recognized as essential to real comprehension: the connection between language and lived experience. The machine doesn't actually anticipate meaning; it anticipates statistical patterns.
Goldberg might note how the chatbot leverages construction grammar principles without truly understanding them. It knows that "How do I structure my essay?" requires a response about organization rather than about physical structures, not because it understands the polysemy of "structure" but because it has seen similar constructions paired with certain types of responses millions of times.
When the student shares a draft and asks, "Does this sound good?", they're participating in a familiar academic ritual. They are seeking validation, hoping for guidance. The chatbot responds with appropriate feedback, not because it has ever written an essay that "sounded good" or felt pride in its work, but because it has identified the construction as a request for evaluation and can guess which patterns typically follow such requests.
This is the profound difference between human and machine language processing: the human operates within Wittgenstein's world of language as a lived, social practice tied to forms of life, while the machine operates in Taylor's world of pattern completion, stripped of the embodied knowledge that gives those patterns meaning. The conversation works because constructions—those conventional form-meaning pairings Goldberg identified—provide enough structure for meaningful interaction despite this fundamental disconnect.
The context window is the machine's only "memory," a pale imitation of the rich experiential substrate humans draw upon. The neural architecture is a shadow play of human cognition, producing convincing performances without understanding the script. But the language itself—that shared conventional system of meaning—creates a bridge across this ontological chasm, allowing human and machine to participate in what seems like the same conversation without ever making real contact.
Key Takeaways and Discussion Questions
1. The Illusion of Shared Context
The human approaches conversation with a language model assuming shared experiential context that doesn't exist. The model succeeds by recognizing and producing appropriate constructions without the embodied knowledge that typically grounds them.
Discussion Question: How might our educational practices change if we acknowledged that AI language models engage with concepts like "research," "understanding," and "learning" in fundamentally different ways than humans do? Should we develop a new vocabulary to distinguish between human and AI cognitive processes, perhaps even a local glossary?
2. Construction Grammar as Bridge
Construction Grammar inadvertently provides the theoretical framework that explains why LLM interactions work despite the asymmetry of experience. By focusing on form-meaning pairings rather than on the origins of meaning itself, it creates a middle ground where human and machine can meet.
Discussion Question: If constructions (form-meaning pairings) can be successfully manipulated without understanding their experiential basis, what does this suggest about the relationship between language and thought? Does Construction Grammar need to be revised to account for the success of systems that use language without grounding it in lived experience?
3. The Cloze Procedure's Limitations
Taylor's Cloze procedure turns out to be an insufficient measure of true comprehension as machines can excel at predicting the missing words while lacking the conceptual understanding that humans bring to the same task.
Discussion Question: If language models can perform exceptionally well on tests designed to measure human reading comprehension, what new assessment methods might we need to develop that better differentiate between statistical pattern matching and genuine understanding? What would Taylor think of how his procedure has evolved?
4. The Transformation of Language Games
Wittgenstein's concept of language games presumed human participants with shared forms of life. Chatbot interactions create new types of language games where one participant lacks any form of life at all, yet the interaction proceeds as if both were playing by the same rules.
Discussion Question: Are we witnessing the birth of new forms of language games that Wittgenstein never anticipated—games where one participant has no stake in the outcome, no personal history, and no cultural embeddedness? How does this change the nature of what language is and how it functions in society?
5. The Ethics of Simulated Understanding
The chatbot creates the impression of understanding, empathy, and collaborative engagement without experiencing any of these states, raising questions about the ethics of these interactions, especially in educational contexts.
Discussion Question: What ethical responsibilities do educators have when introducing AI language models into learning environments? Should students be explicitly taught about the differences between human understanding and machine pattern recognition, or is the illusion of shared understanding pedagogically valuable in its own right?
"He baked her a cake" conveying transfer despite "bake" not being a transfer verb."
(Putting a comma after "her" changes the meaning.)
As a songwriter/poet I try to line up words with complementary multiple meanings to convey what I would call "word chords"
In lines and stanzas.
This evening I listened to a 30 minute YouTube rap on Wittgenstein.
Terry your posts are alway clear and precise. Thank you.
Love it as usual.
"Lived Experience" as a distinct, non-statistical element in human comprehension may be misplaced here.
Lived experience is likely a feedback mechanism that generates probabilities, not unlike RLHF. There seem to be three distinct "modes" of language and grammatical acquisition.
1. Raw input - Reading, listening, etc. Where word streams enter our awareness
2. Feedback - Production response, where we generate possible streams and receive feedback, including corrections from a parent or teacher, social feedback (e.g., faux pas), or incomprehension by a reader or listener.
3. Research - Deliberate seeking of information by an agent to discern meaning, like looking in a dictionary, conducting research, etc.
Social cues and embodiment may serve as feedback mechanisms that differ in form but not in function from their machine counterparts.