Psychological Plausibility and Writing Instruction

Jan 27, 2026

The Criterion

Psychological plausibility is a criterion for evaluating theoretical models of cognitive processes. A model is psychologically plausible if its architecture is compatible with what we know about how minds actually work. The criterion applies wherever theorists propose mechanisms to explain mental operations: perception, memory, reasoning, language.

For example, a filing cabinet model of memory—discrete records stored in locations and retrieved intact—describes recall but misrepresents the mechanism. Memory is reconstructive, i.e., we rebuild experiences from fragments scattered throughout the brain each time.

A model can be descriptively accurate—it can correctly predict outputs—while positing mechanisms that no mind could plausibly execute. Such a model succeeds as formal description but fails as cognitive theory.

The filing cabinet correctly predicts that you'll remember your wedding day. But it can't explain why memories change over time, why false memories form, why retrieval cues matter, or why post-event information contaminates recall.

These phenomena reveal the actual mechanism, which brain science accepts as fact. Distributed fragments are reconstructed, shaped by current context. The filing cabinet describes the output, i.e., you remember things, while positing a mechanism, i.e., intact storage, that doesn't match what brains do. It's descriptively useful, cognitively false.

The distinction matters because description and explanation are different goals. A descriptively adequate grammar specifies which sentences are grammatical and which are not. An explanatorily adequate grammar does this while also modeling how speakers produce and understand those sentences in real time.

The first asks: What are the patterns? The second asks: What are the mental operations that generate those patterns? A grammar can answer the first question brilliantly while giving short shrift to the second question.

This is what happened with transformational grammar. Chomsky acknowledged that his theory ultimately had to connect with language use, and he adjusted transformational grammar considerably over decades, though he remained skeptical of many critics. But the most transparent reading of early transformational theory invited a derivational processing interpretation, and empirical work showed that interpretation to be at best highly indirect.

The Transformational Model

In the 1950s and 1960s, Noam Chomsky revolutionized linguistics with a model of extraordinary formal elegance (Chomsky, 1957, 1965). Transformational grammar proposed that sentences have two levels of structure. Deep structure represents the underlying semantic relationship—who did what to whom, which elements modify which others.

Surface structure represents the sentence as speakers actually produce it, that is, the linear sequence of words, the audible or visible form. Transformational rules convert deep structures into surface structures through ordered operations—movements, deletions, insertions, substitutions.

Consider a passive sentence: “The cake was eaten by the child.” In transformational grammar, this surface form derives from a deep structure resembling the active. At the underlying level, “the child” is the subject and “the cake” is the object, in that order.

The grammar generates the passive by applying a series of transformations: 1) move the deep object to surface subject position, 2) move the deep subject into a prepositional phrase with “by,” 3) insert the auxiliary “was,” 4) convert the verb to its past participle form.

Each transformation takes a structure as input and yields a modified structure as output. The derivation proceeds step by step, rule by rule, until the surface form emerges.

The model was powerful. The relationship between active and passive sentences, between declaratives and questions, between affirmative and negative forms—all could be described as transformational relationships.

Sentences that seemed superficially different could be shown to derive from identical deep structures. Sentences that seemed superficially similar could be shown to have distinct derivational histories. The grammar unified a vast range of phenomena under a small set of principles.

But the model made a strong claim about mental processing. If transformational grammar describes what speakers know, and if speakers use that knowledge when producing and understanding sentences, then the mind must be computing these derivations.

The deep structure must be mentally represented. The transformations must be mentally executed, one after another, in order. The surface structure must emerge as the output of this derivational procedure.

Psycholinguists of the times knew his claim was testable. If the mind computes derivations, then derivational complexity should correlate with processing difficulty. Sentences requiring more transformations should take longer to understand, produce more errors, and demand more cognitive resources.

A passive sentence, requiring more transformational steps than an active, should be measurably harder to process. A negative passive question, requiring still more steps, should be harder yet. The Derivational Theory of Complexity, as it was called, made specific predictions about human performance.

Unsurprisingly, the predictions failed. Psycholinguistic experiments throughout the 1960s and into the 1970s tested whether derivational complexity predicted processing time. Fodor, Bever, and Garrett (1974) synthesized a decade of such research and found the correlations weak or absent.

Passive sentences were not uniformly harder than actives. Negation did not add processing cost in the way the theory predicted. The number of transformations in the formal derivation did not reliably predict the difficulty speakers experienced.

The elegant machinery of deep structures and transformational rules appeared to be a formal convenience—a way of capturing linguistic generalizations—not a description of mental computation.

Transformational grammar was descriptively powerful but, taken as a processing model, psychologically implausible. Its architecture required the mind to do things the mind did not appear to do.

The Constraint-Based Alternative

The response, developed through the 1970s and 1980s, was to rebuild grammatical theory on different architectural foundations. The key figures were Joan Bresnan and Ronald Kaplan, who developed Lexical-Functional Grammar (LFG), and Carl Pollard and Ivan Sag, who developed Head-Driven Phrase Structure Grammar (HPSG).

Both frameworks retained the insight that sentences have multiple levels of structure—you still need to represent both surface form and underlying grammatical relationships—but they abandoned the claim that one level derives from another through sequential rule application. This abandonment becomes important in the development of language-based AI.

As Asudeh and Toivonen (2010) note, “Bresnan and Kaplan were concerned with the related issues of psychological plausibility and computational tractability. They wanted to create a theory that could form the basis of a realistic model for linguistic learnability and language processing” (p. 1). The goal was a grammar whose architecture matched what minds plausibly do.

The shift was from procedural to declarative models. Transformational grammar is procedural; it specifies a sequence of operations to perform. Start with this deep structure, apply this rule, then this rule, then this rule, and you will arrive at that surface structure. The grammar is a recipe.

Constraint-based grammars are declarative; they specify conditions that well-formed structures must satisfy without prescribing an order of operations. A grammatical sentence is one where multiple representations exist simultaneously and meet certain constraints.

Transformational grammar is often written procedurally, as sequences of operations on structures; constraint‑based grammars can be implemented so that all relevant conditions are available at once, supporting parallel constraint satisfaction. The grammar defines a space of possibilities; grammatical sentences are those that satisfy the constraints.

Think of a building code versus a recipe. First do this, then this, then this, says a recipe. A building code says, the structure must bear this load, maintain this clearance, use these materials. You can verify the code in any order; you just check whether conditions are met.

“The cake was eaten by the child” is grammatical not because the mind performed operations on an active sentence but because the sentence satisfies multiple conditions simultaneously: “eat” requires an eater and something eaten; the passive form of “eat” specifies that the thing eaten appears as subject; “cake” fills that role; the eater is optional but if present appears in a “by” phrase; “child” fills that role. All conditions met. No derivation occurred—just parallel satisfaction of constraints specified in the words’ user’s manuals.

This difference in architecture has profound implications for processing. A purely procedural grammar requires serial computation: step one, then step two, then step three, in order. A declarative grammar permits parallel computation: multiple constraints can be evaluated simultaneously, and the system settles into states that satisfy them. Purely or fully serial architectures are implausible as whole‑mind models; real cognition combines extensive parallel activation with local serial bottlenecks.

Cognitive science in the 1970s and 1980s was converging on parallel distributed processing as a model of mental computation. The brain does not appear to work like a serial computer executing one instruction at a time. It works like a network where many processes operate simultaneously, where information flows in multiple directions, where the system settles into stable states through the mutual satisfaction of many constraints at once.

Perception works this way. We do not first identify edges, then shapes, then objects, then meanings in strict sequence. All levels interact in parallel, with bottom-up and top-down information integrating continuously until interpretation stabilizes.

If language processing works similarly—if understanding a sentence means simultaneously satisfying phonological, syntactic, semantic, and pragmatic constraints rather than computing a step-by-step derivation—then a grammar built on parallel constraint-satisfaction is a better model of mental operations.

Words as User’s Manuals

The constraint-based frameworks share another crucial feature in that they are lexicalist—they are grounded in words. In transformational grammar, much of the work is done by transformational rules that operate on structures. In LFG and HPSG, much of the work is done by lexical entries—by the words themselves. Each word comes packaged with rich information about how it can be used.

Think of a lexical entry as a user’s manual for a word. The manual specifies everything you need to know to use that word in a sentence: its phonological form (how it sounds), its syntactic category (noun, verb, adjective), its semantic content (what it means), and—crucially—its combinatorial possibilities (what it can combine with and how).

Consider the verb “eat.” Its lexical entry specifies that it is a verb, that it describes an action of consuming, and that it requires two participants: an eater and something eaten. In the terminology, it has two argument roles: an agent (the one who eats) and a patient (the thing eaten).

The entry further specifies how these semantic roles map to grammatical functions. In the active use of “eat,” the agent maps to subject and the patient maps to object: “The child ate the cake.” In the passive use, the mapping is different: the patient maps to subject, and the agent is either suppressed or expressed in an oblique phrase: “The cake was eaten (by the child).”

This is not a transformation. There is no derivation from active to passive. There are two entries—or two argument structures within a single entry—specifying two different ways to use the same verb. The speaker does not start with an active deep structure and apply rules to derive a passive surface structure. The speaker accesses the lexical information for “eat,” selects the appropriate argument mapping, and builds a structure that satisfies the constraints that mapping imposes.

Multiply this by every word in the lexicon. Each noun specifies what semantic roles it can fill. Each adjective specifies what kinds of nouns it can modify and where it can appear. Each preposition specifies what relationships it can express and what complements it can take. Each word is a bundle of information about combinatorial possibilities—a user’s manual specifying the conditions under which that word can successfully combine with others.

When a speaker wants to express a thought, the process is not construct an abstract deep structure, then apply transformations to derive words. The process is search the lexicon for words whose semantic content matches the thought you want to express, then combine those words in ways that satisfy their combinatorial constraints. The mind is not running a derivational procedure. Instead, it’s satisfying multiple user’s manuals simultaneously.

This is where parallel processing becomes essential. A sentence is not built word by word, with each word’s constraints fully satisfied before the next word is considered. Multiple words are activated at once, and their constraints interact. The system settles into a configuration that satisfies as many constraints as possible across all the words simultaneously.

This is how a speaker can produce a sentence almost instantaneously—not by computing a long derivation step by step, but by activating lexical items in parallel and letting their constraints mutually resolve.

This is why we have “um.” “Um” is the sound of constraints mutually resolving.

The grammar does not specify a procedure to follow. The grammar specifies conditions to meet. The mind meets them in parallel, not in sequence. What makes constraint-based grammar psychologically plausible is that its architecture matches the parallel, constraint-satisfying nature of actual cognition.

It’s important to understand that transformational grammar today is not the 1965 version; within the Chomskyan camp, transformations have been radically pared down, and many procedural elements have been reinterpreted as conditions on representations.

These changes converge conceptually with a hybrid view of the mind and brain where parallel activation at deeper levels feeds into more serial commitments in production and comprehension.

The Writing Process Mistake

Now consider how writing is taught. The standard model presents composition as a sequence of stages: brainstorm, then outline, then draft, then revise, then edit. First you generate ideas. Then you organize them into a structure. Then you write sentences that execute that structure. Then you improve those sentences. Then you fix surface errors. Each stage has its place in the sequence. Each stage is completed before the next begins. The model is procedural. It specifies an order of operations.

Composition pedagogy resembles early derivational thinking in privileging clean stages, and both run into trouble when confronted with data about messy, recursive practice. Just as the transformational model requires computation to proceed step by step, the staged model requires composing to proceed stage by stage.

And just as the transformational model doesn’t provide a complete description of how minds actually process language, the staged model fails as a description of how writers actually compose.

Research on composing processes established this decades ago. Flower and Hayes (1981) studied writers thinking aloud as they composed and found that writing is recursive, not linear. Writers do not complete brainstorming before they begin outlining, or outlining before they begin drafting. They loop back constantly.

A sentence drafted mid-process reveals a problem with the outline, so the writer revises the outline. A new idea emerges during revision, so the writer returns to drafting. The allegedly sequential stages interpenetrate and recur throughout the process.

Perl (1979) found the same recursiveness in basic writers, those who struggle most with composition. Even writers with limited fluency do not move linearly through stages. They circle back, rethink, restart.

Sommers (1980) showed that skilled revisers do not merely “clean up” drafts in a final editing pass. They reconceive their arguments through the act of revision. Revising is not a stage that follows drafting; revising is part of how drafting happens. The distinction between stages dissolves under empirical observation.

E.M. Forster captured the phenomenology: “How do I know what I think until I see what I say?” The act of writing produces thought; it does not transcribe thought that existed prior to writing. The sentence you draft teaches you something you did not know before you wrote it.

That new knowledge changes what you want to say next. Which changes the structure of the piece. Which sends you back to revise what you already wrote. The process is primarily recursive and emergent, not purely sequential and derivational. Lexical “user’s manuals” can drive an incremental construction process that is still fundamentally constraint‑based and parallel in activation.

The staged model persists in classrooms because it is teachable and assessable. You can assign an outline on Monday, a draft on Wednesday, a revision on Friday. You can create rubrics that assess each stage. But this administrative convenience does not reflect cognitive reality. It converts a messy, recursive, parallel process into a linear procedure—just as transformational grammar converted the parallel constraint-satisfaction of language processing into a serial derivation.

The staged writing process is psychologically implausible in closely analogous ways to the early derivational reading of transformational grammar. Both over‑privilege neat sequential mechanisms where actual cognition shows pervasive recursiveness and parallel activation with only local serial production commitments. Both mistake formal convenience for psychological reality.

The Constraint-Based Alternative for Composition

What would a psychologically plausible model of composition look like? It would be declarative, not procedural. It would specify conditions that effective writing satisfies, not steps to execute in order. It would allow for parallel processing and recursive movement, not linear progression.

The analogy to constraint-based grammar is precise. Just as a grammatical sentence satisfies multiple constraints simultaneously—syntactic, semantic, pragmatic—an effective piece of writing satisfies multiple constraints simultaneously: it must say something true or interesting (semantic constraint), it must be organized so readers can follow (structural constraint), it must use language appropriate to its audience (register constraint), it must maintain coherence across sentences and paragraphs (cohesion constraint).

The composing mind, like the language-processing mind, works by activating multiple representations simultaneously and letting their constraints interact until the system settles into a stable configuration. The writer considers the argument they want to make, the evidence they have available, the structure that might organize it, the sentences that might express it—all at once, in recursive interaction, not in sequential stages.

Implications for Teaching

If the staged writing process is psychologically implausible, why do we teach it? For the same reason transformational grammar persisted despite experimental disconfirmation. It is elegant, systematic, and teachable. It gives teachers something concrete to assign and assess. It gives students something concrete to do. It makes the terrifying openness of composition manageable by breaking it into steps.

But the cost is high. When students internalize the staged model, they often experience writing as a series of obligatory performances rather than a recursive exploration. They produce outlines that don’t inform their drafts because the outline was written to satisfy an assignment, not to discover structure.

They submit drafts that don’t benefit from revision because revision was conceived as a separate stage rather than an ongoing process. They experience the disconnect between how writing is taught and how writing actually works, and they conclude either that they are bad at writing or that writing instruction is pointless.

A psychologically plausible pedagogy would teach writing as constraint satisfaction. It would help students understand the multiple constraints that effective writing must meet, and it would give them practice satisfying those constraints recursively rather than sequentially.

It would present revision not as a stage but as a mode of thinking—something that happens throughout composing, not after it. It would value the recursive loop, the mid-draft discovery, the outline revised in light of the sentence that wouldn’t come right.

Most importantly, it would assess writing in ways that honor this recursiveness. If composing is recursive and emergent, then the evidence of genuine composing is not a neat sequence of artifacts—outline, then draft, then revision—but traces of cognitive work that don’t fit a linear story.

The student who wrote the draft first and reverse-engineered the outline has not cheated the process; that student may have composed more authentically than the one who followed the stages in order. A psychologically plausible assessment would look for evidence of thinking, not evidence of compliance with a procedural model.

Conclusion

Psychological plausibility is not an abstract theoretical concern. It is a criterion for distinguishing models that describe cognitive outputs from models that explain cognitive processes. Transformational grammar described linguistic patterns with formal precision, but its derivational architecture required mental operations that empirical research could not confirm.

Constraint-based grammars achieve the same descriptive coverage while positing mechanisms—parallel constraint satisfaction over richly specified lexical entries—that align with what we know about how minds work.

The same logic applies to the teaching of writing. The staged writing process describes one possible path from idea to text, but its sequential architecture does not match how composing actually unfolds in minds. Writers work recursively, not linearly. They satisfy multiple constraints in parallel, not one at a time.

A psychologically plausible pedagogy would honor this reality instead of imposing a procedural fiction.

The lesson in both cases is the same. Formal elegance and teachability are not the same as cognitive accuracy. A model can be useful without being true. And when the goal is to understand—or to teach—how minds actually work, psychological plausibility is not optional.

Share Learning to Read, Reading to Learn

References

Asudeh, A., & Toivonen, I. (2010). Lexical-Functional Grammar. In B. Heine & H. Narrog (Eds.), The Oxford handbook of linguistic analysis (pp. 425–458). Oxford University Press.

Chomsky, N. (1957). Syntactic structures. Mouton.

Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press.

Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365–387.

Fodor, J. A., Bever, T. G., & Garrett, M. F. (1974). The psychology of language: An introduction to psycholinguistics and generative grammar. McGraw-Hill.

Perl, S. (1979). The composing processes of unskilled college writers. Research in the Teaching of English, 13(4), 317–336.

Pollard, C., & Sag, I. A. (1994). Head-driven phrase structure grammar. University of Chicago Press.

Sommers, N. (1980). Revision strategies of student writers and experienced adult writers. College Composition and Communication, 31(4), 378–388.

Scott Tuffiash

Jan 27

More educational gold from you. I can use this immediately to cover at least 10 topics I didn't know where to start with in the AI and Ethics class. Thank you immensely!

If you'd like, I have student presentations from my class you talked to - the requirement was to address every guest speaker, so you're included in most of them. Would be glad to put together a Google Drive folder and you could peruse/watch/listen to as many as you choose.

1 reply by Terry Underwood, PhD

1 more comment...

Learning to Read, Reading to Learn

Discussion about this post

Ready for more?