Comprehension Mirages and Faulty Learning from Text: An Ongoing Problem for AI and Disciplinary Pedagogy
Food for thought: If reading comprehension has always been about prior knowledge and metacognitive monitoring, why are we still debating phonics versus meaning? And how will we know when we're teaching students to read both human and AI texts without fear or favor? Or do we want them to fear or favor one above the other?
From Passive to Active Reader
By 1989, near the zenith of the Whole Language movement in the United States, reading comprehension research had crystallized its views on the reader’s active role in meaning construction, a stark contrast to what preceded it. This new, active role was rich in pedagogical implications driven by insights from what David Pearson (1985) called scientific “interlopers” like ‘psycholinguistics’ and ‘metacognition.’ It’s almost hard to think about a world lacking the word ‘metacognition,’ isn’t it? How did teachers get along without it?
The passive reception of meaning during reading had long been problematized beginning with Huey’s (1908) eye movement studies, but in the early 1970s, just as the cognitive revolution was kicking off, mainstream reading comprehension was dominated by behaviorist and transmission models. Getting from a text to comprehension was easy so long as the reader could decode and recognize all of the words.
Pearson’s (1985) article titled “The Comprehension Revolution: A Twenty-Year History of Process and Practice Related to Reading Comprehension” is an invaluable and timely resource for teachers to read and discuss today in professional development groups in need of a quick dose of theoretical eyedrops with immediately useful wisdom articulated by a key figure with his finger on the pulse of a period of rapid epistemic development in reading and learning still actively compelling us to do better and think more deeply about our mission.
Ripples from the waves being made by reading researchers at the time gave younger teachers like me an opportunity to get a foothold in the assessment policy arena in California, buoyed by a tide of possibilities to create a thinking, meaning-centered curriculum. Despite the conservative backlash in 1996 which ushered in a long period of retreat into controlled, extrinsically driven, mechanical literacy pedagogy under surveillance from the federal government, enough of us survived into the early part of the 21st century to keep up the resistance. Long live prior knowledge!
Old Wine in New Bottles
Intriguing questions, free for the asking, began to materialize on the theoretical drawing board among teachers designing what we were calling authentic reading assessments in 1989. Literary reading had never been assessed in California. The State Department of Education was going to change that. Reading tests of comprehension focused on getting the facts. What about reading when facts weren’t the point?
What is a robust, aesthetic response to a literary text? How does a response develop?How does this response differ from the response to a transaction with an informational text? What is the relationship between reading comprehension and aesthetic response? How does a teacher invite and nurture aesthetic responses through classroom assignments? How does an assessor deal with the inherent paradox of measuring an emotional response to a reading in the moment without destroying its meaning?
In 1989, California decided to start with assessing aesthetic reading and later complete the project by figuring out how to authentically assess informational reading, which was “beyond the bubble,” i.e., outside the reach of multiple choice items. But the broader field of reading research was focused on a more universal question of earthshaking importance to basic comprehension assessment, a question which is still being asked in the NAEP arena.
If the reader’s prior knowledge shapes comprehension, if the writer’s presumptions about the reader’s prior knowledge play a definitive role in shaping comprehension, reading is an epistemic, not a linguistic, practice. Assessment of the reader’s comprehension must entail a measure of the reader’s prior knowledge before reading, the impact of the reading on that prior knowledge, and a measure of the writer’s level of performance in accommodating the reader. If a reader knows little and the writer does little, is a bad reading performance a measure of the reader’s ability to comprehend?
Understanding the critical significance of prior knowledge in literacy events upended any chance for the Simple View of Reading, which had been circulating for a decade or more, to be taken seriously at that point, i.e., as any more than a talking point in graduate reading courses to introduce the real scientific stuff. Reading is quite simply a matter of speaking and listening, only the reader has to decode the words. The missing piece is phonics. Teach phonics and comprehension takes care of itself.
The study of mind and intelligence brought to bear on textual matters reduced to speaking and listening left out all things textual. Through the lens of cognitive science, the field of reading integrated philosophy, psychology, artificial intelligence, neuroscience, linguistics, and anthropology, and probably more. These were fighting words at the time, the Simpletons vs the Complexifiers. Interestingly enough, even while we struggle with the emergence of AI, we continue to fight the war between the Simples and the Complexes.
The Illusion of Knowing
The illusion of comprehension as a phenomenon amenable to empirical study has been discussed since the early 1980s. It’s essentially a false positive: The reader makes the judgment that they comprehend something (positive) when the reality is they do not get it (false). The opposite is also likely true, a reader believes they haven’t comprehended something they really have understood, but it hasn't been widely studied as near as I can tell. Whether we are talking about human-written text or artificially generated text, the illusion of confidence in understanding is a phenomenon which can be approached instructionally under the rubric of self-regulated learning and metacognition.
Recent research on metacognitive monitoring has made great strides in understanding how metacognitive performance can fail to render comprehension mirages or the illusion of understanding self-detectable during student reading events. According to Avhustiuk et al. (2018) types of metacognitive monitoring can vary according to
“…criteria of reliability (accurate and inaccurate monitoring), level of performance (local and global monitoring), temporal implication (on-line and off-line monitoring), learning achievements (subject-specific and general monitoring), cognition plot (monitoring of comprehension, monitoring of metamemory, and monitoring of performance), level of understanding [analytical (explicit) and non-analytical (implicit) monitoring], basis of judgements (information-based and experience-based monitoring)” (p. 318).
I’m dismayed by the underlying assumption in discussions of the role of AI in reading and writing that reading text strategically whether human or bot output is somehow “not the problem.” The problem for reading and the bot is some vague, scary monster called “cognitive offloading,” and it is a moral problem, not a reading problem.
We hear ad nauseam “the AI problem is writing, cheating, and plagiarism.” This could be just another instantiation of common view of the reader under the control of the teacher. Readers read what they are assigned to read for the purpose determined by the teacher. Either they read, they get it, and we’re fine. Or they don’t read, they don’t get it, and we have a problem.
Do we know if the student tried to read the human and fell under the spell of an illusion of comprehension? Do we know if the reader took the step of seeking help from a bot when the illusion began to crumble? Do we call that lazy or dishonest? Do we warn students about the illusion of comprehension of human text and what might be an appropriate turn to the bot? What about massive confusion? No illusion but confusion. Do we assume the reader doesn’t care enough to work it out alone? How do we help students recognize when to verify bot output? Or do we simply forbid the bot?
A Metacognitive Toolkit
Reading comprehension monitoring is a teachable, complex cognitive skill involving interconnected processes. It requires diligent, predictable, interactive attention over sustained time periods. Students must simultaneously track understanding, evaluate the steps they are taking, and adjust strategies based on self-generated feedback.
We know that successful readers actively coordinate multiple types of awareness before, during, and after reading. A wealth of pedagogical strategies have been empirically studied to inform prereading, during reading, and post-reading instructional episodes, all beholden to the overarching importance of metacognition.
Taking the seven distinct monitoring dimensions from Avhustiuk et al. (2018), I’ll offer a provisional framework of pedagogical tools as a starting point for integrating metacognitive instruction for scaffolding the growth of calibrating self-understanding with text-understanding useful in the human text, the AI text, and the human-AI text situations.
Seven Dimensions of Student Monitoring
1. Reliability: Developing Accurate Self-Assessment
Students vary in judging their own understanding. Some feel confident while struggling with basics; others underestimate solid comprehension. This gap significantly impacts learning.
Building accuracy through:
Regular prediction-performance comparisons
Confidence rating activities with feedback analysis
Self-questioning that probes specific understanding aspects
Distinguishing familiarity from true comprehension
2. Performance Level: Local and Global Awareness
Monitoring operates at two levels simultaneously:
Local: Sentence-by-sentence understanding
Global: Overall comprehension of larger sections
Skilled readers coordinate both seamlessly. Developing readers often focus on details while losing the bigger picture.
Effective instruction:
Teaches frequent local comprehension checks
Develops synthesis across larger text sections
Integrates both levels rather than treating separately
3. Timing: During vs. After Learning
During-reading monitoring:
Enables real-time problem-solving
Prevents small problems from becoming larger obstacles
Focuses on recognizing confusion signals as they emerge
After-reading monitoring:
Supports deeper reflection on processes and outcomes
Builds transferable metacognitive awareness
Includes self-assessments and strategy planning
4. Context: Subject-Specific and General Skills
Subject-specific monitoring involves students developing disciplinary ways of seeing and thinking that mirror expert cognitive processes within each field. Mathematics monitoring requires recognizing computational reasonableness and solution pathway validity, i.e., skills that develop through understanding mathematical coherence and logical relationships. Scientific monitoring emphasizes evidence evaluation, hypothesis testing, and the coordination of theory with empirical data. Historical monitoring involves source credibility analysis, perspective recognition, and the ability to detect bias and evaluate competing narratives.
The ability to reconstruct conceptual frameworks and explanations with clearly specified relations among the parts is a valuable general metacognitive skill which bridges the integration of human and AI textual interactions.
5. Target: What Students Monitor
Students can track multiple learning aspects:
Comprehension: Immediate content understanding (quick tests like noting, paraphrasing)
Memory: Integration of content in chunks or captions with graphic links
Performance: Micro goals established for each task with enforced look backs
6. Depth: Explicit vs. Implicit Awareness
Explicit monitoring:
Conscious evaluation using specific task criteria (how will the information be applied)
Systematic approaches and deliberate adjustments (making an immediate adjustment to improve focus and confidence)
Checklist-based self-assessment
Implicit monitoring:
Respect for intuitive feelings about comprehension quality
Registering emotional feedback about understanding and confusion
Gut-level awareness of learning process
Both contribute to comprehension success and should be integrated.
7. Source: Information vs. Experience-Based Judgments
Information-based:
Assessing alignment between familiar vocabulary and new information (identity vocabulary and concepts that are unfamiliar and deliberately applying them)
Objective assessment of understanding of specific knowledge (self-tests, chapter study check up devices, AI interrogation
Direct content verification using human sources
Experience-based:
Learning process quality indicators (self-awareness of typical work sessions, identifying subpar performances and adjusting)
Reading fluency demands, effort requirements, emotional responses
Process efficiency insights
Creating Systematic Development
Building comprehensive monitoring requires progressive instruction:
Early Stage:
Basic awareness and simple strategies
Recognizing obvious confusion signals
Extensive modeling with immediate feedback
Intermediate Stage:
Coordinating multiple monitoring dimensions
Adapting strategies to different contexts
Selecting appropriate approaches for tasks
Advanced Stage:
Flexible application across diverse contexts
Automatic and effective monitoring
Teaching strategies to others
The Ultimate Goal
Develop students who monitor learning automatically and effectively across all contexts, especially during reading events involving access to human texts and/or AI output. Independent learners recognize problems quickly, select appropriate strategies efficiently, and adjust approaches based on ongoing feedback. They take responsibility for their own understanding and performance quality.