What Brain Activity Reveals About Human-AI Collaboration During Writing
In a study titled “Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Tasks” first published in 2021 at arXiv. Recall that arXiv is an open-access preprint repository for scholarly articles, primarily in physics, mathematics, computer science, quantitative biology, statistics, and related fields.
Researchers use arXiv to share their original manuscripts—known as “preprints”—publicly before or during peer review, accelerating dissemination and feedback within the scientific community. As a sort of testing ground where research designs can demonstrate their staying power, arXiv still provides a draft of the article in 2025, which is the document I am using.
Cognitive debt in the title refers literally to how much a writer owes to AI for doing the task and can be difficult to discern qualitatively. Unfortunately, teachers have no sure way to calculate their students’ cognitive debt to the machine, an assessment which would be highly useful. However, these researchers used what has become a standard research technique to measure precise amounts of debt totally impractical in a classroom. Here is their research design in their own words (we’ll unpack this design below):
“We assigned participants to three groups: LLM group, Search Engine group, Brain-only group, where each participant used a designated tool (or no tool in the latter) to write an essay. We conducted 3 sessions with the same group assignment for each participant. In the 4th session we asked LLM group participants to use no tools (we refer to them as LLM-to-Brain), and the Brain-only group participants were asked to use LLM (Brain-to-LLM). We recruited a total of 54 participants for Sessions 1, 2, 3, and 18 participants among them completed session 4.”
Their data collection methods were redundant to increase reliability and validity, but their primary method involved looking directly into the brain. In their own words:
“We used electroencephalography (EEG) to record participants' brain activity in order to assess their cognitive engagement and cognitive load, and to gain a deeper understanding of neural activations during the essay writing task. We performed NLP analysis, and we interviewed each participant after each session. We performed scoring with the help from the human teachers and an AI judge (a specially built AI agent).”
According to the findings and analysis, the Brain-to-AI study reveals a key insight: AI tools don't just help with writing—they may rewire how we think. Whether this rewiring enhances or undermines deep cognitive engagement may depend entirely on when we bring AI into the process.
Again, these findings have not been peer-reviewed, are published on a site intended to offer circulation of research papers which are still percolating, and are discussed here because they have been so widely impactful. My personal judgment is this: Under the laboratory conditions in which the study was done, the findings do not startle. But I sense a radically oversimplified view both of writing and of AI. Let me know how you are thinking about whether the timing is all.
To bot at the start of a writing project, to bot at the end, or never to bot at all—that is the core question facing teachers? Really? Can teachers effectively control the sequence? Should they?
The Research Design: A Window into the Writing Brain
The MIT Media Lab study employed research tools designed to capture real-time neural activity during different writing conditions. The fifty-four university students from MIT, Harvard, Wellesley, Tufts, and Northeastern were divided into three groups, each representing a different relationship with technology during writing:
The LLM Group used GPT-4 exclusively for essay writing
The Search Engine Group could access any website except AI tools
The Brain-only Group worked without any external resources
Researchers used a 32-channel Neuroelectrics Enobio headset to record EEG signals at 500 Hz, capturing brain activity across multiple frequency bands (alpha, beta, theta, and delta). Each participant completed three essay-writing sessions over several months, choosing from SAT prompts covering topics like loyalty, happiness, choices, forethought, philanthropy, art, and courage. Essays averaged 820 words and were completed under controlled laboratory conditions.
The study's fourth session introduced a crucial twist: participants were reassigned into "Brain-to-LLM" (those who previously wrote without AI now used it for revision) and "LLM-to-Brain" (those who previously used AI now wrote without it) groups. This crossover design allowed researchers to examine how prior AI exposure affected subsequent neural patterns.
The Good: Enhanced Integration and Synthesis
The most striking positive finding emerged from the Brain-to-LLM group in Session 4. These participants—who first wrote essays using only their own cognition then revised using AI—showed dramatically elevated neural connectivity across all measured frequency bands. This "network-wide spike" in alpha, beta, theta, and delta band connectivity suggests that using AI as a revision tool engages more extensive brain networks than any other condition tested.
This enhanced connectivity appears to reflect deeper cognitive processing, as the brain works to integrate AI suggestions with existing mental models. The researchers interpreted this as evidence of "better integration of content," where students actively synthesized their original thoughts with AI-generated improvements. Essay quality scores from both AI judges and human teachers confirmed that this approach produced superior written work.
The Search Engine group also demonstrated benefits, showing "higher interconnectivity" and "high integration flow" compared to baseline. This suggests that actively seeking and synthesizing information from multiple sources—whether AI or traditional search—engages beneficial neural processes that pure solo writing may not activate.
The Bad: Cognitive Dependency and Reduced Ownership
The concerning findings center on students who began their writing journey with AI assistance. The LLM group showed progressively decreasing neural connectivity across sessions, which researchers termed "neural efficiency adaptation." While efficiency might sound positive, in this context it suggests the brain is doing less work—potentially learning less—when AI handles initial creation.
More troubling were the behavioral and psychological impacts. LLM-first users demonstrated:
"Significantly reduced ability to quote from their essay"—they couldn't remember what they had supposedly written
"Impaired perceived ownership" of their work
"Low effort visual integration and attentional engagement"
Heavy reliance on copy-paste behaviors with "minimal editing"
Essays that showed "not significant distance to the default ChatGPT answer"
The neural data supports these observations. By Session 3, the LLM group showed "lower interconnectivity due to familiar setup," suggesting their brains had adapted to offload cognitive work to the AI. The researchers noted a concerning pattern where "lower-performing students rely passively on AI, limiting deeper engagement."
The Unknowns: Long-term Implications and Individual Variations
The problem with a study like this is its undeniable tendency to play to reader biases. For example, me. I lean toward the view that AI can inhibit my understanding of and reliance on my own thoughts if I engage with a bot before writing in my own words what I’m thinking, why I want to write about it, what I actually believe I know about it, and how it might be worth taking a reader’s valuable time. My intuition tells me AI should be discouraged during initial stages of writing. I’ve actually written a post exploring this point.
Bias aside, several critical questions remain unanswered by this study, partly due to the study's preliminary status and partly due to inherent limitations, including:
Temporal Dynamics: How do these neural patterns evolve over longer periods? Are students who offload going to be permanently lobotomized? The study captured only a few months of AI use. Would the reduced connectivity in AI-first users eventually plateau or continue declining? Would they learn to approach the machine differently? Can they be taught to do so? Would the enhanced connectivity from AI revision be sustained with regular use?
Individual Differences: The study noted vocabulary and skill development varied among students but didn't fully explore these individual trajectories. How do factors like prior writing ability, technical comfort, or motivation affect neural responses to AI assistance? What about measures of metacognitive strategies and dispositions?
Transfer Effects: Do the neural patterns observed during AI-assisted writing under assignment conditions affect other authentic writing tasks? If students become neurally "efficient" (less engaged) when using AI writing their assignments, does this reduced engagement transfer to non-AI writing tasks, say, writing a love letter or an obituary?
Real Life: This study has poor ecological validity. Writing assignments happen in the real world, and research on writing has a long history of forgetting about real life. In fact, much of the literature since Arthur Applebee has documented how inhospitable the real classroom is to student agentic writing. As teachers begin to understand the dynamics of AI, the reality is going to be much more complex and fluid.
I recall first reading William Labov’s 1971 study Language in the Inner City and being astonished at his findings. Then serious research was lacking to provide insight into Black English Vernacular. The “logic of nonstandard English,” which is not represented in standardized measures of language, made clear how assessments designed around middle-class norms distorted, trivialized, or missed the genuinely sophisticated linguistic performance of these students.
Ecological validity refers to how well research or testing settings reflect real-world situations. Labov made clear that sterile classroom assessments or laboratory studies (which this arXiv focus study we’re discussing is) fail to capture the dynamic, context-dependent ways youth actually use language.
Developmental Concerns: The participant pool consisted of university students with developed writing skills. How might these neural effects differ for younger students still developing fundamental literacy abilities? What is the cost-benefit analysis and how much does it depend on pedagogical approach?
Methodological Strengths and Limitations
The study's sophisticated EEG methodology captured nuanced neural dynamics impossible to observe through behavioral measures alone. The crossover design in Session 4 provided crucial insights about sequence effects. The combination of neural, behavioral, and performance data offers a multi-dimensional view of AI's impact.
However, several limitations constrain our interpretations:
Small sample size (18 participants per group) limits generalizability
Lack of peer review means statistical analyses and interpretations remain unvalidated
The controlled laboratory setting may not reflect naturalistic AI use
The study didn't examine mixed human-AI paragraphs or sentences, only complete essays
Implications for Educational Practice
Despite its preliminary status, this research suggests several provisional principles for educators:
Sequence Matters: Starting with human-generated content before introducing AI appears to preserve cognitive engagement while still benefiting from AI enhancement. Teachers might deliberately engage students in experiments to give them the opportunity to feel the differences in vivo.
Active Integration is Key: The heightened neural activity during AI revision suggests that positioning AI as a revision partner rather than a first-draft generator may optimize learning. This positive finding doesn’t in any way close off possibilities for using AI as an exploratory tool before students’ ideas on a topic or question have settled. Teachers might similarly design pedagogical opportunities for students to experience these differences and learn from one another.
Monitor for Dependency: The progressive reduction in neural engagement among AI-first users warns against allowing students to become cognitively dependent on AI tools. Teachers might assign readings on this topic and encourage reflective introspection leading to metacognitive conversations.
Preserve Ownership: The inability of AI-first users to remember or quote their own work suggests that genuine authorship—and the learning it entails—requires human-initiated thought. Teachers might take this principle seriously and privilege student-initiated writing projects relevant to the domain under study in the classroom. As Arthur Applebee's work taught us, the norm is for teachers to control the writing task, topic, genre, length, structure, and formality, leaving little space for students to have “human-initiated thoughts.”
A Cautionary Tale of Cognitive Trade-offs
The Brain-to-AI study offers a very preliminary glimpse into the neural mechanics of human-AI collaboration in academic writing. While it reveals AI's potential to enhance revision and synthesis processes, it also warns of cognitive disengagement when AI substitutes for rather than supplements human thought. The research suggests that how we integrate AI into the writing process may be more important than whether we use it at all.
Until peer review validates these findings, we should treat them as valuable hypotheses rather than established facts. The best use, in my opinion, is to use them as spring boards for student experimentation, providing a real-world purpose for those AI logs many teachers have been using. In addition to helping their teacher understand their AI insights, these logs could serve as self-study opportunities, a structured and mentored chance to learn about “AI and ME” rather than generic AI and generic students.
Even as preliminary research, this study raises essential questions about the cognitive costs and benefits of educational AI adoption. As we rush to integrate AI tools into classrooms, we must consider not just what students do and how they produce, but what happens inside their minds during production and how they feel about it. Teachers aren’t used to inquiring about how students feel about their work, though they are prone to complaining about it.
The neural evidence suggests that maintaining human cognitive engagement requires thoughtful, sequenced integration of AI tools, not wholesale adoption or rejection, but careful choreography of human and artificial intelligence. Using these principles may indeed make AI less of a threat and more of a support. But if these principles are applied in a highly-controlled ecology where students are told when, how, what, how long, and why to write with a non-existent focus on self-regulating how this AI is working for them, the tool could easily become one of thought control rather than liberation.
Reference List for Further Reading
Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X.-H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv preprint arXiv:2506.08872.arxiv
"ChatGPT's Impact On Our Brains According to an MIT Study." TIME, June 16, 2025.time
Georgiou, G. P. (2025). ChatGPT produces more 'lazy' thinkers: Evidence of cognitive engagement decline. arXiv preprint arXiv:2507.00181.arxiv
"Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task." Reddit, June 14, 2025.reddit
"The Implications of the MIT Research Paper 'Your Brain on ChatGPT' at Work." The Career Toolkit Blog, June 23, 2025.thecareertoolkitbook
"Overview: Your Brain on ChatGPT - MIT Media Lab." MIT Media Lab, June 17, 2025.mit
"The Cognitive Debt of Digging Through Preprints." The BS Detector (Substack), June 18, 2025.thebsdetector.substack

Sisyphus fatigue.
My Authentic Intelligence liked this post.