Overview and Core Argument
Assessment in writing instruction has long presented challenges for educators, but especially for classroom teachers and learners. These challenges are amplified when teaching with artificial intelligence, a rapidly changing instrumental force presenting ethical, epistemic, and aesthetic uncertainties. Recent research continues to unearth persistent, disturbing patterns of unreliability and inequity in grades while suggesting promising paths forward for more equitable and consequential evaluation practices.
Context of the Problem
The problem is stark on its face. According to some sources, approximately 60% of teacher-assigned grades do not accurately reflect students' demonstrated knowledge (Feldman, 2024). Grades across departments within disciplines in universities are highly variable despite years of work by accreditation agencies to clarify program learning outcomes and to document learning using stable and durable indicators.
Traditional approaches to writing assessment, particularly the use of rubrics early in the writing process, may actually impede learning by circumventing students' opportunities to develop intrinsic understanding of effective writing (Furman, 2024). As Furman argues, providing predetermined criteria "blocks student entry into the most powerful opportunity to learn what it is that makes good writing" and "reduces cognitive struggle" necessary for genuine learning.
Analysis of Dilemmatic Spaces
To understand these challenges more deeply, educators can draw on Fransson and Grannäs' (2013) concept of "dilemmatic spaces," existential classroom ecologies where competing values, obligations, and commitments intersect in conflict. Rather than viewing assessment challenges as isolated decision points—use a rubric, don’t use a rubric—this framework helps us recognize them as persistent, resistant tensions that must be thoughtfully navigated to achieve the best possible outcome under the circumstances.
Knowledge-Demonstration Space
In AI writing instruction, three key dilemmatic spaces leap out at me. First, the knowledge-demonstration space (Feldman, 2024) sets the stage for struggle between measuring specific criteria and allowing for innovative thinking. This struggle gets nasty in the STEM areas, especially intense in writing as well when teaching with free use of AI. In this context, teachers face a sharp tension: maintaining rigorous technical standards while fostering creative solutions that AI enables. As Nakamura et al. (2024) demonstrated, this balance directly impacts students' epistemic curiosity and engagement with complex material. Traditional assessment metrics often fail to capture innovative approaches that may be equally or more valid in our rapidly evolving technological landscape.
Power-Agency Space
Second, the power-agency space reflects conflicts between teacher obligations to ensure standards and student needs for agency in defining good writing. The power-agency space in AI writing assessment presents a whole set of paradoxes that intersect with broader questions of equity and sempowerment. Traditional assessment approaches, as Feldman (2024) demonstrated, often privilege teacher authority in ways that can mask or depress actual student capabilities, particularly for historically underserved populations. When teachers hold exclusive, often idiosyncratic or mainstream control over defining "good writing," they perpetuate what Fransson and Grannäs (2013) described as "changing conditions of values, decisions, responsibilities and authority" that can systematically disadvantage certain groups of students.
This dilemma becomes even more pronounced in the context of AI writing instruction, where the field itself is rapidly evolving and traditional hierarchies of expertise may need reconsideration. Emerging evidence (personal communications) suggests that students bring valuable perspectives and experiences with AI technologies that can enrich understanding of what constitutes effective writing in this new version of reality. As Furman (2024) noted, traditional rubric-based approaches can create a "strange weakening of self-efficacy" by emphasizing external validation over internal understanding.
The research on epistemic curiosity provides valuable insights here as well. Nakamura et al. (2024) found that student engagement and learning increased significantly when learners had opportunities to construct knowledge collaboratively rather than simply receiving it from authority figures. Their study identified six thematic factors that contribute to epistemic curiosity, including "positive appraisal" and "cognitive puzzles.” How students position themselves in relation to the subject matter, to their peers, and to the broader educational context is a matter of self-appraisal. When assessment practices honor and encourage positive appraisal, they help create what Furman (2024) might call a more authentic and empowering learning environment.
This tension between teacher obligations and student agency reflects what Fransson and Grannäs (2013) describe as the "micro-political" nature of educational spaces. Teachers have the impossible task of navigating professional responsibilities to ensure academic standards while creating conditions that foster genuine student ownership of learning. When teachers implement more equitable assessment practices that balance these competing demands, both grade accuracy and student achievement improve (Feldman, 2024).
A potential resolution emerges from understanding assessment not as a unidirectional exercise of authority, but as what Fransson and Grannäs (2013) call a "relational category wherein one object is related to another." In this view, teachers and students become co-creators of assessment criteria, with each bringing valuable perspectives to the process. This approach maintains academic rigor while acknowledging what Nakamura et al. (2024) identify as students' "underlying desires" to engage meaningfully with course content.
For writing about AI specifically, this might mean creating assessment frameworks that:
1. Begin with student exploration of what makes writing with AI effective;
2. Incorporate both teacher expertise about writing and student insights about AI technologies;
3. Allow for ongoing revision of rubric criteria as understanding develops;
4. Include regular metacognitive, social, structured reflection on the assessment process itself.
This approach recognizes what Fransson and Grannäs (2013) described as the "changeable boundaries of the space" while ensuring that professional standards are maintained. It transforms the power-agency dilemma from a zero-sum conflict into an opportunity for mutual growth and learning.
Learning-Assessment Space
Third, the learning-assessment space involves balancing clear guidance with strong epistemic curiosity to explore and analyze complex ideas. At its core, this space requires teachers to model spontaneous intellectual engagement with student ideas while nurturing students' own drive to understand. When teachers demonstrate genuine curiosity about students' ideas and approaches, especially in an emerging area like AI, they create what Nakamura et al. (2024) identified as a "positive affect linked to the manifestation of epistemic curiosity."
This modeling goes beyond simple encouragement. Teachers must show students how experienced readers and writers genuinely interact with ideas, demonstrating both appreciation and critical analysis. For example, when responding to student writing about AI ethics, a teacher might first express real interest in the student's perspective ("Your analysis of algorithmic bias raises an intriguing point I hadn't considered...") before moving into more structured feedback. Understanding the role of the teacher as spontaneous and situated, this disposition inherently fosters positive self-appraisals. An intelligent adult is paying attention!
The role of self-appraisal becomes even more crucial. As students receive feedback that acknowledges their intellectual contributions, they develop what Fransson and Grannäs (2013) call a "professional sense of the ethics of teaching." They learn to evaluate their own work not just against external criteria, but through the lens of meaningful contribution to ongoing discussions. This develops what Nakamura et al. (2024) identify as "desires behind the manifestation of epistemic curiosity"—students' intrinsic drive to understand and contribute to complex conversations.
Assessment in this space must therefore be recursive and dialogic. Teachers provide clear guidance about technical requirements while modeling authentic intellectual engagement. Students develop their own capacity for self-appraisal through receiving and reflecting on feedback from peers and the teacher that values their contributions. Furman (2024) might call a "metacognitive space" where students learn not just to write with AI, but to think deeply about how they think and write with AI.
Constructing Positive Appraisals
The challenge lies in maintaining this delicate balance within institutional constraints. As Feldman (2024) noted, traditional grading systems often struggle to capture these more nuanced aspects of learning. However, when teachers consciously create spaces for both structured guidance and open-ended exploration, they foster what Nakamura et al. (2024) described as the "positive affect" that drives deeper learning. Mastery comes not from meeting criteria, but from working hard to engage meaningfully with complex ideas and texts.
Drawing on recent research about equitable grading practices (Feldman, 2024), we can construct a more nuanced assessment framework that acknowledges these dilemmatic spaces while promoting both rigor and fairness. This approach begins with co-construction of understanding, where students and teachers jointly explore what constitutes effective writing with AI through doing it, analyzing its effects, and collaboratively reflecting on its appropriate applications.
The framework must also be dynamic, incorporating both fixed criteria and flexible elements that reward innovative thinking. As Nakamura et al. (2024) demonstrated in their study of epistemic curiosity in L2 learning, student engagement increases when they have opportunities to explore and discover their learning processes rather than simply following prescribed rules. The value of building regular reflection points where assessment criteria can be reviewed and revised is affirmed.
Multiple perspectives should be incorporated through student self-assessment, peer feedback, and teacher assessment and evaluation. Research suggests that when teachers improve their grading practices by breaking with tradition and deeply examining the pernicious effects traditional grading can have on motives like epistemic curiosity or willingness to write, both grade accuracy and student achievement increase (Feldman, 2024). The process should be transparent and fair with open discussion of the tensions inherent in assessing work with emerging technologies.
This integrated approach maintains academic rigor while creating safe spaces for the deep engagement and critical thinking essential for understanding and working with AI. Rather than trying to eliminate assessment dilemmas or wishing they would go away, finding ways to foster positive self-appraisal, epistemic curiosity, and human connections in writing classrooms—in all classrooms, really—addresses both practical concerns about grade accuracy and insights in the throes of dilemmas.
References:
Feldman, J. (2024). Can we trust the transcript? Recognizing student potential through more accurate grading. Washington, DC: The Equitable Grading Project. Retrieved from https://equitablegradingproject.org/wp-content/uploads/2024/01/can-we-trust-the-transcript.pdf
Fransson, G., & Grannäs, J. (2013). Dilemmatic spaces in educational contexts – towards a conceptual framework for dilemmas in teachers work. Teachers and Teaching: Theory and Practice, 19(1), 4-17. https://doi.org/10.1080/13540602.2013.744195
Furman, W. (2024). Rubric design: A designer's perspective. Journal of the Scholarship of Teaching and Learning, 24(4), 221-237. https://doi.org/10.14434/josotl.v24i4.35789
Nakamura, S., Darasawang, P., & Reinders, H. (2024). A classroom-based study on the antecedents of epistemic curiosity in L2 learning. Journal of Psycholinguistic Research. Advance online publication. https://doi.org/10.1007/s10936-022-09839-x
When I went back to college after 28 years I took a grad course in something like Educational Measurement and Evaluation.
I developed a theory to have each student submit a short writing sample on a given subject at the beginning of a course or year and have them repeat the assignment at the end. This might help assess a student's real progress.
Actually I compiled it from reading research.