Educational testing in the U.S. is big business, though finding a precise annual dollar amount is elusive. Companies which publish and market standardized tests and related performance assessments, however, are just the tip of the iceberg. The hidden infrastructure anchoring these companies has been buried in layers of Discourse shaping ongoing Conversations among Authorized Voices on the Problem of Testing for at least a century. This Discourse costs money—lots of it.
American universities spend large sums of money year after year to provide a scientific knowledge base lending credibility to standardized testing practices. For just one example, the journal Applied Measurement in Education (1989) published “…a taxonomy of 43 multiple-choice item-writing rules derived from an analysis of 46 authoritative textbooks….”
Backing out these hidden costs of standardized testing in any principled way is a daunting task. The salaries of the researchers who combed through those 46 textbooks, the tuition and fees paid for these researchers to acquire the credentials to write and publish the article, the cost to publish the journal Applied Measurement in Education, the expenses of the plethora of journals publishing this sort of Discourse, royalties paid to the authors of 46 textbooks on educational evaluation—you get the idea.
*
Pearson and Hamm (2005) published a provocative article about the history of the reading comprehension test we recognize unconsciously, automatically, and even involuntarily today just as we recognize “textbook.” Who among us does not have a full blown schema of a reading comprehension test? Despite the social construct of these tests as real as corn plants, Pearson and Hamm made the important point that comprehension assessment did not begin in 1914 with the invention of the Kansas Silent Reading test:
In 1914 Frederick J. Kelly was the Director of the Training School at the State Normal School in Emporia, Kansas. He developed what he called the “Kansas Silent Reading Test” to assess not whether students understood a reading assignment in the context of the classroom, but to assess the reading abilities of students, a quantum leap in decontextualized reading. This move shifted the object of assessment from determining whether a student was understanding what the teacher intended to be understood, to whether a student could understand a proxy text, an anonymous and autonomous text, brought into the warmth of the classroom from the cold cruel world.
In 1935 the invention of the IBM 805 scanner machine sealed the deal: Choosing the right answer after reading a passage would come to define reading comprehension in the public mind. According to Pearson and Hamm (2005), the reduction in cost of scoring and the efficiency of administering the tests afforded rich possibilities of expansion on the multiple-choice paradigm. Prior to the IBM machine, the Scholastic Aptitude Test (SAT) had been mostly an essay exam, for example.
Whether we use Gee’s (2014) definition of capital d Discourse as social positioning of individual participants in activity settings including but not limited to language (uniforms, clothes, tools, values, beliefs, etc.) to create meanings, or Latour’s (1987) definition of a “black box” (the active parts in a scientifically settled procedure for solving problems or completing tasks), the IBM innovation amplified both the spread and the social value of the right-answer mindset. One might argue that its appearance on the stage had far-reaching effects on literacy in much the same way that AI will have.
Here’s the bottom line: These tests made up of multiple-choice questions with the addition of the scantron machine can be likened to differences in reality like those brought about from the change from flintlocks and long rifles to the AK47. Created as a more efficient alternative to oral exams and essay questions, the standardized test stripped away context and anesthetized the public into revising mental representations of human beings from living breathing fellow travelers into abstractions bundled inside abstractions like ability to compose and comprehend. Percentile ranks and achievement designations afforded a quick look at body counts.
.*
Returning to the tip of the iceberg, I hesitate to report the dollar amount invested in the products and services of the five largest test publishers in the U.S., largely because even the source of this figure hedges bets. Nobody really knows if it’s 1.7 billion or 2.4 billion or what. Clearly, it’s a significant chunk of change in the grand scheme of things.
A giant in the industry, Harcourt Brace Educational Measurement just last year merged with NWEA, known for its formative assessment, MAP Growth, widely used in school districts. MAP’s expertise in formative assessment “…offers guidance to educators and school administrators trying to get a gauge on student academic progress” (EdWeek, January 10, 2023). I’ve been trying to get a gauge on student progress myself for quite a long time.
The “problem” of formative assessment wasn’t really on anyone’s officially sanctioned radar before 2010 when the Common Core State Standards absorbed authority and became real through state-level political rituals and ceremonies. I remember a time when formative assessment was a kind of guerrilla warfare, studied by researchers but discounted by school administrators. Given the newly minted “problem,” NWEA stepped into the breach and—voila—MAPS. Witness from the HBJ website:
*
CTB McGraw-Hill, a test company whose headquarters on California’s coastline I’ve been to for a few days, used to be a powerhouse. My experience with CTB occurred when the California Department of Education bravely threw down the gauntlet to the Knight of the Right Answer with its paradigm shifting CLAS Reading Test. Essentially, CLAS was back to the future, time traveling to the days before the Kansas Test of Silent Reading. My assigned task was to report back to the CLAS team regarding any possible links with multiple-choice tests CLAS could make for political persuasion.
Alas, returning to the future, according to a report in the Monterey Herald (June 10, 2015), CTB was laying off workers since it would be incorporated into a larger structure:
“CTB/McGraw-Hill’s Ryan Ranch headquarters is expected to remain open but 33 employees are expected to be laid off as part of Minnesota-based Data Recognition Corporation’s acquisition of ‘key assets’ of the CTB assessment business announced Tuesday.”
Data Recognition Corporation, the testing company that acquired “key assets” of CTB/McGraw-Hill, appears to be positioning itself for a literal run for the money against ETS. Its menu of “assessment solutions” broadens the market to include government, including the military, and business. Of course, ETS has risen from its birth decades ago with a meritocratic mission to a robust, trusted friend in the industry across the globe.
*
In 1996 I was hired as a consultant to help the teachers at Douglass Middle School in Chicago increase their reading scores enough to forestall “reconstitution.” Chicago Public Schools were in the middle of a rather brutal political scheme to destroy all of its schools performing below a predetermined set of schoolwide standardized test scores. The idea was to close non-performing schools and release school personnel to find new jobs in schools that were performing.
I accepted the job, looking forward to working with teachers in their classrooms and in workshops. Contracted for twice-a-month full day visits over an eight month calendar, I set about framing and fleshing out a strategy, preparing myself to do some good if I could. My first day set me straight.
The teachers were required to administer a mock series of tests every three weeks, using summative assessments for formative assessments. Instruction during the intervening three weeks was “planned” in lock step with the “needs” identified by the mock tests. Anybody familiar with the term “situated cognition” or “legitimate peripheral participation” recognizes instantly that pedagogical cognition had been situated not in the minds of teachers but in the parameters of test-makers.
The teachers were forced to become automatons to save their livelihoods. There wasn’t an inch of room for me to work as an invited guest within a community of practice. I arrived from O’Hare Airport at the school my first Friday eager to begin work. First, I was scheduled to meet with the principal. I waited in a chair in the lobby of the office for almost an hour. She was very busy but had not forgotten me. When I sat down in her office, she explained the mock test strategy. She told me I could provide the most help if I didn’t mind if I were help out in the “war room.”
The war room was as busy as a gaggle of leaf cutter ants. Teachers had turned in bunches of scantron sheets from mock testing the day before. There were scantron machines whirring when I entered. The person who escorted me to the war room introduced me as a “helper” from the office who could feed documents into a machine. The war room was on high alert. Teachers were working on a minimum day schedule so they could meet after lunch to review test data and plan their instruction over the upcoming three weeks.
You remember the hijacking of the upcoming redesign of the reading comprehension portion of the National Assessment of Educational Progress. Chester Finn and merry band of conservatives successfully fought a committee of leaders in the field of reading who were eager to shape the new test design to include strategies for factoring a reader’s prior knowledge of the topic of the test passage into the comprehension score.
Somehow, likely with little declarative knowledge of the dehumanizing, decontextualizing history of multiple-choice tests, these conservatives sensed a significant breach to the socially reproductive sorting mechanism standardized tests provide for Power. They stood firm. To measure autonomous reading ability, reading tests must be autonomous, self-contained measurement machines. Any hint of human subjectivity either in the reader or the test-maker violates norms of testing set in place by the Kansas Test of Silent Reading created even before automobiles.
They won.
This was a necessary review, Terry, thank you. I feel the same way. I am still amazed at the behind the scenes $ figures driving this 'testing' business. I feel the test taking support apparatus has become increasingly commercialized. The MBA requirement of GMAT test is another example, with consulting agencies charging upto $10k to help students.