All About Accountability / Diagnostic Assessment: A Measurement Mirage?

W. James Popham

Today's teachers toil in what we might charitably call a “test-rich” environment. Rarely does a week go by that teachers don't find themselves required to administer high-stakes accountability tests, early-warning versions of those tests, or piles of pretend tests during “practice-makes-perfect” preparation sessions. In the United States, this test frenzy was triggered by No Child Left Behind. In many other nations, state-level or provincial accountability policies have fostered somewhat less frenetic but similar preoccupations with education assessment.

The architects of these testing programs, almost without exception, contend that their assessments provide “instructionally diagnostic” data for teachers. Indeed, most education accountability tests are accompanied by reams of supportive rhetoric asserting that students' performances on these tests will enable teachers not only to diagnose student strengths or weaknesses but also to devise more appropriate classroom pursuits for those “instructionally diagnosed” students. In almost all instances, however, the claim that an accountability test is instructionally diagnostic is a flat-out falsehood.

For the past few years, I've had conversations with classroom teachers across the United States and Canada about this important issue. During those candid interchanges, I always try to get a fix on precisely what those teachers do with the results of their state's or their province's accountability tests. Rarely can teachers spell out in fog-free fashion just how these assessment results inform their instructional decisions.

The feckless nature of most accountability test reports has been fostered, in the main, by well-intentioned but naive attempts on the part of curriculum authorities to assess, without much exaggeration,everything. Just take a gander at any state's currently approved collections of content standards, that is, the state's official curricular aims. Invariably, you'll find an enormous array of skills and knowledge that the state's students are supposed to learn. There is almost always too much stuff to teach during the instructional time available. In truth, most states' current content standards more closely resemble wish lists than realistic curricular aspirations.

From an assessment perspective, these sprawling sets of curricular aims make it literally impossible to devise instructionally diagnostic tests. When we test students, we are attempting to arrive at an accurate inference about a student's unseen skills or knowledge. That's when assessment validity comes into play. We can't tell, merely by looking, whether a student (1) possesses a key cognitive skill, or (2) has mastered a particular body of knowledge. We need students to display their skills or knowledge on a test so that we can infer what's going on inside their skulls.

But if there are too many curricular aims to assess, then the test's designers can't include enough items to enable teachers to arrive at valid inferences about a student's mastery of those aims. It is foolish to believe that teachers can make meaningful instructional decisions about a student on the basis of a student's performance on one or two items—even one or two really good items. For instance, think of how silly it would be to try to get an accurate fix, using only an item or two, on a student's ability to multiply pairs of double-digit numbers.

Sometimes, in the “spirit of instructional diagnosticity,” designers of state accountability tests will release actual test items along with students' per-item performances. This is a misguided attempt to provide teachers with instructionally diagnostic insights. Teachers aren't really advantaged by looking at their students' performances on one or two items, even actual items.

How, then, can instructionally diagnostic tests become a reality instead of public relations rhetoric? The answer is all too straightforward: We must set out to measure fewer curricular targets. Modest sets of genuinely powerful curricular aims—such as a student's ability to compose original persuasive, narrative, or expository essays—must subsume key subskills and enabling bodies of knowledge. This sort of prioritizing would provide sufficient room on a test to include enough items for each curricular aim assessed, enabling teachers to arrive at reasonably accurate estimates of student mastery of those aims. Teachers could formulate sensible, data-based instructional decisions on the basis of easily interpreted reports. If the designers of accountability tests were really inventive, teachers would receive some useful clues regarding the sorts of mistakes their students are making, for example, the number of students who chose incorrect responses specifically designed to detect a common misconception.

Educators need to bring pressure, lots of it, on those in charge of state-level accountability tests to make these tests, without sham, instructionally diagnostic. But such a transformation in accountability testing will only transpire when we insist on using tests that help kids—instead of using tests that only kid teachers.

James Popham is Emeritus Professor in the UCLA Graduate School of Education and Information Studies. At UCLA he won several distinguished teaching awards, and in January 2000, he was recognized by UCLA Today as one of UCLA's top 20 professors of the 20th century.

Popham is a former president of the American Educational Research Association (AERA) and the founding editor of Educational Evaluation and Policy Analysis, an AERA quarterly journal.

He has spent most of his career as a teacher and is the author of more than 30 books, 200 journal articles, 50 research reports, and nearly 200 papers presented before research societies. His areas of focus include student assessment and educational evaluation. One of his recent books is Assessment Literacy for Educators in a Hurry.

Learn More

ASCD is a community dedicated to educators' professional growth and well-being.

Let us help you put your vision into action.

Discover ASCD's Professional Learning Services

From our issue

Reading, Writing, Thinking

Go To Publication