June 1, 1996

•

5 min (est.)

•

Vol. 38

•

No. 4

On the Cutting Edge of Assessment

Premium Resource

How should a teacher in 1996 assess his students' understanding of 1492?

That was the question confronting Larry Lewin after his class of 8th graders had studied, in depth, the historic events of 1492. Lewin wanted to assess his students' understanding of the relationships among Columbus, the Spanish, and the Native American Tainos, but he didn't want to give a traditional short-answer test, which would "put a ceiling" on how much learning his students could demonstrate. Tests are also "a turnoff" for students, says Lewin, who teaches at James Monroe Middle School in Eugene, Ore.

So Lewin asked his students to write a persuasive letter to the monarchs Ferdinand and Isabella of Spain. In their letters, students were expected to define the encounter between Europeans and Native Americans by a key word of their choice—discovery, visit, invasion—and then argue why that word fit. Students had to call upon their newly acquired knowledge of history to defend their point of view. When finished, the letters were assessed against criteria that the class had helped generate.

The letter-writing task revealed more about students' learning than a traditional test would have, Lewin says, because it called on students to do something with their knowledge—not just regurgitate it. For the same reason, the task was "definitely more inherently motivating," he believes. "Kids are more motivated to write to dead monarchs than to take a test."

This performance assessment—and the challenges it poses—are representative of the new trends in assessment. To give readers an overview of these trends, Education Update spoke with 10 experts in the field, who offered their perspectives on how the assessment of student learning is changing, and why. Chief among these trends are educators' efforts to assess active learning and to base assessments on clearly defined standards.

Assessing Active Learning. To assess learning goals that transcend mere recall, educators are experimenting with performance assessments, such as projects, exhibitions, and portfolios. The urge to get beyond rote learning is widespread, experts agree. Today's teachers don't want to just dispense knowledge, Lewin says. Besides teaching about history, literature, and mathematics, teachers also want to give students opportunities to be historians, literary critics, and mathematicians. This new emphasis on active learning requires assessment tasks that call on students to write, debate, create products, conduct experiments, and so on.

Using such performance assessments sends a new message to students, says Jay McTighe, director of the Maryland Assessment Consortium. A reliance on multiple-choice tests conveys to students that what's worth learning is bits of knowledge that can be quickly recalled or recognized. Performance assessments, on the other hand, convey to students that educators value in-depth understanding, the ability to apply knowledge in new situations, and high-quality work.

Standards. Educators are also moving to a standards-based system: They are rejecting norm-referenced tests, which measure students against one another, in favor of criterion-referenced assessments. The latter judge student work against standards (typically embodied in scoring rubrics), so all students potentially can do well. The bell-shaped curve is no longer taken for granted.

The shift to standards-based assessment helps create "a culture of success," says Charlotte Danielson, project coordinator for the South Brunswick (N.J.) Township Schools. Having rejected the norm-referenced paradigm, teachers can say to students, "Here's the challenge—and everybody in here can get up to an acceptable level. We don't have to have any failure."

To create a culture of success, "it helps enormously to involve students in developing rubrics," which spell out the criteria for different levels of performance, Danielson says. In the past, the assessment process has been "kind of mysterious," she contends; students were "shooting blind." But when students are involved in developing assessment criteria, "the kids know what they have to do to [perform] well—and guess what? They do it."

Lewin's letter-writing task illustrates what the shift to rubrics entails. Before his students wrote their letters to the Spanish king and queen, Lewin discussed with them the characteristics of an excellent persuasive letter. Together they brainstormed specific criteria by which the letters would be evaluated. Like Danielson, Lewin believes it's important to use "class-generated" assessment criteria, so students will feel ownership in the way their work is rated.

Getting the criteria right is not necessarily easy. For a task such as this one, Lewin notes, the tendency is "to get hung up on writing traits, such as mechanics or tone, rather than focusing on the real target"—students' historical understanding. "It's tricky," he concedes. "Writing is important, but historians can reveal their understanding in many ways."

Three Kinds of Criteria

Basing assessment on explicit criteria may help demystify the assessment process for students. For teachers, however, such an approach may not be as straightforward as it first appears. When teachers use a criteria-based approach, they need to use a variety of criteria to be fair, says Thomas Guskey of the University of Kentucky, editor of ASCD's 1996 Yearbook, Communicating Student Learning.

Naturally, teachers use "product criteria" to assess the outcome of an assessment task—the essay, performance, or project that students produce. But most teachers also use "process criteria" and "progress criteria," Guskey says. Process criteria are used to assess elements such as effort, homework, and class participation. Progress criteria are used to assess "how far students have come."

Using only product criteria can be unfair—a point Guskey illustrates with a hypothetical gym-class situation. Imagine assessing two students: one is a brilliant athlete; the other has poor movement skills but always tries his hardest and is unfailingly sportsmanlike. Using only product criteria (such as how high the student can jump and how fast he can run) would not recognize the second student for the things he does well.

Using all three kinds of criteria to derive a single grade, however, creates another problem, Guskey points out: Interpreting grades becomes very difficult. What does a B mean? Yet assigning separate grades in each area—for product, process, and progress—can also have undesirable effects. If a student is given an A for effort but a low grade for the product, he or she gets the message that "you tried but didn't cut it."

Another troubling consequence of standards-based assessment is students' potential failure. "If I as a teacher know what standards of performance are in my class," Lewin says, "a kid in fact could not succeed." One possible scenario he envisions is that the students who used to get Cs—the ones who showed up and didn't make waves—would suddenly not be matriculating. "What are we going to do for the kids who try and yet don't meet the standard?" he wonders. The lack of a clear answer to that question fuels educators' "huge trepidation around adopting standards."

Designing Tasks and Rubrics

Undaunted by these challenges, many educators are working to design their own performance assessments. The "top interest" among progressive teachers is how to design classroom-based performance tasks, Lewin believes. Teachers want to create opportunities for students to reveal their thinking, through tasks that motivate them to "give their very best."

Designing performance assessments is not easy, experts agree. "It's really very hard, partly because teachers have not been asked to be very clear about their goals," Danielson says. Teachers are not used to designing assessments, notes education consultant Bena Kallick. "Teachers focus on activities to get the kids engaged," she says. They tend to think of what they're going to do, not what the result will be. "It's very hard to get teachers to plan backwards" from student outcomes, she says.

Usually teachers ask the wrong question first, says Michael Hibbard, assistant superintendent of the Region 15 Public Schools in Middlebury, Conn. Teachers ask, "What do we do?"—putting the focus immediately on designing tasks—when they need to ask, "What do we want kids to know and be able to do? How well? What does quality look like?" Teachers need to answer these questions very clearly first, Hibbard says.

Teachers must also confront the challenge of developing—or finding—scoring rubrics. Teachers often need to create rubrics from scratch, Lewin says, and they must revise their rubrics repeatedly to "really get them right." Using off-the-shelf rubrics, however, is controversial.

The value of "generic" rubrics, which cut across subject areas and topics, is the subject of a "raging discussion" in the field, says Joan Herman, associate director of the Center for Research on Evaluation, Standards, and Student Testing (CRESST) at UCLA. Using generic rubrics can be "a good strategy," Herman herself believes. For example, teachers of any subject could use a generic scoring rubric for explanation tasks—regardless of the content of students' explanations. Such a rubric could include criteria for students' use of principles in their explanations, and for evidence of misconceptions, among others. Some experts, however, believe that rubrics should be closely tailored to particular assessment tasks, not generic.

Teachers also need to bear in mind the concepts of reliability and validity. "The concepts are important at the classroom level, but the technical methodology is not," Herman says. The link between assessment and content is a validity issue for the classroom, she notes: Does the performance task really assess understanding of the content?

To help ensure reliability, teachers can take simple steps such as regrading a couple of papers from the top of the pile after they have graded the whole stack—to guard against applying a sliding standard, Herman says. Some districts have worked out "complex auditing schemes" to ensure inter-rater reliability, she adds. Measures such as blind scoring are used to ensure that scores are based on the quality of the performances, not on who rated them.

New worries about subjectivity arise, says Hibbard, as traditional assessments such as spelling tests yield ground to performance tasks that are not as cut-and-dried to score.

Hibbard's district has found a practical solution to this problem, he says: avoiding "holistic rubrics" in favor of an "analytical approach" based on lists of very detailed criteria. Using these lists, teachers can give students feedback on each element of a task and explain a student's overall grade with great specificity, he explains.

These "analytical assessment lists" also clarify expectations for assignments, giving parents a way to help their children with homework, rather than just harassing them to do it, Hibbard says. If a student is struggling with writing an essay, for example, the parent could ask to see the assessment list, which indicates that the essay should include three main ideas, a strong opening statement and conclusion, rebuttal of opposing viewpoints, and so on. "Now people know what good' is," Hibbard says. "It's so logical and clear; it's not hocus-pocus."

Hibbard warns against a pitfall he calls "process creep"—the tendency to "lose" content when emphasizing student performances. Students may be "making stuff, writing—but where's the content?" he asks rhetorically. Learning content is critical; learning is "not just doing nifty things." If too much emphasis is shifted to doing things, content "can become ambiguous, foggy, or absent," he cautions.

If the public perceives that educators don't care about content, schools are in trouble, Hibbard says. In his district, educators emphasize to parents that they are not devaluing or reducing content; rather, they are "expanding" their program so that students can use the knowledge they learn. "We keep saying that over and over."

Convincing the Public

The need to reassure parents reflects the challenge of gaining public support for performance assessments, which may be a hard sell. The public is suspicious of educators in general, says Herman, so they are prone to see new assessments as "the educational bureaucracy one more time trying to pull the wool over the public's eyes, so they can get away with not teaching the basics."

Parents grew up in a norm-referenced system, says Guskey. "That's the expectation they have." Communicating the results of performance assessments to parents "poses a real challenge for educators," he says. When teachers give feedback based on a rubric or a list of competencies, parents often ask, "How's my kid really doing?" They want to know how their child is doing relative to the other students in the class.

One way teachers can cope with this reaction, Guskey suggests, is to give students two marks: one for progress with regard to the skill in question ("in process" or "mastery," for example), and one to indicate where the student's level of proficiency lies in relation to expectations ("behind," "on target," or "ahead"). Similarly, "to send a portfolio home is not all that helpful," Guskey says. A better approach is "to provide some idea of the standards set for any particular entry" or an analysis of progress made over the course of the year.

Another way to persuade the public to accept performance assessments is to bring them into the assessment development process, Herman maintains. If new assessments are strongly linked to standards endorsed by the public, "there is no reason to expect a lack of buy-in," she says.

When parents are involved in scoring a language arts assessment, they get "a broader awareness of new teaching approaches and the complexity of teaching higher-order thinking skills," says Charlotte Higuchi, director of the language arts project at CRESST. Parent involvement in assessing students' performance yields far more benefits than "a machine spewing a score," she says. "You get a score, but you also get teachers and parents understanding what kids need, and what needs to be done" to improve their learning, through looking directly at students' work. Parents understand that norm-referenced tests tell very little about children, Higuchi says.

The products of performance assessments can also be persuasive—as when parents read a finely crafted essay that makes strong arguments. "Parents of my kids are really proud when they read that letter to Ferdinand and Isabella," Lewin reports.

Parents also need to be reassured that performance assessments are complementing—not replacing—traditional tests. Even most advocates of new assessments don't want to jettison traditional tests, including multiple-choice ones. They want to find the right mix of assessment practices.

"There's no one best assessment method," McTighe says. Whether to use multiple-choice tests, performances, projects, exhibitions, or portfolios depends on what's being assessed, for what purpose, and how results will be used, he says. "To be an advocate of performance assessment doesn't mean you're a foe of standardized assessment," Guskey adds.

Looking Forward

Performance assessments are not going away, experts agree. Today, "issues of assessment are as powerful as ever," Kallick says. "There's a lot more highlighting of the significance of assessment data. People are seeing the limitations of standardized tests and understanding the need for more than one lens."

The future of performance assessment looks somewhat brighter at the classroom level than at the state level, however. "Among school-level educators, the interest is definitely on the rise," Herman says, whereas at the state level there has been "retrenchment." In states such as California and Arizona, legislatures are showing "a conservative shift back toward basic skills," which can be assessed by multiple-choice tests.

State-level performance assessment has been "oversold for accountability purposes," Herman says. Using such assessments to provide data about individual students is problematic. Yet, "in most states, these efforts have fallen on political, not technical, grounds," she notes.

"There's a feeling that [large-scale] performance assessment has not fulfilled its promise," Higuchi says, but the promises may have been extravagant. Some states predicted that students would make really big learning gains, she says. "You don't make gains unless the teaching changes."

In Vermont, where a RAND study found that portfolio assessment was not very reliable, "they tried to implement a large-scale assessment very quickly," McTighe says. As a result, there was not enough time to train teachers or to develop procedures for reliable evaluation.

On a brighter note, most states now have curriculum frameworks, McTighe says, which influence curriculum and assessment. As the demand for accountability increases, so do calls for assessments tied to state outcomes. The question is: to what extent will these be performance assessments?

In Maryland, a statewide performance assessment of students in grades 3, 5, and 8 is tied to state learning outcomes in the core subjects, McTighe explains. Each student takes only some components of the test, which is designed to assess school and district performance. The test incorporates primary source literature, factual information, and real-world scenarios, from which students must draw conclusions. In addition, the science portion of the test includes a hands-on element, and the writing portion allows peer response and revision.

Also encouraging to McTighe are the efforts of some districts to build performance assessments into their curriculums, so teachers don't have to develop the assessments on their own time. He is also heartened by the emergence of regional and state consortiums devoted to performance assessment. These consortiums are "working smarter by collaborating," he says.

Yet, despite such progress, "I'm not confident that large-scale performance assessment is really going to win out," McTighe says, because of its technical fragility and higher costs. With the exception of writing assessments, large-scale performance assessment is still "in its early days." McTighe hopes political pressures "don't cause policymakers to abandon performance assessment before it's had a chance to mature." If we'd done that with writing assessment, he says, we'd still be using multiple-choice tests of mechanics and grammar—an "impoverished" way to assess whether students can write.

McTighe also hopes that teachers and school systems will recognize the value of "a performance orientation." At the classroom level, the costs of performance assessment—both in time and funds—are not as limiting, he says.

Classroom teachers have always used performance assessments, simply by assigning and grading work, Herman points out. Teachers "naturally resonate" to this form of assessment. "In their minds and hearts, they know there's more to learning than multiple choice," she says. For teachers, the "spin-off benefits" of performance assessment are "absolutely enormous," Danielson adds. Performance assessment obliges them to be clear about their goals, and so "everything about their teaching gets better."

ASCD Assessment Conference

Experts quoted in this article will be among the presenters at the ASCD Conference on Teaching and Learning: Assessment, which will take place October 21-23, 1996, in Dallas, Texas. The registration fee for members is $350 and for nonmembers, $400. (If registrations are postmarked by July 15, these fees are $299 and $350 respectively.) For more information, contact ASCD's Call Service Center at 1-800-933-2723; press 2.

Questioning Our Motives

Many progressive educators are working intently to develop and refine new systems of performance assessment. That worries Alfie Kohn, author of Punished by Rewards, who suggests that these educators may be overlooking some fundamental issues.

Kohn is concerned that the enthusiasm over performance assessment may cause educators to "pay far more attention to how we're assessing students than to the more critical question of why." If educators are using assessment to motivate students to improve, then "that enterprise is doomed, regardless of how jazzy the technique for assessment may be," he asserts. This is because students who are preoccupied with how they're doing tend to lose interest in what they're doing—a phenomenon that Kohn says is borne out by research from educational psychology.

The motive for assessing students should be to provide them with information that can fuel their interests, Kohn contends. A good assessment model supports students' desire to learn, rather than imposing a set of demands and expectations on them, which will blight their intrinsic motivation. "If students continue arguing about a topic after class, if they read by choice, if they chatter excitedly about something they've figured out—then we're doing something right," Kohn says. "Too much emphasis on how they're doing makes that less likely to happen."

Even the most authentic assessment system can be overdone, Kohn warns—noting that some assessment experts want to turn the school day into "one long assessment activity." This trend is worrisome because research suggests there is a fundamental difference in students' goals depending on whether they are learning or performing, and these goals "can pull in opposite directions."

Although it's important for students to think about how well they're doing, students also need "substantial time freed from pressure to improve, to be deeply engaged with the ideas they're playing with," Kohn says. Students need time to be tentative, when they can stop worrying about how good they are. "Fixing students' attention on their performance may come at the expense of their playing with ideas, words, and numbers," he cautions.

Another danger of making students preoccupied with their performance is that "kids explain assessment results in terms of innate ability rather than effort," Kohn says. "After a while, students are inclined to take assessment as an indication of how smart or dumb they are." Once that has happened, students want only easy tasks, to preserve their image of themselves as smart. Students who are afraid to face challenges or take risks are "not what we want," Kohn points out.

Nevertheless, there is room for the "How am I doing?" side of the equation. Educators should involve students in the process of evaluating their work, Kohn believes.

At the beginning of the school year, for example, students should reflect on what constitutes superb work, Kohn recommends. What makes for an interesting story? A well-designed experiment? A persuasive essay? Working together, the teacher and students should develop some assessment criteria that both can apply. Students should also be encouraged to weigh their own work against good and bad exemplars. "This kind of evaluation is less punitive," Kohn says, and it makes assessment part of the learning process.

Yet even exemplars can have a negative effect, if students focus on doing better than last year's students. If the point becomes "not to learn but to beat someone else," Kohn says, "that is really pernicious."

Underlying these issues is a value judgment about what educators' main objective should be. To explain this point, Kohn uses an illustration from physical education. Do we want students to run faster? he asks. Or do we want students to be intrigued with running and to run by choice outside the school setting?

Educators typically take for granted that the goal of schooling is to continually improve students' performance, Kohn says. A better goal, he maintains, is to support students' continued motivation to learn—to foster intrinsic interest in learning that does not spring from rubrics and tests. If educators pursue the latter goal, "excellence will follow," Kohn believes. On the other hand, if students are ceaselessly pushed to improve, "some will drop out; some will stay in school but have no taste for it," he says.

Kohn concedes that shifting emphasis away from improving performance is difficult. Public officials—school boards, legislators— "have a gun to our heads," he observes. "They are not seeing student interest as a key outcome." Instead, they want higher test scores.

Kohn also emphasizes the need for a curriculum that is inherently interesting. "If what students are asked to do is not important and engaging," he says, "then it's a futile and manipulative enterprise to use assessment tools to get them to do it." Educators shouldn't just ask, "Are students learning effectively?" but also, "Is what they are given worth learning?"

While he offers these caveats about assessment, Kohn emphasizes that he's a supporter of performance assessment and, by and large, "on the same side" as its advocates. "In many districts, it would be an enormous step forward merely to go to a rubric, away from letter grades," he says. "That alone would be cause to party well into the night."

ASCD is a community dedicated to educators' professional growth and well-being.

Let us help you put your vision into action.

Discover ASCD's Professional Learning Services