1703 North Beauregard St.
Alexandria, VA 22311-1714
Tel: 1-800-933-ASCD (2723)
8:00 a.m. to 6:00 p.m. eastern time, Monday through Friday
Local to the D.C. area: 1-703-578-9600
Toll-free from U.S. and Canada: 1-800-933-ASCD (2723)
All other countries: (International Access Code) + 1-703-578-9600
November 2005 | Volume 63 | Number 3
Assessment to Promote Learning
Jay McTighe and Ken O'Connor
Teachers in all content areas can use these seven assessment and grading practices to enhance learning and teaching.
Classroom assessment and grading practices have the potential not only to measure and report learning but also to promote it. Indeed, recent research has documented the benefits of regular use of diagnostic and formative assessments as feedback for learning (Black, Harrison, Lee, Marshall, & Wiliam, 2004). Like successful athletic coaches, the best teachers recognize the importance of ongoing assessments and continual adjustments on the part of both teacher and student as the means to achieve maximum performance. Unlike the external standardized tests that feature so prominently on the school landscape these days, well-designed classroom assessment and grading practices can provide the kind of specific, personalized, and timely information needed to guide both learning and teaching.
Classroom assessments fall into three categories, each serving a different purpose. Summative assessments summarize what students have learned at the conclusion of an instructional segment. These assessments tend to be evaluative, and teachers typically encapsulate and report assessment results as a score or a grade. Familiar examples of summative assessments include tests, performance tasks, final exams, culminating projects, and work portfolios. Evaluative assessments command the attention of students and parents because their results typically “count” and appear on report cards and transcripts. But by themselves, summative assessments are insufficient tools for maximizing learning. Waiting until the end of a teaching period to find out how well students have learned is simply too late.
Two other classroom assessment categories—diagnostic and formative—provide fuel for the teaching and learning engine by offering descriptive feedback along the way. Diagnostic assessments—sometimes known as pre-assessments—typically precede instruction. Teachers use them to check students' prior knowledge and skill levels, identify student misconceptions, profile learners' interests, and reveal learning-style preferences. Diagnostic assessments provide information to assist teacher planning and guide differentiated instruction. Examples of diagnostic assessments include prior knowledge and skill checks and interest or learning preference surveys. Because pre-assessments serve diagnostic purposes, teachers normally don't grade the results.
Formative assessments occur concurrently with instruction. These ongoing assessments provide specific feedback to teachers and students for the purpose of guiding teaching to improve learning. Formative assessments include both formal and informal methods, such as ungraded quizzes, oral questioning, teacher observations, draft work, think-alouds, student-constructed concept maps, learning logs, and portfolio reviews. Although teachers may record the results of formative assessments, we shouldn't factor these results into summative evaluation and grading.
Keeping these three categories of classroom assessment in mind, let us consider seven specific assessment and grading practices that can enhance teaching and learning.
On the first day of a three-week unit on nutrition, a middle school teacher describes to students the two summative assessments that she will use. One assessment is a multiple-choice test examining student knowledge of various nutrition facts and such basic skills as analyzing nutrition labels. The second assessment is an authentic performance task in which each student designs a menu plan for an upcoming two-day trip to an outdoor education facility. The menu plan must provide well-balanced and nutritious meals and snacks.
The current emphasis on established content standards has focused teaching on designated knowledge and skills. To avoid the danger of viewing the standards and benchmarks as inert content to “cover,” educators should frame the standards and benchmarks in terms of desired performances and ensure that the performances are as authentic as possible. Teachers should then present the summative performance assessment tasks to students at the beginning of a new unit or course.
This practice has three virtues. First, the summative assessments clarify the targeted standards and benchmarks for teachers and learners. In standards-based education, the rubber meets the road with assessments because they define the evidence that will determine whether or not students have learned the content standards and benchmarks. The nutrition vignette is illustrative: By knowing what the culminating assessments will be, students are better able to focus on what the teachers expect them to learn (information about healthy eating) and on what they will be expected to do with that knowledge (develop a nutritious meal plan).
Second, the performance assessment tasks yield evidence that reveals understanding. When we call for authentic application, we do not mean recall of basic facts or mechanical plug-ins of a memorized formula. Rather, we want students to transfer knowledge—to use what they know in a new situation. Teachers should set up realistic, authentic contexts for assessment that enable students to apply their learning thoughtfully and flexibly, thereby demonstrating their understanding of the content standards.
Third, presenting the authentic performance tasks at the beginning of a new unit or course provides a meaningful learning goal for students. Consider a sports analogy. Coaches routinely conduct practice drills that both develop basic skills and purposefully point toward performance in the game. Too often, classroom instruction and assessment overemphasize decontextualized drills and provide too few opportunities for students to actually “play the game.” How many soccer players would practice corner kicks or run exhausting wind sprints if they weren't preparing for the upcoming game? How many competitive swimmers would log endless laps if there were no future swim meets? Authentic performance tasks provide a worthy goal and help learners see a reason for their learning.
A high school language arts teacher distributes a summary of the summative performance task that students will complete during the unit on research, including the rubric for judging the performance's quality. In addition, she shows examples of student work products collected from previous years (with student names removed) to illustrate criteria and performance levels. Throughout the unit, the teacher uses the student examples and the criteria in the rubric to help students better understand the nature of high-quality work and to support her teaching of research skills and report writing.
A second assessment practice that supports learning involves presenting evaluative criteria and models of work that illustrate different levels of quality. Unlike selected-response or short-answer tests, authentic performance assessments are typically open-ended and do not yield a single, correct answer or solution process. Consequently, teachers cannot score student responses using an answer key or a Scantron machine. They need to evaluate products and performances on the basis of explicitly defined performance criteria.
A rubric is a widely used evaluation tool consisting of criteria, a measurement scale (a 4-point scale, for example), and descriptions of the characteristics for each score point. Well-developed rubrics communicate the important dimensions, or elements of quality, in a product or performance and guide educators in evaluating student work. When a department or grade-level team—or better yet, an entire school or district—uses common rubrics, evaluation results are more consistent because the performance criteria don't vary from teacher to teacher or from school to school.
Rubrics also benefit students. When students know the criteria in advance of their performance, they have clear goals for their work. Because well-defined criteria provide a clear description of quality performance, students don't need to guess what is most important or how teachers will judge their work.
Providing a rubric to students in advance of the assessment is a necessary, but often insufficient, condition to support their learning. Although experienced teachers have a clear conception of what they mean by “quality work,” students don't necessarily have the same understanding. Learners are more likely to understand feedback and evaluations when teachers show several examples that display both excellent and weak work. These models help translate the rubric's abstract language into more specific, concrete, and understandable terms.
Some teachers express concern that students will simply copy or imitate the example. A related worry is that showing an excellent model (sometimes known as an exemplar) will stultify student creativity. We have found that providing multiple models helps avoid these potential problems. When students see several exemplars showing how different students achieved high-level performance in unique ways, they are less likely to follow a cookie-cutter approach. In addition, when students study and compare examples ranging in quality—from very strong to very weak—they are better able to internalize the differences. The models enable students to more accurately self-assess and improve their work before turning it in to the teacher.
Before beginning instruction on the five senses, a kindergarten teacher asks each student to draw a picture of the body parts related to the various senses and show what each part does. She models the process by drawing an eye on the chalkboard. “The eye helps us see things around us,” she points out. As students draw, the teacher circulates around the room, stopping to ask clarifying questions (“I see you've drawn a nose. What does the nose help us do?”). On the basis of what she learns about her students from this diagnostic pre-test, she divides the class into two groups for differentiated instruction. At the conclusion of the unit, the teacher asks students to do another drawing, which she collects and compares with their original pre-test as evidence of their learning.
Diagnostic assessment is as important to teaching as a physical exam is to prescribing an appropriate medical regimen. At the outset of any unit of study, certain students are likely to have already mastered some of the skills that the teacher is about to introduce, and others may already understand key concepts. Some students are likely to be deficient in prerequisite skills or harbor misconceptions. Armed with this diagnostic information, a teacher gains greater insight into what to teach, by knowing what skill gaps to address or by skipping material previously mastered; into how to teach, by using grouping options and initiating activities based on preferred learning styles and interests; and into how to connect the content to students' interests and talents.
Teachers can use a variety of practical pre-assessment strategies, including pre-tests of content knowledge, skills checks, concept maps, drawings, and K-W-L (Know-Want to learn-Learn) charts. Powerful pre-assessment has the potential to address a worrisome phenomenon reported in a growing body of literature (Bransford, Brown, & Cocking, 1999; Gardner, 1991): A sizeable number of students come into school with misconceptions about subject matter (thinking that a heavier object will drop faster than a lighter one, for example) and about themselves as learners (assuming that they can't and never will be able to draw, for example). If teachers don't identify and confront these misconceptions, they will persist even in the face of good teaching. To uncover existing misconceptions, teachers can use a short, nongraded true-false diagnostic quiz that includes several potential misconceptions related to the targeted learning. Student responses will signal any prevailing misconceptions, which the teacher can then address through instruction. In the future, the growing availability of portable, electronic student-response systems will enable educators to obtain this information instantaneously.
As part of a culminating assessment for a major unit on their state's history and geography, a class of 4th graders must contribute to a classroom museum display. The displays are designed to provide answers to the unit's essential question: How do geography, climate, and natural resources influence lifestyle, economy, and culture? Parents and students from other classrooms will view the display. Students have some choice about the specific products they will develop, which enables them to work to their strengths. Regardless of students' chosen products, the teacher uses a common rubric to evaluate every project. The resulting class museum contains a wide variety of unique and informative products that demonstrate learning.
Responsiveness in assessment is as important as it is in teaching. Students differ not only in how they prefer to take in and process information but also in how they best demonstrate their learning. Some students need to “do”; others thrive on oral explanations. Some students excel at creating visual representations; others are adept at writing. To make valid inferences about learning, teachers need to allow students to work to their strengths. A standardized approach to classroom assessment may be efficient, but it is not fair because any chosen format will favor some students and penalize others.
Assessment becomes responsive when students are given appropriate options for demonstrating knowledge, skills, and understanding. Allow choices—but always with the intent of collecting needed and appropriate evidence based on goals. In the example of the 4th grade museum display project, the teacher wants students to demonstrate their understanding of the relationship between geography and economy. This could be accomplished through a newspaper article, a concept web, a PowerPoint presentation, a comparison chart, or a simulated radio interview with an expert. Learners often put forth greater effort and produce higher-quality work when given such a variety of choices. The teacher will judge these products using a three-trait rubric that focuses on accuracy of content, clarity and thoroughness of explanation, and overall product quality.
We offer three cautions. First, teachers need to collect appropriate evidence of learning on the basis of goals rather than simply offer a “cool” menu of assessment choices. If a content standard calls for proficiency in written or oral presentations, it would be inappropriate to provide performance options other than those involving writing or speaking, except in the case of students for whom such goals are clearly inappropriate (a newly arrived English language learner, for example). Second, the options must be worth the time and energy required. It would be inefficient to have students develop an elaborate three-dimensional display or an animated PowerPoint presentation for content that a multiple-choice quiz could easily assess. In the folksy words of a teacher friend, “With performance assessments, the juice must be worth the squeeze.” Third, teachers have only so much time and energy, so they must be judicious in determining when it is important to offer product and performance options. They need to strike a healthy balance between a single assessment path and a plethora of choices.
Middle school students are learning watercolor painting techniques. The art teacher models proper technique for mixing and applying the colors, and the students begin working. As they paint, the teacher provides feedback both to individual students and to the class as a whole. She targets common mistakes, such as using too much paint and not enough water, a practice that reduces the desired transparency effect. Benefiting from continual feedback from the teacher, students experiment with the medium on small sheets of paper. The next class provides additional opportunities to apply various watercolor techniques to achieve such effects as color blending and soft edges. The class culminates in an informal peer feedback session. Skill development and refinement result from the combined effects of direct instruction, modeling, and opportunities to practice guided by ongoing feedback.
It is often said that feedback is the breakfast of champions. All kinds of learning, whether on the practice field or in the classroom, require feedback based on formative assessments. Ironically, the quality feedback necessary to enhance learning is limited or nonexistent in many classrooms.
To serve learning, feedback must meet four criteria: It must be timely, specific, understandable to the receiver, and formed to allow for self-adjustment on the student's part (Wiggins, 1998). First, feedback on strengths and weaknesses needs to be prompt for the learner to improve. Waiting three weeks to find out how you did on a test will not help your learning.
In addition, specificity is key to helping students understand both their strengths and the areas in which they can improve. Too many educators consider grades and scores as feedback when, in fact, they fail the specificity test. Pinning a letter (B-) or a number (82%) on a student's work is no more helpful than such comments as “Nice job” or “You can do better.” Although good grades and positive remarks may feel good, they do not advance learning.
Specific feedback sounds different, as in this example:
Your research paper is generally well organized and contains a great deal of information on your topic. You used multiple sources and documented them correctly. However, your paper lacks a clear conclusion, and you never really answered your basic research question.
Sometimes the language in a rubric is lost on a student. Exactly what does “well organized” or “sophisticated reasoning” mean? “Kid language” rubrics can make feedback clearer and more comprehensible. For instance, instead of saying, “Document your reasoning process,” a teacher might say, “Show your work in a step-by-step manner so the reader can see what you were thinking.”
Here's a simple, straightforward test for a feedback system: Can learners tell specifically from the given feedback what they have done well and what they could do next time to improve? If not, then the feedback is not specific or understandable enough for the learner.
Finally, the learner needs opportunities to act on the feedback—to refine, revise, practice, and retry. Writers rarely compose a perfect manuscript on the first try, which is why the writing process stresses cycles of drafting, feedback, and revision as the route to excellence. Not surprisingly, the best feedback often surfaces in the performance-based subjects—such as art, music, and physical education—and in extracurricular activities, such as band and athletics. Indeed, the essence of coaching involves ongoing assessment and feedback.
Before turning in their science lab reports, students review their work against a list of explicit criteria. On the basis of their self-assessments, a number of students make revisions to improve their reports before handing them in. Their teacher observes that the overall quality of the lab reports has improved.
The most effective learners set personal learning goals, employ proven strategies, and self-assess their work. Teachers help cultivate such habits of mind by modeling self-assessment and goal setting and by expecting students to apply these habits regularly.
Rubrics can help students become more effective at honest self-appraisal and productive self-improvement. In the rubric in Figure 1 (p. 13), students verify that they have met a specific criterion—for a title, for example—by placing a check in the lower left-hand square of the applicable box. The teacher then uses the square on the right side for his or her evaluation. Ideally, the two judgments should match. If not, the discrepancy raises an opportunity to discuss the criteria, expectations, and performance standards. Over time, teacher and student judgments tend to align. In fact, it is not unusual for students to be harder on themselves than the teacher is.
The rubric also includes space for feedback comments and student goals and action steps. Consequently, the rubric moves from being simply an evaluation tool for “pinning a number” on students to a practical and robust vehicle for feedback, self-assessment, and goal setting.
Initially, the teacher models how to self-assess, set goals, and plan improvements by asking such prompting questions as,
Questions like these help focus student reflection and planning. Over time, students assume greater responsibility for enacting these processes independently.
Educators who provide regular opportunities for learners to self-assess and set goals often report a change in the classroom culture. As one teacher put it,
My students have shifted from asking, “What did I get?” or “What are you going to give me?” to becoming increasingly capable of knowing how they are doing and what they need to do to improve.
A driver education student fails his driving test the first time, but he immediately books an appointment to retake the test one week later. He passes on his second attempt because he successfully demonstrates the requisite knowledge and skills. The driving examiner does not average the first performance with the second, nor does the new license indicate that the driver “passed on the second attempt.”
This vignette reveals an important principle in classroom assessment, grading, and reporting: New evidence of achievement should replace old evidence. Classroom assessments and grading should focus on how well—not on when—the student mastered the designated knowledge and skill.
Consider the learning curves of four students in terms of a specified learning goal (see fig. 2, p. 14). Bob already possesses the targeted knowledge and skill and doesn't need instruction for this particular goal. Gwen arrives with substantial knowledge and skill but has room to improve. Roger and Pam are true novices who demonstrate a high level of achievement by the end of the instructional segment as a result of effective teaching and diligent learning. If their school's grading system truly documented learning, all these students would receive the same grade because they all achieved the desired results over time. Roger and Pam would receive lower grades than Bob and Gwen, however, if the teacher factored their earlier performances into the final evaluation. This practice, which is typical of the grading approach used in many classrooms, would misrepresent Roger and Pam's ultimate success because it does not give appropriate recognition to the real—or most current—level of achievement.
Not available for electronic dissemination.
Two concerns may arise when teachers provide students with multiple opportunities to demonstrate their learning. Students may not take the first attempt seriously once they realize they'll have a second chance. In addition, teachers often become overwhelmed by the logistical challenges of providing multiple opportunities. To make this approach effective, teachers need to require their students to provide some evidence of the corrective action they will take—such as engaging in peer coaching, revising their report, or practicing the needed skill in a given way—before embarking on their “second chance.”
As students work to achieve clearly defined learning goals and produce evidence of their achievement, they need to know that teachers will not penalize them for either their lack of knowledge at the beginning of a course of study or their initial attempts at skill mastery. Allowing new evidence to replace old conveys an important message to students—that teachers care about their successful learning, not merely their grades.
The assessment strategies that we have described address three factors that influence student motivation to learn (Marzano, 1992). Students are more likely to put forth the required effort when there is
By using these seven assessment and grading practices, all teachers can enhance learning in their classrooms.
Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2004). Working inside the black box: Assessment for learning in the classroom. Phi Delta Kappan, 86(1), 8–21.
Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds.). (1999). How people learn: Brain, mind, experience, and school. Washington, DC: National Research Council.
Gardner, H. (1991). The unschooled mind. New York: BasicBooks.
Marzano, R. (1992). A different kind of classroom: Teaching with dimensions of learning. Alexandria, VA: ASCD.
Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. San Francisco: Jossey-Bass.
Jay McTighe (firstname.lastname@example.org) is coauthor of The Understanding by Design series (ASCD, 1998, 1999, 2000, 2004, 2005). Ken O'Connor is author of How to Grade for Learning: Linking Grades to Standards (Corwin, 2002).
Copyright © 2005 by Association for Supervision and Curriculum Development
Subscribe to ASCD Express, our free e-mail newsletter, to have practical, actionable strategies and information delivered to your e-mail inbox twice a month.
ASCD respects intellectual property rights and adheres to the laws governing them. Learn more about our permissions policy and submit your request online.