A More Perfect Union
Marrying Standardized and Formative Testing
The tedious debate over the merits of summative and formative tests in the classroom seems to have found a compromise: we will never get rid of standardized tests, but we have realized that we need to use more formative tests to know how to help students improve their skills and understanding. Regardless of the sort of test used, the real issue is how teachers determine a student’s level of conceptual understanding—and what level of understanding is appropriate, or even possible, at different ages or different stages of development.
A math colleague recently told me about a problem he was having with his 2nd graders: "One of the goals of our math curriculum is to enable the students to articulate their mathematical reasoning. We would like them to explain, 'The problem said two more came, so I knew I needed to add,' but instead we get, 'I knew 2+4 was 6'." Although this level of abstract conceptual understanding is not typical of 2nd graders, the teacher was convinced that because a few students could use the words he wanted to hear, all of them ought to be equally capable. Further exploration revealed that even those who used the words didn’t understand the concept; they were merely parroting the language the teacher insisted they use.
Developing conceptual understanding is a process that takes the learner through stages of increasingly sophisticated comprehension, stages that require understanding the relationships among smaller ideas that make up a larger abstraction. Moving through these stages requires a brain that is developmentally ready to do so. Tests, whether summative or formative, typically fail to take these factors into account. Teachers continue to mistake sound for sense. Despite all the chatter, little has changed since I was in school, memorizing the definition of osmosis. As a 9th grader, I had no clue what "the passage of water through a semi-permeable membrane" meant, but I got an A on the test.
Things may finally be changing. Last year, I met Zachary Stein, a doctoral candidate at Harvard’s Graduate School of Education. He told me about his work there with professors Kurt Fischer and Theo Dawson on a project to redesign tests so that they reveal students' current levels of conceptual understanding. The tests also provide teachers with the necessary insight to plan the next steps toward developing more complex understanding in students, using developmentally appropriate learning activities. These tests function as standardized formative assessments, says Stein (in press).
Their project, called DiscoTest, is named for the Latin root for discourse. To construct these tests, researchers and teachers interviewed individuals ranging in age from 5 to about 20 years old to determine their skill level or level of understanding of a concept. These interviews revealed the roughly 14 phases, or "learning sequence," that learners go through, from initial understanding to mastery of the skill or concept. Tests are comprised of open-ended questions, and student answers can be assessed in just a few minutes by using rubrics created from the particular learning sequence for a specific skill or concept.
For example, below is one of six questions DiscoTest (2009) developed to test a student's understanding of energy as it relates to the ball and spring:
Essay 2: What is happening to the energy of the ball in the following situation?
A student writes, “Gravitational force is pulling the ball toward the ground and slightly compressing the spring. The compressed spring has elastic potential energy.”
The answer is rated in three areas—gravity/force, kinetic energy, and potential energy—using statements that the teacher (or the student) can select as accurate descriptions of the answer:
Gravity/force: The student’s answer. . .
- Does not mention gravity.
- Mentions gravity, but does not relate the concept to the problem in a coherent manner.
- Claims that gravity is present (holding down the spring and/or ball).
- Claims that gravity is present and (may be) holding down the spring and/or ball.
- Claims that gravity is present and the ball may have a great enough mass to compress the spring.
Kinetic energy: The student’s answer. . .
- Does not mention kinetic energy.
- Mentions kinetic energy, but does not apply the concept coherently.
- Claims that the ball has no energy.
- Claims that the ball has no kinetic energy.
- Claims that the ball has no kinetic energy because it is not moving.
Potential energy: The student’s answer. . .
- Does not mention potential energy.
- Mentions potential energy, but does not apply the concept coherently.
- Claims that the spring (or the ball) may (or may not) have potential energy.
- Claims that the neither the spring nor the ball has potential energy because they are just sitting there.
The picture of student understanding that emerges from assessing answers to open-ended questions, as opposed to choices from a list of possible answers, provides real insight into "the concepts the students are working with, how they understand the concepts, what their line of reasoning is, and how well they can explain their thinking" (Stein, Dawson, & Fischer, in press). Informed by this insight into students' thinking, teachers can provide developmentally appropriate learning activities to help students move toward the next level of understanding. Those who need standardized comparisons could also receive a more meaningful sense of students’ abilities in a developmentally appropriate context, because DiscoTest could provide comparisons between groups of students.
However, Stein sees the real value of the tests in their ability to provide meaningful insight into an individual's level of understanding and to improve that understanding. "This is the difference between using the test to rank kids relative to one another and using the test to aid in education," says Stein.
This new approach to testing could create fundamental and significant changes to how we look at schooling by bringing summative and formative testing together. With this marriage, understanding replaces memorization as the criterion for measuring learning (and, ultimately, teaching). Instead of merely rewarding students for memorizing the definition of osmosis, the test would reveal their level of mastery of the concept. Although this new vision for assessment is in its infancy and developing the tests will take time, there is now hope that the trivial pursuits of the SAT will give way to tests that not only provide meaningful insight into student understanding, but are also coordinated with the sort of teaching teachers actually want to do.
DiscoTest. (2009). Energy teaser (ET0001) coding page. Retrieved from http://discotest.org/et001/et001codingtry.php?energy001key=1100000008
Stein, Z., Dawson, T. L., & Fischer, K. W. (In press). Redesigning testing: Operationalizing the new science of learning. In M. S. Khine & I. M. Saleh (Eds.), The new science of learning: Computers, cognition and collaboration in education. Berlin: Springer Press.
Alden S. Blodget is a former assistant head and English teacher at Lawrence Academy in Groton, Mass., currently working as an education consultant.