HomepageISTEEdSurge
Skip to content
ascd logo

Log in to Witsby: ASCD’s Next-Generation Professional Learning and Credentialing Platform
Join ASCD
February 1, 1993
Vol. 50
No. 5

Toronto's Benchmark Program

Benchmarks are designed to demystify the goals of education and illuminate the nature of good performances for teachers, students, and parents.

For more than 30 years, the City of Toronto chose not to evaluate its elementary student population against any internal or external standards. Teachers at Toronto's more than 100 elementary schools assessed students according to their own preferred methods, including self-made tests, cumulative samples of students' work, and teacher-student conferencing. For reporting to parents, they developed their own schemes—ranging in style from highly personalized anecdotal reports to numbers and computer-coded comments.
  1. Parents have the right to participate meaningfully in decision making about their children's education.
  2. Parents have a corresponding right to know, in a timely way, how well or how poorly their children are doing in school.
  3. Parents are entitled to have their children's achievements determined in relation to systemwide standards.

What Is a Benchmark?

Now, five years later, the Toronto Board of Education has developed more than 100 Benchmarks for Language and Mathematics at grades 3, 6, and 8 representing, respectively, the end of the Primary, Junior and Senior Elementary Divisions. In addition, work at the Secondary Division is well under way.
Toronto defines a Benchmark as information to which teachers, students, and parents can refer to daily as they teach, learn, and assess achievement. Benchmarks may be shaped in any number of ways, depending on the needs of the users, but the standards they represent are clear to all. The philosophy behind the program is that instruction, learning, and evaluation should occur simultaneously in the classroom on a continuous basis with many authentic performance activities.
In order to achieve this, teachers use the Benchmarks as models in designing classroom activities that incorporate the Ministry of Education and Toronto Board of Education objectives. Teachers then observe students as they perform these activities in the classroom and evaluate students according to the levels of performance depicted in the Benchmarks.

Observation and Holistic Evaluation

In developing the Language and Mathematics Benchmarks, the Toronto Board combined observation and holistic evaluation. Figure 1 shows part of a Grade 8 Language Benchmark. (The part that exists on videotape is not described.) At the beginning are clusters of loosely connected, broadly stated learning objectives drawn from the Ministry and Board guidelines. Next is a brief description of a global activity, designed by committees of classroom teachers, which operationalizes the objectives. The activity represents only one of an infinite number of open-ended tasks that could be designed and has no pre-determined answer or “correct” problem-solving procedure.

Figure 1. Toronto Language Arts Benchmark for Grade 8

Key Objectives

  • Use oral language to think, learn, and communicate.

  • Read a painting and demonstrate an understanding that effective reading, of whatever type, is a constant search for meaning.

  • Demonstrate aesthetic appreciation.

Two Paintings: Viewing (Art Book) and Responding Orally

This Benchmark was developed to assess students' ability to obtain information from pictures and to explain, describe, and evaluate that information. Two paintings, “The Fledgling” by E. Lindner and “The Young Canadian” by D.P. Brown, were chosen from the book High Realism in Canada by P. Duval.

Students who did well in the activity described and explained the significant features of the paintings and gave reasonable interpretations of them. They expressed their ideas fluently with appropriate language and vocabulary. They drew reasonable conclusions from the paintings and appreciated that their ideas and feelings had value and were worthy of expression.

Before the activity, students were asked questions about paintings and were rated for prior knowledge and experience. The evaluators then showed the first painting and asked the students to describe it as exactly as possible. After responding, the students were asked: (1) What does the painting make you think about? (2) How does it make you feel? (3) Why do you think the artist did the painting this way? The evaluators next showed the second painting to the students and repeated the same series of questions. Finally, both paintings were shown together, and the students were asked to describe their similarities, make up a title for each painting, and discuss why they preferred one painting over the other. Upon completion, students were asked to evaluate the activity in terms of interest in the paintings, enjoyment of the task, and ease of describing each painting.

The Grade 6 Benchmark, L6-2, shows how grade 6 students performed on this same task when evaluated with identical criteria.

Holistic Scoring Criteria Used for Each Painting: Individually, Both Paintings, and Overall Response

Level Five — 20% — The student describes and explains the significant features of the painting(s) and gives reasonable interpretations. The student expresses opinions about the painting(s) and often links them to personal experiences. The student is able to clarify assumptions and make inferences and can explain them to the evaluator. Views are stated clearly and concisely. The student is articulate and self-confident. Feelings are considered to be valuable and worthy of expression. The student explores ideas with the evaluator using appropriate vocabulary and conversational style. The comparison of the two painting(s) goes beyond the obvious and is often thoughtful and insightful.

Level Four — 32% — The student describes several significant features of the painting(s) with some interpretation. Some opinions about the painting(s) are expressed. The student's description is accurate but not figurative. Views are expressed with some fluency, and feelings are briefly stated. The comparison of the two painting(s) is reasonable.

Level Three — 39% — The student describes and explains one or two features of the painting(s) with minimal elaboration. Opinions are rarely expressed, and the description may include inaccuracies. The student may be hesitant in stating feelings and may describe only a single similarity between the painting(s).

Level Two — 7% — The student identifies and describes a few details. No opinions are expressed. The student has difficulty expressing feelings, making comparisons, and/or articulating ideas.

Level One — 2% — The student is unable to respond or gives a very limited response.

An example of a Grade 3 Mathematics Benchmark task (number 16 of 26 Benchmarks) is as follows: Geometric Solids: Classifying Objects by Various AttributesStudents sort a set of 12 wooden geometric solids in two different ways. Where necessary, students were given an explanation of what is meant by sorting. In each case, the students were asked to explain the rule they used to sort the solids into sets.
Both the language and the mathematics activities involve the use of oral language, and the mathematics activity involves concrete materials. Both tasks also require interaction between the student and the evaluator—sometimes according to a structured format, sometimes with specially designed prompts from the evaluator, and sometimes spontaneously. In short, the activities resemble authentic tasks that might take place in a regular classroom setting.
Finally, because they are performance-based, the activities require that students and their work be observed. Consequently, for all such activities, the representative samples of students are videotaped.

Holistic Scoring Criteria

The development of standards, of course, suggests the need for some form of scoring. Because the Language Benchmark task in Figure 1 and in the Mathematics Benchmark activity described above had been designed in a global fashion, and because they had never before been considered in the creation of standards, developing criteria prior to the scoring proved difficult. It was more beneficial to observe the students' performances on the videotapes and simultaneously determine the criteria. Thus, the standards came to be defined as what Toronto students “can” do, not what they “should” do. Because Toronto's student population is large and diverse in its economic, social, and cultural makeup, these standards provide sound and valuable information. For the type of Benchmark illustrated here, we created criteria for five levels of performance so that a standard shows the full range of the ability of Toronto students.
  • describe many problem-solving, process, and higher-order thinking skills;
  • are true to the whole child performing a global task;
  • reflect the complexity and interconnectedness among the parts of the performance;
  • include, in an integrated fashion, many affective variables such as perseverance, confidence, and willingness, and some metacognitive variables such as monitoring one's problem-solving strategies;
  • and, because they were drawn from students' performances, incorporate many unanticipated, as well as anticipated, outcomes and skills.

Honoring Teacher Professionalism

Benchmarks are packaged so that the standards can be observed. The missing element of Figure 1 is the exemplar of students' performances on videotape. Other Benchmarks show exemplars of students' performances in print format. Teachers do not administer Benchmarks to students as they would conduct traditional standardized tests. Instead, they use Benchmarks as reference materials, an approach that has resulted in a variety of centralized and school-based initiatives.
To begin, all K–8 school principals and vice-principals participated in one week of inservice about Benchmark materials and philosophy as well as strategies for school-based collaborative use. A few months later, school teams of administrators and teachers participated in centrally designed inservices. Since then, school-based teams continue to collaborate in implementing the Benchmark Program. The school teams are using funds from the Toronto Board and the expertise of curriculum consultants and staff development personnel to tailor the program to their unique needs. Some entire school staffs have gone off-site and devoted professional development days to the study of Benchmarks. At other locations, small groups of teachers use preparation time and lunch hours to share their understandings. Teachers also take the Benchmarks home and study them on their own time. The use of Benchmarks as reference materials promotes a collaborative culture, professional dialogue, and individual reflection.

Promoting Shared Accountability

At Whitney Public School all teachers are collaborating to implement the program. Located in a wealthy area of Toronto, Whitney's population is 476, serving students from junior kindergarten through grade 6. Even though Benchmarks have been designed with grades 3, 6, and 8 students, the range of performance for each Benchmark is so wide that they can be easily used across all grades.
The teachers—organized in family groupings of grades 1/2, 3/4, and 5/6—agree on two or three Benchmarks at the most appropriate level and work toward a collective understanding of the tasks, criteria, and exemplars. They then discuss the skills students need to perform the tasks, how students learn the skills, and how to teach the skills. Teachers also look closely at the role that language plays in students' demonstration of the many higher-order thinking skills associated with each Benchmark.
Because Benchmarks support the developmental whole-language approach (Whitney's school philosophy), teachers find that the program goes hand-in-hand with their classroom practices. During a second phase of sharing, teachers bring to the group samples of teaching/learning activities that promote the development of Benchmark skills and discuss how to integrate learning and evaluation.

Illuminating Good Performances

This year the Toronto Board of education is involving K–8 teachers in designing a new system for recording student achievement and for communicating with parents. The goal is to have the system reflect the methods and philosophy of the Benchmark Program and, thus, create consistency across the elementary and secondary levels.
As in many of Toronto's elementary schools, teachers at Whitney Public School are experimenting with a variety of reporting forms and approaches to teacher-parent interviews. They are looking forward to presenting some of their creative ideas to the system later in the year.
All of Toronto's teachers will bring to this exercise three years of experience using Benchmarks to evaluate and engage students and to inform and engage parents. Unlike externally developed and scored tests, Benchmarks allow teachers, students, and parents to collaborate and remain in control of learning and evaluation. Because Benchmarks can be observed, they demystify the goals of education and illuminate the nature of good performances. Students use them as models of excellent performances. Parents consider them to better understand today's complex menu of educational objectives and to make meaningful decisions about their children's education.
End Notes

1 For an in-depth discussion, see: S. Larter, (1991), Benchmarks: The Development of a New Approach to Student Evaluation (Toronto: Toronto Board of Education).

2 Because Benchmarks are not to be used as tests, the scripts and scoresheets designed to administer the activities have not been made available to educators.

Sylvia Larter has been a contributor to Educational Leadership.

Learn More

ASCD is a community dedicated to educators' professional growth and well-being.

Let us help you put your vision into action.
From our issue
Product cover image 61193016.jpg
The Challenge of Higher Standards
Go To Publication