While there has been much recent attention to standards for curriculum and for assessment,3
public and professional discussion of standards for instruction tends to focus on procedural and technical aspects, with little attention to more fundamental standards of quality. Is achievement more likely to be authentic when the length of class periods varies, when teachers teach in teams, when students participate in hands-on activities, or when students spend time in cooperative groups, museums, or on-the-job apprenticeships?
We were cautious not to assume that technical processes or specific sites for learning, however innovative, necessarily produce experiences of high intellectual quality. Even activities that place students in the role of a more active, cooperative learner and that seem to respect student voices can be implemented in ways that do not produce authentic achievement. The challenge is not simply to adopt innovative teaching techniques or to find new locations for learning, but deliberately to counteract two persistent maladies that make conventional schooling inauthentic:
- Often the work students do does not allow them to use their minds well.
- The work has no intrinsic meaning or value to students beyond achieving success in school.
To face these problems head-on, we articulated standards for instruction that represented the quality of intellectual work but that were not tied to any specific learning activity (for example, lecture or small-group discussion). Indeed, the point was to assess the extent to which any given activity—traditional or innovative, in or out of school—engages students in using their minds well.
Instruction is complex, and quantification in education can often be as misleading as informative. To guard against oversimplification, we formulated several standards, rather than only one or two, and we conceptualized each standard as a continuous construct from “less” to “more” of a quality, rather than as a categorical (yes or no) variable. We expressed each standard as a dimensional construct on a five-point scale. Instructions for rating lessons include specific criteria for each score—1 to 5—on each standard. Space does not permit us to present criteria for every possible rating, but for each standard we first distinguish between high and low scoring lessons and then offer examples of criteria for some specific ratings. Raters consider both the number of students to which the criterion applies and the proportion of class time during which it applies.4
The five standards are: higher-order thinking, depth of knowledge, connectedness to the world beyond the classroom, substantive conversation, and social support for student achievement (see fig. 1.).
Figure 1. Five Standards of Authentic Instruction
1. Higher-Order Thinking
lower-order thinking only 1... 2... 3... 4... 5... higher-order thinking is central
2. Depth of Knowledge
knowledge is shallow 1... 2... 3... 4... 5... knowledge is deep
3. Connectedness to the World Beyond the Classroom
no connection 1... 2... 3... 4... 5... connected
4. Substantive Conversation
no substantive conversation 1... 2... 3... 4... 5... high-level substantive conversation
5. Social Support for Student Achievement
negative social support 1... 2... 3... 4... 5... positive social support
The first scale measures the degree to which students use higher-order thinking.
Lower-order thinking (LOT) occurs when students are asked to receive or recite factual information or to employ rules and algorithms through repetitive routines. As information-receivers, students are given pre-specified knowledge ranging from simple facts and information to more complex concepts. Students are in this role when they recite previously acquired knowledge by responding to questions that require recall of pre-specified knowledge.
Higher-order thinking (HOT) requires students to manipulate information and ideas in ways that transform their meaning and implications, such as when students combine facts and ideas in order to synthesize, generalize, explain, hypothesize, or arrive at some conclusion or interpretation. Manipulating information and ideas through these processes allows students to solve problems and discover new (for them) meanings and understandings. When students engage in HOT, an element of uncertainty is introduced, and instructional outcomes are not always predictable.
Criteria for higher-order thinking:
3 = Students primarily engage in routine LOT operations a good share of the lesson. There is at least one significant question or activity in which some students perform some HOT operations.
4 = Students engage in an at least one major activity during the lesson in which they perform HOT operations. This activity occupies a substantial portion of the lesson, and many students perform HOT.
Depth of Knowledge
From “knowledge is shallow” (1) to “knowledge is deep” (5), the next scale assesses students' depth of knowledge and understanding. This term refers to the substantive character of the ideas in a lesson and to the level of understanding that students demonstrate as they consider these ideas.
Knowledge is thin or superficial when it does not deal with significant concepts of a topic or discipline—for example, when students have a trivial understanding of important concepts or when they have only a surface acquaintance with their meaning. Superficiality can be due, in part, to instructional strategies that emphasize coverage of large quantities of fragmented information.
Knowledge is deep or thick when it concerns the central ideas of a topic or discipline. For students, knowledge is deep when they make clear distinctions, develop arguments, solve problems, construct explanations, and otherwise work with relatively complex understandings. Depth is produced, in part, by covering fewer topics in systematic and connected ways.
Criteria for depth of knowledge:
2 = Knowledge remains superficial and fragmented; while some key concepts and ideas are mentioned or covered, only a superficial acquaintance or trivialized understanding of these complex ideas is evident.
3 = Knowledge is treated unevenly during instruction; that is, deep understanding of something is countered by superficial understanding of other ideas. At least one significant idea may be presented in depth and its significance grasped, but in general the focus is not sustained.
Connectedness to the World
The third scale measures the extent to which the class has value and meaning beyond the instructional context. In a class with little or no value beyond, activities are deemed important for success only in school (now or later). Students' work has no impact on others and serves only to certify their level of compliance with the norms of formal schooling.
A lesson gains in authenticity the more there is a connection to the larger social context within which students live. Instruction can exhibit some degree of connectedness when (1) students address real-world public problems (for example, clarifying a contemporary issue by applying statistical analysis in a report to the city council on the homeless); or (2) students use personal experiences as a context for applying knowledge (such as using conflict resolution techniques in their own school).
Criteria for connectedness:
1 = Lesson topic and activities have no clear connection to issues or experience beyond the classroom. The teacher offers no justification for the work beyond the need to perform well in class.
5 = Students work on a problem or issue that the teacher and students see as connected to their personal experiences or contemporary public situations. They explore these connections in ways that create personal meaning. Students are involved in an effort to influence an audience beyond their classroom; for example, by communicating knowledge to others, advocating solutions to social problems, providing assistance to people, or creating performances or products with utilitarian or aesthetic value.
From “no substantive conversation” (1) to “high-level substantive conversation” (5), the fourth scale assesses the extent of talking to learn and understand the substance of a subject. In classes with little or no substantive conversation, interaction typically consists of a lecture with recitation in which the teacher deviates very little from delivering a preplanned body of information and set of questions; students routinely give very short answers. Teachers' list of questions, facts, and concepts tend to make the discourse choppy, rather than coherent; there is often little or no follow-up of student responses. Such discourse is the oral equivalent of fill-in-the-blank or short-answer study questions.
High levels of substantive conversation are indicated by three features:
- There is considerable interaction about the ideas of a topic (the talk is about disciplined subject matter and includes indicators of higher-order thinking such as making distinctions, applying ideas, forming generalizations, raising questions, and not just reporting experiences, facts, definitions, or procedures).
- Sharing of ideas is evident in exchanges that are not completely scripted or controlled (as in a teacher-led recitation). Sharing is best illustrated when participants explain themselves or ask questions in complete sentences and when they respond directly to comments of previous speakers.
- The dialogue builds coherently on participants' ideas to promote improved collective understanding of a theme or topic.
Criteria for substantive conversation:
To score 2 or above, conversation must focus on subject matter as in feature (1) above.
2 = Sharing (2) and/or coherent promotion of collective understanding (3) occurs briefly and involves at least one example of two consecutive interchanges.
4 = All three features of substantive conversation occur, with at least one example of sustained conversation (that is, at least three consecutive interchanges), and many students participate.
Social Support for Student Achievement
The social support scale involves high expectations, respect, and inclusion of all students in the learning process. Social support is low when teacher or student behavior, comments, and actions tend to discourage effort, participation, or willingness to express one's views. Support can also be low if no overt acts like the above occur, but when the overall atmosphere of the class is negative as a result of previous behavior. Token acknowledgments, even praise, by the teacher of student actions or responses do not necessarily constitute evidence of social support.
Social support is high in classes when the teacher conveys high expectations for all students, including that it is necessary to take risks and try hard to master challenging academic work, that all members of the class can learn important knowledge and skills, and that a climate of mutual respect among all members of the class contributes to achievement by all. “Mutual respect” means that students with less skill or proficiency in a subject are treated in ways that encourage their efforts and value their contributions.
Criteria for social support:
2 = Social support is mixed. Both negative and positive behaviors or comments are observed.
5 = Social support is strong. The class is characterized by high expectations, challenging work, strong effort, mutual respect, and assistance in achievement for almost all students. Both teacher and students demonstrate a number of these attitudes by soliciting and welcoming contributions from all students. Broad student participation may indicate that low-achieving students receive social support for learning.
Using the Framework to Observe Instruction
We are now using the five standards to estimate levels of authentic instruction in social studies and mathematics in elementary, middle, and high schools that have restructured in various ways. Our purpose is not to evaluate schools or teachers, but to learn how authentic instruction and student achievement are facilitated or impeded by:
- organizational features of schools (teacher workload, scheduling of instruction, governance structure);
- the content of particular programs aimed at curriculum, assessment, or staff development;
- the quality of school leadership;
- school and community culture.
We are also examining how actions of districts, states, and regional or national reform projects influence instruction. The findings will describe the conditions under which “restructuring” improves instruction for students and suggest implications for policy and practice.
Apart from its value as a research tool, the framework should also help teachers reflect upon their teaching. The framework provides a set of standards through which to view assignments, instructional activities, and the dialogue between teacher and students and students with one another. Teachers, either alone or with peers, can use the framework to generate questions, clarify goals, and critique their teaching. For example, students may seem more engaged in activities such as cooperative learning or long-term projects, but heightened participation alone is not sufficient. The standards provide further criteria for examining the extent to which such activities actually put students' minds to work on authentic questions.
In using the framework, either for reflective critiques of teaching or for research, it is important to recognize its limitations. First, the framework does not try to capture in an exhaustive way all that teachers may be trying to accomplish with students. The standards attempt only to represent in a quantitative sense the degree of authentic instruction observed within discrete class periods. Numerical ratings alone cannot portray how lessons relate to one another or how multiple lessons might accumulate into experiences more complex than the sum of individual lessons. Second, the relative importance of the different standards remains open for discussion. Each suggests a distinct ideal, but it is probably not possible for most teachers to show high performance on all standards in most of their lessons. Instead, it may be important to ask, “Which standards should receive higher priority and under what circumstances?”5
Finally, although previous research indicates that teaching for thinking, problem solving, and understanding often has more positive effects on student achievement than traditional teaching, the effects of this specific framework for authentic instruction on student achievement have not been examined.6
Many educators insist that there are appropriate times for traditional, less authentic instruction—emphasizing memorization, repetitive practice, silent study without conversation, and brief exposure—as well as teaching for in-depth understanding.
Rather than choosing rigidly and exclusively between traditional and authentic forms of instruction, it seems more reasonable to focus on how to move instruction toward more authentic accomplishments for students. Without promising to resolve all the dilemmas faced by thoughtful teachers, we hope the standards will offer some help in this venture.
Archbald, D., and F.M. Newmann. (1988). Beyond Standardized Testing: Assessing Authentic Academic Achievement in the Secondary School. Reston, Va.: National Association of Secondary School Principals.
Brown, A., and A. Palinscar. (March 1989). “Coherence and Causality in Science Readings.” Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
Carnegie Corporation of New York. (1989). Turning Points: Preparing American Youth for the 21st Century. Report on the Carnegie Task Force on the Education of Young Adolescents. New York: Carnegie Council on Adolescent Development.
Carpenter, T. P., and E. Fennema. (1992). “Cognitively Guided Instruction: Building on the Knowledge of Students and Teachers.” In Curriculum Reform: The Case of Mathematics in the United States. Special issue of International Journal of Educational Research, edited by W. Secada, pp. 457–470. Elmsford, N.Y.: Pergamon Press, Inc.
Elmore, R. F., and Associates. (1990). Restructuring Schools: The Next Generation of Educational Reform. San Francisco: Jossey-Bass.
Knapp, M. S., P.M. Shields, and B.J. Turnbull. (1992). Academic Challenge for the Children of Poverty: Summary Report. Washington, D.C.: U.S. Department of Education, Office of Policy and Planning.
Murphy, J. (1991). Restructuring Schools: Capturing and Assessing the Phenomena. Nashville, Tenn.: National Center for Educational Leadership, Vanderbilt University.
National Council on Education Standards and Testing. (1992). Raising Standards for American Education. A Report to Congress, the Secretary of Education, the National Education Goals Panel, and the American People. Washington, D.C.: U.S. Government Printing Office, Superintendent of Documents, Mail Stop SSOP.
Newmann, F. M. (1991). “Linking Restructuring to Authentic Student Achievement.” Phi Delta Kappan 72, 6: 458–463.
Newmann, F.M., and D. Archbald. (1992). “The Nature of Authentic Academic Achievement.” In Toward a New Science of Educational Testing and Assessment, edited by H. Berlak, F.M. Newmann, E. Adams, D.A. Archbald, T. Burgess, J. Raven, and T.A. Romberg, pp. 71–84. Albany, N.Y.: SUNY Press.
Newmann, F. M., G.G. Wehlage, and S.D. Lamborn. (1992). “The Significance and Sources of Student Engagement.” In Student Engagement and Achievement in American Secondary Schools, edited by F.M. Newmann, pp. 11–30. New York: Teachers College Press.
Resnick, L. (1987). Education and Learning to Think. Washington, D.C.: National Academy Press.
Smith, M. S., and J. O'Day. (1991). “Systemic School Reform.” In The Politics of Curriculum and Testing: The 1990 Yearbook of the Politics of Education Association, edited by S.H. Fuhrman and B. Malen, pp. 233–267. Philadelphia: Falmer Press.
Wehlage, G. G., R.A. Rutter, G.A. Smith, N. Lesko, and R.R. Fernandez. (1989). Reducing the Risk: Schools as Communities of Support. Philadelphia: Falmer Press.
See Carnegie Corporation of New York (1989), Elmore and Associates (1990), and Murphy (1991).
See Archbald and Newmann 1988, Newmann 1991, Newmann and Archbald 1992, Newmann et al. 1992, and Wehlage et al. 1989.
For example, see the arguments for standards in National Council on Education Standards and Testing (1992), and Smith and O'Day (1991).
In three semesters of data collection, correlations between raters were .7 or higher, and precise agreement between raters was about 60 percent or higher for each of the dimensions. A detailed scoring manual will be available to the public following completion of data collection in 1994.
The standards may be conceptually distinct, but initial findings indicate that they cluster together statistically as a single construct. That is, lessons rated high or low on some dimensions tend to be rated in the same direction on others.
Evidence for positive achievement effects of teaching for thinking is provided in diverse sources such as Brown and Palinscar (1989), Carpenter and Fennema (1992), Knapp et al. (1992), and Resnick (1987). However, no significant body of research to date has clarified key dimensions of instruction that produce authentic forms of student achievement as defined here.
Authors' note: This paper was prepared at the Center on Organization and Restructuring of Schools, supported by the U.S. Department of Education, Office of Educational Research and Improvement (Grant No. R117Q0005-92) and by the Wisconsin Center for Education Research, School of Education, University of Wisconsin-Madison. Major contributions to the development of these standards have been made by Center staff members. The opinions expressed here are those of the authors and do not necessarily reflect the views of the supporting agencies.
Fred M. Newmann is Director of the Center on Organization and Restructuring of Schools and Professor of Curriculum and Instruction, University of Wisconsin-Madison, Wisconsin Center for Education Research, 1025 W. Johnson St., Madison, WI 53706. Gary G. Wehlage is Associate Director of the Center on Organization and Restructuring of Schools and Professor of Curriculum and Instruction, University of Wisconsin-Madison, same address.