Excerpts from "Creating Better Student Assessments"


Improving America's Schools: A Newsletter on Issues in School Reform (Spring 1996)


Example of a Performance Assessment Task
(from Delaware's Interim Assessment Program)

A 5th grade mathematics performance assessment task in Delaware's Interim Assessment Program presents the situation of students planning a county all-star basketball game. Students respond to a series of fourteen questions that are organized into four exercises. Ten questions call on students to accomplish such tasks as estimating game revenues; solving money problems; converting among percents, fractions, and decimals; and applying basic mathematical operations. [Performance standards for this task are provided in the full text. See link below.]


What Are Promising Ways to Assess Student Learning?

Performance assessments may include any of the following:

* Open-ended or constructed response items that ask students to respond in their own words -- to "construct" their answers -- to questions that may have multiple good answers. Students usually reason out their solutions as part of their answers. Usually students can answer these questions in just a few minutes, and in that way they differ from some of the performance activities described below.

* Performance-based items or events: questions, tasks, or activities that require students to perform an action. Although performances can involve demonstrations or presentations, most typically they involve students explaining how they would answer the question or solve a problem by writing a few sentences or paragraphs, drawing and explaining a diagram, or performing an experiment. Such tasks may take from 15 minutes to an hour or more and may involve some work with a group of students who think through the answers and later provide their own individually written answers.

* Projects or experiments: extended performance tasks that may take several days or even several weeks to complete. Students generate problems, consider options, propose solutions, and demonstrate their solutions. Students often work in groups, at least for some of the project, to analyze options and to consider ways to present their thinking and conclusions.

* Portfolios: collections of student work that show teachers and others who may "score" portfolios the range and quality of student work over a period of time and in various content areas. There are almost as many approaches to compiling and evaluating portfolios as there are proponents of this form of assessment. Portfolios can be used both formally and informally; ideally, portfolios capture the evolution of students' ideas & can be used instructionally & as progress markers for students, teachers, & program evaluators.


What Research Says About Student Assessment: Effects on Instruction

[P]reliminary observations of classroom instruction in Kentucky and Vermont, two states with portfolio assessment, indicate that teachers spend more time training students to think critically and solve complex problems than they did previously.

VERMONT. After studying Vermont's portfolio assessment program during the first two years of its implementation the RAND Corporation concluded that the effects of portfolio assessment on instruction were "substantial and positive." Half the teachers surveyed by RAND reported an increase in the time students spent working in pairs or small groups. Almost three-fourths of the principals interviewed said the program produced positive changes in instruction at their schools. Between 70 percent and 89 percent of the math teachers reported more discussion of math, explanation of solutions, and writing about math in their classrooms since the advent of the portfolios; three-fourths reported having students spend more time applying math knowledge to new situations, and roughly 70 percent reported devoting more class time to writing math reports. Principals in half the sample schools reported expanding portfolio assessments to other grade levels, an indication that they approved of portfolios' effect on instructional practice in their schools (Koretz, Stecher, Klein, & McCaffrey, 1994).

According to Dan Koretz and his colleagues at the RAND Corporation, one of the reasons that Vermont's portfolio assessment program has succeeded in changing instruction is the state's enormous investment in professional development for teachers. In other states, the Council for Educational Development and Research found that a lack of professional development for teachers and principals hampers states' ability to change instruction. Without such assistance, researchers fear that the classroom changes resulting from the new measures will be superficial, at best.

KENTUCKY. A recent evaluation of Kentucky's assessment program conducted by the Evaluation Center at Western Michigan University found that students in Kentucky are writing more and doing more group work as a result of the new state testing program. Teachers, district assessment coordinators, and superintendents reported almost unanimously that writing had improved in Kentucky (Western Michigan Evaluation Center, 1994). Lorraine M. McDonnell of the University of California-Santa Barbara, arrived at similar findings in a case study of 24 teachers in six Kentucky schools. McDonnell noted more thematic & conceptual curriculum units, more projects, & more group work, especially at the elementary level (Olson, 1995).

Other studies of state-sponsored, performance-based assessment systems have had less positive findings. Mary Lee Smith and her colleagues at Arizona State University conducted case studies of instruction in four Arizona schools during the first two years of the Arizona Student Assessment Program. They found little instructional change, except in a suburban school that was already moving toward the curriculum and instruction advocated by the state. Smith and her colleagues attributed the lack of change to the absence of complementary state policies that promote good teaching and learning; other than providing test forms and scoring workshops, the state paid little attention to professional development. Some districts provided support for teachers to change their teaching, while others did not. Classroom responses to the new performance-based testing program varied according to the district's capacity for financing professional and curriculum development and the extent to which prevailing values and assumptions matched the state mandate (Smith, Noble, Cabay, Heinecke, Junker, & Saffron, 1994).


Issues Involved in Developing Assessments

Three interrelated issues should guide educators and policy makers in developing new assessments:

TECHNICAL QUALITY. Establishing technical quality involves reviewing development plans for new assessments or applying review criteria to assessments developed by other groups. The National Center for Research on Evaluation, Standards, and Student Testing (CRESST) has developed criteria for reviewing assessments on the basis of:

-- cognitive complexity (such as problem solving, critical thinking, and reasoning),
-- content quality (challenging and important subject matter),
-- meaningfulness (tasks are worth students' time & students understand their value),
-- language appropriateness,
-- transfer & generalizability,
-- fairness,
-- reliability,
-- consequences (the assessment has the desired effects on children, teachers, & the educational system).

CREDIBILITY. New assessments must be introduced in a way that builds public support. Parents and community members must understand what the assessments accomplish, why they are needed, and how they fit with other ways of testing students. .... Public understanding and support can be gained by giving parents, teachers, and community members opportunities to review and even try to answer some of the new assessments.

FEASIBILITY. The expectations for teachers, development costs, and scoring and reporting costs for new assessments must be reasonable. Some assessment systems in other countries have failed because administrative requirements just couldn't be met using regular classroom teachers with little training in assessment. In addition, when many assessments are introduced at once, teachers may not be able to redirect all of their instruction to meet all of the new goals.

Assessment design, development, and scoring should be approached in ways that support the adaptation of existing assessment models to local or state needs but without reinventing the wheel. For example, common approaches to measuring content understanding can be applied to various subjects, reducing the cost of training teachers to rate student work in various topics; over time, the costs of scoring student work will drop.


To view the complete text of this newsletter published by the U.S. Department of Education, go to http://www.ed.gov/pubs/IASA/newsletters/ and look for the Spring 1996 issue of Improving America's Schools.


Return to Assessment Links