Excerpts from "Creating Better Student Assessments"
Improving America's Schools: A Newsletter on Issues in School Reform
(Spring 1996)
Example of a Performance Assessment Task
(from Delaware's Interim Assessment Program)
A 5th grade mathematics performance assessment task in Delaware's Interim
Assessment Program presents the situation of students planning a county
all-star basketball game. Students respond to a series of fourteen questions
that are organized into four exercises. Ten questions call on students to
accomplish such tasks as estimating game revenues; solving money problems;
converting among percents, fractions, and decimals; and applying basic mathematical
operations. [Performance standards for this task are provided in the full
text. See link below.]
What Are Promising Ways to Assess Student Learning?
Performance assessments may include any of the following:
* Open-ended or constructed response items that ask students to respond
in their own words -- to "construct" their answers -- to questions
that may have multiple good answers. Students usually reason out their solutions
as part of their answers. Usually students can answer these questions in
just a few minutes, and in that way they differ from some of the performance
activities described below.
* Performance-based items or events: questions, tasks, or activities that
require students to perform an action. Although performances can involve
demonstrations or presentations, most typically they involve students explaining
how they would answer the question or solve a problem by writing a few sentences
or paragraphs, drawing and explaining a diagram, or performing an experiment.
Such tasks may take from 15 minutes to an hour or more and may involve some
work with a group of students who think through the answers and later provide
their own individually written answers.
* Projects or experiments: extended performance tasks that may take several
days or even several weeks to complete. Students generate problems, consider
options, propose solutions, and demonstrate their solutions. Students often
work in groups, at least for some of the project, to analyze options and
to consider ways to present their thinking and conclusions.
* Portfolios: collections of student work that show teachers and others
who may "score" portfolios the range and quality of student work
over a period of time and in various content areas. There are almost as
many approaches to compiling and evaluating portfolios as there are proponents
of this form of assessment. Portfolios can be used both formally and informally;
ideally, portfolios capture the evolution of students' ideas & can be
used instructionally & as progress markers for students, teachers, &
program evaluators.
What Research Says About Student Assessment: Effects on Instruction
[P]reliminary observations of classroom instruction in Kentucky and Vermont,
two states with portfolio assessment, indicate that teachers spend more
time training students to think critically and solve complex problems than
they did previously.
VERMONT. After studying Vermont's portfolio assessment program during the
first two years of its implementation the RAND Corporation concluded that
the effects of portfolio assessment on instruction were "substantial
and positive." Half the teachers surveyed by RAND reported an increase
in the time students spent working in pairs or small groups. Almost three-fourths
of the principals interviewed said the program produced positive changes
in instruction at their schools. Between 70 percent and 89 percent of the
math teachers reported more discussion of math, explanation of solutions,
and writing about math in their classrooms since the advent of the portfolios;
three-fourths reported having students spend more time applying math knowledge
to new situations, and roughly 70 percent reported devoting more class time
to writing math reports. Principals in half the sample schools reported
expanding portfolio assessments to other grade levels, an indication that
they approved of portfolios' effect on instructional practice in their schools
(Koretz, Stecher, Klein, & McCaffrey, 1994).
According to Dan Koretz and his colleagues at the RAND Corporation, one
of the reasons that Vermont's portfolio assessment program has succeeded
in changing instruction is the state's enormous investment in professional
development for teachers. In other states, the Council for Educational Development
and Research found that a lack of professional development for teachers
and principals hampers states' ability to change instruction. Without such
assistance, researchers fear that the classroom changes resulting from the
new measures will be superficial, at best.
KENTUCKY. A recent evaluation of Kentucky's assessment program conducted
by the Evaluation Center at Western Michigan University found that students
in Kentucky are writing more and doing more group work as a result of the
new state testing program. Teachers, district assessment coordinators, and
superintendents reported almost unanimously that writing had improved in
Kentucky (Western Michigan Evaluation Center, 1994). Lorraine M. McDonnell
of the University of California-Santa Barbara, arrived at similar findings
in a case study of 24 teachers in six Kentucky schools. McDonnell noted
more thematic & conceptual curriculum units, more projects, & more
group work, especially at the elementary level (Olson, 1995).
Other studies of state-sponsored, performance-based assessment systems have
had less positive findings. Mary Lee Smith and her colleagues at Arizona
State University conducted case studies of instruction in four Arizona schools
during the first two years of the Arizona Student Assessment Program. They
found little instructional change, except in a suburban school that was
already moving toward the curriculum and instruction advocated by the state.
Smith and her colleagues attributed the lack of change to the absence of
complementary state policies that promote good teaching and learning; other
than providing test forms and scoring workshops, the state paid little attention
to professional development. Some districts provided support for teachers
to change their teaching, while others did not. Classroom responses to the
new performance-based testing program varied according to the district's
capacity for financing professional and curriculum development and the extent
to which prevailing values and assumptions matched the state mandate (Smith,
Noble, Cabay, Heinecke, Junker, & Saffron, 1994).
Issues Involved in Developing Assessments
Three interrelated issues should guide educators and policy makers in developing
new assessments:
TECHNICAL QUALITY. Establishing technical quality involves reviewing development
plans for new assessments or applying review criteria to assessments developed
by other groups. The National Center for Research on Evaluation, Standards,
and Student Testing (CRESST) has developed criteria for reviewing assessments
on the basis of:
-- cognitive complexity (such as problem solving, critical thinking, and
reasoning),
-- content quality (challenging and important subject matter),
-- meaningfulness (tasks are worth students' time & students understand
their value),
-- language appropriateness,
-- transfer & generalizability,
-- fairness,
-- reliability,
-- consequences (the assessment has the desired effects on children, teachers,
& the educational system).
CREDIBILITY. New assessments must be introduced in a way that builds public
support. Parents and community members must understand what the assessments
accomplish, why they are needed, and how they fit with other ways of testing
students. .... Public understanding and support can be gained by giving
parents, teachers, and community members opportunities to review and even
try to answer some of the new assessments.
FEASIBILITY. The expectations for teachers, development costs, and scoring
and reporting costs for new assessments must be reasonable. Some assessment
systems in other countries have failed because administrative requirements
just couldn't be met using regular classroom teachers with little training
in assessment. In addition, when many assessments are introduced at once,
teachers may not be able to redirect all of their instruction to meet all
of the new goals.
Assessment design, development, and scoring should be approached in ways
that support the adaptation of existing assessment models to local or state
needs but without reinventing the wheel. For example, common approaches
to measuring content understanding can be applied to various subjects, reducing
the cost of training teachers to rate student work in various topics; over
time, the costs of scoring student work will drop.
To view the complete text of this newsletter published by the U.S. Department
of Education, go to http://www.ed.gov/pubs/IASA/newsletters/
and look for the Spring 1996 issue of Improving America's Schools.
Return to Assessment Links