VALUE-ADDED TESTING IN TENNESSEE
Series of stories in the Memphis Commercial-Appeal
November 29-30, 1998
Measuring Schools: Value-Added Puts Tennessee
On Map
Which Is Your School?
The Premise of Value-Added Assessment
UT Statistics Professor Didn't Count On This Reception
Accountability? For Schools, Yes; For Teachers,
No
Editorial: School Value Accountability Tool Deserves
Full, Fair Use
Measuring Schools: Value-Added Puts Tennessee
On Map
By Mickie Anderson
One of the biggest movements in education is happening in Tennessee's public
schools.
Value-added assessment, the brainchild of a University of Tennessee, Knoxville,
professor, has churned up tons of data about school performance - enough
to confirm a few common-sense theories about education, as well as debunk
a few.
Value-added assessment is based on a complicated statistical theory, but
its purpose is simple - to measure how much the students in a particular
school or even a particular classroom improve over the course of a year
compared with their peers.
It's viewed as a performance measure of teachers and schools as much as
of students. And Tennessee is the only state doing it.
Dallas schools use a similar system, but Tennessee's, created by UT professor
Bill Sanders, is considered more advanced.
"A lot of people around the country are taking notice of Sanders's
research," said Mike Petrilli of the Fordham Foundation, an education
reform group.
Not everyone embraces value-added assessment, though.
The state's largest teachers association has doubts. Some teachers and
principals prefer less complicated assessment methods. And Tennessee Education
Commissioner Jane Walters has been vocal in her stance against relying
too much on test-based systems like Sanders's for analyzing the performance
of schools or teachers.
Nor are value-added scores compiled for every Tennessee classroom or teacher.
So far, they exist only for fourth- through eighth-grade teachers and secondary
math teachers.
Despite the skepticism and limitations, some educators have dived into
the tricky value-added assessment method and found weak spots in their
schools that needed tweaking.
Snowden principal Catherine Battle, for example, said her school has used
value-added assessment to target academic areas that need emphasis the
next year.
"We don't use it as `Oh, you're a crummy teacher,' '' she says. "But
let's say I've noticed a new third-, fourth- or fifth-grade teacher, and
the class is on par in all areas except maybe social studies. Then I would
use it to ask: How can I help you with this?''
After some initial skepticism, Memphis Supt. Gerry House has taken a keen
interest in the scores since schools using her reform models showed substantial
gains under the value-added system.
Shelby County Schools' assistant superintendent for instruction, Fred Johnson,
said he hasn't studied value-added assessment deeply enough to comment
about it.
Some educators are paying attention, however, especially in light of the
trends uncovered by value-added assessment.
Since Sanders developed the method in the mid-1980s, researchers have been
able to pin down a number of insights about the dynamics of classrooms,
kids and teachers.
Among the biggest findings:
-- More than anything else, teachers matter.
"What we consistently find, and have found since the beginning, is
that the difference in the effectiveness among teachers is the single largest
factor affecting the academic growth of students," Sanders says.
In a study of fifth-grade math scores in two large Tennessee school systems,
Sanders found that students who had three straight years of low-performing
teachers had percentile scores 52 to 54 points lower than students who
had high-performing teachers three years in a row.
A percentile score shows how well a student did compared with students
around the country. A 75 would mean the student did as well as, or better
than, 75 percent of the test-takers.
In test score terms, 50 percentile points is an astronomical difference,
the kind that could mean a child being challenged in advanced courses or
turning brain-dead in average ones.
"It's basically taking an engineering, math-based career off the table
for them by the time they're in the fifth grade,'' says Dave Shearon of
the Nashville school board.
Shearon, elected this summer, calls that luck-of-the-draw difference tragic
- but avoidable.
Last month, he suggested his school system use teacher effectiveness data
to plot where the most effective and least effective teachers work, using
a system of green and red dots.
But Shearon tabled the plan after Nashville's teachers, smack in the middle
of hot contract negotiations, went into battle mode. They mocked the plan,
wearing red and green dot-stickers in protest.
Shearon says he'll bring the idea up again when negotiations cool.
He insists the data wouldn't be used to assign teachers to low-performing
schools, saying he doesn't believe the teachers would be effective if they're
unhappy.
"But my suspicion is we might see some of our best teachers volunteering
to go back into those schools,'' he said.
-- Teachers matter - corollary two.
The negative effects of a bad teacher, especially on a child who suffers
poor-performing teachers two or more years in a row, can linger long after
the child leaves that classroom.
Just as important are the effects of a great teacher. A child who gets
a high-performing teacher will enjoy academic benefits for years to come.
Residual academic effects from extremely effective and ineffective teachers
were measurable three years later, Sanders's study showed.
-- Teacher effect - part three.
The good teacher effect isn't compensatory. That means a good teacher can
make gains with her students, but she can't completely wipe out the lingering
negative effects of a bad teacher.
-- You can't judge a good teacher by her school Zip Code.
A study of two large Tennessee school districts found that black schoolchildren
were overrepresented in the worst teachers' classrooms by about 10 percent
and underrepresented in the best teachers' classrooms by about the same
amount.
But by and large, Sanders, his RS6000 computer and his team of statisticians
have found effective teachers in all kinds of schools, from rich to poor.
-- The building-change phenomenon.
Researchers have found huge drops in test scores - more strikingly in Memphis's
seventh-graders than almost anywhere else - the year after students move
from their elementary school to middle school or junior high.
Sanders says researchers don't believe the trauma of switching schools,
or even the trauma of early adolescence, is to blame.
"Our hypothesis is that the receiving schools do not usually have
a good fix on where the feeder schools left off, so the receiving schools
tend to re-teach a good chunk of old material until they get a fix on where
the kids are," he says.
Schools often waste time re-teaching material, dulling students' scores
on that year's achievement test, he believes.
The other two theories don't pan out because test scores of students who
move from one school to another don't always drop. And children are being
tested against their own age group, so teenage angst shouldn't come into
play.
"This is one of the things schools might want to address and give
some attention to, because those impacts are large and they're consistent,"
he said.
"It's been hurtful to Memphis, relative to its overall cumulative
gains, because the seventh grade is nearly flat," Sanders says.
Memphis's seventh-grade scores, which have shown far less progress than
the national average, have dragged down city scores overall, he said.
-- Test results show that in most public schools, both inner-city and rural,
the pace of instruction is geared to the lowest-scoring kids.
Sanders says his studies have shown that if students with above-average
test scores in their early school years are stuck in a series of classrooms
where the pace is slow, they slow down, too.
"Then it becomes a self-fulfilling prophecy: That these early above-average
kids, after being in that series of classrooms, will no longer be above
average, but indeed will start regressing," he says.
Much of the debate surrounding value-added assessment mirrors the arguments
that have raged for years about standardized testing in general.
"There is no statistical system that will allow you to look at a student's
performance and tell you what the teacher quality is. Tests don't test
what a student knows,'' said Al Mance of the Tennessee Education Association.
"(Teaching) isn't like a person standing around making widgets and
you're only going to hire someone who can manufacture 20 widgets a day.
When you're talking about human beings . . . a person can give input on
one end, and the reaction to it can be quite different,'' Mance says.
The TEA doesn't oppose value-added assessment in general, Mance says, but
the 45,000-member group does object to legislators attaching ``high stakes''
to the results, such as the portion of the Education Improvement Act that
allows the state to put continually low-performing schools on probation.
John Stone, an East Tennessee State University professor who is outspoken
in his criticism of the education establishment, says Mance's ``widgets''
analogy doesn't fly.
``Their whole modality of argument is to simply find fault,'' he says.
``They damn the good because it isn't perfect. There is no perfect system
of measuring what an individual teacher does with a group of students,
and no matter what, to some extent, the method will always distort the
outcome.''
But as a method of evaluating schools, teachers and students, Stone says,
value-added assessment is about as fair as it gets.
``Nobody would accept it if banks argued that you shouldn't do audits because
it doesn't perfectly capture what they're doing with the money,'' he said.
``But that's the implication in those arguments.''
Which Is Your School?
Education researchers say they've been able to sketch several types
of schools based on value-added assessment data.
"SHED" - This pattern is most prevalent in inner-city, urban schools
and is also the most frequently noted pattern among schools statewide.The
shed pattern is an indication that a school is tailoring lessons to the
lowest-achieving students, while having less success with average- and
high-achieving students.
"REVERSE SHED" - Most common in suburban districts, such as Shelby
County Schools. Shows that a school is having the most success teaching
its highest-achieving students, while average- and low-achieving students
may not be faring as well. A prevalent pattern for private schools.
"TEEPEE" - A less frequent pattern among Tennessee schools, indicates
that a school is having the most success with its average-achieving students,
while low- and high-achieving students aren't gaining as much. "V"
- Also an infrequent pattern among Tennessee schools, a "V" pattern
suggests a school is having success with its high- and low-achieving students,
but is less successful with average kids.
The Premise of Value-Added Assessment
The premise of value-added assessment is this: Looking at scores from
standardized tests doesn't give a clear picture of what happens in a classroom
during a school year.
Value-added assessment is meant to measure not how much a child knows but
how much that child has learned over the course of a year - how much "value"
the teacher and school have added.
The system was developed by University of Tennessee-Knoxville professor
Bill Sanders.
To arrive at his assessments, Sanders uses a child's current scores on
the Tennessee Comprehensive Assessment Program tests, up to five years'
worth of the child's old test scores, and expected improvements based on
the national average gain.
Each year the system generates a batch of reports to teachers, principals
and administrators showing the progress made by students in school systems,
schools and even individual classrooms.
The value-added system has become attractive to school systems around the
country that spent the last decade or so jumping onto the school accountability
bandwagon. As a measure of the effectiveness of schools and teachers, it
adds more depth to the picture provided by raw test scores.
For example, a child who continually scores well on standardized tests
might make her parents happy, but it may be that the child's teacher didn't
push her far enough or fast enough. Maybe she didn't gain enough compared
to other kids around the country.
And for children in the lowest-scoring schools, value-added assessment
is a way to gauge whether the school is helping kids catch up with their
better-scoring peers.
To catch up - no easy task - students' scores must show that they're outgaining
other schools each year, what Sanders calls the ``ratcheting effect.''
Value-added assessment is calculated only for third- through eighth-grade
students because they're the only students who take the TCAP tests.
But Sanders has developed a value-added assessment to measure the progress
of seventh- through 12th-graders in five different math courses.
Based on scores from end-of-course tests, it measures the gain made by
students against the statewide average. The same method soon may be used
to assess secondary school students in as many as 14 other subjects.
Sanders notes that it's almost pointless to look at value-added scores
without also looking at percentile scores. That's because a school with
dreadfully low test scores might show deceptively high value-added scores.
The opposite also could be true.
A school that practically aces standardized tests might not show any value-added
gains. The test scores alone would give parents a bogus picture about whether
their child is working up to potential.
UT Statistics Professor Didn't Count On
This Reception
By Mickie Anderson
Timing is everything.
For years, University of Tennessee professor Bill Sanders banged on Nashville
doors like a salesman, pestering anyone who would listen to his method
for measuring school effectiveness.
Now the world's at his door.
Sanders and his ``value-added'' method for evaluating schools are the biggest
things going in education, the hot topic at educrat cocktail parties, the
stuff school researchers would kill to sink their teeth into.
In education circles, it's put Tennessee, the only state with such a system,
on the map.
It was a harmonic convergence - a mix of political timing and dumb luck
- that pushed Sanders, an agriculture research statistician by trade, into
the field of education.
It was the mid-1980s and Gov. Lamar Alexander's career ladder proposal
was all the news. (Legislators froze the teacher merit pay plan last year.)
Sanders, long an advocate of a little-known brand of mathematics called
``mixed-model statistics,'' was teaching a group of graduate students about
the method.
Coming up with an on-the-fly sample problem, he began to show the class
how one could use mixed-model statistics to evaluate how much progress
a school made in a year's time.
"That was just pulled totally, completely out of the air," the
56-year-old statistician said.
His colleague, Robert McLean, corralled him after class.
McLean insisted the two write to the governor, outlining how a school evaluation
system could work.
Sanders got the OK to try out his theory using Knox County schools' test
data.
"I mean, to show you how naive I was, I thought the whole world was
waiting for this, I really did," he recalls with amusement. "We
were go-go-go-go-go.''
"So, anyway, I got done, and I called up my contact in Nashville,
and I said, `I'm done.' And his question was `With what?'
"That was my first clue that the whole world wasn't waiting on this,"
Sanders says, chuckling at the memory.
Sanders did his first three reports based on student test data from Knox
County, Blount County and Chattanooga City Schools.
Despite some significant findings, Sanders couldn't get the time of day.
So he put it aside.
Satisfied that he'd proved it could be done, he figured an education researcher
one day might pick up where he left off.
But then the scene shifts.
It's now 1989, and the small schools vs. big schools lawsuit has again
made education a hot-button issue in Tennessee.
About a week after the November election that year, a brand-new legislator
called Sanders at home. He had been on an airplane with a retired UT professor
who had chatted the legislator up about Sanders's work.
In 1992, after months of hearings and discussion and debate, Sanders's
value-added assessment was made part of the state's massive Education Improvement
Act.
Sen. Andy Womack (D-Murfreesboro), one of the sponsors of the legislation
that incorporated Sanders's plan, isn't quite convinced that Tennessee
schools are using the value-added data as much as they could.
But so far, he says, the plan is on the right track, noting that he gets
calls from legislators around the country interested in what Tennessee
is up to.
"I think it has brought attention to the fact that the only way to
evaluate school performance is whether value is being added or not,'' Womack
said.
"It gives us a new criteria for judging and evaluating schools that
never existed previously.''
Sanders, who's quick with an analogy to explain his sometimes-confusing
system, has logged more frequent-flier miles than a flight attendant, preaching
his value-added gospel to interested school districts and groups.
With so much attention on the program, he's busy fielding offers and figuring
how to handle his new fame.
"Sometimes I'm reminded of the dog that chased the car,'' he says.
"When the car stops, what do you do with it?''
Accountability? For Schools, Yes; For Teachers,
No
Evaluations Not Tied To Tests, Critics Say
By Mickie Anderson The Commercial Appeal
When legislators were asked to raise the state's sales tax to generate
more money for schools back in 1992, they held their breath and did it,
insisting on one thing in exchange: accountability.
Indeed, school systems are held accountable on several fronts, including
academic improvement as measured by the state's value-added assessment
system.
But teachers aren't.
For one thing, the value-added assessment system is based on scores from
the Tennessee Comprehensive Assessment Program, the annual standardized
tests taken by students in grades three through eight.
That means there is no comprehensive value-added assessment for first-
and second-graders or high school students. (There is an assessment that
measures the progress of seventh- though 12th-graders in certain math courses.)
But even for fourth- through eighth-grade teachers, value-added assessment
plays a limited role in their evaluations or accountability.
State law says value-added scores can be used in teachers' formal evaluations
when three years' worth of data has been collected, a mark some teachers
reached two years ago.
But new guidelines issued by the state Board of Education governing teacher
evaluations ask just two things: what the teacher has learned from the
value-added data and how the teacher intends to use the findings to improve.
The evaluation form used in Memphis refers several times to test scores.
But, there is no direct tie between student scores and the teacher's rating.
Nor can parents use value-added assessments to choose their children's
teachers. By state law, reports that reflect on individual teachers are
not public.
State legislators conducted weeks of hearings before adopting the value-added
system six years ago. Some say state Education Commissioner Jane Walters's
lack of oversight and enthusiasm for the system has yanked most of the
teeth from their attempt to hold teachers accountable.
"I think the intent was very clear, and that was that it was to be
used to identify teachers, who year after year, didn't have gains, then
identify the teachers whose students were showing gains year after year,"
said Sen. Tommy Haun (R-Greeneville).
Walters has heard the "you-just-don't-want-to-get-rid-of-the-bad-teachers"
criticism before. It's off-base, she says.
She favors the use of teacher effectiveness data in evaluations but is
leery of putting significant emphasis on one score.
Instead of trying to go after the worst teachers, Walters says, energy
would be better spent coaching the majority in the middle.
"Getting rid of that bottom 5 percent is not going to be as effective
as boosting those in the middle," she said.
A lot of things can affect a teacher-effectiveness report, she said, such
as a school spending more time fund-raising than teaching, or constant
classroom interruptions.
While Walters says she does not oppose value-added assessment, she has
suggested scrapping some of the data-gathering, and last year convinced
legislators to end mandatory second-grade testing.
When the value-added system was approved, Ned McWherter was governor, and
Charles Smith ran the state's Education department. Both were ardent supporters
of the system.
Research based on data turned up through the massive assessment system
has consistently found one thing: That teacher quality has more to do with
how students will do on standardized achievement tests than anything else.
Analysis showed that students who had the least effective teachers three
years straight had standardized test scores 52 to 54 percentile points
lower than students lucky enough to have highly effective teachers three
years in a row.
"That's a very powerful insight into how important it is to have good
teachers," said Diane Ravitch of the Brookings Institution, a public
policy think tank.
"The typical educator's response is, `It's all in the home background,'
'' she said. "But this is very powerful evidence that besides all
that, there's still something important going on in the classroom. And
that's terrifying to many people."
Although many poor students face tough academic challenges, value-added
assessment tracks individual students against their own progress, eliminating
the poverty factor.
Haun said in carrot-and-stick terms, the legislation was never meant as
a stick to rap on teachers' heads. The intent was to identify the best
teachers as a way to help poor teachers improve. In fact, legislators ensured
in the law that individual teachers' records wouldn't be public.
But if a teacher simply couldn't be salvaged, Haun said, then the value-added
assessment scores could be used to help terminate the employee.
"If the attitude is, `I'm not going to change,' '' he said, then (value-added
assessment) could be used as a stick."
The state does hold school systems accountable for improvements, and the
possibility of district probation is spelled out in the law.
That has happened only once, when Hancock County schools were put on probation
last year. Besides having poor test scores, the district also was criticized
for misusing funds, for using outdated textbooks and for its top-heavy
administration.
The gap between school systems' accountability and teacher accountability
troubles some.
"You find out that the law might not be meeting what your intent was,"
Haun said. "The downfall is in the oversight."
But when having a poor teacher for three years in a row is enough to "knock
a kid out of the box," as Ravitch puts it, it shouldn't be an option.
"A lot of people don't want to know. They don't even want to ask the
question," Ravitch says. "But you have to step back and say `What
do we have these schools for?' They're not employment agencies. These kids
have to be protected."
School Value Accountability Tool Deserves
Full, Fair Use
Editorial
THE USE of "value-added assessment" to measure the annual improvement
- or lack of it - of Tennessee's public schools gives parents, taxpayers,
lawmakers and school officials a potentially powerful tool to hold principals
and teachers responsible for their performance. Its application should
not be unduly limited by efforts to avoid such accountability.
Value-added assessment uses pupil scores on standardized achievement tests
over several years to measure how a district, a school and even an individual
classroom are progressing. It seeks to establish how much the district,
school and classroom contribute each year to student learning, relative
to other systems, schools and classrooms across the state and nation.
The method was developed by a professor of statistics at the University
of Tennessee-Knoxville and is used more widely in our state than in any
other. It permits more sophisticated measures of school improvement than
do simple comparisons of aggregate test scores.
Of course no single statistical device can fully assess the effectiveness
of a particular school or teacher, any more than a single test score should
determine a student's admission to college. But used in combination with
other types of evaluations, value-added assessment can suggest how well
a school is educating the children in its charge, and how it can improve.
There seems no real basis, other than reflexive resistance to change, for
the opposition the method has generated among Tennessee teacher unions
and some state education officials. Self-interested objections to tying
value-added assessment to classroom evaluations - and appropriately publicizing
those evaluations - cannot provide the last word on the method's merits.
Many of the findings of value-added assessment seem self-evident: The quality
of teaching, more than any other factor, determines how well pupils do
on standardized tests. Bad teaching in early grades can place children
at a disadvantage from which they may never recover.
By contrast, good teaching can provide a solid academic foundation that
lasts for years. And the best teachers are not always found in the districts
and schools that are perceived to be the most prestigious.
Value-added assessment measures have confirmed the success of Memphis City
Schools reform initiatives. Several city schools have used the method's
findings to identify weaknesses and define remedial measures.
The findings also raise important policy questions. Can curricula be coordinated
better, to prevent the large declines in test scores that occur when students
move from elementary school to middle or junior high school? What are the
consequences for gifted students of policies that set the pace of classroom
instruction to accommodate low-achieving pupils?
The method has its drawbacks. Because its database consists of scores on
Tennessee Comprehensive Assessment Program tests, it generally excludes
pupils in the earliest grades and in high school who do not take those
tests, although new measurements are under development for secondary grades.
Many aspects of teaching quality are subjective and not susceptible to
statistical measurement. Test scores may not take into account the unique
obstacles a school or class faces.
Like other statistical measures, value-added scores also may be deceptive
at the extremes - in this case, identifying improvements in the lowest-
and highest-achieving schools. And any system that links rewards to test
scores can encourage schools to "teach to the test" instead of
teaching, period.
With those caveats, though, value-added assessment can help measure the
performance of teachers and school principals. Suggesting that assessment
data identifying consistently poor performance should not affect continued
employment makes no more sense than arguing that the marks on a child's
report card should have no bearing on whether he or she is promoted to
the next grade.
Tennessee's Education Improvement Act - and the tax increase the state
levied to pay for it - are based on the premise that school districts must
be publicly accountable for the resources they receive.
That should apply to schools, teachers and administrators as well. Value-added
assessment is one important way of imposing such accountability - if politicians
and interest groups allow it to work as it is designed to.
##