(Vol. 1, No. 2 - Spring 1997)


Back to index

Back to "New Ways of Testing and Grading"

The Long Beach Unified School District
Testing Program


(A December 1996 memo prepared by Dr. Lynn Winters, assistant superintendent for research, assessment and evaluation.)


PART 1: THE DEVELOPMENT OF A DISTRICT ASSESSMENT SYSTEM

What is an assessment system?


Test validity tied to testing purpose. One test cannot serve all testing purposes. A test is "valid" only insofar as it provides accurate information for a particular decision. Some tests, such as the SAT, are valid for sorting students and, when combined with the high school GPA, are accurate for predicting freshman college grades. Other tests, such as the Golden State or Advanced Placement examinations, are valid for identifying which students have mastered a particular course of study. Classroom tests, such as the 'end of chapter' exam, are valid for identifying which students understand the important concepts and principles in a unit and which need more help.

Valid uses for norm-referenced tests. At the district level, we have multiple reasons for testing: monitoring progress toward meeting the standards, public accountability, and improving instruction. Not every test we administer serves all purposes. For example, tests such as ITAS are made to compare students to national norm groups. These tests are designed so that half of the norm group scores below and half above the median score. Items are chosen to "create a normal distribution". Thus items are chosen for their difficulty levels (so that about 50% of the students get them right and 50% get them wrong) as much as for what content they measure. These tests are appropriate for identifying low and high performing groups against a "standard" that the larger community understands, the national norm. They do not provide diagnostic information for individual students, nor are they particularly well-matched to local curriculum ( Though an argument could be made that they reflect a broad national consensus of what students should know.).

Valid uses for district level performance tests. Performance tests, such as the Integrated Task, Direct Writing, and Open Ended Mathematics, provide excellent samples of what students can actually generate on their own. They identify student weaknesses in communication, and the ability to describe, explain, represent knowledge or solve problems. Because performance tests take a lot of time to administer, they generally consist of only one or two tasks (items). When you have very few samples of student work (or items), you do not have not very good predictors of what students can do over a wide variety of tasks in a particular subject matter. And because you get a limited sample of student behavior on a performance test, you can underestimate student content knowledge or be inaccurate in your description of the student's real performance level.

Valid uses for classroom tests. Classroom-based tests such as running records, projects, experiments, essays, or collections of student work (portfolios) are excellent examples of how well students understand the local curriculum. They pretty much "define" what teachers think Standards mean in a particular classroom. Classroom-administered assignments or tests are the primary method of giving students immediate information about how well they are doing. These tests, unless scored by several raters and administered under "standard" conditions, may not be useful for predicting a student's performance 'in another class, in subsequent years, or be credible to audiences who want an "objective" measure of the schools. Nevertheless, well-designed classroom assignments and assessments with standard scoring criteria tied to district Standards are more powerful tools for improving student achievement dm norm-referenced or district-level performance tests.

What the above discussion boils down to is this: when you have several testing purposes, you need several tests. The use of multiple measures to meet a variety of testing purposes are often referred to as an "assessment system ".


What is our timeline for having an assessment system for Long Beach?

The district has approximately 32 broad content standards in the core curriculum: Language Arts, Mathematics, Science and Social Studies. Since no one test could possibly assess all 32 Standards, our strategy is to develop an assessment system comprised of multiple measures and different testing strategies. Some of the Standards are readily assessed by norm-referenced multiple choice tests. Some can be assessed by district level performance tests. Other Standards, especially those focused on student thinking processes or requiring extended amounts of assessment time (exhibitions, demonstrations, experiments, multi-step performances) are best monitored at the classroom level. It will take several years to develop a complete assessment system comprised of district level, school level, and classroom tests.

When complete, the system will serve three major purposes:

-- monitoring progress toward Standards;
-- accountability, and
-- improving instruction.

School Year 1996-97:
Assessment of Key Standards in Language Arts and
Mathematics; Piloting Science and Social Studies Tasks,
Setting Performance Standards.


Language Arts and Mathematics

This year we have three pieces of the system in place:

1. A multiple choice norm-referenced test (ITAS) that assesses Language Arts Standards 5, 6, 9 and 11 and Mathematics Standard 7.

2. Performance Assessments in Direct Writing and Open Ended Mathematics that assess Language Arts Standards 10 and 11 and one Mathematics Standard (depending upon the prompt chosen.)

3. Running Records in grades K-3 to assess Language Arts Standards 4 and 5.

Science and Social Studies Pilots

We are piloting a hands-on science assessment in grade 7 tied to one Science Standard. The science assessment appears on the test schedule. Teacher-developed history-social science performance tasks are being tried out in classrooms during the school year. The social science items are being tried out in classrooms and will be reviewed this summer. At that tune, we will work with the Curriculum Leaders to devise a system for monitoring student progress on the social studies standards using classroom-based assessments.

Setting Performance Standards

In order to monitor student progress on district Standards, we must set performance standards, that is levels of student performance that define what "meeting" the standards means. Both the state and federal government have designated four levels for performance: "Advanced", "Proficient", "Basic or Partially Proficient", and "Below Basic." The Performance Standard setting process determines what "cut scores" are used to decide whether students are "Proficient" (meeting the Standard) , Advanced, Below Basic and so on. This year's focus for setting performance levels is Language Arts and Mathematics. Next year district committees will set performance standards for science and history-social studies. Performance levels will not be set for all 32 Standards this first year. We will begin by identifying a few key standards in language arts and mathematics, then in science and social studies. These "key" standards will serve as a focus for district level assessment and monitoring of student progress toward meeting Standards.

School Year 1997-98:
Implementing Improved Measures in English and Spanish

New tests, designed to assess Standards rather than what appears in the commonly used textbooks, are beginning to appear on the market. Some of these tests are norm-referenced, multiple choice tests, some are performance tests, some are "modules of science experiments" and the like. We are convening a test-review committee beginning in January, 1997 to look at what we are currently using and to ask hard questions about what best assesses our local Standards. We know, for example, that we will need different Spanish tests because we are currently using translated tests with no national norms. We need measures for LEP students to assess progress towards meeting the ELD Standards.

ITAS is being revised this spring because its norms are more than 5 years old and it no longer will appear on the "state approved" test list. The new ITAS will incorporate the complete Iowa Tests of Basic Skills as well as a standards-based test for a total testing time of 5 hours instead of our current 3 hours. Since virtually all test publishers are revising their tests this year to meet state and Title I requirements for Standards-based tests with current norms, we will probably want to review the new tests as well as the revised ITAS.

PART II: CHANGES IN THIS YEAR'S TESTING SCHEDULE

Testing ALL Students


In the past the district has allowed schools to exempt large numbers of students from testing. While students should not be tested inappropriately, district initiatives at grades 3 and 8, with early warning at grades 4 and 7, and Title I regulations require that we collect assessment information from ALL students, including LEP exempt and Special Education. We are working with Alex Morales in PALMS and Eloise Thompson in Special Education to develop guidelines for testing previously-exempt populations. The inclusion of all students in some kind of testing (not necessarily ITAS and performance) will not "penalize" schools in any way. We have the ability to present test scores for special populations along with summary building scores so that you may compare the performance of different populations of students and their "impact" on the building score.

With the disappearance of CLAS (California's previous statewide assessment), newspapers no longer print local test scores and compare them to other districts. The public focus is now on S.A.T. scores or other tests that are the same across districts. Thus, there's no reason to fear that inclusion of potentially low-scoring students could hurt a school. We also provide a place on student answer sheets for indicating which students were tested under "modified" conditions so that those students may be scored separately.

Performance Testing in Mid-March

The purpose of the performance test is NOT to compare schools but to help teachers diagnose problems students have in conceptual understanding and communication. Because it is diagnostic, we encourage you to continue to administer classroom level assessments similar to the district test and to use the test as a focus for talking about student work at your site. The test is not used to track the growth of individual students. It is used to see whether groups of students are improving over time. The district performance test is an "in-progress" assessment of where we are and what we need to do to get students up to the Standard in writing and mathematics.

Direct Writing and Open Ended Mathematics Assessments
(in English and Spanish) at Grades 3, 5, 6, 8 and 10


In an effort to cut down the amount of spring testing time and to complete performance test scoring in one day instead of two, we are not testing all grades at the district level. The sites will probably wish to test the "off grades" (grades 2, 4, 7 and possibly 9) with site-level performance assessments. The district performance tests will be scored on Monday, May 19th. Tuesday, May 20th is available for scoring at your site. We will send information about how you can order assessments for the "off grade levels". You will need to organize the scoring and pull anchor papers at your site as these won't be available districtwide.

Direct Writing

We will be administering a direct writing prompt in English and Spanish instead of the Integrated Task this year. The direct writing task will take one period instead of three. SRS did not develop new Integrated Tasks, so if we are to have "secure" prompts, we need to use the Direct Writing prompt. The CAS2 scoring guides will be used to scores these tests.

The writing domains assigned to each grade level remain the same and one will be chosen for the direct writing test. Below is a summary of the possible domains:
Grade 3: autobiographical incident; problem-solution; report of information; story
Grade 5: evaluation; firsthand biography; observational; problem-solution
Grade 6: firsthand biography; problem-solution; report of information; story
Grade 8: evaluation; firsthand biography; observational;speculation
Grade 10: controversial issue; evaluation

Open Ended Mathematics

The Open Ended Mathematics prompts will be developed by the district. Teachers will receive a notebook with a set of prompts for all mathematics Standards in their grade level. Mathematics consultant Dixie Dawson will choose ONE prompt to revise and use during testing. This prompt will be translated into Spanish. The scoring guides for these prompts will be included with the model OEM tasks. These district-developed scoring criteria will be used for the district test.

Running Records for Grades K-3 (English and Spanish)

Teachers will be issued a scan sheet in June to record student results on running records for both fiction and non-fiction books in grades K, 1, 2, and 3. The results will be summarized at the district and returned to the sites to be used as "early warning" information for the Grade 3 Reading Initiative.

#

Back to "New Ways of Testing and Grading"