Varied Measures for Varied Purposes
Today's guest contributor is Susan H. Fuhrman, President, Teachers College, Columbia University; Chair, The Consortium For Policy Research In Education (CPRE).
Just about everyone agrees there's too much stress on testing in American schools. Ever since No Child Left Behind in 2001, schools have been testing every student in English/Language Arts and Mathematics in Grades 3-8 and Grade 10. The same state standardized tests have been used for multiple purposes: to monitor student learning, to hold schools accountable and to evaluate teacher performance. Putting so much weight on these tests has meant that they drive instruction, forcing teachers to attend to the specific content likely to be tested rather than the whole curriculum and crowding out important subjects like social studies, physical education and the arts.
New developments in research and practice open up the possibility of varying the measures used for the main purposes of assessment--tracking student progress and guiding the improvement of student learning; getting information about school progress for incorporation in accountability systems; and evaluating teachers. Building a hybrid system of assessment would alleviate placing undue pressure on a single yearly state test.
Student performance should be regularly tracked by teachers, who are positioned to be the best assessors of student learning, through in class tests, midterms, finals, and grades for homework and classwork. Excellent curricular approaches, like the Teachers College Reading and Writing Project, incorporate formative assessment processes. In addition, some states and districts are developing item banks of good test questions that teachers can use in classrooms. Eventually, new learning software and games used by students in some school subjects will track student learning and supplement teacher in-class tests with embedded measures of student progress. New increasingly ubiquitous information systems compile all these data so that educators can observe the progress of every child. Parents and students can also access versions of these reports.
Hopefully additional research and investment will enable us to design better measures that are embedded in the curriculum and aggregate and standardize the results of such assessments to measure school, classroom or group progress. Until that day, schools and districts and even states could include some common measures among all the data they collect for the purpose of assuring that all students are learning essential material. It is hard to see why standardized state testing of each individual student would need to take place in as many grades as presently -if at all- given all this accumulating student-level information.
For school-level information, we should consider a matrix sampling approach. Assessment experts have long argued that one way for tests to cover the whole curriculum and not narrow in on a few areas that can be covered in a single test session is to give different students different questions. Matrix sampling, used by the National Assessment for Educational Progress (NAEP), enables coverage of many more areas of content than giving everyone the exact same test (as we do now, in an approach called census testing). If we separate the need to track individual students from the need for school-level information, it is possible that we can introduce sampling for the latter purpose and mitigate the narrowing that comes from teachers having to focus on the limited content most tests can cover. With some common questions, in a partial matrix sampling approach, some individual-level scores might be available.
California successfully petitioned the federal government to accept a sampled approach in lieu of traditional No Child Left Behind Testing this spring while it pilots new tests from the Smarter, Balanced Consortium. A few smaller states have also opted to use sampled field tests in lieu of traditional census testing. Perhaps a policy window has opened. I hope to collaborate in some research that focuses on how a sampling approach can satisfy policymaker needs for school-level and subgroup information. While the consortia working on Common Core aligned assessments are designing tests that provide individual scores, the cost and burden of such assessments over time would fall on the states. If a sampling approach, perhaps at benchmark grades, provides good information on school progress, it might be an attractive alternative for state policymakers.
We also have to figure out what to do about teacher evaluation. Despite growing resistance, many states are using state test measures to indicate how much difference in student learning can be traced to individual teachers. Moving to a matrix sample design for school accountability, as just discussed, will not provide sufficient information to link standardized test scores to individual teachers and is resisted by those who think a single test best captures teacher influence. Fortunately, research has shown that there are valid ways to examine teacher contributions through the work they assign and that their students complete, through student surveys and through observations of practice. Putting more effort into developing efficient systems using such measures seems warranted.
Susan H. Fuhrman
Teachers College, Columbia University
The Consortium For Policy Research In Education (CPRE)