Which Side Is Right About Evaluating Teachers?
Now that the fall semester is underway, it won't be long before teachers are evaluated about their instructional effectiveness. In years past, the process applied largely to new teachers who did not yet have tenure. But most states today require that even veteran teachers be evaluated. High on the list of strategies for this purpose is the value-added model. Two papers published by prestigious organizations two years apart almost to the day present contrasting views about this controversial metric.
The Economic Policy Institute was first when it released its paper on Aug. 29, 2010 ("Problems With the Use of Student Test Scores to Evaluate Teachers"). Ten scholars concluded that "used with caution, value-added modeling can add useful information to comprehensive analyses of student progress and can help support stronger inferences about the influences of teachers, schools, and programs on student growth." However, they emphasized that the "foundation of teacher evaluation systems" should be the professional judgment of competent supervisors and peers. In short, the value-added model should play only a supplemental role.
Then on Sept. 5, 2012, the Manhattan Institute's Marcus A. Winters weighed in with his view based on his study of data from Florida public schools ("Transforming Tenure: Using Value-Added Modeling to Identify Ineffective Teachers"). He concluded that "public schools can indeed use VAM to help identify teachers for tenure or removal." However, he hastened to emphasize that "this report does not argue that VAM should be used in isolation to evaluate teachers for tenure or to make any other employment decisions."
Actually, I see more agreement between the papers than initially meets the eye. Both acknowledge that evaluating teachers is a complex undertaking. They both also urge the use of multiple measures in making these high-stakes decisions. The major difference is over the issue of reliability. Winters maintains that claims about the value-added model's unreliability should be rejected. In contrast, the ten scholars believe that "estimates of teacher effectiveness are highly unstable." They cite the Board on Testing and Assessment of the National Research Council of the National Academy of Sciences: "... VAM estimates of teacher effectiveness should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable."
There's another issue that warrants further debate. It has to do with Type I (false positive) and Type II (false negative) errors. Which is worse: labeling a bad teacher as good or labeling a good teacher as bad? Winters believes that the existing system of evaluation defaults in favor of teachers ("Putting value-added model to the test: Study finds student scores can predict teacher effectiveness," The Atlanta Journal-Constitution, Sept. 5). I don't doubt that some teachers remain in the classroom when they don't deserve to be there. But there is an equal danger of removing teachers who help students in ways that are not measured by the value-added model. I'm referring now to non-cognitive outcomes that are not measured by standardized tests currently in use. I point this out in my letter to the editor that was published in The New York Times Book Review on Sept. 9 ("The Character Hypothesis").
I hope that many more studies will be devoted to the subject because the issue is far from settled. At this point, however, I think the value-added model can cautiously be used as one piece of information. My only concern - and I want to emphasize it - is that as pressure mounts for quantifiable data about teacher effectiveness, the value-added model will take over the entire process. That's not good for teachers or students.