Releasing Virginia's Teacher Evaluation Data Would Be a Bad Idea
Yesterday, in a page one story in the Washington Post, Emma Brown and Moriah Balingit reported on the Virginia lawsuit that's seeking to force the state to release the individual evaluation data for thousands of teachers across Virginia. The suit prevailed in a Richmond courtroom in January and is currently being challenged by state education officials and the Virginia Education Association. While I'd rather be siding with crusading parents than with bureaucrats, I think the effort to release data on the students of individual teachers is hugely problematic. Why? My reasoning hasn't changed since the L.A. Times famously first reported individual teacher value-added scores back in August 2010. Since the issue hasn't changed, let's just go to the tape. Here's a lightly trimmed version of what I wrote the day after the Times story ran:
[Given] the impressive journalistic moxie it showed, I really wanted to endorse the LAT's effort. But I can't. Now, don't get me wrong. I'm all for using student achievement to evaluate and reward teachers and for using transparency to recognize excellence and shame mediocrity. But I have three serious problems with what the LAT did.
First, as I've noted here before, I'm increasingly nervous at how casually reading and math value-added calculations are being treated as de facto determinants of "good" teaching. As I wrote back in April , "There are all kinds of problems with this unqualified presumption. At the most technical level, there are a dozen or more recognized ways to specify value-added calculations. These various models can generate substantially different results, with a third of each result varying with the specifications used." [Note: The Virginia model doesn't attempt to control for the impact of poverty or other demographic considerations.]
Second, beyond these kinds of technical considerations, there are structural problems. For instance, in those cases where students receive substantial pull-out instruction or work with a designated reading instructor, LAT-style value-added calculations are going to conflate the impact of the teacher and this other instruction. How much of this takes place varies by school and district, but I'm certainly familiar with locales where these kinds of "nontraditional" (something other than one teacher instructing 20-odd students) arrangements accounts for a hefty share of daily instruction. This means that teachers who are producing substantial gains might be pulled down by inept colleagues, or that teachers who are not producing gains might look better than they should. Currently, there is nothing in the design of data systems that can correct for these kinds of common challenges.
Third, there's a profound failure to recognize the difference between responsible management and public transparency. Transparency for public agencies entails knowing how their money is spent, how they're faring, and expecting organizational leaders to report on organizational performance. It typically doesn't entail reporting on how many traffic citations individual LAPD officers issued or what kind of performance review a National Guardsman was given by his commanding officer. Why? Because we recognize that these data are inevitably imperfect, limited measures and that using them sensibly requires judgment. Sensible judgment becomes much more difficult when decisions are made in the glare of the public eye.
So, where do I come out? I'm for the smart use of value-added by districts or schools. I'm all for building and refining these systems and using them to evaluate, reward, and remove teachers. But I think it's a mistake to get in the business of publicly identifying individual teachers in this fashion. I think it confuses as much as it clarifies, puts more stress on primitive systems than they can bear, and promises to unnecessarily entangle a useful management tool in personalities and public reputations.
Sadly, this little drama is par for the course in K-12. In other sectors, folks develop useful tools to handle money, data, or personnel, and then they just use them. In education, reformers taken with their own virtue aren't satisfied by such mundane steps. So, we get the kind of overcaffeinated enthusiasm that turns value-added from a smart tool into a public crusade. (Just as we got NCLB's ludicrously bloated accountability apparatus rather than something smart, lean, and a bit more humble.) When the shortcomings become clear, when reanalysis shows that some teachers were unfairly dinged, or when it becomes apparent that some teachers were scored using sample sizes too small to generate robust estimates, value-added will suffer a heated backlash. And, if any states get into this public I.D. game (as some are contemplating), we'll be able to add litigation to the list. This will be unfortunate, but not an unreasonable response—and not surprising. After all, this is a movie we've seen too many times.