Irreconcilable Differences: Why NYC’s Surveys Provide a Misleading Portrait of School Quality
So I can only imagine how the NYC Department of Education analysts’ felt when they sat down to analyze the data from student, parent, and teacher surveys this year. It turns out that you get as much valid and reliable information out of these surveys as Gibson managed to pull out of Sarah Palin.
The problem is a very simple – and very predictable – one. Survey responses constitute 10% of the Progress Report Grades, and schools face very real consequences if their schools receive a poor grade. Faced with such pressures, we expect that the adults who fully understand these consequences – parents and teachers – will provide a rosier picture of the school than truly exists.
If all schools did this equally, the inflation of survey responses would not be a problem; we could still rank schools by their perceptions of safety, engagement, or what have you. We would not have a clean measure of how safe a school is overall, but we would know how safe it was relative to other schools – a central objective of the grading system.
Alas, schools face different incentives to inflate their survey responses. If you’re a teacher filling out a survey in an F school, you know that your school could very well be closed if its grade doesn’t improve. Compared to a teacher filling out a survey in an A school, you’re more likely to put on a happy face.
One way to get at this problem is to compare changes in the teacher responses to the survey with changes in the student responses. We know that students and teachers don’t see eye-to-eye about school conditions, so we don’t expect them to provide comparable assessments of the school in any given year. But if teachers report improvement at a rate that far outpaces the improvement reported by the students, and this happens more in D and F schools than A and B schools, we have pretty good evidence that teachers have inflated their responses.
To get a handle on survey inflation, I did a basic calculation for each of the 4 survey domains: safety, communication, academic expectations, and engagement. Using the example of safety, I calculated:
(2008 Teacher Survey Score for Safety – 2007 Teacher Survey Score for Safety) –
(2008 Student Survey Score for Safety – 2007 Student Survey Score for Safety)
At schools that have positive scores on this measure, teachers report a pace of improvement that outpaces the improvement that students report. Kids are often the best check on us wily adults, and it turns out that they function as a first-rate BS detector in this case. I should also note that students may be pressured to inflate their scores, so if anything, the difference between the teacher and student changes is a lower bound measure of survey inflation.
The first graph below reports the average of these differences for the safety measure for high schools receiving A to F grades. At A schools, students and teachers saw improvement happening equally – there is almost no difference between the change in teacher scores and the change in student scores. At F schools, there are tremendous differences between the rate of improvement reported by teachers and students.
The teacher-student discrepancy exists for every measure on the survey. Next, let’s look at the engagement measure for high schools.
Bottom line: survey inflation exists across the board, but is worst at D and F schools. If you’d like figures for the other domains or school levels, feel free to email me. The irony, of course, is that instead of having better information about how things are going in NYC schools, incorporating the surveys in the grading scheme has fundamentally corrupted this measure.