My heart went out to Charlie Gibson last week, as he stared into those doe eyes that will not blink and realized that he could not wrangle a single straight answer out of Miss Wasilla.
So I can only imagine how the NYC Department of Education analysts’ felt when they sat down to analyze the data from student, parent, and teacher surveys this year. It turns out that you get as much valid and reliable information out of these surveys as Gibson managed to pull out of Sarah Palin.
The problem is a very simple – and very predictable – one. Survey responses constitute 10% of the Progress Report Grades, and schools face very real consequences if their schools receive a poor grade. Faced with such pressures, we expect that the adults who fully understand these consequences – parents and teachers – will provide a rosier picture of the school than truly exists.
If all schools did this equally, the inflation of survey responses would not be a problem; we could still rank schools by their perceptions of safety, engagement, or what have you. We would not have a clean measure of how safe a school is overall, but we would know how safe it was relative to other schools – a central objective of the grading system.
Alas, schools face different incentives to inflate their survey responses. If you’re a teacher filling out a survey in an F school, you know that your school could very well be closed if its grade doesn’t improve. Compared to a teacher filling out a survey in an A school, you’re more likely to put on a happy face.
One way to get at this problem is to compare changes in the teacher responses to the survey with changes in the student responses. We know that students and teachers don’t see eye-to-eye about school conditions, so we don’t expect them to provide comparable assessments of the school in any given year. But if teachers report improvement at a rate that far outpaces the improvement reported by the students, and this happens more in D and F schools than A and B schools, we have pretty good evidence that teachers have inflated their responses.
To get a handle on survey inflation, I did a basic calculation for each of the 4 survey domains: safety, communication, academic expectations, and engagement. Using the example of safety, I calculated:
(2008 Teacher Survey Score for Safety – 2007 Teacher Survey Score for Safety) –
(2008 Student Survey Score for Safety – 2007 Student Survey Score for Safety)
At schools that have positive scores on this measure, teachers report a pace of improvement that outpaces the improvement that students report. Kids are often the best check on us wily adults, and it turns out that they function as a first-rate BS detector in this case. I should also note that students may be pressured to inflate their scores, so if anything, the difference between the teacher and student changes is a lower bound measure of survey inflation.
The first graph below reports the average of these differences for the safety measure for high schools receiving A to F grades. At A schools, students and teachers saw improvement happening equally – there is almost no difference between the change in teacher scores and the change in student scores. At F schools, there are tremendous differences between the rate of improvement reported by teachers and students.

The teacher-student discrepancy exists for every measure on the survey. Next, let’s look at the engagement measure for high schools.

Bottom line: survey inflation exists across the board, but is worst at D and F schools. If you’d like figures for the other domains or school levels, feel free to email me. The irony, of course, is that instead of having better information about how things are going in NYC schools, incorporating the surveys in the grading scheme has fundamentally corrupted this measure.
So I can only imagine how the NYC Department of Education analysts’ felt when they sat down to analyze the data from student, parent, and teacher surveys this year. It turns out that you get as much valid and reliable information out of these surveys as Gibson managed to pull out of Sarah Palin.
The problem is a very simple – and very predictable – one. Survey responses constitute 10% of the Progress Report Grades, and schools face very real consequences if their schools receive a poor grade. Faced with such pressures, we expect that the adults who fully understand these consequences – parents and teachers – will provide a rosier picture of the school than truly exists.
If all schools did this equally, the inflation of survey responses would not be a problem; we could still rank schools by their perceptions of safety, engagement, or what have you. We would not have a clean measure of how safe a school is overall, but we would know how safe it was relative to other schools – a central objective of the grading system.
Alas, schools face different incentives to inflate their survey responses. If you’re a teacher filling out a survey in an F school, you know that your school could very well be closed if its grade doesn’t improve. Compared to a teacher filling out a survey in an A school, you’re more likely to put on a happy face.
One way to get at this problem is to compare changes in the teacher responses to the survey with changes in the student responses. We know that students and teachers don’t see eye-to-eye about school conditions, so we don’t expect them to provide comparable assessments of the school in any given year. But if teachers report improvement at a rate that far outpaces the improvement reported by the students, and this happens more in D and F schools than A and B schools, we have pretty good evidence that teachers have inflated their responses.
To get a handle on survey inflation, I did a basic calculation for each of the 4 survey domains: safety, communication, academic expectations, and engagement. Using the example of safety, I calculated:
(2008 Teacher Survey Score for Safety – 2007 Teacher Survey Score for Safety) –
(2008 Student Survey Score for Safety – 2007 Student Survey Score for Safety)
At schools that have positive scores on this measure, teachers report a pace of improvement that outpaces the improvement that students report. Kids are often the best check on us wily adults, and it turns out that they function as a first-rate BS detector in this case. I should also note that students may be pressured to inflate their scores, so if anything, the difference between the teacher and student changes is a lower bound measure of survey inflation.
The first graph below reports the average of these differences for the safety measure for high schools receiving A to F grades. At A schools, students and teachers saw improvement happening equally – there is almost no difference between the change in teacher scores and the change in student scores. At F schools, there are tremendous differences between the rate of improvement reported by teachers and students.

The teacher-student discrepancy exists for every measure on the survey. Next, let’s look at the engagement measure for high schools.

Bottom line: survey inflation exists across the board, but is worst at D and F schools. If you’d like figures for the other domains or school levels, feel free to email me. The irony, of course, is that instead of having better information about how things are going in NYC schools, incorporating the surveys in the grading scheme has fundamentally corrupted this measure.



This is a good piece of analysis and certainly the kind of triangulation that needs to always be in place when using survey data. But, I wonder if it is a bit of a leap to attribute the difference to ulterior motives deriving from the inclusion of the survey scores on the report card.
In my district the teacher survey is carried out by the union to guard the privacy of every teacher involved and ensure participation. The results are primarily published on the union web-site. As a parent I just stumbled on this data gold mine by accident, but have found that it provides very helpful information in selecting schools. But, while some topics are very insightful (how high a regard the staff has for the principal, for instance), I have seen similarly questionable viewpoints when it comes to other things. I generally attribute this to teachers' different point of view. Kids know a lot more about what goes on in the hallways, bathrooms, cafeterias and on buses, while the teachers regard life mostly from within a classroom. Hence, they see the prevalence of harassment very differently.
But the other thing that I suspect may be driving some of the "inflation" is a level of denial of problems in low performing schools. If one believes that they are really doing well, but what can you expect from kids who come from the kind of background they come from, one might rate student engagement overly high. If one believes that the kids bring social problems in the door with them, but that the school is the safest part of their day, again one would answer accordingly. I would call this denial, and it is a pretty common symptom of dysfunctional systems in which problems appear to be insurmountable. If one accepts that teachers become teachers because they want to do wonderful things to improve the lives of children (and I am willing to believe that this is pretty much the case), it can be an incredible let down to encounter one's human powerlessness to turn kids around. So rather than adjust one's own self-view, the result is to minimize problems and to find scapegoats. The cure, of course for dysfunctional situations, is always to start by admitting that there is a problem. This is something that we are only beginning to be able to do.
How the responses are being used matters to respondents (in this case, teachers).
My school was not a pleasant place to work and teachers were less than happy with the admin -- and I think they would've reported as such on a survey. But when it was announced that the school would be closing, teachers were upset -- many didn't want to leave. If the staff knew that the school might be closed based, in part, on their responses to the survey then I'd guess that their responses would be positively biased.
Hi Margo,
I agree that kids and teachers see things differently - and that is very clear in the data - but that is netted out of this analysis since I'm looking at change across two years. Why would we expect teachers in D and F schools to report dramatic improvements, while students do not report these improvements?
Perhaps cross-checking against the parent data would address your concern, so I will take a look at that later. I did look at the parent response rates, which shot up enormously at D and F schools, but not at A and B schools. This supports Corey's assertion that parents and teachers may band together to support an institution that they find valuable when they see it as endangered.
I am an elementary school parent and I think you are completely spot-on when you say "parents and teachers may band together to support an institution that they find valuable when they see it as endangered."
When I answered the survey the first year, I didn't realize that there were essentially right and wrong answers and that giving the wrong ones could result in my child's school being negatively assessed. So this year, I pretty much figured out what the DOE wanted me to say and said it--even tho what I value (and my kid's school values) does not necessarily accord with DOE priorities.
As a progressive institution, my kid's school is already in a constant struggle with the system. Why would I (or teachers) want to jeopardize the school's valuable mission, even if we aren't 100% satisfied?
It's too bad, because a survey (though I'd include a whole different set of questions) could yield useful information. I am not a data person so maybe I am not aware of the exact definition, but this "survey" seems more like a test to me.
I am an elementary school parent and I think you are completely spot-on when you say "parents and teachers may band together to support an institution that they find valuable when they see it as endangered."
When I answered the survey the first year, I didn't realize that there were essentially right and wrong answers and that giving the wrong ones could result in my child's school being negatively assessed. So this year, I pretty much figured out what the DOE wanted me to say and said it--even tho what I value (and my kid's school values) does not necessarily accord with DOE priorities.
As a progressive institution, my kid's school is already in a constant struggle with the system. Why would I (or teachers) want to jeopardize the school's valuable mission, even if we aren't 100% satisfied?
It's too bad, because a survey (though I'd include a whole different set of questions) could yield useful information. I am not a data person so maybe I am not aware of the exact definition, but this "survey" seems more like a test to me.