UPDATED: Scholar Questions Induction Study Results
The Institute of Education Science's study on intensive teacher induction is getting a lot of buzz, and at least one scholar is worried that it may be giving an inaccurate picture of the effects of mentoring.
The study found positive effects in both reading and math that, relatively speaking, are quite large. As the New Teacher Center's Liam Goldrick points out on his blog, the effect sizes for mentoring given in this report are larger than those produced in other large-scale randomized studies, such as one on Teach For America.
But Jonah Rockoff, an associate professor at Columbia University's School of Business who's studied induction programs in New York City, said to be cautious about the results. For one, he noted that despite these large achievement boosts, the data don't show any other effects that would seem to confirm the results. Teachers didn't report feeling any more prepared, for instance.
He also directed my attention to this factor: The researchers used "covariates" to control for the effects of teachers' background characteristics and other factors on the data. Without such controls, the effect estimates drop and are no longer statistically significant for reading. You can find this on p. 95 of the report.
By Rockoff's read, covariates shouldn't be necessary in an experiment if the "treatment" and "control" group are really appropriate comparisons. One problem with large-scale education studies, he said, is that teachers often change classes, grade levels and subjects. In this case, as the population of teachers declined due to attrition and other movements, only a small population of teachers were tied to at least two years of student-achievement data. And if the declining group of teachers in the study resulted in a material difference in the composition of the treatment and control group, then that might skew the data.
"The big issue is whether treatment and control groups still look like one another, among the subset of teachers with multiple years of data," Rockoff told me.
UPDATE: Steven Glazerman of Mathematica, one of the number guys behind the study, submits this response:
"We agree with Jonah Rockoff that one should be cautious about interpreting the test score impact findings. We sounded that note of caution in our report and in its executive summary because the results are not robust to alternative ways to estimate the program's impact. However, the most credible estimates adjust for chance differences between student and teacher characteristics, including students' achievement before they entered the study teacher's classroom. These show a positive and significant impact on reading and math in two-year districts and no impact in one-year districts. Rockoff is correct that it is not necessary in a true experiment to adjust for these chance differences by including covariates. However, failing to do so would cause us to ignore data and rely on a less precise estimate. In this case especially, when the test score analysis sample is small, the precision gains are substantial: the standard error associated with our estimates decreases by 29% in a model with covariates compared to a model without covariates."