What About Everyone Else? Statistical Significance and Sample Size
Guest blog post by Jaclyn Zubrzycki
I recently wrote about a study on the effectiveness on school vouchers from the Brookings Institution's Matthew Chingos and Harvard University's Paul E. Peterson. The study's received some attention, both positive and negative, and the authors also wrote an opinion piece in yesterday's Wall Street Journal to publicize the release of their study. But I'm not going to talk about the politics of vouchers here (though you can check out the article, which went live yesterday, or our vouchers issue page for more detail). Instead, we're going split hairs about the way the results of different subgroups are presented in the study.
The study showed that, among a group of about 2,700 New York City students in the late 90s, winning a lottery to use a school voucher and then using that voucher led to a significant increase in college enrollment for black students, who accounted for a little more than 40% of the students studied (good news!). The study also showed, however, that winning the voucher lottery or using the voucher had NO overall impact on the college enrollment of the group as a whole, and no impact on the college enrollment of Hispanic students, the other large subgroup in the study.
But wait a minute, you might ask—if African-American students' college enrollment increased with vouchers, and Hispanic students' remained the same with vouchers, but the OVERALL rate also remained the same with vouchers, whose college enrollment went down? Shouldn't we also be reading about the fact that vouchers DECREASED the likelihood of some other group to attend college? What about Asian students, white students, or students who identify as "other"? The equation doesn't balance!
It's a great question, and one I also asked the researchers. But it turns out that the results from those groups weren't highlighted for good reason. Students from the other subgroups who received vouchers DID attend college at a slightly lower rate than students who did not receive vouchers—but the sample sizes for white, Asian, and other students are so small that the differences are statistically insignificant, or "noisy," as researcher Matt Chingos explained.
"We find negative effects for white and Asian students that are very imprecisely estimated because this group is very small (only 105 white and Asian students in total)," Mr. Chingos explained in an email. Researcher Casey Cobb, a professor at the University of Connecticut's Neag School of Education, noted that, in this case, the fact that the total enrollment rate remained the same while the enrollment rate of blacks grew did NOT necessarily mean that another subgroup had to have seen a significant change. Here's a short but hopefully helpful explanation of what makes a finding statistically significant.
In this particular case, Mr. Chingos said, the characteristics of the treatment and control groups for these particular subgroups—the students who identified as Asian, white, or other who did and did not receive voucher offers—seemed to be slightly different as well, which could also lead to this difference. But even so, the very small number of students involved means that we can't draw confident conclusions.
So, a good reminder: The size of the study matters, and sometimes there's a difference in what's statistically significant and what seems to intuitively make sense. In this case, despite the overall flat results, one group of students' gain was not necessarily another group's loss.
Yours in statistical nerdery,