Yes, Beltway Wonks, Sampling Error Does Matter
Dan Koretz, Harvard psychologist and author of Measuring Up: What Educational Testing Really Tells Us, provides a very clear explanation of why Carey is wrong:
A few readers might be wondering: if all students in a school (or at least nearly all) are being tested, where does sampling error come into play? After all, in the case of polls, sampling error arises because one has in hand the responses of only a small percentage of the people who will actually vote. This is not the case with most testing programs, which ideally test almost all students in a grade.Addressing complexities like sampling error is not just exploiting a "loophole" to avoid NCLB sanctions. Rather, it's an assurance that when we label a school as "in need of improvement," we're not wrongly assigning that label. It strikes me as deeply ironic that even as NCLB endorses "scientifically-based" research, many wonks continue to turn their noses up at the central conventions of the science of statistics.
This question was a matter of debate among members of the profession only a few years ago, but it is now generally agreed that sampling error is indeed a problem even if every student is tested. The reason is the nature of the inference based on scores. If the inference pertaining to each school...were about the particular students in that school at that time, sampling error would not be an issue, because almost all of them were tested. That is, sampling would not be a concern if people were using scores to reach conclusions such as "the fourth-graders who happened to be in this school in 2000 scored higher than the particular group of students who happened to be enrolled in 1999." In practice, however, users of scores rarely care about this. Rather, they are interested in conclusions about the performance of schools. For the inferences, each successive cohort of students enrolling in the school is just another small sample of the students who might possibly enroll, just as the people interviewed for one poll are a small sample of those who might have been. (p. 170)