The Limits of Testing and the Future of Accountability
In this post, Jack Schneider and Paul Reville continue their conversation about the role of standardized testing in public education.
Schneider: You ended our last post with some claims that I'd like to respond to and continue discussing.
First, you seem to be saying that schools are to blame for narrowing the curriculum in response to accountability policies. In your words: "the problem is less about the tests and more about the way schools respond to the tests." But that gives policymakers a pass for not anticipating a very obvious consequence of high-stakes testing. If I told you that I was going to hold you accountable for the number of steps you take each day, and I handed you a pedometer, it wouldn't matter that I encouraged you to get other forms of exercise. I would be dictating your behavior. There's just no way around that fact. Without a balanced scorecard for evaluating quality, we will continue to see the work of schools distorted by standardized tests.
You also suggested in our last post that we need to develop better measures of factors like social and emotional health before we can incorporate them into accountability frameworks. And I'm all in favor of getting things right rather than merely getting them done. But I would also challenge the notion that our measures for things like student engagement are any worse than our measures for school knowledge. What does a standardized test tell us about student learning? Something, certainly. But not a tremendous amount. And not without substantial error. So where's the consistency? Either we should say: "Hey, we're in the Stone Age right now and none of our tools are really sufficient." Or we should admit that our tools are imperfect and deploy the full range of them.
Reville: You're right that multiple choice tests are very limited. I'm proud that our Massachusetts tests, the MCAS, have generally had very high proportions of open-response questions. The best tests I ever experienced as a student were tests that forced me to write long essays. They would be corrected by hand in a very detailed way and promptly returned to me. This kind of testing is expensive, so states often default to more primitive, multiple choice tests. However, with the new testing consortia pioneering some innovative forms of measuring student learning, I am confident that we're ready to advance into a new and better era of testing.
Now, I do think policy-makers should take responsibility for the consequences of their actions. In the case of the narrowing of the curriculum, I think the fault lies not with the goals or the tests but with the fact that educators and schools needed more time to accomplish these and other education goals. Educators didn't necessarily ask for more time, nor did policy-makers supply it. I believe that prioritizing learning and skill development in English, math, and science was a justifiable policy goal and one which would require more time if schools were still to attend to all the other subjects. It was a mistake not to offer more time in order to build the capacity of the school system to improve performance on the core subjects while delivering a well-rounded education.
I'm not a psychometrician, so I don't have a good grasp of the state of the art in terms of measuring social and emotional development. The tests I've seen in this area have serious limitations. And while I agree that our standardized tests in conventional subjects also have limitations, I don't think your characterization of them as "Stone Age" is fair. The MCAS, for example, while far from perfect, is authentic, valid, and reliable. I'm convinced that the next generation of subject tests will be even more helpful to teachers and students.
I believe part of the reason that we haven't seen tests of social-emotional development introduced much at the school, district or state levels is that there's not much agreement on whether or not schools should be focused on this domain. Many policymakers and families feel that social-emotional learning is best left to families and communities and that schools should stick to the three R's. I don't subscribe to this view but it's widespread. Policymakers won't include these kinds of factors in our accountability systems until substantial demand for them emerges. Educators could be a major force in bringing this subject to the forefront of policy discussions.
Schneider: My first question here is about what providing "more time" would accomplish (I assume we're talking about a longer school day here?). You suggest that "more time" would prevent the curriculum from narrowing. But I'm not sure that's true, since schools would still be held accountable for standardized test scores in English and math. So wouldn't "more time" simply lead to more of the same, with un-tested subjects being crowded out of the curriculum? And what would more time do to prevent educators from feeling the pressure to emphasize test-prep, since that's what they're being held accountable for? And, perhaps most importantly: how would this prevent schools from being punished for the demographic background of their student populations, since some students tend to score lower on tests than others?
My second question is about what kind of evidence you think policymakers would need to see in order to push for a more balanced scorecard of school quality. Richard Rothstein and Rebecca Jacobsen, for instance, conducted a survey with a representative sample of Americans, and found that "social skills and work ethic" rated third (behind "basic skills in core subjects" and "critical thinking and problem solving") among the many aims that schools can promote. Also highly valued by Americans are "citizenship and community responsibility," "preparation for skilled work," "physical health," "emotional health," and "the arts and literature." Stakeholders clearly want schools to be promoting these aims.
Reville: I believe in tests that are as rich and authentic as the educational goals we're trying to achieve. Good tests aren't susceptible to "test-prep." Strong accountability systems reward educators for improvement not the status of their students.
I think more time would give teachers more capacity to cover the subjects they feel are important. If teachers have an adequate amount of time, I have faith in their judgment to balance the learning needs of their students. I do think it's time to start factoring some of the "non-cognitive" qualities we value into our accountability system so they get the attention they deserve in the curriculum. We also need time for subjects like history, career readiness, languages, economics, psychology and the arts.
Policy-makers are reluctant to add time because of the costs associated with more time and the general resistance from students and the public to the extension of school hours and schedules. I believe we need to use this moment in history to reconsider education for the 21st century. Researchers, employers and the general public can be persuasive with policy-makers in making the case for broadening educational goals to include factors like interpersonal skills, persistence and resilience. Success for all should be the goal, and we're learning that the factors associated with success include, but are not limited to, academic skills. Educators can play a leadership role in persuading society that schools should be responsible for addressing non-academic factors. Finally, it's good to keep in mind an old axiom about testing and education generally: "not everything that's important can be measured and not everything we can measure is important."
Schneider: I agree with most of what you say here. I would add only that if good tests aren't susceptible to test prep, we have pretty clear evidence that our current tests aren't particularly good. And state policies tend to exacerbate the problem by attaching such high stakes to those tests. It doesn't need to be this way.