Designer of Value-Added Tests a Skeptic About Current Test Mania
Follow me on Twitter at @AnthonyCody
Defenders of our current obsession over test scores claim that new, better tests will rescue us from the educational stagnation caused by a test prep curriculum. And one of those new types of tests is an adaptive test, which adjusts the difficulty of questions as students work, so that students are always challenged. This gives a better measure of student ability than a traditional test, and can be given in the fall and spring to measure student growth over the year. This approach is increasingly being used to determine the "value" individual teachers add to their students' academic ability, which is then used as a significant factor in teacher evaluation -- as required by the Department of Education as a condition for relief from No Child Left Behind.
One might expect the designer of these tests to be happy with the many uses now being found for the data they produce. But Jim Angermeyr, one of the architects of the value-added assessment, is not so thrilled. He worked with the Northwest Evaluation Association to develop tests, and more recently as director of research and evaluation with the Bloomington Public Schools. In this fascinating interview with the Minneapolis Post, he shares some of his concerns as he prepares to retire from the field.
His first concern is the way test scores are being used to rate teachers:
We [test designers] have a healthy respect for error and how to measure it. And always a certain amount of caution when you're interpreting results.
That caution grows as the groups get smaller, like looking at a classroom instead of a whole school. And that caution grows even more when the stakes increase because increasing the stakes can lead to all kinds of distortions, whether it's the cheating that goes on in some of schools that you've been reading about around the country, or whether it's just the general over-emphasis on testing to the exclusion of other things.
Dr. Angermeyr helps us put testing in its place. He says,
Where the distortion comes in is that you can only test a limited amount of the domain. Even if it's a domain like mathematics, you can't cover everything. And so you make assumptions about kids' skills in that broader domain. Do we have eighth graders who are good readers based on a pretty small sample of questions and items?
Testing professionals know that you're just sampling the domain and you don't try to make inferences further than that. But nonprofessionals do that all the time. "American students are 51st in the world in reading." There are a lot of assumptions that are made before you can get to that conclusion, but people leap right over that.
If I was running the world, I would severely reduce the accountability stakes for tests. I would certainly eliminate things like No Child Left Behind. I would probably take away the current waiver. Even if it looks better, sometimes it's still really the same wolf in different clothing.
I would do away with standards, to be honest. Even though on paper they sound kind of cool, they assume all kids are the same and they all make progress the same way and move in lockstep. And that's just not accurate. Standards distort individual differences among kids. And that's bad.
I would put testing back as a local control issue in school districts. I would take the emphasis off of evaluating and [compensating] teachers. I would put the emphasis on good training for principals and curriculum specialists and teachers on how to interpret data and use it for the kind of diagnosis and assessment that it was originally intended for.
This resonates powerfully with what teachers have been saying since the beginning of No Child Left Behind. It reminds me especially of the work that Doug Christensen led in Nebraska several years back, focused on developing local control of testing and standards.
But Jim Angermeyr is also aware of the power of data to provide our leaders with the ability to simplify complex issues.
It's politicians and some policymakers who believe tests can do more than they really can. And there's not enough people stopping and saying wait a minute. When you can summarize a whole bunch of complicated things in a single number, that has a lot of power and it's hard to ignore, especially when it tells a story that you want to promote. And that's where it gets really twisted.
There are quite a few of us saying "wait a minute." There is a National Resolution on High Stakes Testing that has gathered the support of hundreds of organizations and thousands of individuals.
This message is also echoed in the latest news out of Florida, where the state School Board Association recently adopted a resolution condemning the over-use of high stakes tests, and objecting to their use as the primary basis for evaluating teachers, administrators, schools and districts.
Perhaps if those designing the tests raise their voices alongside those of us who are giving the tests, and the students taking the tests, and their parents as well, we can bring about the change we need.
What do you think? Can we return testing to its proper place as a diagnostic tool?