Counting 'Proficient' Students Creates Bad Accountability
By Morgan Polikoff
Consider two California schools. Both schools have 40 percent of students scoring proficient on state tests. Because both schools are well short of the state's accountability targets, they are subject to intervention. According to the accountability classifications used under No Child Left Behind and those proposed for the new federal Every Student Succeeds Act, these two schools look exactly the same.
But a glance under the hood shows that the two schools are not at all the same. School A has performance that looks like the following: 20% below basic, 40% basic, 20% proficient, 20% advanced. School B has performance of 40% below basic, 20% basic, 40% proficient, 0% advanced. In terms of average achievement levels, School B is clearly worse off than School A. If the goal of an accountability system is to target resources and support to the lowest-performing schools, it would be a mistake to treat these two schools the same.
NCLB All Over Again
Yet that is exactly what NCLB did through its Adequate Yearly Progress calculation, and which the draft regulations of ESSA would continue. All that mattered for school performance was the proportion of students in a school at or above the state-defined proficiency threshold. Research showed that this design led to negative unintended consequences:
- It encouraged schools to focus their energies on students just below this proficiency cut score, since improving student achievement for already-proficient (or very-far-below-proficient) students would not help the schools' accountability ratings.
- It is statistically a very poor measure of school performance because the relative rankings depend heavily on where the proficiency score is set.
- It is a very poor measure of achievement gaps—actual changes in achievement gaps can sometimes be opposite to changes identified by comparing proficiency rates, again because of where proficiency levels are set and the distribution of student achievement.
There are more arguments about the problems of this accountability policy design, which I detail in the letter I wrote to the US Department of Education. (It was signed by nearly 100 scholars and practitioners, including former presidents of the American Educational Research Association, former Commissioners of The National Center for Educational Statistics and the Institute of Education Sciences, and more than a dozen K-12 educators and school district data and accountability officers. The letter was also covered by news media here, here, and here.)
There's Time To Fix This
Fortunately, there is a window of opportunity for the U.S. Department of Education to address these issues in ESSA through its regulation process. The regulations being developed by the Department will have profound consequences for the kinds of accountability policies that are actually written and implemented under ESSA.
Based on the research evidence, I argue in the letter that the Department should interpret the law to allow states to use alternative measures of performance. In particular, I recommend one of two options:
- The simple average of students' scores on the assessment (what's known as average scale scores).
- A "proficiency index" that gives schools credit for performance all along the achievement distribution (for instance, 0 points for far below basic, 25 for below basic, 50 for basic, 75 for proficient, and 100 for advanced).
In the letter I recommend the first approach because it uses the most information available in the assessment data. However, both approaches would be far superior to rating schools according to the percent of students who achieved at least a "proficient" score.
So how would this work for our two example schools? In either of the two approaches sketched above, School A would come out on top of School B, as it should. For instance, using the proficiency index described above, School A would score a 65 and School B would score a 50 on the 0-to-100 scale, a dramatic difference reflecting the fact that the students in School A are higher-achieving on average than those in school B. An average scale score approach would even more precisely reflect the performance of kids in each school.
Why am I so concerned about such a seemingly esoteric design feature of accountability policy?
In short, incentives matter. Educators understandably pay attention to the design of accountability systems. Given the stakes involved, which can sometimes include reconstituting schools, it is not surprising that educators would shape their instructional responses based on the incentives of the policy even if they are educationally unsound. Given the once-in-a-generation opportunity that a new federal law offers, it's important for its accountability policies to encourage good practices, not bad ones.
The letter, however, did not describe my ideal accountability system, which would be one based heavily on student achievement growth. That kind of policy would encourage schools to focus on growing ALL students, and it would be much fairer to schools that serve large proportions of low income and other historically underserved students. Unfortunately, such a policy would not be allowable under ESSA as I read the statute.
I sincerely hope that the Department takes my letter seriously, because such a simple change could have profound effects on the incentives in ESSA. Others have written letters that make the same or similar points, so there will be ample voices behind these policy changes. Regardless, it is encouraging that so many researchers, educators, and advocates are interested in improving accountability policy for the next generation.
Morgan Polikoff is associate professor of education at the University of Southern California. His areas of expertise include K-12 education policy; Common Core standards; assessment policy; alignment among instruction, standards and assessments, and the measurement of classroom instruction. He blogs at https://morganpolikoff.com, and this article was adapted from his earlier post.