« Guest Blogger Daniel Koretz on New York City's Progress Reports | Main | GothamSchools Geeks Out on Sampling Error! »

Between a Political Rock and a Statistical Hard Place

| 1 Comment

Some days, skoolboy feels bad for the hard-working folks in the New York City Department of Education. They’re caught between a political rock and a statistical hard place. The political rock is the New York State accountability system, which complies with No Child Left Behind’s requirements to test students annually in grades 3-8 in Mathematics and English Language Arts, and to classify students, based on their test scores, as either Not Meeting Learning Standards (Level I), Partially Meeting Learning Standards (Level II), Meeting Learning Standards (Level III), or Meeting Learning Standards with Distinction (Level IV), and then aggregate the performance of students, and subgroups of students, to assess the school’s progress toward the goal of 100% proficiency for all students by the year 2014. The mechanism for this is a series of grade-specific exams, with a broad (but arbitrary, as Dan Koretz explains in Measuring Up) standard-setting process that define the scores on the exam that correspond to the four proficiency levels. Whatever a student’s scale score on the exam, he or she is classified into a particular proficiency level.

The statistical hard place is that the proficiency levels are only part of the story. The NYC DOE has found that the scale scores matter, such that a student whose scale score is halfway between the cutoffs for Level II and Level III, and therefore whose proficiency level is Level II, has a higher probability of graduating from high school on time than a student whose scale score is right at the cutoff for Level II. The scale scores have predictive validity—that is, they predict educational outcomes that we think of as important—but they don’t have the political currency of the proficiency levels specified by the state and the federal government.

There’s no evidence, to skoolboy’s knowledge, that achieving a proficiency level on NCLB-style exams has any predictive validity over and above the scale scores on which they are based. (Another regression discontinuity design study waiting to happen.) But I’ll wager that they don’t.

Whether or not the state/NCLB proficiency levels matter, the NYC DOE is stuck. They have to pay homage to the state standards, even though their internal evidence shows that partial progress—“learning quite a bit,” in skoolboy’s terms—really does matter for students’ futures, and therefore is something that schools should be held accountable for.

And I don’t disagree. I would be comfortable (though not ecstatic) with school progress reports that used changes in scale scores to quantify how much students had learned from one year to the next, under two conditions: (a) if the exams were vertically linked, and (b) if the uncertainty in the estimates of school-level effects on the average change were taken into account. Neither of these conditions is met in the current New York City School Progress Reports.

Navigating the political rock and the statistical hard place is definitely a challenge, both rhetorically and in the construction of the School Progress Reports. Rhetorically, the DOE is obliged to argue that a student who is Level III in fourth grade and Level II in fifth grade has lost ground—that student has fallen off of the sharp Level III cliff—because the state and federal accountability metrics treat this as a sharp discontinuity. But as a practical matter, the student may not have fallen off a cliff; rather, she may be just a little bit lower on a gradual hill in fifth grade than we’d like, but still higher on the hill than she was in fourth grade--and the DOE’s internal analyses document that anyone who is higher on the hill is better off than someone lower.

What’s the DOE to do? Well, it could continue to escalate the rhetoric directed toward its critics. (I note with alarm that the DOE went from calling me by my blogging name “skoolboy” on Monday to calling me “Professor Pallas of Teachers College” on Wednesday—whose proclivity to giving A’s to all of his students will come as a surprise to many of them—what’s next? Examining my teeth?) Or it could speak honestly and openly about the challenge of incorporating political and technical realities into the School Progress Reports. I think readers know which path skoolboy recommends.

1 Comment

You have to feel sorry for the brain trust working on this. Up until this week, if you told the American public that you had hired a group of really smart economists to develop some complicated statistical models that explain the entire universe, many people would have been very impressed. But that was before those really smart economists crashed the entire global economy.

People in education may not know this, but one of the main reasons these big companies are in so much trouble is the fact that, in 2004, the SEC rules changed to allow companies to take on more debt. Instead of fixed rules for debt to assets ratios, companies hired economists to create computer models to predict risk. Based on analyses of millions of data points, these models predicted that their companies should be allowed to borrow more and more and more money. Anyone with common sense would have said, "That sounds like a really bad idea." But the economists had science and statistics and computer models to prove it would be OK. The fundamentals were sound. Little to no risk. Yup, yup, yup.

So, this very same week, we have economists reporting a model of student achievement. To be blunt, the results make no d*mn sense to anyone. Schools shouldn't be jumping all over the place in terms of progress scores or achievement. That is simply not reality. A school that moved up should be proudly proclaiming how they spun gold from wet straw, but the principal says nothing changed. What strange statistical model is spitting out this nonsense?

I suppose you could say that it looks like they have taken a normalized distribution of residuals, standardized within the peer index classification, and then assigned letter grades based on -2, -1, 0, 1, and 2 standard errors. Or is it simply a predicted change score with peer grouping as a covariate? I don't know. You can do some fancy stuff with expensive computers. But recoding statistical noise into a letter grade does not an accountability system make.

Now that the results are out and the validity in question for a second year, critics are being told: "Statistics don't lie." But ordinary people know that companies shouldn't be allowed to borrow 30 times more than what they own so they can gamble the money on other stocks. You don't need a Ph.D. in economics to know what that smells like. New York's "Progress Reports" have the same foul stench. Maybe everyone will feel differently about it in 10 or 20 years when we have forgotten the damage done, but right now something about value-added models seems just a bit too...risky.

Comments are now closed for this post.


Recent Comments

  • DoubleDown: You have to feel sorry for the brain trust working read more




Technorati search

» Blogs that link here


8th grade retention
Fordham Foundation
The New Teacher Project
Tim Daly
absent teacher reserve
absent teacher reserve

accountability in Texas
accountability systems in education
achievement gap
achievement gap in New York City
acting white
AERA annual meetings
AERA conference
Alexander Russo
Algebra II
American Association of University Women
American Education Research Associatio
American Education Research Association
American Educational Research Journal
American Federation of Teachers
Andrew Ho
Art Siebens
Baltimore City Public Schools
Barack Obama
Bill Ayers
black-white achievement gap
books on educational research
boy crisis
brain-based education
Brian Jacob
bubble kids
Building on the Basics
Cambridge Education
carnival of education
Caroline Hoxby
Caroline Hoxby charter schools
cell phone plan
charter schools
Checker Finn
Chicago shooting
Chicago violence
Chris Cerf
class size
Coby Loup
college access
cool people you should know
credit recovery
curriculum narrowing
Dan Willingham
data driven
data-driven decision making
data-driven decision-making
David Cantor
Dean Millot
demographics of schoolchildren
Department of Assessment and Accountability
Department of Education budget
Diplomas Count
disadvantages of elite education
do schools matter
Doug Ready
Doug Staiger
dropout factories
dropout rate
education books
education policy
education policy thinktanks
educational equity
educational research
educational triage
effects of neighborhoods on education
effects of No Child Left Behind
effects of schools
effects of Teach for America
elite education
Everyday Antiracism
excessed teachers
exit exams
experienced teachers
Fordham and Ogbu
Fordham Foundation
Frederick Douglass High School
Gates Foundation
gender and education
gender and math
gender and science and mathematics
gifted and talented
gifted and talented admissions
gifted and talented program
gifted and talented programs in New York City
girls and math
good schools
graduate student union
graduation rate
graduation rates
guns in Chicago
health benefits for teachers
High Achievers
high school
high school dropouts
high school exit exams
high school graduates
high school graduation rate
high-stakes testing
high-stakes tests and science
higher ed
higher education
highly effective teachers
Houston Independent School District
how to choose a school
incentives in education
Institute for Education Sciences
is teaching a profession?
is the No Child Left Behind Act working
Jay Greene
Jim Liebman
Joel Klein
John Merrow
Jonah Rockoff
Kevin Carey
KIPP and boys
KIPP and gender
Lake Woebegon
Lars Lefgren
leaving teaching
Leonard Sax
Liam Julian

Marcus Winters
math achievement for girls
meaning of high school diploma
Mica Pollock
Michael Bloomberg
Michelle Rhee
Michelle Rhee teacher contract
Mike Bloomberg
Mike Klonsky
Mike Petrilli
narrowing the curriculum
National Center for Education Statistics Condition of Education
new teachers
New York City
New York City bonuses for principals
New York City budget
New York City budget cuts
New York City Budget cuts
New York City Department of Education
New York City Department of Education Truth Squad
New York City ELA and Math Results 2008
New York City gifted and talented
New York City Progress Report
New York City Quality Review
New York City school budget cuts
New York City school closing
New York City schools
New York City small schools
New York City social promotion
New York City teacher experiment
New York City teacher salaries
New York City teacher tenure
New York City Test scores 2008
New York City value-added
New York State ELA and Math 2008
New York State ELA and Math Results 2008
New York State ELA and Math Scores 2008
New York State ELA Exam
New York state ELA test
New York State Test scores
No Child Left Behind
No Child Left Behind Act
passing rates
picking a school
press office
principal bonuses
proficiency scores
push outs
qualitative educational research
qualitative research in education
quitting teaching
race and education
racial segregation in schools
Randall Reback
Randi Weingarten
Randy Reback
recovering credits in high school
Rick Hess
Robert Balfanz
Robert Pondiscio
Roland Fryer
Russ Whitehurst
Sarah Reckhow
school budget cuts in New York City
school choice
school effects
school integration
single sex education
small schools
small schools in New York City
social justice teaching
Sol Stern
Stefanie DeLuca
stereotype threat
talented and gifted
talking about race
talking about race in schools
Teach for America
teacher effectiveness
teacher effects
teacher quailty
teacher quality
teacher tenure
teachers and obesity
Teachers College
teachers versus doctors
teaching as career
teaching for social justice
teaching profession
test score inflation
test scores
test scores in New York City
testing and accountability
Texas accountability
The No Child Left Behind Act
The Persistence of Teacher-Induced Learning Gains
thinktanks in educational research
Thomas B. Fordham Foundation
Tom Kane
University of Iowa
Urban Institute study of Teach for America
Urban Institute Teach for America
value-added assessment
Wendy Kopp
women and graduate school science and engineering
women and science
women in math and science
Woodrow Wilson High School