« Bill Gates, U.S. Superintendent of Schools | Main | The NYC High School Progress Reports Meet Credit Recovery »

# School Progress Grade Effects on NYC Achievement: Tame, Fierce, or a Hot Mess?

skoolboy ventured into the rarified air of NYC’s Harvard Club yesterday to hear Marcus Winters present his new Manhattan Institute research on the effects of the 2006-07 New York City School Progress Reports on students’ 2008 performance on state math and English tests in grades four through eight. The analysis uses a regression-discontinuity design, capitalizing on the fact that schools received a continuous total score summarizing their performance on school environment (15%), student performance (30%) and student growth (55%), but there are firm cut-offs that distinguish schools receiving an F from those receiving a D, those receiving a D from those receiving a C, etc. This means that there might be schools that are very similar in their total scores, and presumably on other school characteristics, on either side of a given cut-off, allowing researchers to study the test-score consequences of obtaining a specific letter grade.

The two tables below summarize the impact of the Progress Report grades on student math and English proficiency, respectively. Both tables contrast the consequences of getting an A, B, D or F with a reference category, a C grade. A green up-arrow indicates that students in a school that received a particular Progress Report Grade did better than students in C schools, whereas a red down-arrow indicates that students did worse than students in C schools. An X indicates that student performance did not differ significantly from that of students in C schools at the p<.05 level.

There’s a lot of X’s. In math, students in F schools did better than students in schools receiving higher grades, although this seems to be primarily due to an effect in grade 5. Students in D schools also did better than those in schools receiving higher grades, also due to their advantages in grade 5, apparently. In English, the letter grade a school received did not have any consequences for student performance.

Although both Winters and discussant Jonah Rockoff were careful to note limits both to the analyses and what they can tell us about the incentive effects of accountability systems, both characterized the results as pretty clear evidence that schools reacted to receiving an F or a D in ways that boosted student achievement. This was particularly noteworthy, they argued, because such little time had elapsed between when a school learned that it had received a D or F and when students were tested—January, for English, and March, for mathematics.

Well, yeah, the short time between receiving the grade and the testing is certainly an issue, and surfaced as the likely explanation for why no effects of the School Progress Report grades were found in English. But skoolboy is still worried about math. There were no statistically reliable consequences for getting a D or an F in grades 4, 6, 7 and 8; only in grade 5 is there a test-score boost. How are we to make sense of this? If the letter grades are such a powerful incentive, wouldn’t they affect the performance of students in all of the grades in a school, not just fifth-graders?

Cool person Amy Ellen Schwartz posed a very smart question from the audience. "What about those A and B schools doing worse than the C schools in 5th grade math? What does that mean?" she asked. The panelists didn’t want to address that head-on, in skoolboy’s view, but he will: Looking at 5th grade mathematics, there’s as much evidence of the receipt of an A or a B causing a school to coast as there is evidence of the receipt of a D or an F causing a school to be more productive. Probably not a popular interpretation among the true believers in the power of incentives in the room.

But the bigger story is one of what Winters called "tame" effects. No effects of the School Progress Report grades in English, and limited evidence of effects in Math. A short time-horizon between the “treatment” of receiving the grades and student testing. Ambiguous incentives, both positive and negative, associated with the grades. A very weak theory of how the grades would be expected to increase student performance. It’s a wonder that Winters found anything at all.

A last point: Winters suggested that there were dire predictions that schools would "give up" if they got low Progress Report grades, and his findings, he said, did not show that. Although there were editorials at the time of the initial release of the Progress Reports last fall expressing concern that schools might be stigmatized by getting a C, D or F when students were performing at generally high levels, I question whether anyone thought that schools, and the educators who work in them, would "give up." The more predictable reaction—which I think was born out—was that principals, teachers and parents would simply not believe the Progress Report grades accurately characterized what they saw on a day-to-day basis. A lot of stakeholders don’t believe that the Progress Report grades are reliable measures of school performance, and given what eduwonkette and I have shown about the instability in the student progress measures at the heart of the system, those beliefs are well-founded.

A brief version of the research can be found here. The technical version is now available at the same location.

When he says there's a boost in 5th grade math scores is he comparing this year's 5th graders to last year's 5th graders or this years to themselves when they were in 4th grade last year?

Also, what's the dependent variable -- did he compute a gain score using a test that isn't vertically equated, or is he using overall proficiency?

Lastly, I've also never heard the theory that schools would give up if they received a low grade -- I'm not really sure why or how that would happen.

There'a a cute video making the rounds showing what a stop-sign would look like if it was designed by a big corporation.

I think there is plenty of room to create a much funnier video showing what a stop-sign would look like if it was designed by the educational community.

Corey: The analysis regresses student i's score in school s at the end of year t on a cubic function of the student's prior year test score and a bunch of covariates. The analyses that pool students across grades are doing something I wouldn't recommend, given the lack of vertical equating of the test across grades.

It seems to me that the statistics here are even more marginal than they first appear...

They make 48 independent measurements, and 6 of them appear to be significant at the 95% confidence level -- did I get that part right?

What are the chances of that happening even if there were no actual underlying effect?

Rachel: I wouldn't characterize the analyses pooling grades 4 through 8 as independent of grade-specific regressions, since the data in the grade-specific regressions also appear in the pooled all-grades regression. So it's basically 4 coefficients out of 40 that are reliably different from zero, which could easily be due to chance.

The Manhattan Institute is to be commended for this research. Researchers are seldom willing to do the extensive data preparation necessary for this kind of work: collapsing, combining, weighting, and rescaling scaled, collapsed, and weighted variables over and over again. Just organizing a regression with so many control variables can be difficult. They are to be applauded for their willingness to report that, time and time again, the Progress Report results had no effect on student performance. This research reflects the kind of transparency long overdue in educational accountability.

However, despite their meticulous attention to detail, these researchers may have overlooked a basic relationship in the data. See their Table 3. Look at the data by letter grade from 2006-2007. Consider just two variables: Overall Progress Report Score and Percent Black. Using scientific statistical methods, we can investigate these data. Let me modestly propose, in keeping with the rigor of educational accountability research, that this little data set is most certainly worthy of intensive scrutiny. Below is the table, reformatted.

Grade, Overall Progress Report Score, Percent Black:

F 23.6 45.5%
D 35.0 44.5%
C 44.7 36.3%
B 56.6 32.2%
A 72.6 26.4%

Only a trained statistical eye could possibly see the trend.

A regression analysis employing Percent Black as a predictor and Overall Progress Report Score as the dependent measure results in an unadjusted r-square of 0.9586, a statistically significant effect (Pearson correlation = -.97910, p

A planned follow-up analysis indicates that the effect is particularly strong among the middle to high scoring schools. Subsetting the data to include only C or better schools (n=3), the regression model results in an unadjusted r-square of 0.9998 (Pearson correlation = -0.99990, p

146.99002 + (-2.81423 x Percent_Black) = Overall_Progress_Report_Score

Based on observed data, these estimates are highly accurate to within one half of one percent across the three grades.

Grade, Percent Black, Observed Overall Score, Predicted Overall Score (rounded to 2 decimal places):
C 36.3 44.7 44.83
B 32.2 56.6 56.37
A 26.4 72.6 72.69

Given that the far majority of the schools in New York are awarded C or better scores, this equation will allow us to predict with high accuracy the Overall Progress Report Score of almost any school using only the percent of students who are black.

Who needs ARIS when you have a pencil? Imagine the money we could save. With the budget crisis upon us, the cost savings could be used to fund enough independent research organizations to provide full employment for every economist in Manhattan. With more time and more money, just imagine what an entire building full of Ph.D.'s might discover.

An alternate hypothesis which may explain a lot of the failure of American public education systems in general to respond to objective negative numbers with significant improvement:

The industry is systemically and pervasively incompetent to do what it is expected to do because it has - successfully - ignored research from relevant non-education fields (psychology, neuroscience, psychiatry, social/group sociology/psychology, etc.) for dogs' years while failing to allow or perform its own rigorous education industry-limited research, and where there is education industry-limited valid research, it has successfully rejected all efforts to mandate significant use of same.

Educrats and speducrats know what they know. And it just isn't good enough. Time to break down the artificial and highly fictitious boundaries they have created between human life in the rest of the world and that in the allegedly unique, rarified world of an American public school. Cognition is cognition; behavior is behavior; group dynamics is group dynamics. Surprise! These have all been studied - well - elsewhere. Time to bring those lessons into the school environment.

A planned follow-up analysis indicates that the effect is particularly strong among the middle to high scoring schools. . . .

Given that the far majority of the schools in New York are awarded C or better scores, this equation will allow us to predict with high accuracy the Overall Progress Report Score of almost any school using only the percent of students who are black.

That's a good analysis, however if we look at home-buying patterns for parents with children who can help schools achieve high ranking, we see that these parents intuitively seem to gravitate to neighborhoods with low diversity features. I wonder what rule of thumb they're using?

Comments are now closed for this post.

### Recent Entries

• TangoMan: A planned follow-up analysis indicates that the effect is particularly read more
• Dee Alpert: An alternate hypothesis which may explain a lot of the read more
• J. Swift: The Manhattan Institute is to be commended for this research. read more
• skoolboy: Rachel: I wouldn't characterize the analyses pooling grades 4 through read more
• Rachel: It seems to me that the statistics here are even read more

### Technorati

Technorati search

### Tags

Fordham Foundation
The New Teacher Project
Tim Daly
absent teacher reserve
absent teacher reserve

accountability
accountability in Texas
accountability systems in education
achievement gap
achievement gap in New York City
acting white
AERA
AERA annual meetings
AERA conference
AERJ
Alexander Russo
Algebra II
American Association of University Women
American Education Research Associatio
American Education Research Association
American Educational Research Journal
American Federation of Teachers
Andrew Ho
Art Siebens
ATR
Baltimore City Public Schools
Barack Obama
Bill Ayers
black-white achievement gap
books
books on educational research
boy crisis
brain-based education
Brian Jacob
bubble kids
Building on the Basics
Cambridge Education
carnival of education
Caroline Hoxby
Caroline Hoxby charter schools
cell phone plan
charter schools
Checker Finn
Chicago
Chicago shooting
Chicago violence
Chris Cerf
class size
Coby Loup
college access
cool people you should know
credit recovery
curriculum narrowing
D3M
Dan Willingham
data driven
data-driven decision making
data-driven decision-making
David Cantor
DC
Dean Millot
demographics of schoolchildren
Department of Assessment and Accountability
Department of Education budget
desegregation
Diplomas Count
do schools matter
Doug Staiger
dropout factories
dropout rate
dropouts
education books
education policy
education policy thinktanks
educational equity
educational research
educational triage
effects of neighborhoods on education
effects of No Child Left Behind
effects of schools
effects of Teach for America
elite education
ETS
Everyday Antiracism
excessed teachers
exit exams
experienced teachers
Fordham and Ogbu
Fordham Foundation
Frederick Douglass High School
Gates Foundation
gender
gender and education
gender and math
gender and science and mathematics
gifted and talented programs in New York City
girls and math
good schools
guns in Chicago
health benefits for teachers
High Achievers
high school
high school dropouts
high school exit exams
high-stakes testing
high-stakes tests and science
higher ed
higher education
highly effective teachers
Houston Independent School District
how to choose a school
IES
incentives in education
Institute for Education Sciences
is teaching a profession?
is the No Child Left Behind Act working
Jay Greene
Jim Liebman
Joel Klein
John Merrow
Jonah Rockoff
Kevin Carey
KIPP
KIPP and boys
KIPP and gender
Lake Woebegon
Lars Lefgren
leaving teaching
Leonard Sax
Liam Julian

Marcus Winters
math achievement for girls
McGraw-Hill
meaning of high school diploma
Mica Pollock
Michael Bloomberg
Michelle Rhee
Michelle Rhee teacher contract
Mike Bloomberg
Mike Klonsky
Mike Petrilli
narrowing the curriculum
National Center for Education Statistics Condition of Education
NCLB
neuroscience
new teachers
New York City
New York City bonuses for principals
New York City budget
New York City budget cuts
New York City Budget cuts
New York City Department of Education
New York City Department of Education Truth Squad
New York City ELA and Math Results 2008
New York City gifted and talented
New York City Progress Report
New York City Quality Review
New York City school budget cuts
New York City school closing
New York City schools
New York City small schools
New York City social promotion
New York City teacher experiment
New York City teacher salaries
New York City teacher tenure
New York City Test scores 2008
New York State ELA and Math 2008
New York State ELA and Math Results 2008
New York State ELA and Math Scores 2008
New York State ELA Exam
New York state ELA test
New York State Test scores
No Child Left Behind
No Child Left Behind Act
passing rates
Pearson
picking a school
press office
principal bonuses
proficiency scores
push outs
pushouts
qualitative educational research
qualitative research in education
quitting teaching
race and education
racial segregation in schools
Randall Reback
Randi Weingarten
Randy Reback
recovering credits in high school
Rick Hess
Robert Balfanz
Robert Pondiscio
Roland Fryer
Russ Whitehurst
Sarah Reckhow
school budget cuts in New York City
school choice
school effects
school integration
single sex education
skoolboy
small schools
small schools in New York City
social justice teaching
Sol Stern
SREE
Stefanie DeLuca
stereotype threat
Teach for America
teacher effectiveness
teacher effects
teacher quailty
teacher quality
teacher tenure
teachers
teachers and obesity
Teachers College
teachers versus doctors
teaching as career
teaching for social justice
teaching profession
test score inflation
test scores
test scores in New York City
testing
testing and accountability
Texas accountability
TFA
The No Child Left Behind Act
The Persistence of Teacher-Induced Learning Gains
thinktanks in educational research
Thomas B. Fordham Foundation
Tom Kane
Tweed
University of Iowa
Urban Institute study of Teach for America
Urban Institute Teach for America