« With New Gifted and Talented Rules, Who Wins and Loses? | Main | Guest Blogger Mike Klonsky: The Small Schools Movement Meets the Ownership Society »

More Signs of the Apocalypse!

Here's my take on the New York tenure law discussion going on around the blogs:

1) The backdoor process was unsavory, and now threatens to displace an important discussion about the limits of value-added measures in New York. Sherman Dorn offers some fertile thoughts on the process issue. Also worth noting that last week's outragists were hardly outraged about the secrecy surrounding NYC's teacher experiment.

2) Critics would do well to separate the likely effects of this law from their unhappiness with the process. Consider Robert Gordon's post, which interprets the law's effects as follows:

This means that in deciding whether to give a teacher a presumptive right to teach for 30+ years, a principal may not consider evidence of whether the teacher is helping students learn. The principal can consider whether the teacher maintains neat bulletin boards, whether the teacher attends meetings on how to pay for pencils, and whether the teacher is sufficiently deferential in the hallway. But the principal may not consider, based on achievement data, whether children are learning.

Do classroom observations provide no "evidence of whether the teacher is helping students learn?" Value-added measures, after all, are simply a proxy for student learning, and observations also provide proxy data on student learning. Gordon assumes that principals cannot identify teachers with especially low value-added in the absence of test score data. But if value-added measures mean anything, very low performers should be getting poor subjective evaluations too. It turns out that principals are actually pretty good at identifying teachers with low value-added based on subjective evaluations (see this post). If a teacher is a consistent low performer, the three admissable forms of evidence in tenure decisions - 1) observations, 2) peer review, and 3) an evaluation of how teachers use data to inform instruction - already provide lots of information about how teachers affect student learning.

3) To my knowledge, no one has provided a viable technical solution to the middle of the year testing issue. Given existing problems with value-added and the added complication of midyear testing dates, it would be wildly irresponsible to put these measures into place in NY without further study.

If you want new reasons (not related to testing dates) to sweat about the fallibility of value-added, check out this paper, which was presented last weekend at AEFA by Tim Sass (in collaboration with RAND's J.R. Lockwood and Dan McCaffrey). They looked at the year-to-year stability of value-added estimates in Florida, and found that it's often the case that teachers who are in the bottom 20% of value-added estimates in one year are not in the bottom 20% the next year. In Broward County, only 41.4% of teachers who were in the bottom 20% in one year were in the bottom 20% the next year, too. In Orange County, only 31.7% of the teachers who were in the bottom 20% in one year were also there the next year!

Update: Robert Gordon cherrypicks a finding from the Jacob and Lefgren paper to make his point. Perhaps if he'd read beyond the abstract and looked at the magnitude of the value-added advantage over principal ratings in predicting future student achievement (a whopping .036 SD in reading and .074 SD in math), he would realize that all is not lost. And again, this minuscule value-added advantage is coming from the middle of the distribution, not the top and bottom - and the bottom is the relevant issue in tenure decisions. From the same paper:

While value-added measures of teacher effectiveness generally do a better job at predicting future student achievement than principal ratings, the two measures do about equally well in identifying the best and worst teachers. With regard to parent satisfaction, we find that a principal’s overall rating of a teacher is a substantially better predictor of future parent requests for that teacher than either the teacher’s experience, education and current compensation or the teacher’s value-added achievement measure.

Moreover, what kind of predictive advantage can we expect inaccurate/noisy value-added estimates to have over principals' evaluations?

I look forward to some ridiculous cases in which the courts are asked to interpret the meaning of the phrase "student performance data." If the New York state legislature wants to preclude administrators making tenure decisions based on standardized test scores, they should say so explicitly. I recently attended an event in which student artwork was on display, and a music teacher directed a small ensemble of children performing. That artwork and the students' performances are student performance data, and I'd like to think that their teachers had something to do with the quality of the work. The parents certainly thought so.

(A brief pause here for Andy Rotherham, Charlie Barone and their minions to say, "Aha! No evidence that NCLB chased art and music out of that school!")

SB -

Thanks for saying it for us!

Welcome to the minions.

I think you're getting the idea.

--- Charlie

Point #2 is hard to take seriously, given the ferocity with which the anti-accountability crowd protests any time someone wants to give a principal any sort of evaluative or, God-forbid, hiring/firing responsibility. Principals, we are to believe, are all incompetent and capricious demagogues who will wield any smidgen of power with the malevolence of a robber baron. Now the story changes: who needs objective measures of accountability when we have teacher observations?

The smugness of the anything-but-accountability folks as they have their anti-testing cake and wolf it down is hardly bearable when the stakes are so high. To pretend that everything is hunky dory in unionized teacher land as long as the principal is doing the evals is disingenuous at best.

Socrates, I don't see this as an "anything but accountability" argument. My point in #2 is simply that principals/APs are not operating in the dark, as some pols/bloggers would have us believe.

As is clear from point 1, I'm no more happy with the process than you are. But I don't think you're acknowledging the difficulty of producing these "objective measures" even when we have September-June tests, and NY's testing schedule makes these issues even more complicated.

Yeah, I'm sorry about that - my comments were more directed at the typical opponents of accountability than they were at you. You accurately state the limits of testing, but the majority of those in the blogosphere who support point #2 don't address said limitations with much nuance. For them (and for all the knee-jerk anti-NCLB-ites), testing is bad, period. And similarly, principal evaluations are bad, period. Thus, the argument goes, all accountability is suspect and should be completely abolished in the name of protecting teachers.

I agree with you that the mid-year testing dates are preposterous, and that testing as a single measure of accountability is similarly problematic. Most of those who support this legislature's utterly absurd determination, though, don't approach the issue with any degree of subtlety; in their minds, the only good teacher evaluation is no evaluation. Clearly, what we need is good, objective measures of accountability to be used alongside the more subjective measures that provide color to the stark numbers we get from test scores.

Chris Cerf presented the NYC DOE's value added work to the Panel for Educational Policy, the body responsible for approving policy decisions for the city's public schools. As the Manhattan representative I've not taken a position on the value added work mostly because my requests to see the technical information including the specifics of the designed test have not been granted when I asked.

That said, I can understand the teachers union point of view. The Bloomberg administration's DOE has no credibility with regard the use of data and statistics. Data is routinely manipulated and deceptively presented to bolster the administration's policy agenda. My biggest complaint is with how parent survey results were manipulated to purport to show class size was not the primary concern of parents. But you only have to look at the previous post on this blog to see another example The DOE press release heralded the new gifted and talented policy as a major step forward for closing the achievement gap while even a cursory review of acceptance numbers showed lower income districts fell further behind.

The question is not whether "all is lost" without value-added data. The question is whether value-added data contribute useful information or should be categorically barred from use.

Similarly, the question is not whether it is possible to make an accurate subjective judgment without value-added data. The question is whether complementing subjective judgment with value-added data is likely to yield a more accurate conclusion. Common sense suggests that the more information you have, the more likely you are to be accurate.

We are not talking about pornography here. (Not to imply that you would advocate banning pornography.) We are talking about data on student performance. Why is it okay to ban using student performance data to inform decisions affecting student performance? For goodness sake!

By the way, I discussed the Jacob & Lefgren paper at some length on a panel at CAP two years ago. The sharpest critic there argued that the paper showed subjective ratings are "like chance" and not to be trusted.

Take care.

Mr. Gordon: eduwonkette is on record that she is not opposed to the use of value-added data for decision-making in schools. Trust me, she likes value-added data. And I think she'd agree with me that the New York state legislature ban on the use of any student performance data in the making of tenure decisions is ridiculous. (But she can speak for herself, and I'm sure she will.)

But here's the thing, and I'm delighted that you've weighed in. The only value-added system currently under development in New York City is fundamentally flawed, for the reasons that eduwonkette has discussed -- especially the reliance on mid-year testing. I have yet to hear any defense of the NYC value-added system, let alone a persuasive defense. Are you, as a consultant to the NYC DOE, prepared to offer one? The issue in New York City is not value-added in general -- it's the particulars of this system.

And may I encourage you to encourage the NYC DOE to do a better job of explaining the value-added system under development to teachers, parents, principals, researchers, and the general public? Performance evaluation systems have the best chance of being perceived of as legitimate by diverse stakeholders if they are perceived to be fair. Stakeholders can only judge the fairness of an evaluation system if they understand what goes into the evaluation. Right now, the biggest obstacle to perceiving the NYC system as fair is that a teacher's estimated value-added depends on both the performance of the teacher her students had in the previous year and the performance of the teacher her students will have in the following year. That's not fair!

Robert Gordon is the author of the highly flawed "fair student funding" system, which in the name of equity, would have cut the budget of half of the failing schools in NYC by an average of $400,000.

Similarly, every formula this administration has come up with has been simplistic and statistically illiterate -- and without any awareness of its damaging effects on schools, teachers and students.

Take the school grading system and the merit pay scheme -- with more than half of the grade or reward based upon gains or losses of one year's worth of test scores alone at the school level, which nearly every expert has said is statistically invalid.

So Gordon is asking us to trust the DOE to use this test score data responsibly in conjunction with other information in making tenure decisions?

They are the last people I'd trust to use any sort of data in a transparent, reliable manner, since they get this stuff wrong nearly 99% of the time. It's like giving a gun to a convicted serial killer.

Fool me once, shame on you; fool me twice, or three times, or four times, or in this case, ad infinitem -- shame on me.

The finding you selected from the paper of Tim Sass, J.R. Lockwood, and Dan McCaffrey is just what we all hope for in the best of systems: Deficient performers are unobservable the following year, after choosing other careers; or they are coached, mentored, improved, matured, etc into becoming better teachers.
I'm dubious of these performance assessment methods as well, but I would choose as evidence of the random error leading to regression the fact -- if it is so -- that large fractions of the BEST teachers, as indicated by these metrics, drop out of that top tier the following year. Very effective teachers should be revealed as effective year after year. Even policy makers would have to concede that the assessment methods are worthless if the performance statistics say otherwise.

But, all of that is modeling-theoretical. Can't we learn to accept more about the falibilty of statistical models from the "surprises" on Wall Street? Good teachers no differently from other good workers run into problems, perhaps with a challenging mix of students; and they get seek and get help, in good systems, in overcoming them.


Comments are now closed for this post.


Recent Comments

  • DemostiX: The finding you selected from the paper of Tim read more
  • Leonie Haimson: Robert Gordon is the author of the highly flawed "fair read more
  • skoolboy: Mr. Gordon: eduwonkette is on record that she is not read more
  • Robert Gordon: The question is not whether "all is lost" without value-added read more
  • Patrick Sullivan: Chris Cerf presented the NYC DOE's value added work to read more




Technorati search

» Blogs that link here


8th grade retention
Fordham Foundation
The New Teacher Project
Tim Daly
absent teacher reserve
absent teacher reserve

accountability in Texas
accountability systems in education
achievement gap
achievement gap in New York City
acting white
AERA annual meetings
AERA conference
Alexander Russo
Algebra II
American Association of University Women
American Education Research Associatio
American Education Research Association
American Educational Research Journal
American Federation of Teachers
Andrew Ho
Art Siebens
Baltimore City Public Schools
Barack Obama
Bill Ayers
black-white achievement gap
books on educational research
boy crisis
brain-based education
Brian Jacob
bubble kids
Building on the Basics
Cambridge Education
carnival of education
Caroline Hoxby
Caroline Hoxby charter schools
cell phone plan
charter schools
Checker Finn
Chicago shooting
Chicago violence
Chris Cerf
class size
Coby Loup
college access
cool people you should know
credit recovery
curriculum narrowing
Dan Willingham
data driven
data-driven decision making
data-driven decision-making
David Cantor
Dean Millot
demographics of schoolchildren
Department of Assessment and Accountability
Department of Education budget
Diplomas Count
disadvantages of elite education
do schools matter
Doug Ready
Doug Staiger
dropout factories
dropout rate
education books
education policy
education policy thinktanks
educational equity
educational research
educational triage
effects of neighborhoods on education
effects of No Child Left Behind
effects of schools
effects of Teach for America
elite education
Everyday Antiracism
excessed teachers
exit exams
experienced teachers
Fordham and Ogbu
Fordham Foundation
Frederick Douglass High School
Gates Foundation
gender and education
gender and math
gender and science and mathematics
gifted and talented
gifted and talented admissions
gifted and talented program
gifted and talented programs in New York City
girls and math
good schools
graduate student union
graduation rate
graduation rates
guns in Chicago
health benefits for teachers
High Achievers
high school
high school dropouts
high school exit exams
high school graduates
high school graduation rate
high-stakes testing
high-stakes tests and science
higher ed
higher education
highly effective teachers
Houston Independent School District
how to choose a school
incentives in education
Institute for Education Sciences
is teaching a profession?
is the No Child Left Behind Act working
Jay Greene
Jim Liebman
Joel Klein
John Merrow
Jonah Rockoff
Kevin Carey
KIPP and boys
KIPP and gender
Lake Woebegon
Lars Lefgren
leaving teaching
Leonard Sax
Liam Julian

Marcus Winters
math achievement for girls
meaning of high school diploma
Mica Pollock
Michael Bloomberg
Michelle Rhee
Michelle Rhee teacher contract
Mike Bloomberg
Mike Klonsky
Mike Petrilli
narrowing the curriculum
National Center for Education Statistics Condition of Education
new teachers
New York City
New York City bonuses for principals
New York City budget
New York City budget cuts
New York City Budget cuts
New York City Department of Education
New York City Department of Education Truth Squad
New York City ELA and Math Results 2008
New York City gifted and talented
New York City Progress Report
New York City Quality Review
New York City school budget cuts
New York City school closing
New York City schools
New York City small schools
New York City social promotion
New York City teacher experiment
New York City teacher salaries
New York City teacher tenure
New York City Test scores 2008
New York City value-added
New York State ELA and Math 2008
New York State ELA and Math Results 2008
New York State ELA and Math Scores 2008
New York State ELA Exam
New York state ELA test
New York State Test scores
No Child Left Behind
No Child Left Behind Act
passing rates
picking a school
press office
principal bonuses
proficiency scores
push outs
qualitative educational research
qualitative research in education
quitting teaching
race and education
racial segregation in schools
Randall Reback
Randi Weingarten
Randy Reback
recovering credits in high school
Rick Hess
Robert Balfanz
Robert Pondiscio
Roland Fryer
Russ Whitehurst
Sarah Reckhow
school budget cuts in New York City
school choice
school effects
school integration
single sex education
small schools
small schools in New York City
social justice teaching
Sol Stern
Stefanie DeLuca
stereotype threat
talented and gifted
talking about race
talking about race in schools
Teach for America
teacher effectiveness
teacher effects
teacher quailty
teacher quality
teacher tenure
teachers and obesity
Teachers College
teachers versus doctors
teaching as career
teaching for social justice
teaching profession
test score inflation
test scores
test scores in New York City
testing and accountability
Texas accountability
The No Child Left Behind Act
The Persistence of Teacher-Induced Learning Gains
thinktanks in educational research
Thomas B. Fordham Foundation
Tom Kane
University of Iowa
Urban Institute study of Teach for America
Urban Institute Teach for America
value-added assessment
Wendy Kopp
women and graduate school science and engineering
women and science
women in math and science
Woodrow Wilson High School