« Remember MLK | Main | Carnival of Education Pre-Party »

It's Our Secret! The NYC Teacher Experiment

The NY Times reported yesterday on an ongoing experiment on teacher effectiveness in NYC schools. Principals in the treatment group (140 schools) receive extensive value-added information on each teacher, and then are asked to evaluate the teachers. Principals in the control group do not receive these reports but also provide evaluations of their teachers. As far as I can tell, the goal is to determine how principals' evaluations are affected by having access to value-added data. By the summer, the NYC DOE will decide how these data will be used, and Deputy Chancellor Chris Cerf has even suggested releasing individual teachers' effectiveness data publicly. You can watch this video for more information about the experiment.

While much could be said about the challenges of estimating reliable value-added measures for teachers or the move to use test scores as the primary measure of assessing teacher effectiveness, I'll save those for later. (See more posts about measuring teacher effectiveness here.) Instead, I want to talk about the issue of research ethics in scientific experiments. It turns out that many teachers in participating schools have not been notified of the study.

Secret experiments have an odious history in science. The most notable example is the Tuskegee experiment, in which African-American men with syphilis were recruited into a study but not told of the purpose of the study or notified of their diagnosis. Their disease was left untreated so that researchers could track its progression. Once this experiment broke publicly, Congress passed legislation that, many commissions and administrative changes later, ultimately required universities receiving federal grants to form Institutional Review Boards to oversee all research. Human subjects policies require university researchers to receive the consent of all subjects and to make them aware of the potential risks of the study.

My point is not that the NYC experiment's secrecy is the moral equivalent of the Tuskegee Experiments. The Department of Education is not bound by any university's human subjects policy, and it is their right to examine whatever data they please to produce new knowledge. (Note that the university researchers involved are bound by IRB standards if they plan to publish off of these data.) But the Hippocratic Oath of the research community - that subjects should be aware that they are part of a study - has been grossly violated. And it does not help the reputation or future of "scientifically based research" in education when studies are conducted in secret. Even if this was not a research study, a decent boss notifies employees when they change the criteria on which employees are evaluated.

Where is this going next? Notably, Cerf's suggestion that individual teachers' data should be publicly released has precedent in New York. The New York State Department of Health started collecting similar data on doctors' effects on mortality in the early 1990s. In 1991, New York Newsday filed a Freedom of Information request, which forced the Department of Health to publicly release doctor level data. Since then, individual doctors' data have been publicly reported. Assuming the same Freedom of Information statutes apply to education, it may not be long before we can examine the "value-added scores" of NYC teachers while waiting for the C train to show up.

Back to data-driven decision making tomorrow.

David K. Cohen once wrote a paper on sociologist Willard Waller, who he described as "hating school but loving education." I'd characterize the New York City Department of Education as loving data but hating research. The senior administrators are true believers in the power of data to drive decisions; but there is a remarkable lack of understanding of the fact that data don't speak for themselves. Real-world research is characterized by ambiguity and uncertainty, and not because researchers are incompetent or propelled by ideology. Where is this uncertainty represented in the school progress reports or teacher value-added reports?

The fact that the UFT knew about this for months but kept it quiet is as outrageous as the actions of the DOE.

Randi Weingarten UFT has known about the program for months but kept quiet about it - she claims she did not know the specific schools which we all know would have been easy for them to find out and warn the teachers. And even if they couldn't find out, a public exposure at the time would have allowed teachers in all schools to confront their principals and ask point blank if they were part of the program. That would have forced them to tell them or basically lie to their faces. At the very least the UFT could have thrown a monkey wrench into their plans but chose the sounds of silence.

Therefore, view Randi Weingarten's words of outrage - I guess she wasn't all too outraged all these months - and promise to fight the plan as the usual empty words designed to obfuscate the issue and confuse the members.

So what is the "risk" here?

When I came across this article this weekend I was thinking: "I can't wait to hear Eduwonkette's take on the situation."

Skoolboy raised the idea that struck me most. The data produced by examining one teacher's classroom and comparing that data to teachers with "similiar students" presents a very narrow view of what is actually going on in that classroom.

But clearly the most egregious problem is the secrecy involved. Teachers have a right to know they are being analyzed and compared and by what criteria. And then the powers of be wonder why teachers won't let go of tenure. It's situations like these that make people hold on to that practice. There has to be a better way to evaluate teachers then to simply let principals design their own measuring system. Possibly we could use peer or self review systems so that the process is less acrimonius.

I found it quite humorous that the principal interviewed in the article revealed that he is part of this study. I am sure he had a great day today.

The chapter leader at that school is excellent and works hard defending her members - she is also a very well respected teacher and works with the opposition (ICE) to Weingarten - not that it means much. We'll get some idea from her as to the reaction and report back.

Thanks, everyone, for commenting. Skoolboy, I'm with you on measurement error. As Sander said correctly in the article, we may be able to identify positive and negative outliers, but the idea of arraying teachers in a continuum goes against everything we know about estimating school and teacher effects.

Norm, as I understand it, this went from being announced as an academic study - with no consequences for anyone - to a policy proposal. If it was just a study, perhaps widespread notification was unnecessary - but as Leonie Haimson suggested when she sent me this video in September, we all knew where this could go. And it did.

Stuart, the risk is that teachers' future employment, as well as the current relationship they have with their administrators, is fundamentally endangered by participation in the experiment. By any IRB's standard, this is a major risk; I am certain that my institution's IRB would not approve a protocol that did not involve notification of study participants. If the study investigators could assure that this information would be unassailably "correct," this might be less of a problem. However, the estimation of teacher effects is complicated by the fact that model specification makes a big difference.

Marnie, I agree that test scores provide a very narrow view of teacher effectiveness. As Jesse Rothstein demonstrated in this paper, gain scores are very dangerous when students aren't randomly assigned to teachers:


Keep the comments coming!

At the risk of condemnation from many in the educational establishment, I believe Bloomberg is onto something with the collection of this data. However, in no way do I condone his clandestine methodology. What was he attempting to hide?

From a pragmatic view, what teacher worth a dime would shy away from the opportunity to be judged on a merit pay system?

Additionally, what parent would willing place their child in a class where the teacher was reluctant to be judged on quantifiable, objective results?

For more on this topic try "googling" William Sanders, University of Tennessee, value-added assessments. You might find the NYTimes piece left out some noteworthy information regarding this practice which will be coming to a school district near you, soon.

Principals in the treatment group (140 schools) receive extensive value-added information on each teacher, and then are asked to evaluate the teachers.

Even from a research point of view, I don't get this. What do they think they are going to learn from comparing the teacher evaluations of principals who get the data and those who don't?

If I were developing a hypothesis for the outcome of that experiment, it would be that principal's evaluations in the "with data" case will tend to align with the data they have. Would that tell me anything about the usefulness of the data? I don't see how.

A more (to me) interesting question would be to compare the evaluations principals give in the absence of the data, with the data. How well does a principals subjective evaluation correlate with test-score results? How consistent is it from principal to principal? What qualities do principals seem to value that don't correlate with test score "value-added?"

Is there something about the DOE's "research" that I'm missing??

Unfortunately there are far too many education research studies in which the participants are not informed. Many of these fall in the "action research" category.

Currently many students in education programs (including the one from which I graduated) are required to complete an "action research" project. This is touted as a counterbalance to--or even protection against--the shortcomings of quantitative research. Under "action research," just about anything qualifies as data. Moreover, the researcher is directly involved in the environment under study, and his or her current and future actions may be influenced by the results of the research.

For example: a teacher might conduct an "action research" project on the effect of positive phone calls to parents on homework completion rates. If it appears that positive phone calls home result in more homework turned in, then the teacher may make far more of those positive phone calls than he or she would otherwise.

Since anything qualifies as "data" (including the researcher's own observations, reflections, and impressions), there is no guarantee that the outcome will diverge substantially from what the researcher expects or wants. Moreover, since it is a "reflective" form of research, the researchers (mostly classroom teachers) are not required to obtain consent from their students, even if the students' work, comments, actions, facial expressions, and other characteristics will form a large part of the "data."

Those in favor of "action research" argue that the main purpose is for teachers to reflect on their own practice, and that the results will not be used for anything else; therefore it is not necessary to inform all participants. Unfortunately, such laxity encourages both methodological and ethical carelessness. Many education students want nothing more than to get the project over, get the grade, and get back to teaching without all the extra pressure.

(I completed one of those "action research" projects, with great reservations. I put substantial effort into it, and devoted a section of the essay to ethical concerns. It proved worthwhile for me, and may have helped improve my teaching. That said, the concerns did not go away.)

I do not have qualms about qualitative research per se. But when it is sloppy or superficial, it gives all the more power and credence to narrowly conceived quantitative research. "Look at what the fuzzy folks are doing," the D3M warriors will chortle. "It doesn't even hold up as research." The "fuzzy" crowd may respond by pointing at the holes in quantitative research; neither side acknowledges the other.

(The DoE, for its part, has concocted a dangerous mixture of the fuzzy and the concrete: the tests are rather fuzzy in what they test, yet the results are presented as hard numbers. But that's another story.)

Ironically, it may be in the DoE's interest to allow "action research" to proliferate among novice teachers. It undermines the validity of qualitative research (go D3M!), while at the same time loosening the ethical standards around educational research in general. Also, it makes way for the "inquiry teams" that are now mandated at schools. There's so much "research" going on at once that we take it for granted.

We need high methodological and ethical standards for both qualitative and quantitative research. Both types are needed in education. If one type loses its reputation, the other assumes unnatural power. Moreover, questionable research practices can prove contagious. We are all research-practitioners; everything is data; and no one needs to be informed. Take those premises to their logical conclusion, and you end up with a surveillance society that calls itself introspective (or maybe "value-self-added") and believes its own jargon.

This is scary because I feel like it would be one more freedom being infringed upon. This hasn't happened yet in SC but I could imagine the powers that be liking the idea. For anyone who is not from NY and thinking -oh, that doesn't apply to me- it could be heading your way!

Comments are now closed for this post.


Recent Comments

  • Pat: This is scary because I feel like it would be read more
  • korobochka: Unfortunately there are far too many education research studies in read more
  • Rachel: Principals in the treatment group (140 schools) receive extensive value-added read more
  • Paul Hoss: At the risk of condemnation from many in the educational read more
  • eduwonkette: Thanks, everyone, for commenting. Skoolboy, I'm with you on measurement read more




Technorati search

» Blogs that link here


8th grade retention
Fordham Foundation
The New Teacher Project
Tim Daly
absent teacher reserve
absent teacher reserve

accountability in Texas
accountability systems in education
achievement gap
achievement gap in New York City
acting white
AERA annual meetings
AERA conference
Alexander Russo
Algebra II
American Association of University Women
American Education Research Associatio
American Education Research Association
American Educational Research Journal
American Federation of Teachers
Andrew Ho
Art Siebens
Baltimore City Public Schools
Barack Obama
Bill Ayers
black-white achievement gap
books on educational research
boy crisis
brain-based education
Brian Jacob
bubble kids
Building on the Basics
Cambridge Education
carnival of education
Caroline Hoxby
Caroline Hoxby charter schools
cell phone plan
charter schools
Checker Finn
Chicago shooting
Chicago violence
Chris Cerf
class size
Coby Loup
college access
cool people you should know
credit recovery
curriculum narrowing
Dan Willingham
data driven
data-driven decision making
data-driven decision-making
David Cantor
Dean Millot
demographics of schoolchildren
Department of Assessment and Accountability
Department of Education budget
Diplomas Count
disadvantages of elite education
do schools matter
Doug Ready
Doug Staiger
dropout factories
dropout rate
education books
education policy
education policy thinktanks
educational equity
educational research
educational triage
effects of neighborhoods on education
effects of No Child Left Behind
effects of schools
effects of Teach for America
elite education
Everyday Antiracism
excessed teachers
exit exams
experienced teachers
Fordham and Ogbu
Fordham Foundation
Frederick Douglass High School
Gates Foundation
gender and education
gender and math
gender and science and mathematics
gifted and talented
gifted and talented admissions
gifted and talented program
gifted and talented programs in New York City
girls and math
good schools
graduate student union
graduation rate
graduation rates
guns in Chicago
health benefits for teachers
High Achievers
high school
high school dropouts
high school exit exams
high school graduates
high school graduation rate
high-stakes testing
high-stakes tests and science
higher ed
higher education
highly effective teachers
Houston Independent School District
how to choose a school
incentives in education
Institute for Education Sciences
is teaching a profession?
is the No Child Left Behind Act working
Jay Greene
Jim Liebman
Joel Klein
John Merrow
Jonah Rockoff
Kevin Carey
KIPP and boys
KIPP and gender
Lake Woebegon
Lars Lefgren
leaving teaching
Leonard Sax
Liam Julian

Marcus Winters
math achievement for girls
meaning of high school diploma
Mica Pollock
Michael Bloomberg
Michelle Rhee
Michelle Rhee teacher contract
Mike Bloomberg
Mike Klonsky
Mike Petrilli
narrowing the curriculum
National Center for Education Statistics Condition of Education
new teachers
New York City
New York City bonuses for principals
New York City budget
New York City budget cuts
New York City Budget cuts
New York City Department of Education
New York City Department of Education Truth Squad
New York City ELA and Math Results 2008
New York City gifted and talented
New York City Progress Report
New York City Quality Review
New York City school budget cuts
New York City school closing
New York City schools
New York City small schools
New York City social promotion
New York City teacher experiment
New York City teacher salaries
New York City teacher tenure
New York City Test scores 2008
New York City value-added
New York State ELA and Math 2008
New York State ELA and Math Results 2008
New York State ELA and Math Scores 2008
New York State ELA Exam
New York state ELA test
New York State Test scores
No Child Left Behind
No Child Left Behind Act
passing rates
picking a school
press office
principal bonuses
proficiency scores
push outs
qualitative educational research
qualitative research in education
quitting teaching
race and education
racial segregation in schools
Randall Reback
Randi Weingarten
Randy Reback
recovering credits in high school
Rick Hess
Robert Balfanz
Robert Pondiscio
Roland Fryer
Russ Whitehurst
Sarah Reckhow
school budget cuts in New York City
school choice
school effects
school integration
single sex education
small schools
small schools in New York City
social justice teaching
Sol Stern
Stefanie DeLuca
stereotype threat
talented and gifted
talking about race
talking about race in schools
Teach for America
teacher effectiveness
teacher effects
teacher quailty
teacher quality
teacher tenure
teachers and obesity
Teachers College
teachers versus doctors
teaching as career
teaching for social justice
teaching profession
test score inflation
test scores
test scores in New York City
testing and accountability
Texas accountability
The No Child Left Behind Act
The Persistence of Teacher-Induced Learning Gains
thinktanks in educational research
Thomas B. Fordham Foundation
Tom Kane
University of Iowa
Urban Institute study of Teach for America
Urban Institute Teach for America
value-added assessment
Wendy Kopp
women and graduate school science and engineering
women and science
women in math and science
Woodrow Wilson High School