« Remember MLK | Main | Carnival of Education Pre-Party »

It's Our Secret! The NYC Teacher Experiment

| 10 Comments
telephone_shhh.GIF
The NY Times reported yesterday on an ongoing experiment on teacher effectiveness in NYC schools. Principals in the treatment group (140 schools) receive extensive value-added information on each teacher, and then are asked to evaluate the teachers. Principals in the control group do not receive these reports but also provide evaluations of their teachers. As far as I can tell, the goal is to determine how principals' evaluations are affected by having access to value-added data. By the summer, the NYC DOE will decide how these data will be used, and Deputy Chancellor Chris Cerf has even suggested releasing individual teachers' effectiveness data publicly. You can watch this video for more information about the experiment.

While much could be said about the challenges of estimating reliable value-added measures for teachers or the move to use test scores as the primary measure of assessing teacher effectiveness, I'll save those for later. (See more posts about measuring teacher effectiveness here.) Instead, I want to talk about the issue of research ethics in scientific experiments. It turns out that many teachers in participating schools have not been notified of the study.

Secret experiments have an odious history in science. The most notable example is the Tuskegee experiment, in which African-American men with syphilis were recruited into a study but not told of the purpose of the study or notified of their diagnosis. Their disease was left untreated so that researchers could track its progression. Once this experiment broke publicly, Congress passed legislation that, many commissions and administrative changes later, ultimately required universities receiving federal grants to form Institutional Review Boards to oversee all research. Human subjects policies require university researchers to receive the consent of all subjects and to make them aware of the potential risks of the study.

My point is not that the NYC experiment's secrecy is the moral equivalent of the Tuskegee Experiments. The Department of Education is not bound by any university's human subjects policy, and it is their right to examine whatever data they please to produce new knowledge. (Note that the university researchers involved are bound by IRB standards if they plan to publish off of these data.) But the Hippocratic Oath of the research community - that subjects should be aware that they are part of a study - has been grossly violated. And it does not help the reputation or future of "scientifically based research" in education when studies are conducted in secret. Even if this was not a research study, a decent boss notifies employees when they change the criteria on which employees are evaluated.

Where is this going next? Notably, Cerf's suggestion that individual teachers' data should be publicly released has precedent in New York. The New York State Department of Health started collecting similar data on doctors' effects on mortality in the early 1990s. In 1991, New York Newsday filed a Freedom of Information request, which forced the Department of Health to publicly release doctor level data. Since then, individual doctors' data have been publicly reported. Assuming the same Freedom of Information statutes apply to education, it may not be long before we can examine the "value-added scores" of NYC teachers while waiting for the C train to show up.

Back to data-driven decision making tomorrow.
10 Comments

David K. Cohen once wrote a paper on sociologist Willard Waller, who he described as "hating school but loving education." I'd characterize the New York City Department of Education as loving data but hating research. The senior administrators are true believers in the power of data to drive decisions; but there is a remarkable lack of understanding of the fact that data don't speak for themselves. Real-world research is characterized by ambiguity and uncertainty, and not because researchers are incompetent or propelled by ideology. Where is this uncertainty represented in the school progress reports or teacher value-added reports?

The fact that the UFT knew about this for months but kept it quiet is as outrageous as the actions of the DOE.

Randi Weingarten UFT has known about the program for months but kept quiet about it - she claims she did not know the specific schools which we all know would have been easy for them to find out and warn the teachers. And even if they couldn't find out, a public exposure at the time would have allowed teachers in all schools to confront their principals and ask point blank if they were part of the program. That would have forced them to tell them or basically lie to their faces. At the very least the UFT could have thrown a monkey wrench into their plans but chose the sounds of silence.

Therefore, view Randi Weingarten's words of outrage - I guess she wasn't all too outraged all these months - and promise to fight the plan as the usual empty words designed to obfuscate the issue and confuse the members.

So what is the "risk" here?

When I came across this article this weekend I was thinking: "I can't wait to hear Eduwonkette's take on the situation."

Skoolboy raised the idea that struck me most. The data produced by examining one teacher's classroom and comparing that data to teachers with "similiar students" presents a very narrow view of what is actually going on in that classroom.

But clearly the most egregious problem is the secrecy involved. Teachers have a right to know they are being analyzed and compared and by what criteria. And then the powers of be wonder why teachers won't let go of tenure. It's situations like these that make people hold on to that practice. There has to be a better way to evaluate teachers then to simply let principals design their own measuring system. Possibly we could use peer or self review systems so that the process is less acrimonius.

I found it quite humorous that the principal interviewed in the article revealed that he is part of this study. I am sure he had a great day today.

The chapter leader at that school is excellent and works hard defending her members - she is also a very well respected teacher and works with the opposition (ICE) to Weingarten - not that it means much. We'll get some idea from her as to the reaction and report back.

Thanks, everyone, for commenting. Skoolboy, I'm with you on measurement error. As Sander said correctly in the article, we may be able to identify positive and negative outliers, but the idea of arraying teachers in a continuum goes against everything we know about estimating school and teacher effects.

Norm, as I understand it, this went from being announced as an academic study - with no consequences for anyone - to a policy proposal. If it was just a study, perhaps widespread notification was unnecessary - but as Leonie Haimson suggested when she sent me this video in September, we all knew where this could go. And it did.

Stuart, the risk is that teachers' future employment, as well as the current relationship they have with their administrators, is fundamentally endangered by participation in the experiment. By any IRB's standard, this is a major risk; I am certain that my institution's IRB would not approve a protocol that did not involve notification of study participants. If the study investigators could assure that this information would be unassailably "correct," this might be less of a problem. However, the estimation of teacher effects is complicated by the fact that model specification makes a big difference.

Marnie, I agree that test scores provide a very narrow view of teacher effectiveness. As Jesse Rothstein demonstrated in this paper, gain scores are very dangerous when students aren't randomly assigned to teachers:

http://eduwonkette2.blogspot.com/2007/12/do-value-added-models-add-value-new.html

Keep the comments coming!

At the risk of condemnation from many in the educational establishment, I believe Bloomberg is onto something with the collection of this data. However, in no way do I condone his clandestine methodology. What was he attempting to hide?

From a pragmatic view, what teacher worth a dime would shy away from the opportunity to be judged on a merit pay system?

Additionally, what parent would willing place their child in a class where the teacher was reluctant to be judged on quantifiable, objective results?

For more on this topic try "googling" William Sanders, University of Tennessee, value-added assessments. You might find the NYTimes piece left out some noteworthy information regarding this practice which will be coming to a school district near you, soon.

Principals in the treatment group (140 schools) receive extensive value-added information on each teacher, and then are asked to evaluate the teachers.

Even from a research point of view, I don't get this. What do they think they are going to learn from comparing the teacher evaluations of principals who get the data and those who don't?

If I were developing a hypothesis for the outcome of that experiment, it would be that principal's evaluations in the "with data" case will tend to align with the data they have. Would that tell me anything about the usefulness of the data? I don't see how.

A more (to me) interesting question would be to compare the evaluations principals give in the absence of the data, with the data. How well does a principals subjective evaluation correlate with test-score results? How consistent is it from principal to principal? What qualities do principals seem to value that don't correlate with test score "value-added?"

Is there something about the DOE's "research" that I'm missing??

Unfortunately there are far too many education research studies in which the participants are not informed. Many of these fall in the "action research" category.

Currently many students in education programs (including the one from which I graduated) are required to complete an "action research" project. This is touted as a counterbalance to--or even protection against--the shortcomings of quantitative research. Under "action research," just about anything qualifies as data. Moreover, the researcher is directly involved in the environment under study, and his or her current and future actions may be influenced by the results of the research.

For example: a teacher might conduct an "action research" project on the effect of positive phone calls to parents on homework completion rates. If it appears that positive phone calls home result in more homework turned in, then the teacher may make far more of those positive phone calls than he or she would otherwise.

Since anything qualifies as "data" (including the researcher's own observations, reflections, and impressions), there is no guarantee that the outcome will diverge substantially from what the researcher expects or wants. Moreover, since it is a "reflective" form of research, the researchers (mostly classroom teachers) are not required to obtain consent from their students, even if the students' work, comments, actions, facial expressions, and other characteristics will form a large part of the "data."

Those in favor of "action research" argue that the main purpose is for teachers to reflect on their own practice, and that the results will not be used for anything else; therefore it is not necessary to inform all participants. Unfortunately, such laxity encourages both methodological and ethical carelessness. Many education students want nothing more than to get the project over, get the grade, and get back to teaching without all the extra pressure.

(I completed one of those "action research" projects, with great reservations. I put substantial effort into it, and devoted a section of the essay to ethical concerns. It proved worthwhile for me, and may have helped improve my teaching. That said, the concerns did not go away.)

I do not have qualms about qualitative research per se. But when it is sloppy or superficial, it gives all the more power and credence to narrowly conceived quantitative research. "Look at what the fuzzy folks are doing," the D3M warriors will chortle. "It doesn't even hold up as research." The "fuzzy" crowd may respond by pointing at the holes in quantitative research; neither side acknowledges the other.

(The DoE, for its part, has concocted a dangerous mixture of the fuzzy and the concrete: the tests are rather fuzzy in what they test, yet the results are presented as hard numbers. But that's another story.)

Ironically, it may be in the DoE's interest to allow "action research" to proliferate among novice teachers. It undermines the validity of qualitative research (go D3M!), while at the same time loosening the ethical standards around educational research in general. Also, it makes way for the "inquiry teams" that are now mandated at schools. There's so much "research" going on at once that we take it for granted.

We need high methodological and ethical standards for both qualitative and quantitative research. Both types are needed in education. If one type loses its reputation, the other assumes unnatural power. Moreover, questionable research practices can prove contagious. We are all research-practitioners; everything is data; and no one needs to be informed. Take those premises to their logical conclusion, and you end up with a surveillance society that calls itself introspective (or maybe "value-self-added") and believes its own jargon.

This is scary because I feel like it would be one more freedom being infringed upon. This hasn't happened yet in SC but I could imagine the powers that be liking the idea. For anyone who is not from NY and thinking -oh, that doesn't apply to me- it could be heading your way!

Comments are now closed for this post.

Advertisement

  • Principal
  • Chattahoochee Hills Charter School, Multiple Locations
  • Principal
  • Roaring Fork School District, Carbondale, CO
  • Principal
  • The Berkeley Institute, HAMILTON, Bermuda
  • Principal
  • Christ the King Preparatory School, NJ
  • Superintendent
  • Round Rock ISD, Round Rock, TX
  • Principal
  • Amargosa Valley Elementary School, Amargosa Valley, NV

Archives

Recent Comments

  • Pat: This is scary because I feel like it would be read more
  • korobochka: Unfortunately there are far too many education research studies in read more
  • Rachel: Principals in the treatment group (140 schools) receive extensive value-added read more
  • Paul Hoss: At the risk of condemnation from many in the educational read more
  • eduwonkette: Thanks, everyone, for commenting. Skoolboy, I'm with you on measurement read more

Most Viewed
On Education Week