My Value-Added Bucket List

Last week's teacher effects brouhaha brings me back to where this blog started - not eduwonk channeling Britney, but rather how to measure teacher effectiveness. We know a lot more about estimating teacher effects on student test scores than we did 10 years ago. (Readers know well that I am as concerned with academic and social outcomes of education that are not measured by test scores, but that is for another post.) Nonetheless, big picture questions linger, and Mary Lou Retton-worthy technical gymnastics won't make teachers feel comfortable with value-added until these questions are answered. Here's what I'd like to know before moving forward:

1) How do schools affect teachers' ability to be effective in the classroom? The current assumption about teacher effects is that they reside within the teacher - i.e that a teacher is "good" or "bad" independent of the school context in which s/he is working. But we don't know if a teacher is equally effective across multiple schools, or if some component of a teacher's effectiveness is "firm-specific." For example, Harvard health economist Robert Huckman has examined doctors' effectiveness across hospitals and found that human capital isn't entirely portable. (The Effect of Organizational Context on Individual Performance). Is this also true in education?

2) How, and how much, do colleagues matter? Having higher quality colleagues may make you a better teacher yourself. We need to know whether "teacher peer effects" exist, and if so, how important they are. (For more, see No Teacher is An Island). Colleagues matter in a second way in middle and high school, where kids have different teachers for different subjects. That your students have an exceptional English teacher makes it easier for your kids to write lab reports in science, and prior year teachers may matter as well. We need to know how these crossover effects operate, and how large they are.

3) Are the same teachers that are effective in promoting short-term score gains effective in promoting longer term academic growth?: We currently estimate teacher effects on what happens on a year-end test - but what we're really after is teachers' long-term effects on their students. We're not interested in short-term score inflation, but in improved learning that lasts. (See this New Yorker article about the trouble with hedge fund bonuses.) A new paper, "How Long Do Teacher Effects Persist?" by Spyros Konstantopoulos provides some insight here.

4) Are the same teachers that are good at promoting math skills good at promoting reading skills? Does being an "effective teacher" mean that you are good at one or good at both? Current estimates of the correlation between teachers' math and reading effects are in the neighborhood of .50-.60.

5) How large are student peer effects, and how does the existence of peer effects complicate our ability to estimate teacher effects? Classrooms are interactive organisms, not individuals sitting in separate cells. Teachers are well aware of this fact, and talk about classes from hell/heaven. Peer effects can be random - i.e. a couple of kids who chemically react and pull the class down with them - or socially patterned. For example, classes with a higher proportion of girls result in both girls and boys performing better (See More Girls=More Learning). How should our knowledge of peer effects in the classroom affect the way we model teacher effects?

6) What about non-random assignment? Non-random assignment may be the biggest threat to value-added systems. (See The Great Sorting Machine for more.) It's important from a technical perspective (see Do Value-Added Estimates Add Value?), but also from a legitimacy perspective. Teachers know that principals can bury them by sticking them with tough kids.

7) Are all gains created equal?: Should gains for high performers be treated differently than gains for low performers? In other words, should a gain of 10 scale score points for a high scoring kid be treated the same as a gain of 10 points for a low scoring kid?

Why do these big picture questions matter? Each has modeling implications. More importantly, they matter because teachers have these concerns about value-added estimates and they deserve to have their questions answered. From following the use of value-added in Dallas at the Dallas ISD Blog, it appears that few teachers actually understand how their CEI scores are calculated. Researchers and wonks interested in trying value-added need to do a better job of explaining these systems to teachers, of making them comprehensible, and of addressing concerns like those raised above.

My one line position on value-added? It's not ready.

You are right that Value Added models aren't ready, and as always I loved your analysis.

Anybody ought to know that peer effects on teenagers is huge. That brings me back to your promise to address troubled kids and disciplinary consequences.

How many games would a football team win if they lost all of their "skill players" and leaders. That's what happens when "Creaming," to magnet schools pulls the most motivated kids out of neighborhood schools. Our basketball team won three State championships and always was a semi-finalist until magnet schools took away the team leaders. We still have the raw talent but we are now beaten by forty points every night. The key to our championships were three two-parent families who provided players who knew how to "play smart." Those three fathers had a mentoring effect on several dozen other players. But when none of the players have had fathers or friends with fathers who could provide guidance, everything changes. Should we expect anything less dramatic in the classroom?

That issue is huge in explaining the quality of teaching in high poverty schools. In schools like mine, you might as well post a sign saying "young teachers need not apply." Not trying to diss the twenty-somethings but I've never seen a 23 year old who survived a year. On the other hand, not many principals would take a chance on teachers who aren't approaching thirty so maybe its not a fair test.

Similarly, even ineffective teachers are better than classrooms with no teachers at all. When disorder reaches a certain point, it is virtually impossible to hire a full staff. Its not unusual for our students to spend all three hours before lunch in rooms where nobody has even tried to conduct class. Year in and year out, my students give the same estimate. Of their eight teachers, they average three who are able to conduct class. Many would have been excellent if they had disciplinary backing, but several never had any chance on functioning in our school.

That makes it politically impossible for administrators to provide the disciplinary backing they would like.

Steve Raudenbush and his colleague Doug Willms have some interesting things to say about the question of how schools influence teachers' effectiveness in raising test scores. They argue that there are two different kinds of causal effects regarding schools, and one is of interest to parents (e.g., "how would my child do in School X compared to her performance in School Y?"), whereas the other is of more interest to policymakers in the context of an accountability system (e.g., "How would a child do in School X with policies and practices P compared to the child's performance in School X with a different set of policies and practices?"). You can substitute teacher for child here to get to the teacher value-added concern. Analytically, there are a number of problems in estimating this second type of effect: practices are not assigned randomly to schools, and, more strikingly, we don't have a vocabulary for operationalizing the practices that might matter, and thereby introducing them into a value-added model. The gist of Raudenbush's argument (warning: not for the statistically faint-of-heart) is available in http://jeb.sagepub.com/cgi/reprint/29/1/121.pdf

No one has addressed the psychological effects of these experiments. Look at places where teachers have been lured into these plans with money. The experiment always begins with apprehension, a sort of reluctance. The policy wonks explain that this fear is because the teachers have been brainwashed by the unions and don’t understand the science at work. Perhaps. It is also possible that experienced professionals know in their gut when something just feels wrong, even if they can’t explain why.

But they participate anyway because the pull of the money is just so strong, the promise of some financial reward for years of hard work seems so right, and, in some cases, “leadership” has promised them that the results will be fair. Once the decision is made to participate, initial reluctance is replaced with a sense of excitement and teachers soon forget many of their worries. After all, teachers are human: Who could pass on a free lottery ticket, especially when you think you will win, especially when you think you will win because you deserve it.

But the morning after, teachers invariably wake up to regret and shame, at least when they know the outcome. They learn that teachers they know work hard did not get a reward. They see less deserving teachers rewarded. No one can explain why. The fairness of the experiment becomes less clear when they see who is left out and how the money is divided. Some winners become ashamed of the money they got and will not even admit to winning; some of the very people who don’t want bonuses published are the ones who got one. Other winners wonder secretly if they may actually be that much better than their peers; after all of those years of playing a supporting role, maybe they should have played the lead? How does that feel?

The losers feel duped. They review in their minds everything they thought they were doing right. They must, as the system is intended to do, start to question everything about what they do. What was working, what wasn’t? But in many of these experiments, they don’t get any feedback, no explanation, no guidelines for improvement, just a report card with a big red “F.” How does that feel?

And after the checks are cashed, the teachers are in the awful situation of having to admit that, despite everything they have ever believed about themselves, they may be doing what they do for the money. Not the kids. Not the community. Just for the money. At that point, they are stuck with the realization that they have been kidding themselves for 5, 10, or 20 years by saying they were in it because they cared about teaching and kids and learning. Even worse, in some places, teachers will have to reconcile that they choose to participate when their peers living nearby said “No, no thanks,” despite the money.

And then… the final twist. The teachers find out that real, objective researchers believe the results were statistically unsound or there was an error in the calculation or the analyses can’t be used to tell most good teachers from bad ones. Millions of dollars were rewarded, winners and losers chosen, and even the people in charge can’t say if the results were correct. The winners had no right to brag and the losers had no need to apologize. How does that feel then?

Read the comments and reaction of teachers who have participated in many of these experiments. Their reactions blur from outrage to confusion to pure shame. They wanted the money. They have to admit that they went against their own beliefs – for the money. What happens when you invalidate someone’s lifelong sense of purpose? What happens when you upend their sense of self? How does that feel?

Ultimately, of course, today’s teachers could be replaced by a teaching staff that is completely motivated by financial reward…supplemented by a cadre of Princeton liberal arts grads who need a steady temp job for a couple of years. But what will education be like when these are the only people left to teach our children? If the current schemes are just an experiment, is that the final solution?

The problems I see with the value-added approach from an ethical/democratic perspective is that this approach is entirely based on a contemporary view of democracy as opposed to a classical view of democracy. Our schools were created partly to ensure that democracy persisted in America. “A democracy is more than a form of government; it is primarily a mode of associated living, of conjoint communicated experience” John Dewey

In research done by Carr and Hartnett, 1996, they found that in a Classical democracy the aim of education is "To initiate individuals into the values, attitudes and modes of behavior appropriate to active participation in democratic institutions." In a contemporary democracy the aim of education is to, "To offer a minority an education appropriate to future political leaders; the majority an education fitted to their primary social role as producers, workers and consumers."

I don't see how making school an arena for free market competition would add any benefit to our democratic society. If we are going to talk about value added measures we need to talk about "Values" not just test scores. In case you have never seen it, TLN's TeacherSolutions came up with its' take on pay-for-performance. The two ideas that would really add some value to this discussion are:
Rewarding small teams of teachers who raise student achievement together;
• Rewarding teachers who accept challenging assignments in high-needs schools and strengthen connections between school and community;

The Teachers’ Union Contract has been a major point of contention in Washington, D.C. The Washington Teachers’ Union has been negotiating a three-year contract that would eliminate seniority and replace it with a merit system based on differential compensation, which bases teachers’ pay on factors such as improved student test scores rather than years of service.
With the seniority tradeoff, teachers have the possibility of making $100,000 with five years certification. According to the most recent survey conducted by the American Federation of Teachers, the national average salary for teachers is $47,600. The change would increase the worth of teachers’ human capital, helping to redefine who we consider elite professionals. At the same time, rather than hiring teachers of quality, we see teachers who want to make money, teach the exam, and bypass pedagogical methods that improve students’ critical thinking abilities.
In addition, the merit-based system would also give the Chancellor the ability to offer talented teachers the position of their choice. This aspect of the policy implicates the quality of teaching as well because the tendency is for teachers who have their choice to choose high-performing institutions. Ideally, the best teachers would go to the students with the highest need, but many teachers want to go to high-performing schools because these schools have the best resources and facilities. If we want to get better teachers where students need them, we need to address the underlying problem of resource mobilization, ensuring that the systematic supports are in place for schools to offer its students the highest quality education.

