eduwonkette_header_515.jpg

Through the lens of social science, eduwonkette takes a serious, if sometimes irreverent, look at some of the most contentious education policy debates. (Find eduwonkette's complete archives prior to Jan. 6, 2008 here.)

Main

December 23, 2008

Survivor: The TFA Edition

skoolboy remains fascinated by the way in which Teach for America, a program serving perhaps 3% of the students in the districts in which it operates, can seem like the tail wagging the dog. Like eduwonkette, I see many virtues to the program, but do not view it as a solution to the nation's challenge of developing a corps of skilled career teachers to serve our children and youth.

TFA recruits make a two-year commitment to teaching in a high-needs school, and the limited nature of this commitment is a recurring source of concern. If TFA recruits stay just two years and then leave, then the schools they serve face a revolving door of teachers shuffling in and out. TFA, for its part, cites recent evidence that TFA recruits are at least as effective in the classroom as other novice teachers. Moreover, TFA champions the enduring value of having its recruits see the challenges facing high-needs schools, if only for a few years, and claims that many recruits stay in the field of education beyond the two-year commitment.

There’s some new evidence on this latter point, emerging in the doctoral dissertation research of Morgaen Donaldson, formerly with Harvard’s Project on the Next Generation of Teachers, and now an Assistant Professor in the School of Education at the University of Connecticut. Donaldson surveyed the 2000, 2001 and 2002 cohorts of TFA recruits, obtaining 2029 responses, for a 62% response rate. Focusing on voluntary departures (approximately 16% in the sample were involuntary), she modeled the likelihood of staying in the initial placement school over time, as well as the likelihood of transferring to another school or leaving teaching altogether.

The charts below are from fitted hazard models that describe the cumulative probability of "survival" in the initial placement school across years, as well as the probability of voluntarily resigning from teaching for the first time. The first chart shows that about 90% of TFA recruits (voluntarily) remain in the initial placement school for a second year, and about 44% stay for a third year. These figures decline steadily over time, with about 22% staying in the initial placement school for a fourth year, 15% for a fifth year, and 9% for a sixth year.

TFA-initial.JPG

The probability of voluntarily staying in the teaching profession over time is higher than the likelihood of staying in the initial placement school, since some TFA recruits, like teachers in general, transfer to other schools. The fitted models suggest that about 94% of TFA recruits remain in teaching for a second year, and 60% teach for a third year. 44% remain in teaching for a fourth year, 35% for a fifth year, and 29% for a sixth year.

TFA-total.JPG

It’s difficult to know whether to think of these rates of persisting in the initial school placement or in teaching at large are high or low. As usual, the question is, compared to what? TFA recruits are placed in schools that are claimed to be "hard to staff," and they may be challenging places to work, regardless of the route that brought the teachers to such schools. If the attrition rates for other novice teachers in these schools are just as high as those observed for TFA recruits, it’s harder to argue that TFA is exacerbating the problem of building a stable, high-quality teaching force in high-needs schools. Donaldson’s study doesn’t shed any light on this issue.

I’ll have a bit more to say about Morgaen Donaldson’s research on how working conditions affect the persistence of TFA recruits in their initial schools tomorrow.

December 12, 2008

EduJello Wrestling, Round 1! Gladwell vs. Gladwell

virtual-jello.gif
A thought experiment: If Malcolm Gladwell, author of Outliers, was to jello wrestle his alter ego on central matters of public education, who would come out on top?

In his article in the New Yorker this week, Gladwell's argument is that it's hard to predict who will become a great pro quarterback or teacher before job candidates start playing or teaching. Like most engaged in the teacher quality debate, Gladwell assumes that there are "good" and "bad" teachers, and this quantity exists a priori. But it's just impossible to observe it before a teacher steps into the classroom. It's not about training. And it's not about some schools providing more supportive environments for teaching than others. For Gladwell, it's about individual "withitness," and we can't see it until after the teacher has walked through her classroom door.

It was surprising to see Gladwell focus so heavily on the potential of the individual player or teacher, given that he just penned a book about the importance of social contexts and chance in producing human greatness. As he put it, "The tallest oak in the forest is the tallest not just because it grew from the hardiest acorn; it is the tallest also because no other trees blocked its sunlight, the soil around it was deep and rich, no rabbit chewed through its bark as a sapling, and no lumberjack cut it down before it matured."

So where's the "forest" for a quarterback or teacher? It's a team. Or a school. Even the most gifted quarterbacks end up with pretty crappy pass completion stats if their teammates consistently miss the ball. And a great quarterback doesn't look so great if he's a poor fit for the team he's playing with. The same goes for teachers. So my fingers are crossed that the Gladwell who recognizes the importance of the environments - not just individuals - wins this match.

November 19, 2008

The ATR Deal: An Acknowledgement that Teacher Price Incentives Aren't All They're Cracked Up to Be?

hand-shake.jpg
Following up on a long discussion last spring about teachers displaced from their schools and not rehired - teachers who are part of New York City's "Absent Teacher Reserve:" the city and the union have reached a deal. Principals will not have to pay more for hiring more experienced teachers (for eight years), and will also receive a cash incentive, equal to half of a starting teacher's salary, for hiring a teacher from the ATR.

In effect, this deal undoes a central - and in my opinion, unfortunate - component of weighted student funding, through which it costs principals more to hire an experienced teacher than an inexperienced one. Without a doubt, experienced teachers have been remaining in the pool longer than their less experienced counterparts. What the union and the DOE have debated is whether this is a cost issue or a quality issue. Some figures from the original New Teacher Project report: Because of seniority rules, 44% of teachers excessed in 2006 had 0-3 years experience, while 22% of teachers in this pool had 13+ years of experience. Of the 235 teachers who remained unplaced as of December 2007, only 25% of these teachers had 0-3 years of experience, while 42% had 13+ years of experience. (See graph below.)

Now we'll have a strong test of the claim that these are "bad teachers" that no principal wants to hire. But beyond the ATR issue, it's worth thinking about how the deal - and principals' reactions to it - may affect the future of teacher price incentives in New York City and beyond. Sure, experienced teachers should be more evenly distributed across schools, but I've never seen any evidence that making principals pay more for them is going to achieve that outcome. In the worst case scenario, we end up with a tragedy of the commons dilemma in which individual principals, each acting in their own short-term interest, end up turning experienced teachers away from their schools, and the collective impact is to push them out of the district altogether.

NTP%20graph.jpg

November 17, 2008

Lessons for Performance Pay from the Financial Crisis?

whell.JPG
This fall, we've heard a lot about how short-term pay incentives on Wall Street encouraged traders to take huge risks, and ultimately ushered us into our current financial mess. Ask the folks at Lehman Brothers - the decisions that maximize profits in the short-term don't always pan out in the long-term.

It's curious that at the same time, journalists and talking heads have pushed performance pay for teachers onto center stage. Proponents of performance pay often want to use one year of test score data in order to pass out bonuses - in other words, reward the attainment of short-term goals. But as Jay Mathews noted in his article this morning, the kind of growth we're after is what makes a student perform well in college, in the workplace, and in life.

There are good reasons to believe that the instructional strategies that produce short-term growth may not always serve students well in the long-term. So I suspect that there are at least two types of high value-added schools or teachers - those whose effects persist, and others who just produce short-term gains. Might we learn something from the financial crisis and hold some of that performance pay in escrow? Or revamp our accountability systems so they aren't focused solely on short-term gains?

And a teacher quality bonus: Robert Pondiscio offers a smart response to the Mathews article.

October 8, 2008

Following Up on the Art Siebens Discussion

SiebensUPOP1.JPG
Thanks to everyone who's participated in the spirited discussion below on the Art Siebens case. Commenters have raised a number of important questions, among them:

1) Will eliminating tenure increase the quality of teachers in DCPS? Where will this fleet of new exceptional teachers come from? Do principals have incentives to keep the best teachers? Will principals nix “bad” teachers, or will teachers who are outspoken take the fall, too? Might it not be prudent to make investments in improving the teachers that we have, rather than just replacing them in large numbers?

2) What are the implications of arbitrary firing for the teaching profession overall? As John Thomspon wrote, “It does not take many arbitrary decisions to destroy a career before you poison the entire well of teaching talent. Would you commit to a career and buying a house etc. if you had a 2 or 3 or 5% chance per year to run afoul of someone who could destroy your career?”

3) What do we learn from examples like Art Siebens? Is his experience reason enough to abandon the idea of eliminating tenure? How many mistakes are too many? And what kind of appeals system should be in place?

4) When the budget gets tight and the private money runs dry, who is going to pay for six figure teacher salaries?

So join the discussion. I'll leave you with some of the moving testimonials about Art Siebens, of which you can find many examples here and here. And if you'd like to learn more about the effort to reinstate Art Siebens, you can visit this website.
He didn't just teach us the material, he sang it to us. Dr. Siebens, in all of his excited glory, would break out his guitar, forcing us groaning teenagers to sing to the tunes of "I heard it through the Grapevine" (The Nephrons like a Grapevine about the adrenal system) and "Poor wand'ring one" from the musical the Pirates of Penzance (Poor Wandering Bun- a song about digestion). Junior year, Biology was everywhere- in the class, on the radio, and even in my dreams. Can you name a teacher in your lifetime that had this power? - Devorah Flax-Davidson, valedictorian, 2005

When I was a first-time teacher nine years ago, Dr. Siebens took the time to provide me with demonstrations of each Biology lab that I had to use for the entire year. He also provided me with all the teaching materials I needed as D.C. Public Schools (DCPS) did not have a Biology curriculum at the time....Now entering my ninth year as a teacher, I still use his well-crafted and creative Biology songs to engage students with the content of my lessons. To say that Dr. Siebens is a valuable resource to Wilson's staff and students is a gross understatement. - Damian Kreske, Former Biology teacher at Wilson High School

This was not a decision about the children and what they learn; that much is certain. This might be a decision about adults and pecking orders, power over the union, an attack on contract rights, but it is without any doubt not about getting the best teacher in the classroom and giving the students the best possible education. Dr. Siebens is the best example of teaching excellence we encountered in 15 years of experience with DCPS. - Ross Eisenbrey, parent of former student

If a man like this is not a fit, then who is a fit? He obviously, you know, if we listen to the lyrics, he loves his job; He loves kids. I mean, you don’t have to be a rocket scientist or a nuclear physicist in order to understand the commitment that he has to young people. - DC City Council Chair Vincent Gray

Letting Dr Siebens go is a very bad thing for the school in many ways. It shows teachers that being great teachers does not matter, it will go unrewarded. It shows the parents that you have no understanding of what they think is important - which is the education of the children. But most importantly, it shows the students that you and other grown ups do not really care about them or understand what they think is important when they think about school - that is, good teachers who care about them as students. - Susan Churchill, parent of two current students

June 30, 2008

How Much Math Does a Teacher Need to Know to Teach Math?

spiffboy2-thumb.jpg

I once asked a colleague if he’d read a particular book. “Read it?” he replied incredulously. “I haven’t even taught it!” A former college English professor, he came by the joke honestly. The first time I taught a course that I had never taken myself, I acknowledged the absurdity, at least to myself. I stayed about a week ahead of my students. Out-of-field teaching? Not exactly. I was teaching a course that was in my field, but outside of my immediate area of expertise. The teaching assignment was justified on the grounds that, as a Ph.D.-holder, I was deeply grounded in the core theoretical perspectives and research traditions in my discipline, and that I could therefore pick up the literature in a subfield quickly and accurately, and teach that literature competently. (At the time, no one was concerned with pedagogical content knowledge, the idea that there is practical knowledge of how to teach a subject that differs from mastery of the subject itself.)

Last week, the National Council on Teacher Quality released a report on the mathematics preparation of elementary school teachers who teach mathematics. The report indicts education schools for failing to select and prepare elementary teachers who have an adequate mastery of mathematics. Singling out algebra as a topic that is shortchanged in preparation programs, the authors offer a number of sensible recommendations for states, education schools, textbook publishers, and institutions of higher education.

The Teacher Education and Development Study in Mathematics (TEDS-M), a comparative study of how 18 countries, including the U.S., prepare mathematics teachers at the primary and lower secondary grades, is currently underway under the auspices of the International Association for the Evaluation of Educational Achievement. We’ll learn a great deal from this study that will complement the NCTQ recommendations.

It seems obvious that teachers must have knowledge of the subject matter they will actually teach. But how much more knowledge should a teacher have than what she or he is seeking to assist students in learning? The case of secondary school mathematics is instructive. Is it enough for a high school trigonometry teacher to know trigonometry cold – but not, say, real analysis, or ordinary differential equations?

In the US, many states have content specialty tests that prospective teachers must pass prior to assuming full-time teaching positions; presumably these tests tell us something about the mathematical content that states think is important for teachers to master. The four-hour Massachusetts test covers number sense and operations; pattern relations, and algebra; geometry and measurement; data analysis, statistics, and probability; trigonometry, calculus, and discrete mathematics; and integration of knowledge and understanding. Approximately 23% of the test is devoted to patterns, relations, and algebra, and there are 100 multiple-choice items and two constructed-response items. From tests such as these, we can infer that some states do not demand that high school math teachers have an extensive understanding of the discipline of mathematics.

One of the reasons I was unhappy with much of the press reporting on the Urban Institute’s study of Teach for America teachers’ effects on end-of-course tests in Algebra I, Algebra II, and Geometry (among other subjects) in North Carolina is that it shifted the locus of policy discussion to whether to expand alternate routes to teacher certification, without addressing the more challenging questions about what knowledge about subject matter and about how to teach it is optimal for student learning in particular subjects in high school. The reality is that even if we could count on the incremental achievement observed in the Urban Institute study, lots of other countries would still be kicking our butts in international assessments of mathematics and other subjects. I think we’d be better off examining how these countries prepare secondary math teachers – and teachers in other subjects – to see if there are approaches that we can adapt to the U.S. context. One thing that we might learn is that other countries demand much higher levels of subject matter competence from their elementary and secondary school teachers than we do.

June 28, 2008

Guest Blogger Sarah Reckhow: Easy to Blame

Sarah.jpg
Sarah Reckhow taught at Frederick Douglass High School in Baltimore from 2002 to 2004 and was a Teach for America corps member. Currently, she is a Ph.D. candidate in political science at UC Berkeley. Her dissertation explores the role of national philanthropies and community organizers in urban education policymaking.

Liam Julian’s review of “Hard Times at Douglass High” boils down a complicated stew of frustration, hope, and absurdity to a singular and simplistic point—many of the teachers are “just plain bad at their jobs.” Julian does begin with a fair remark—this documentary is not a systematic assessment of No Child Left Behind. Nonetheless, the film offers a vivid portrait of common NCLB observations and enough contextual information to make Julian’s reductive reaction dubious.

NCLB is most present in the film as a looming threat with vague and rarely applied consequences, including state takeover. The filmmakers bring us in on test day—students listlessly staring at test booklets, falling asleep, staring off into space. Many students did not take the tests seriously, assuming that the tests had no consequences or feeling too indifferent to try. We also hear from faculty commenting that they are forced to find ways to accommodate failing seniors at the end of the year in order to artificially raise the graduation rate.

We meet a state observer walking the halls with the academic dean. The state observer rattles off the various actions that may be taken if Douglass does not improve. At the end of the film, we learn that the state board of education finally tried to take over Douglass during 2005-2006, but the move was blocked by the state legislature. An impending gubernatorial election between Baltimore Mayor O’Malley and Governor Ehrlich added a heavy dose of partisan politics to that debate. The film implies that Ms. Grant, the principal in the film, was removed due to the school’s low performance. In fact, she was removed due to a school athletics scandal. Nonetheless, the school was “restructured” by the district in 2006, and the administration was replaced. The NCLB accountability system, as practiced at urban schools like Douglass, tends to operate like a merry-go-round; principal turnover rates in Baltimore are very high. School leaders get on board, ride until they get dizzy and stumble off, and then new leaders come aboard.

The bulk of Julian’s column focuses on Douglass’ teachers and seems oddly divorced from policy considerations. Drawing on clips from the film, he offers arm chair criticism of discipline and teaching methods, arguing that “the staff members at Douglass aren’t cutting it.” Even if this were true, Julian draws no clear policy lessons from his conclusion. It seems unlikely that Douglass hired only ineffective teachers from an otherwise talented pool of applicants.

Though there are great teachers at Douglass like Ms. Ray (she is featured in the film, but we never go in her classroom), it is also true that there are not enough. The film offers pieces to form an explanation—vacancies that go unfilled, long term substitute teachers, and a shortage of experienced teachers. The film features a 9th grade English class; the teacher makes a difficult choice to resign midway through the year. Substitutes come in, and the class flounders. The school has also hired a number of Teach for America corps members; some continue to teach there, but many have not stayed beyond the two year commitment, including me. All of these point to a clear problem of supply—Douglass cannot hire and keep enough good teachers to meet its needs. Teachers like Ms. Ray have heart and commitment that few of us can muster for even a few years, let alone decades.

The film does not provide new criticisms of NCLB, nor will it surprise anyone that the school struggles with teacher recruitment and retention. Viewers might be more startled by taking the longer view of Frederick Douglass High School: the school was founded in 1883 and has illustrious graduates including Thurgood Marshall; more than a century later, it is segregated, marginalized, and struggling.

Yet grumbling about the teachers who work in this difficult environment is not the answer. In fact, the film offers some illuminating scenes of teaching and learning at its best, only they don’t take place in a “typical” classroom setting. These include the school’s debate team, choir, band, and music production class. The students involved in these activities display precisely the attitudes we want schools to instill—pride, enthusiasm, and curiosity. Furthermore, the students are expected to perform well and rise to the occasion. Much of the commentary on this film has focused on Douglass at its worst, but much can be learned from Douglass at its best.

June 15, 2008

Everyone's Favorite Sound Bite About Highly Effective Teachers Put to the Test

talking-mouth.gif
"By our estimates from Texas schools, having an above average teacher for five years running can completely close the average gap between low-income students and others."
-Steve Rivkin, Rick Hanushek, and John Kain (2005)

"Having a top-quartile teacher rather than a bottom-quartile teacher four years in a row would be enough to close the black-white test score gap."
-Robert Gordon, Tom Kane, and Doug Staiger (2006)

"There are big differences in the amounts and kinds of learning that different teachers help produce....these effects are cumulative."
- Kati Haycock, Education Trust

It's everyone's favorite sound bite: good teachers alone can close racial and socioeconomic achievement gaps. But if the entire teacher effect doesn't persist from year-to-year - that is, a student only retains some fraction of the learning advantage they get from having a highly effective teacher - these claims simply don't hold up.

In a new paper, "The Persistence of Teacher-Induced Learning Gains", Brian Jacob, Lars Lefgren, and David Sims estimate how much of the teacher effect fades out over time. It turns out that kids lose more of these short-term test score gains that we (or at least I) thought:

"Our estimates suggest that only about one-fifth of the test score gain from a high value-added teacher remains after a single year. Given our standard errors, we can rule out one-year persistence rates above one-third. After two years, about one-eighth of the original gain persists."

Yes, you read that correctly. Even if you rely on the upper bound estimates of teacher effect persistance from this study, only a third of that gain sticks around. If you take their point estimate, only 20% of this gain persists. If gains fade out at this rate, we may be overstating the ability of highly effective teachers to contribute to students' long-term academic skills, says Jacob:

"Our results indicate that contemporary teacher value-added measures may overstate the ability of teachers, even exceptional ones, to influence the ultimate level of student knowledge since they conflate variation in short-term and long-term knowledge. Given that a school’s objective is to increase the latter, the importance of teacher value-added measures as currently estimated may be substantially less than the teacher value-added literature indicates."

Jacob and colleagues conclude that we should revisit the "5 great teachers can erase gaps" claim that is so common in education policy discourse:

"Previous researchers have referenced a counterfactual world in which a series of high value-added effects for a hypothetical student with a string of good teachers may be simply added together. Given this scenario, researchers and policymakers have advocated the widespread use of such value-added measures in a variety of education policies including teacher compensation and teacher/school accountability. Our results suggest some caution should be taken in focusing on such measures of teacher effectiveness. If value-added test score gains do not persist over time, adding up consecutive gains does not correctly account for the benefits of higher value-added teachers. Of course, the same caution should be attached to any educational intervention. Hence, the broader implication from this work is that researchers and policymakers should make greater effort to track the long-run impact of education policies and programs."

If you can't access the paper, I've linked to Brian Jacob's contact info above, or shoot me an email (eduwonkette (at) gmail (dot) com).

Update: * To clarify, this paper does not find that teachers "don't matter." If every teacher moved students forward 2 grade levels - an effect twice as large as the gains we expect teachers to produce - we would find no "teacher effects" on test scores, i.e. having one teacher versus another wouldn't matter because all teachers would be equally effective in increasing test scores. But teachers would still make enormous contributions to students' learning in this scenario - they still "matter." Jacob and colleagues' point is simply that the difference between having a below and above-average teacher may be inflated in the current literature because we've been focusing on short-term versus long-term gains.

* Chad Aldeman at The Quick and the Ed misunderstands the implications of the paper: "These findings in no way challenge previous studies indicating teacher effects accumulate over time." This study does find that teacher effects accumulate - a student does, after all, hold on to 20% of the teacher-induced learning gains - but they do not accumulate in the additive way that those quoted at the top of this post have suggested. The above quotes assume that students carry forward the full gain, and that as a result, we can close the achievement gap by giving students five 84th percentile teachers in a row. If teacher-induced gains decay at the rate documented in this paper, this sound bite does not hold up.

June 9, 2008

A Plea to Stop the Drama on Teacher Misconduct

paris_hilton_tinkerbell_dog.jpg
Providing shock and awe news on the gritty trespasses committed by teachers is a cottage industry. Now there are entire blogs committed to this enterprise, the most disgusting of which is Detention Slip. Rather than discussing these stories in a productive way, something that more astute observers have consistently done (See Scott McLeod on cell phone videos or Corey Bower on teachers losing their cool), the goal is to discredit teachers and public education in general.

There are 3.2 million public school teachers in America. Even if one hundredth of one percent (.01%) of them did egregious things, we would still see about 2 awful stories in the news for every day of the school year. I am not suggesting that we ignore these issues, but asking that we put them in perspective. Every profession struggles with how to enforce and maintain professional norms (see Robert Pondiscio on a Hippocratic Oath for education). Those discussions are important. But I question the wisdom of focusing so much attention on these stories when they are no way typical of the behavior of public educators.

June 3, 2008

In Which Mike Petrilli and I Play Debbie and Diane

Mike Petrilli and I have a friendly off-blog scuffle at least once a week, and here's our latest quarrel. Over at Flypaper, Mike wrote, "After arguing about race for forty years, many of which saw an expansion of the achievement gap between white and black students, even the left-left coast is agreeing that student performance is more important than the racial make-up of a classroom."

Here are my two cents on this false choice: Even if you only care about student achievement, racial composition is important. Put simply, it's more difficult to attract and retain high-quality teachers in schools that are racially isolated. There are oodles of papers on this topic, but here is a good one. Mike has more to say about this point, so I'll let him take it from here...

May 28, 2008

Kopp Out

pondiscio.jpg
Last week, Robert Pondiscio put forth an ingenious proposal to leverage the service of recent college grads who teach for two years through Teach for America:

Instead of throwing TFAers into the worst teaching situations in the cities you serve, place them in some of the best, highest-performing schools….Place them in that high-functioning school for two years as pinch-hitters for some of our best, most experienced teachers, and send those master teachers to the same schools to which you’re sending TFA corps members now. We can call it the Teach For America Fellowship, and throw in a nice extra chunk of change to incentivize those master teachers.

Kopp rejected the idea. Here’s her argument, and my thoughts on each point:

1) It is a rare person who has what it takes to excel as a teacher in a low-income community, and it’s not at all a given that teachers who do well in more privileged communities will do well in urban and rural areas.

Sure. But is there any reason to believe that the most talented experienced teachers who are willing to teach in a high-needs school for two years will do worse than recent college grads with no teaching experience?

2) The individuals who come to Teach For America are coming because they want to work with the nation’s most disadvantaged children (and it is unlikely that most of them would decide to channel their energy toward teaching in more privileged contexts).

This is an empirical question. Many recent college grads understand that they would be doing a greater service to disadvantaged kids by putting an exceptional experienced teacher in that classroom – certainly many of my undergrads who’ve considered TFA have thought about this issue. Those who want to teach for more than two years might value learning to teach in a supportive environment. And let’s be honest – many TFA applicants gravitate towards TFA simply because it is selective. Finally, Robert’s proposal need not supplant TFA’s current recruitment efforts; this fellowship could operate as a stand alone program.

3) The recent Urban Institute study that looked at the impact of high school teachers in the state of North Carolina over a six-year period provides evidence that our strategy has a positive impact for kids.

We have discussed the generalizability of this study at length on this blog (see Teach for America Study Wrap Up and In Which We Make Sweeping Generalizations from a Sample of 69 Teach for America Teachers in North Carolina). Beyond those caveats, this study provides little insight into the likely effects of Robert’s proposal.

Why not? The Urban Institute study does not examine the distribution of teacher effects for experienced teachers – i.e. how effective are the top 10% of experienced teachers? Instead, it focuses on the average effects of TFA teachers versus experienced non-TFA teachers. Average effects are not helpful in evaluating the potential effectiveness of a program that would select the most exceptional experienced teachers.

4) Our strategy of channeling the energy of the nation’s future leaders into urban and rural schools is important for the long-term effort to ensure educational excellence and equity…Their initial teaching experience in under-resourced communities is foundational to their lifelong commitment to effecting the systemic changes necessary to ensure educational opportunity for all.

If Kopp’s point 2 is correct – TFA applicants are dedicated to improving the lives of the most disadvantaged children – this commitment pre-dates, though perhaps is bolstered by, their TFA teaching experience. And as many other bloggers have suggested, the TFA commitment could be lengthened to three years – two years in a low-needs school, and one in a high-needs school.

Robert, you should run with this idea - whether it is supported by TFA or another organization. Worst case scenario - we would gather useful data on a number of important teacher quality questions. Best case scenario - this "classroom swap" helps staff the toughest schools with the best experienced teachers, and disadvantaged kids benefit immensely.

May 23, 2008

skoolboy wonders: Could a Parrot Pass the New York State ELA Exam?

spiffboy2.jpg
A few days ago, A Voice in the Wilderness broke the story that the retest for the New York State English Language Arts exam had a task that required students to write a position paper arguing that inexperienced people can provide leadership, after listening to a speech by Wendy Kopp, founder of Teach For America. Some were appalled by the one-sided nature of the task, likening it to propaganda. eduwonkette’s take was that the task would be more defensible if students were given information on both sides and then asked to choose a side to argue.

The scoring guide for the task is now available on line, and it leads me in a different direction. I’m not close enough to high school English classrooms to know what a realistic level of competency is.

Here’s the task. Students were told that they would listen to a speech about young people who have become leaders in their communities. They were provided with the following situation:

Your leadership group has been debating whether leaders should have experience in their chosen fields. As part of this debate, you have decided to write a position paper in which you argue that inexperienced people can provide leadership. In preparation for your paper, listen to a speech by Wendy Kopp. Then use relevant information from the speech to write your position paper.

Students were instructed to be sure to : Tell your audience what they need to know about why inexperienced people can provide leadership; Use specific, accurate, and relevant information from the speech to support your argument; use a tone and level of language appropriate for a position paper for members of your leadership group; Organize your ideas in a logical and coherent manner; Indicate any words taken directly from the speech by using quotation marks or by referring to the speaker; and Follow the conventions of standard written English.

The passage, reproduced below, is from Wendy Kopp’s commencement speech at the University of North Carolina in 2006.

Thinking back to my own senior year in college, I wasn’t intending to start something like Teach for America—or to start anything at all for that matter. As a college senior I was applying to two-year corporate training programs, seeking out political internships, and generally struggling in my search for something that I really wanted to do. My generation was dubbed the “Me Generation.” People thought all we wanted to do was focus on ourselves and make a lot of money. But that didn’t strike me as right. I felt as if thousands of us talented, driven graduating seniors were searching for a way to make a social impact but simply couldn’t find the opportunity to do so.

Well, during my senior fall, I helped organize a conference about education reform, where one of the topics was the shortage of qualified teachers in urban and rural communities. It was at that conference that I thought of an idea: Why doesn’t our country have a national teacher corps that recruits us to teach in low-income communities the same way we’re being recruited to work on Wall Street?

From that moment, I was possessed by this idea—I thought it would make a huge difference in kids’ lives, and that ultimately it could change the very consciousness of our country, by influencing the thinking and career paths of a generation of leaders.

So I did the obvious thing. I wrote a long and very passionate letter to the President of the United States suggesting he start this corps. That didn’t get very far—I received a job rejection letter in response. So in my undergraduate senior thesis, I declared that I would try to create such a corps myself, as a non-profit organization. When my thesis advisor looked at my budget, which showed that to recruit 500 new teachers into this corps during the first year would cost two-and-a-half million dollars, he asked me if I knew how hard it was to raise $2,500, let alone two-and-a-half million dollars. Aided by my inexperience, I was unphased by his question. When school district officials and potential funders laughed at the notion that the Me Generation would jump at the chance to teach in urban and rural communities, their concerns, too, went unheard.

That year 2,500 graduating seniors competed to enter Teach For America, in response to a grassroots recruitment campaign—flyers under doors since there was no email back then! And one year after I graduated, with two-and-a-half million dollars in hand from the corporate and foundation community, I was looking out on an auditorium full of 489 recent college graduates who had joined Teach For America’s first corps.

My very greatest asset in reaching this point was that I simply did not understand what was impossible. I would soon learn the value of experience, but Teach For America would not exist today were it not for my naivete.

I see this same phenomenon every day as I watch 23-year-olds walking into classrooms and setting goals for themselves and their students that most people believe to be entirely unrealistic. The conventional wisdom is that there is only so much schools can do to overcome the challenges of poverty and the lack of student motivation and parental involvement that is perceived to come with it. But then there’s Liam Honigsberg, a Teach For America corps member in Phoenix whom I met a couple of weeks ago. His school’s vice principal saw that he had a degree in cognitive neuroscience and, naturally, called him the day before school started to ask him to teach a math class wholly comprised of seniors who were in danger of not graduating because they had not been able to pass the math portion of the state’s exit exam. It was a daunting task. Liam’s students seemed to be entirely uninterested in math. Their performance levels ranged from not having passed algebra to not having passed geometry. But Liam determined that they could and would gain the skills to graduate. The Arizona Republic estimated last year that 5,000 students didn’t graduate in Arizona because they didn’t pass that exit exam, and yet thanks to Liam’s idealism, all of his students will walk across the stage this spring.

Just over 100 miles from here, Tammi Sutton and Caleb Dolan were teaching middle schoolers in Gaston County. Tammi and Caleb were just 25 years old when they decided that to truly ensure their students had the opportunities they deserved, they would have to actually go out and start a new school in their community—a school that would set their students up to go to college. This was a pretty radical idea in Gaston County and there were many skeptics. In spite of the many who said it could not be done, Tammi and Caleb designed a program with rigorous expectations that would run from 7:30 in the morning until 5 at night, on two Saturdays a month and three weeks during the summer.

There were many who said this could not be done. Yet now their 8th graders—students who came to them in 5th grade performing anywhere between the 1st to the 4th grade levels—are performing at a level that places their school among the state’s top 15 schools in reading, writing, and math.

Teach For America’s story, and Liam, Tammi and Caleb, show us that your inexperience is a real asset. I hope you will put it to good use.

Here’s the anchor paper, scored 4, the top score (in a scale from 1 to 4). Text verbatim.

As shown in Wendy Kopp’s speech, experience is not required to be a leader. I believe leaders can be anyone who has the drive and motivation to be seccessful in the task that is at hand. Experience is aquired through years of doing the same thing over and over again, leadership does not require that.

Wendy Kopp, the woman who stated Teach for America, was inexperienced when she started the program, yet she was very seccessful. She had the drive and motavation necessary to be a leader and never gave up. Many people believed her program would never be a success because her generation was dubbed the “me” generation. The “me” generation is a generation in which money and themselves are all that matter. However, peoples thoughts about how her program would never be a sucess did not stop her. Wendy Kopp started out by writing a letter to the president, this was unseccessful. She decided to write her undergraduate thesis on her idea for Teach for America and the teacher told her it was not possible, it required too much money. Wendy was still determined, so she went to buisnesses to asked for donations and she got laughed at. They believed she could not do it. She believed her generation had people who wanted to make a social impact. Urban and rural areas needed experienced teachers and her program was designed to help. Once she finally got the money, her program was a success, about 489 recent graduates joined her program.

Liam is a part of Teach for America. He was determined to make every senior in his class graduate, although, he did not have much support because many people thought they were hopless cases. Liam taught in Arizona, in a class of seniors who needed to pass a math exam to graduate. In Arizona about five thousand students did not graduate last year. Liam’s did.

Then there was Tammy and Caleb. They started a new school in Gaston County to teach children that were considered hopeless. Tammy and Caleb took 5th graders who were considered at the 1st to 4th grade level and made them model students by 8th grade. Thier school is now a top school.

Experience is not needed to be a effective leader. Motivation and determination is all that is necessary. Wendy Kopp is the proof of that.

The scoring commentary states the following:

Meaning: The response reveals an in-depth analysis of the text making clear and explicit connection between information and ideas in the text and the assigned task.

Development: The response develops ideas clearly and fully, making effective use of relevant and specific details from the text to argue that inexperienced, but determined, people can provide leadership.

Organization: The response maintains a clear and appropriate focus on how motivation and determination, rather than experience, are necessary for leadership. The response exhibits a logical and coherent structure through use of appropriate transitions.

Language Use: The response uses appropriate language, with some awareness of audience and purpose. The response occasionally makes effective use of sentence structure or length.

Conventions: The response demonstrates partial control of conventions, exhibiting occasional errors in spelling, punctuation, capitalization, and grammar that may hinder comprehension.

So readers, what do you think? Is the problem here the task, or what’s scored as an excellent response to it, or both?

May 16, 2008

In Which We Make Sweeping Generalizations from a Sample of 69 Teach for America Teachers in North Carolina

A special shoutout goes to the New York Times editorial board for making national policy recommendations based on the Urban Institute's study of Teach for America in North Carolina, which included a whopping 69 Teach for America teachers - a .5% sample of all TFA teachers placed during those years. The study found that North Carolina TFA math and science teachers produced results slightly better (about a tenth of a standard deviation) than experienced teachers in the same school. Because every state in the country is just like North Carolina, the NYT argues that "states that want students to do better in math and science need to focus recruitment on more selective colleges instead of on traditional teacher education programs, which are often little more than diploma mills."

There is a long discussion of that study here. As I wrote then:

I’m all for Teach for America as a stopgap, but the achievement gap claim is fanciful thinking. Why? By comparison, the black-white gap in NAEP math achievement in grade 12 is approximately 1 standard deviation (and is likely larger because many black students have left by grade 12). An advantage of .04-.1 standard deviations over teachers with 3-5 years experience in the same school is not going to significantly close the achievement gap. This is not an advantage over teachers in the nearest suburb or the best schools in the city that don’t staff TFA teachers, and is hardly a convincing rationale to permanently staff tough schools with a revolving corps of academically talented 2-year teachers.

Teacher Salaries, ATRs, and Closing Schools: A Preview

I've got NYC's school-level teacher salary data fired up, and will write a few posts using these data next week. Here's a preview. New York City is slated to close 14 schools this year, though many will not close immediately, but will phase out over the coming years. Per the whole "Absent Teacher Reserve" (ATR) debate (here, here, here, and here), how many teachers are employed at these schools, and what are their average salaries?

These schools employ a total of 822 teachers, and a number of these schools have relatively high average salaries. Given current budgeting rules, through which schools are allocated dollars rather than positions, what's the chance a principal will, all else equal, hire an excessed teacher from Franklin K. Lane who makes $80,000 when he can hire a teacher with 3 years experience for about $46,000? (See the teacher salary scale here.)

If you've got questions that you'd like to see answered using the teacher salary data, please leave me ideas below.

closing-schools-salary.jpg

May 9, 2008

Who Slipped a Mickey in John Merrow's Kool-Aid?

Kool-AidMan.jpg
It wasn't me, that's for sure. John Merrow shed crocodile tears this morning over public education's "upside-down universe where student outcomes are not allowed to be connected to teaching." He gives a special shout out to New York, though conveniently fails to mention that our tests are administered smack in the middle of the year. We'll give him a pass for forgetfulness - but watch your drink next time, J.

You can't swing a fish anymore without hitting a glitch in value-added models - try some of the papers from the recent Wisconsin Value-Added Conference on the complexities of measurement error (value-added models must contend with measurement error in both last year and this year's scores), interval scaling (few tests are scaled so that 1 unit of growth at the bottom of the scale means the same thing as 1 unit of growth at the top of the scale), and non-random assignment (see Jesse Rothstein's new paper on just how large these biases can be). Or you can refer to these earlier posts:

* skoolboy on: The Status of the Status Quo in Education Policy
* More Signs of the Apocalypse! (More on NY's Teacher Tenure Law)
* After NY's Teacher Tenure Law, Blogosphere Plays Union Pinata
* My Value-Added Bucket List
* Do Value-Added Models Add Value? A New Paper Says Not Yet
* The Oops Factor in Measuring Teacher Effectiveness
* Ignoring the Great Sorting Machine
* No Teacher is an Island
* What Does It Mean for a Teacher to Be Good?

Alexander Russo, also commenting on Merrow, makes the mistake of equating teachers' evaluation of students with tests and quizzes with the evaluation of teachers by students' test scores. It's just a bad comparison. Teachers give tests, assignments, reports, homeworks, etc in order to evaluate students and to see what they've learned. These measures are part of an extended interactive process through which a teacher hopes to move students forward. The purpose is not simply to label a student as "good" or "bad" based on one assessment. But when we evaluate teachers based on students' scores, the teacher is being evaluated on a more narrow set of skills than are students, and high-stakes are attached to a single test. So the intent of the process is different; few value-added plans are designed to help teachers improve, but focus instead on assigning rewards and sanctions.

The measurement issues are also different. In an elementary school year, a teacher probably collects 900 data points on student performance (let's say 5 a day); with teacher value-added, we end up with 20-25 data points a year. Teacher value-added is, in short, a low precision enterprise. Readers, what do you think of Alexander's comparison?

Happy weekend, everyone!

May 7, 2008

Joel Klein Blames Idle Teachers for $4 Gas, Subprime Crisis

joel_klein.jpg
Forget Secretary of Education - this guy should be running the Fed. This morning, the Daily News reported that "Schools Chancellor Joel Klein said the teachers union - and policies that keep instructors from their classrooms - bear some of the blame for next school year's budget cuts."

You've got to give the man props for having the cojones to craft a budgeting rule that creates disincentives to hire teachers from closing schools on Monday, spend $80 million on a data warehousing system that doesn't work on Tuesday, hire a legion of PR and executive staff at McKinsey prices on Wednesday, pay for British quality review evaluators to fly across the pond on Thursday, and on Friday, blame the freaking teachers union for his lack of fiscal discipline and America's economic downturn. Those are epic cojones, really.

So if we could get back to the real issues - I'd like to know the answers to these questions:

1) What percentage of ATRs are carrying full loads but haven't been formally hired? Now that the UFT has established that many ATRs are serving as regular teachers, a third party needs to formally study this question. I do wonder why these data weren't collected and analyzed as part of the original report.

2) How do budgeting rules affect experienced teachers' odds of being hired? Yesterday, Daly clarified that some excessed teachers are on local budgets (34% of the 2006), but there are good reasons to believe that it's the younger teachers who are on local budgets. As I understand it, here's the budgeting rule: If the teacher comes from a closing school, the ATR goes on central payroll. If a school is simply deciding that it wants to close down one of its programs, or its student enrollment goes through the ordinary dips, the ATR remains on the school's budget.

In a comment, Daly reported that senior ATRs are more likely to come from closing schools - it follows, then, that experienced teachers are more likely to be on the central budget. If experienced teachers are more likely to be centrally financed, this may explain, in part, why they are more likely to remain in the ATR pool.

If, as TNTP report said, we need a solution "that recognizes the value, commitment and service of New York City’s teachers," we first need to understand why experienced teachers are more likely to remain in the ATR pool. More hard numbers on these issues would be a good start.

Image credit: Gotham Gazette

May 5, 2008

Guest Blogger Tim Daly on The New Teacher Project's Report

timdaly.jpg
Tim Daly is the President of The New Teacher Project and the lead author of "Mutual Benefits."

Over the past several days, representatives of the United Federation of Teachers (UFT) and others have sought to challenge specific findings of “Mutual Benefits,” our recently released study on New York City’s school staffing policies. We appreciate the UFT’s engagement in this dialogue and welcome their participation.

The New Teacher Project (TNTP) researched and released “Mutual Benefits” with the goal of sparking a substantive, data-driven policy debate from which better policies would emerge. We are glad to see this debate taking shape and remain optimistic that it will lead to reforms that better serve New York City students.

As our paper indicates, the current policy on teachers in the Absent Teacher Reserve (ATRs) is flawed in four fundamental ways:

1. Teachers in the ATR have no incentive to search for positions aggressively and no requirement to apply for positions
2. Teachers have earned and will continue to earn tenure while serving in the ATR
3. There is no limit to the amount of time teachers may serve in the ATR, earning full salary and benefits regardless of their placement status
4. The ATR includes a higher concentration of teachers with documented performance problems than the overall teacher population, and that concentration is growing over time

It is important to note that our assessment of these flaws in the current policy has not, to our knowledge, been rebutted or addressed by any criticism of the paper to date. We stand by these findings and continue to believe that, if unaddressed, the stresses that these flaws put on the school system will inevitably undermine the fair, open and efficient staffing process now in place in New York City.

Though the arguments by the UFT and others against our findings and recommendations have not centered on these core issues to date, many of them mischaracterize our research and threaten to distract everyone involved from the real issues at hand. Below we respond to each of the primary arguments leveled against our report, as discussed primarily in posts on the UFT’s official blog, EdWize.org, and on Eduwonkette.com. We have asked both sites to post this response as part of the larger discussion.

One-third of ATRs are teaching “regular programs” on a full-time basis.

This assertion is inaccurate and misleading for several reasons, including:

1) It wrongly includes guidance counselors

The UFT estimates that 200 or more individuals in the ATR are, “teaching full programs, with regularly scheduled classes, just as they had done when they were regular assigned to schools.” However, the UFT includes not only teachers but also guidance counselors in this figure. Our report does not include data on guidance counselors or address their hiring patterns at any point. Guidance counselors should therefore be excluded from this calculation. Data from New York City’s payroll system appear to indicate that approximately 85 guidance counselors remained in excess as of April 2007.

2) It includes District 79 teachers, whose excessing and hiring processes were anomalous

In his posting on EdWize.org, Leo Casey of the UFT claims that 270 of the 665 teachers in the ATR are from District 79 alternative schools. Neither figure is correct. According to the NYCDOE’s payroll system, 123 teachers from District 79 schools were in the ATR as of December 2007. These teachers were not included in the 665 figure or our study in general because District 79 underwent a substantial and atypical restructuring in 2007 that led to many teachers changing schools. The rules governing the hiring process for these teachers differed from those for other excessed teachers.

For this reason, TNTP did not include 2007 excessed teachers from District 79 schools in its analysis; it would have been misleading to consider them along with other teachers whose excess process was quite different and far more typical of the city’s normal hiring process. If the UFT believes that the restructuring process for alternative schools should have happened differently, that is a worthy debate – but it is quite separate from this one.

Even so, District 79 teachers fared very well in obtaining new placements. Overall, only 24 percent of teachers excessed from District 79 in 2007 still had not found a new position by December—lower than the unselected rate for teachers who were not from District 79 schools.

3) It is based on an unreliable data source

Last, the UFT’s data is of questionable quality and requires more scrutiny and explanation. It is not enough to conclude that because a teacher reports working a full class schedule that the teacher is actually filling a full-time, permanent vacancy. Self-reported data is vulnerable to a host of inaccuracies. For example, the teacher could be substituting for a teacher who is on long-term leave but who will return again. Verification of the UFT’s claim would require communication with the building principal and an examination of the course allocation for each school. It would require knowing whether the only factor preventing principals from placing ATRs into permanent positions is the budget issue raised by the UFT, or whether they are assigning them to classes merely because they have been instructed to do this as the best way to accommodate ATRs who are housed in their buildings.

It is entirely possible that some teachers in the ATR are effectively teaching on a full-time basis. Indeed, as we have noted before, it is difficult to know exactly how principals are putting these teachers to use. In instances where a reserve pool teacher truly is filling a permanent position, we believe that teacher should be formally appointed to the position. That is a reasonable and fair outcome. Limiting the amount of time a teacher may serve in the reserve pool, as we recommend, may in fact provide an incentive for principals to appoint these teachers to positions formally (or risk losing them).

Continue reading "Guest Blogger Tim Daly on The New Teacher Project's Report" »

Why Buy the Teacher When You Can Have the Teaching for Free?

free_250x251.jpg
New Yorkers love themselves some incentives. We have incentives for students to do well on tests and incentives for parents to take their kids to the doctor. Now that we can't enjoy a meal without contemplating its caloric content, we have guilt-based incentives to eat Pinkberry yogurt instead of Beard Papa's cream puffs. Last week, the New Teacher Project argued that teachers in the "Absent Teacher Reserve" have no incentive to get a job. This morning, it's clear that, in many cases, principals have no incentive to hire them.

On Friday, I showed that experienced teachers are more likely to remain in the Absent Teacher Reserve, and asked what role financial incentives might play in producing this outcome. Of teachers excessed in 2006, only 22% had 13+ years of experience. Of the 2006 teachers who remained unplaced as of December 2007, 42% had 13+ years of experience. Under Fair Student Funding, which allocates dollars rather than positions to principals, a rational principal would choose a $40,000 teacher over an $75,000 one, all else equal. But FSF didn't come online until 2007, and thus can't account for this pattern.

But a more basic incentive problem predates Fair Student Funding - ATR teachers are off-budget. Imagine that you're a principal and through the ATR pool, you've identified a teacher with 20 years of experience that you'd like to have on board. You can give the teacher a full-time class and acquire him at no cost, or you can shell out a pile of money. In the former scenario, the teacher is happy (he's getting paid a full salary and has no reason to leave) and the principal is happy - he's scored a free teacher.

This morning, Elizabeth Green reported that 29% of the absent teacher reserve pool (194 of the 665 teachers) are teaching full courseloads. Edwize provides a list of schools in which teachers have full-time positions and more details. While we're kvetching about aligning incentives, we should get them aligned for principals, too.

May 2, 2008

Why You Should Read the Fine Print in the New Teacher Project Report

fine-print-shadow.jpg
From the coverage of the New Teacher Project's report, "Mutual Benefits: New York City’s Shift to Mutual Consent in Teacher Hiring,” you'd think that the 235 teachers excessed in 2006 and remaining in the "absent teacher reserve" in December 2007 are the worst of NYC's worst teachers. Consider the National Center on Teacher Quality's retelling: "They are also a generally substandard bunch, with a higher rate of unsatisfactory ratings on their personnel records than their more successful peers. For those content to do very little in life, why give up the life of an excessed teacher?" Or, as the NTP's press release put it, "By September 2007, unselected excessed teachers from 2006 were six times as likely to have received a prior “Unsatisfactory” rating as other New York City teachers."

So what percentage of these teachers have never received an Unsatisfactory rating? 81 percent. What percentage of these teachers have received an Unsatisfactory rating more than one time in their careers? Only 6 percent - about 14 teachers. I am not denying that these rates are higher than the NYC teacher population as a whole. They are. But the raw numbers provide much needed context, and we shouldn't have to dig deep in the report to find them.

The issue of age discrimination in teacher hiring also remains unresolved by this report, despite eduwonk's protest on this point. And there are good reasons to keep a close eye on age discrimination in NYC. With the advent of "Fair Student Funding," principals have strong incentives to hire teachers that cost less. And as the age of principals continues to decline, we might expect that young principals will prefer to supervise younger teachers.

To be sure, the NTP report provides evidence that experienced teachers are somewhat less engaged in the job seeking process than inexperienced teachers. Unfortunately, it doesn't provide enough evidence to convince me that previous teacher ratings and job seeking patterns can fully explain the pattern exhibited in the graph below. The blue bars show the experience levels of the pool of teachers excessed in 2006, while the red bars show the experience levels of teachers who remained unplaced in December 2007.

Because of seniority rules, 44% of teachers excessed in 2006 had 0-3 years experience, while 22% of teachers in this pool had 13+ years of experience. Of the 235 teachers who remained unplaced as of December 2007, only 25% of these teachers had 0-3 years of experience, while 42% had 13+ years of experience. (All numbers are taken from the NTP report, though it wisely never put these two sets of numbers in a figure together.)

NTP%20graph.jpg

My point is not that we should preserve the current staffing rule, or that we should turn back the clock - mutual consent is an important principle. The DOE and UFT need to strike a deal, but first we need to understand the nature of the problem. Framing these teachers as a uniform bunch of incompetent louts does little to advance this understanding.

Update: The New Teacher Project's Tim Daly comments below.

April 14, 2008

More Signs of the Apocalypse!

Apocalypse-Cow_logo.jpg
Here's my take on the New York tenure law discussion going on around the blogs:

1) The backdoor process was unsavory, and now threatens to displace an important discussion about the limits of value-added measures in New York. Sherman Dorn offers some fertile thoughts on the process issue. Also worth noting that last week's outragists were hardly outraged about the secrecy surrounding NYC's teacher experiment.

2) Critics would do well to separate the likely effects of this law from their unhappiness with the process. Consider Robert Gordon's post, which interprets the law's effects as follows:

This means that in deciding whether to give a teacher a presumptive right to teach for 30+ years, a principal may not consider evidence of whether the teacher is helping students learn. The principal can consider whether the teacher maintains neat bulletin boards, whether the teacher attends meetings on how to pay for pencils, and whether the teacher is sufficiently deferential in the hallway. But the principal may not consider, based on achievement data, whether children are learning.

Do classroom observations provide no "evidence of whether the teacher is helping students learn?" Value-added measures, after all, are simply a proxy for student learning, and observations also provide proxy data on student learning. Gordon assumes that principals cannot identify teachers with especially low value-added in the absence of test score data. But if value-added measures mean anything, very low performers should be getting poor subjective evaluations too. It turns out that principals are actually pretty good at identifying teachers with low value-added based on subjective evaluations (see this post). If a teacher is a consistent low performer, the three admissable forms of evidence in tenure decisions - 1) observations, 2) peer review, and 3) an evaluation of how teachers use data to inform instruction - already provide lots of information about how teachers affect student learning.

3) To my knowledge, no one has provided a viable technical solution to the middle of the year testing issue. Given existing problems with value-added and the added complication of midyear testing dates, it would be wildly irresponsible to put these measures into place in NY without further study.

If you want new reasons (not related to testing dates) to sweat about the fallibility of value-added, check out this paper, which was presented last weekend at AEFA by Tim Sass (in collaboration with RAND's J.R. Lockwood and Dan McCaffrey). They looked at the year-to-year stability of value-added estimates in Florida, and found that it's often the case that teachers who are in the bottom 20% of value-added estimates in one year are not in the bottom 20% the next year. In Broward County, only 41.4% of teachers who were in the bottom 20% in one year were in the bottom 20% the next year, too. In Orange County, only 31.7% of the teachers who were in the bottom 20% in one year were also there the next year!

Update: Robert Gordon cherrypicks a finding from the Jacob and Lefgren paper to make his point. Perhaps if he'd read beyond the abstract and looked at the magnitude of the value-added advantage over principal ratings in predicting future student achievement (a whopping .036 SD in reading and .074 SD in math), he would realize that all is not lost. And again, this minuscule value-added advantage is coming from the middle of the distribution, not the top and bottom - and the bottom is the relevant issue in tenure decisions. From the same paper:

While value-added measures of teacher effectiveness generally do a better job at predicting future student achievement than principal ratings, the two measures do about equally well in identifying the best and worst teachers. With regard to parent satisfaction, we find that a principal’s overall rating of a teacher is a substantially better predictor of future parent requests for that teacher than either the teacher’s experience, education and current compensation or the teacher’s value-added achievement measure.

Moreover, what kind of predictive advantage can we expect inaccurate/noisy value-added estimates to have over principals' evaluations?

April 9, 2008

After NY's Teacher Tenure Law, Blogosphere Plays Union Pinata

burro_donkey_pinata.jpg
That's what I get for predicting that the big ed news of the week would be Mario Chalmers' shot. The NY legislature has put a two year hold on the use of test scores for teacher tenure decisions, and will convene a commission to study the issue in the meantime. First, check out these links to Joel Klein's op-ed, Randi Weingarten's op-ed, the NY Times article, and the NY Post article.

Neither policymakers nor the public understands the complexity of estimating value-added models, so I preferred educating lawmakers and the public about what conditions would have to be in place to validly use these measures to nixing the use of test scores formally. Perhaps that was naive on my part, as Joel Klein wanted to ignore these limitations and move ahead with value-added (see his op-ed above).

But I worried that formally barring test scores from consideration would give union bashers another opportunity to distract attention from the larger problems faced by public education. And now the union pinata match is on. Joe Williams' post stands out for its histrionics. Featuring a mushroom cloud, Williams prognosticates, "When we are all standing at public education's funeral someday in the near future, remember to do a cough-chant of "murderer" when Dick Ianuzzi or anyone else from NYSUT tries eulogize." Kevin Carey digs deep and pulls out Paris Hilton-worthy dramatics: "It's hard to imagine a more unambiguous declaration of the union's total disregard for student learning when its members' jobs are at stake." Socrates calls the legislators "union-mouthpieces." Joel Klein, in his op-ed, even blames unions for the existence of achievement gaps:

Protecting grownups rather than making sure students can read and do math is how our country has gotten into the educational mess it's in today. It's the reason we have shameful racial achievement gaps separating our white and Asian students from our African-American and Latino students.

That's why there are no achievement gaps in North Carolina and Texas!

Yet none of these guys acknowledges the elephant in the room in New York: tests are given in January. That means that a value-added measure would estimate the effects of teacher pairs, not individual teachers: one teacher teaches students from January to June, and another from September to January. Even if two teachers are equally effective, a novice 4th grade teacher who receives students from a 10 year superstar 3rd grade teacher is going to look better than a novice 4th grade teacher who receives students from another novice teacher.

If NYC wants to get serious about value-added, tests need to be given in September and June, and these tests need to be designed to measure growth, which NY state's tests are not.

The good news is that principals are actually pretty good at identifying which teachers have high or low value-added, even in the absence of these data, and they can use this insight to inform their tenure decisions. Take a look at this paper by Brian Jacob and Lars Lefgren, based on a study in which the authors estimated value-added models, but also had principals conduct subjective performance evaluations. They found that principals can identify teachers with high and low value-added; for tenure, the goal is to deny tenure to teachers with especially low-value added. Moreover, Jacob and Lefgren found that, "a principal’s overall rating of a teacher is a substantially better predictor of future parent requests for that teacher than either the teacher’s experience, education and current compensation or the teacher’s value-added achievement measure." They concluded:

To the extent that the most important staffing decisions involve sanctioning incompetent teachers and/or rewarding the best teachers, a principal-based system may also produce achievement outcomes roughly comparable to a test-based accountability system. In addition, increasing a principal’s ability to sanction and reward teachers would likely improve educational outcomes valued by parents but not readily captured by standardized tests.

See below the fold for more wonky stuff on the testing calendar.

Continue reading "After NY's Teacher Tenure Law, Blogosphere Plays Union Pinata" »

April 4, 2008

Is Teaching an Overrated Career?

us-news.jpg
Over at the Faculty Room, they're discussing the US News and World Report claim that teaching is an overrated career. Devin Ozdogu shares an old excerpt from Whitney Tilson's guru, Linda Darling-Hammond:

HELP WANTED. College graduate with academic major (master’s preferred). Challenging opportunity to serve 150 clients daily on tight schedule, developing up to five different products each day to meet individual needs. Adaptability helpful since suppliers cannot always deliver goods and support services on time. Diversified position allows employee to exercise typing, clerical, law enforcement, and social work skills between assignments and after hours. Ideal candidate will enjoy working in isolation from colleagues. Typical work week 50 hours. Nature of work precludes use of telephones or computers, but work has many intrinsic rewards. Starting salary $24,661 with chance to earn $36,495 after 15 years.

eduwonkette guest blogger alumnus Sean Corcoran responds, too, and suggests that being a US News "best of" listwriter may the most overrated career of all.

April 3, 2008

Value-Added Preview

Wisconsin's Center for Educational Research has posted abstracts for the National Conference on Value-Added Modeling. Take a look here.

April 2, 2008

Quotes of the Day

GarthHarries.jpg
Call me old fashioned and curmudgeonly, but I can't stand it when the wonks break out in a "research shows" chorus with no references. If research so valiantly and definitively shows it, you should be able to tell us whose research shows it.

The quote of the day is a tie; both quotes hail from the Teachers College forum on class size this afternoon.

1) In introducing NYC Department of Education's Garth Harries, who is the "Chief Portfolio Officer" and the former "engagement manager" at McKinsey, TC prof Carolyn Riehl said, "These titles - we usually don't think about them in education - so I'm sure it makes for some great cocktail party conversation."

2) After Harries spoke at length about how the effect of teacher quality is much larger than the effect of reduced class size, an audience member asked him to cite some studies supporting this claim. Harries replied, "Uh, I can't quote to you on what the research is, but I can (pause) get it to you." Research shows!

For a paper on teacher effects using the STAR data, see "How Large are Teacher Effects?"

March 30, 2008

Teach For America Study Wrap-Up

Some readers requested a closer look at the Urban Institute's Teach for America study presented at AERA last week. To this reader, the study is convincing, and provides strong and viable evidence against those who argue that Teach for America teachers negatively affect their kids’ educations. However, I was not sold on the authors’ conclusion that teacher retention should take the backseat to teacher selection.

First, what did the study find? If we take the study's most conservative estimates for all eight high school subjects (7 math and science subjects, plus English I, and comparing North Carolina TFA teachers with non-TFA teachers in the same school )- the Teach for America advantage is .064 standard deviations, while teachers with 3-5 years experience provide an advantage of .024 standard deviations (compared to those with <3 yrs experience), teachers with 6-10 years of experience offer a .015 gain, and those with 11 or more years of experience offer a gain of .007 standard deviations.

The authors concluded that "the Teach for America effect, at least in the grades and subjects investigated, exceeds the impact of additional years of experience, implying that TFA teachers are more effective than experienced secondary school teachers….programs like TFA that focus on recruiting and selecting academically talented recent college graduates and placing them in schools serving disadvantaged students can help reduce the achievement gap, even if teachers stay in teaching only a few years.”

But small is small. I’m all for Teach for America as a stopgap, but the achievement gap claim is fanciful thinking. Why? By comparison, the black-white gap in NAEP math achievement in grade 12 is approximately 1 standard deviation (and is likely larger because many black students have left by grade 12). An advantage of .04 standard deviations over teachers with 3-5 years experience in the same school is not going to significantly close the achievement gap. This is not an advantage over teachers in the nearest suburb or the best schools in the city that don’t staff Teach for America teachers, and is hardly a convincing rationale to permanently staff tough schools with a revolving corps of academically talented 2-year teachers.

So my primary disagreement with this study stems from its conclusion, “policy makers should focus more on issues of teacher selection, and less on issues of teacher retention, if the concern is the performance of disadvantaged secondary school students especially in math and science.” For this to be true, we must assume that a school is simply an amalgam of pods in which teachers teach, such that a teacher’s decision to leave is independent of other teachers’ future efficacy. In other words, the authors presuppose that teacher turnover has no effect on the school as an organization, and that teacher quality is solely an individual attribute, rather than the joint product of individuals and organizations. (And what do we make of the tiny effects of experience? Is it possible that the most talented math and science teachers left to pursue more lucrative opportunities?)

It’s nearly impossible to build a stable school community and an ethos of sustained change in the face of regular turnover. Herein we have the classic chicken and egg problem in education: how do we create places where good teachers want to work - a key component of which is a stable professional community – if we can’t get strong teachers to stay? Programs like Teach for America are a fine band-aid, but they are hardly a solution.

March 28, 2008

AERA filing: Good Teachers: Who Are They? Where Are They? When Do They Stay and Move?

spiffboy2.jpg
skoolboy went to a session Thursday that was billed as about all things teachers -- mobility, retention, etc. But the session was a bait-and-switch; three CALDER (National Center for Longitudinal Data in Educational Research) papers, only two of which were about teachers. Tim Sass led off with a paper on charter high school effects on high school graduation and college attendance in Chicago and the state of Florida. Using eighth grade test scores and demographic variables as controls, and studying students who attended charter middle schools to control for selection bias, Sass and his colleagues found that students who attended charter high schools were 11 to 14 percentage points more likely to graduate from high school, and 10 to 13 percentage points more likely to attend college, than similar students who did not attend charter high schools. He concluded that expanding school choice at the high school level may be part of an effective policy to reduce high school dropout rates and promote college attendance.

Sunny Ladd presented a paper coauthored by Charlie Clotfelter and Jake Vigdor on high school teacher credentials and student achievement. Examining North Carolina end-of-course tests in English I, Algebra I, biology, geometry, and ELP (social studies), Ladd modeled achievement as a function of teacher credentials and characteristics, classroom characteristics, and student fixed effects. Students of teachers who entered via lateral entry rather than a regular license had lower test scores, whereas students with more experienced teachers and National Board certified teachers had higher test scores. Certification in the subject taught enhanced test scores by .08 standard deviations -- a sizeable amount, given that low SES black students scored .12 standard deviations below other students. Ladd found that teacher credentials explain 1/5 to 1/3 of the overall variation in teacher quality, and that teacher credentials are distributed unevenly across schools, with black students and students in high-poverty schools less likely to have highly-qualified teachers. Thus, racial differences in access to teacher credentials contributes to the black-white achievement gap.

Jane Hannaway reported on a study of Teach For America effects on high school math and science outcomes in North Carolina. (Basically the same data that Ladd used.) Estimating a cross-subject student fixed effects model, Hannaway found that students of TFA teachers performed better than students of several different comparison groups of teachers. At least in high school, she concluded, there is a greater payoff to teacher selection than to teacher retention.

Dan Goldhaber, discussing the papers, raised questions about the generalizability of the findings, and argued that the question that policymakers are likely to ask -- "What kind of a bet am I making?" in picking a policy alternative -- would best be addressed by a distribution of likely outcomes, not a point estimate of the average effect. A number of other thoughtful comments.

These are all skilled researchers, who analyzed their data with great care. And yet I came away disappointed in two respects. First, these presentations were largely atheoretical. They answered a set of "what works?" questions, but didn't yield much in the way of insights about mechanisms. Second, the two North Carolina papers relied on end-of-course test scores, but I was dismayed that Ladd and Hannaway didn't really know very much about the tests. One of the challenges in large-scale longitudinal data analysis is that just getting the data in shape to analyze is a big deal. But tests have psychometric properties, and no one in the room knew very much about them -- or about what the history and details of teacher certification requirements in North Carolina was. Since these were central concerns in the North Carolina papers, I left uneasy.

March 24, 2008

Strange Bedfellows, Even for Jersey

turnpike.jpg
eduwonk, Joe Williams, and I make strange bedfellows, but let me join them in criticizing a proposed state law barring the use of test scores to make tenure decisions. Yes, I worry that value-added models could be done all wrong. Yes, value-added models have a long way to go before they offer valid and reliable information. But a state law is too heavy handed, and sets a bad precedent.

March 7, 2008

Guest Blogger Sean Corcoran: The Teaching Penalty

teaching-penalty-cover-100.jpg
Sean Corcoran is an economist who teaches at the Steinhardt School of Education at NYU. He is a co-author (with Sylvia Allegretto and Larry Mishel) of The Teaching Penalty, a report released today by the Economic Policy Institute.

“I don’t see why a good teacher should be paid less money than a bad senator . . . It is unconscionable that the average salary of a lawyer is $79,000 a year and the average salary of a teacher is $39,000 a year.”
- John McCain, Republican debate at Dartmouth College, October 29, 1999

“We are going to have to take the teaching profession seriously. This means paying teachers what they are worth. There is no reason why an experienced, highly qualified teacher shouldn't earn $100,000.”
- Barack Obama, from The Audacity of Hope

A charter school in New York City recently announced that it will pay its teachers a base salary of $125,000, with opportunities for extra pay when the school performs well. This announcement may come as a surprise to charter supporters who believe that charter schools are capable of doing much more with less, but the school’s founder Zeke Vanderhoek may be on to something.

A large and growing body of research has demonstrated that teacher quality is one of the most (if not the most) important resources schools contribute to the academic success of their students. At the same time, the average quality of teachers has steadily fallen over time, and an increasingly smaller fraction of the most cognitively skilled graduates are choosing to teach (for more on this see here).

Vanderhoek believes that significantly higher salaries will bring these top graduates back to the classroom, and he may be right. Economists have linked this steady decline in teacher quality since 1960 to the rise in career opportunities for women and the sizable gap between teacher salaries and those of other professionals.

Sylvia Allegretto, Lawrence Mishel, and I offer an in-depth analysis of this teacher pay gap in a new book to be released today by the Economic Policy Institute. (This book is in part an update of our 2004 analysis). The results are discouraging. In 2006, public school teachers earned 15% less per week than similar workers, a gap roughly one percentage point larger than in 2003. Only ten years before, the weekly pay difference between teachers and non-teachers was a mere 4.3%. But the 1990s economic boom largely left teachers behind, as average earnings growth for college graduates far surpassed that of teachers. (Average earnings plateaued after 2000, but the relative pay of teachers never recovered).

The recent slip in relative teacher pay is only a small part of a much longer decline in the attractiveness of teaching. Using Census data on teachers and other professionals, we find that the annual teacher pay differential has grown from parity (or a 14.7 percent pay premium for female teachers) in 1960 to a 20 percentage point gap in 2000 (or almost 30 percent gap for female teachers).

Our analysis is sure to bring out the usual “teachers have it easy” chorus, which claims that teachers’ supposed light work schedule and “summers off” adequately compensate them for their lower annual salaries. (See this report by the Manhattan Institute, for example, which argues that teachers are one of the highest paid professions). In our book, we take a closer look at these arguments and find they are mostly overblown. Either way, policymakers interested in raising the quality of the teacher workforce should be much more concerned about the big picture than petty quibbles over the number of hours teachers work each week or each year.

The fact is, college graduates weigh the relative attractiveness of each profession when deciding which line of work to pursue. And I’ve seen little evidence to suggest that our most highly skilled graduates are interested in part-year employment that pays low salaries and the opportunity to vacation or work at Sears during the summer. Vanderhoek recognizes that teaching is a profession that must compete with many others for top talent, and that the traditional compensation package has little to no chance of winning that talent over. His experiment is unlikely to change the face of the teaching profession overnight, but I think it’s a big step in the right direction.

March 6, 2008

Pay for Performance in the Corporate World

perf%20pay.jpg
We often hear that education needs to operate more like the private sector. But few corporations tie their employee bonuses to quantifiable output in the same way that some performance pay plans tie teacher pay to scores. (See How Does Performance Pay Work in Other Sectors?)

For those who believe that corporate employees rise and fall based on the fates of their companies, here's a story ripped from the headlines: Washington Mutual is shielding executive performance pay from the housing crisis fallout. From the Wall Street Journal article:

In the filing, the human-resources committee of WaMu's board, which approved the compensation targets, cited the "challenging business environment and the need to evaluate performance across a wide range of factors." The committee said it will "exercise its discretion" to determine the exact amount of the cash bonuses for executives covered by the plan and "subjectively evaluate company performance in credit risk management and other strategic actions."...WaMu directors wanted to develop a plan that would not penalize executives for market conditions beyond their control but would also allow discretion to judge individual performance, according to a person familiar with the board's thinking.

By extension, should NYC teachers participating in the bonus program get a break because of "market conditions beyond their control," i.e. budget cuts?

Another CEO sums up corporate performance pay nicely:

John Buckingham, CEO of Al Frank Asset Management Inc. in Laguna Beach, Calif., which holds about 119,000 shares of WaMu according to FactSet Research Systems Inc., said the board was being realistic by trying to show that it still is possible for executives to earn a bonus. "You have to do things to keep them," he said. "It might not be politically correct, because the captain's supposed to go down with the ship. But in the real world, that's not how it works."

For more on compensation and accountability in other sectors, check out Richard Rothstein's new paper, "Holding Accountability to Account: How Scholarship and Experience in Other Fields Inform Exploration of Performance Incentives in Education."

February 20, 2008

Performance Pay Goes to Opryland

Oprylandlogo1.jpg
For those interested in performance pay, the papers from the upcoming National Center on Performance Incentives conference in Nashville are posted here.

Update: My bad - Opryland apparently closed in 1997.

February 4, 2008

Reader Comment on Performance Pay

4lightbulbs.jpg
I had to excerpt this passionate comment on teacher performance pay. Rather than asking what its implications are for student achievement, this reader focused on what it means for teachers' personal and professional identities. This is an angle I'd never considered before - thank you, anonymous reader. You can read the full comment here.

Look at places where teachers have been lured into these plans with money. The experiment always begins with apprehension, a sort of reluctance. The policy wonks explain that this fear is because the teachers have been brainwashed by the unions and don’t understand the science at work. Perhaps. It is also possible that experienced professionals know in their gut when something just feels wrong, even if they can’t explain why.

But they participate anyway because the pull of the money is just so strong, the promise of some financial reward for years of hard work seems so right, and, in some cases, “leadership” has promised them that the results will be fair. Once the decision is made to participate, initial reluctance is replaced with a sense of excitement and teachers soon forget many of their worries. After all, teachers are human: Who could pass on a free lottery ticket, especially when you think you will win, especially when you think you will win because you deserve it.

But the morning after, teachers invariably wake up to regret and shame, at least when they know the outcome. They learn that teachers they know work hard did not get a reward. They see less deserving teachers rewarded. No one can explain why. The fairness of the experiment becomes less clear when they see who is left out and how the money is divided. Some winners become ashamed of the money they got and will not even admit to winning; some of the very people who don’t want bonuses published are the ones who got one. Other winners wonder secretly if they may actually be that much better than their peers; after all of those years of playing a supporting role, maybe they should have played the lead? How does that feel?

The losers feel duped. They review in their minds everything they thought they were doing right. They must, as the system is intended to do, start to question everything about what they do. What was working, what wasn’t? But in many of these experiments, they don’t get any feedback, no explanation, no guidelines for improvement, just a report card with a big red “F.” How does that feel?

And after the checks are cashed, the teachers are in the awful situation of having to admit that, despite everything they have ever believed about themselves, they may be doing what they do for the money. Not the kids. Not the community. Just for the money. At that point, they are stuck with the realization that they have been kidding themselves for 5, 10, or 20 years by saying they were in it because they cared about teaching and kids and learning. Even worse, in some places, teachers will have to reconcile that they choose to participate when their peers living nearby said “No, no thanks,” despite the money.

And then… the final twist. The teachers find out that real, objective researchers believe the results were statistically unsound or there was an error in the calculation or the analyses can’t be used to tell most good teachers from bad ones. Millions of dollars were rewarded, winners and losers chosen, and even the people in charge can’t say if the results were correct. The winners had no right to brag and the losers had no need to apologize. How does that feel then?

January 31, 2008

My Value-Added Bucket List

curby-bucket-715799.jpg
Last week's teacher effects brouhaha brings me back to where this blog started - not eduwonk channeling Britney, but rather how to measure teacher effectiveness. We know a lot more about estimating teacher effects on student test scores than we did 10 years ago. (Readers know well that I am as concerned with academic and social outcomes of education that are not measured by test scores, but that is for another post.) Nonetheless, big picture questions linger, and Mary Lou Retton-worthy technical gymnastics won't make teachers feel comfortable with value-added until these questions are answered. Here's what I'd like to know before moving forward:

1) How do schools affect teachers' ability to be effective in the classroom? The current assumption about teacher effects is that they reside within the teacher - i.e that a teacher is "good" or "bad" independent of the school context in which s/he is working. But we don't know if a teacher is equally effective across multiple schools, or if some component of a teacher's effectiveness is "firm-specific." For example, Harvard health economist Robert Huckman has examined doctors' effectiveness across hospitals and found that human capital isn't entirely portable. (The Effect of Organizational Context on Individual Performance). Is this also true in education?

2) How, and how much, do colleagues matter? Having higher quality colleagues may make you a better teacher yourself. We need to know whether "teacher peer effects" exist, and if so, how important they are. (For more, see No Teacher is An Island). Colleagues matter in a second way in middle and high school, where kids have different teachers for different subjects. That your students have an exceptional English teacher makes it easier for your kids to write lab reports in science, and prior year teachers may matter as well. We need to know how these crossover effects operate, and how large they are.

3) Are the same teachers that are effective in promoting short-term score gains effective in promoting longer term academic growth?: We currently estimate teacher effects on what happens on a year-end test - but what we're really after is teachers' long-term effects on their students. We're not interested in short-term score inflation, but in improved learning that lasts. (See this New Yorker article about the trouble with hedge fund bonuses.) A new paper, "How Long Do Teacher Effects Persist?" by Spyros Konstantopoulos provides some insight here.

4) Are the same teachers that are good at promoting math skills good at promoting reading skills? Does being an "effective teacher" mean that you are good at one or good at both? Current estimates of the correlation between teachers' math and reading effects are in the neighborhood of .50-.60.

5) How large are student peer effects, and how does the existence of peer effects complicate our ability to estimate teacher effects? Classrooms are interactive organisms, not individuals sitting in separate cells. Teachers are well aware of this fact, and talk about classes from hell/heaven. Peer effects can be random - i.e. a couple of kids who chemically react and pull the class down with them - or socially patterned. For example, classes with a higher proportion of girls result in both girls and boys performing better (See More Girls=More Learning). How should our knowledge of peer effects in the classroom affect the way we model teacher effects?

6) What about non-random assignment? Non-random assignment may be the biggest threat to value-added systems. (See The Great Sorting Machine for more.) It's important from a technical perspective (see Do Value-Added Estimates Add Value?), but also from a legitimacy perspective. Teachers know that principals can bury them by sticking them with tough kids.

7) Are all gains created equal?: Should gains for high performers be treated differently than gains for low performers? In other words, should a gain of 10 scale score points for a high scoring kid be treated the same as a gain of 10 points for a low scoring kid?

Why do these big picture questions matter? Each has modeling implications. More importantly, they matter because teachers have these concerns about value-added estimates and they deserve to have their questions answered. From following the use of value-added in Dallas at the Dallas ISD Blog, it appears that few teachers actually understand how their CEI scores are calculated. Researchers and wonks interested in trying value-added need to do a better job of explaining these systems to teachers, of making them comprehensible, and of addressing concerns like those raised above.

My one line position on value-added? It's not ready.

January 25, 2008

Timely Tidbits on Unintended Consequences

wheel_of_fortune_2_review_col4.JPG
Freakonomics and Marginal Revolution face off on unintended consequences - it's timely food for thought about the potential consequences of adopting value-added as the primary measure of teacher effectiveness. As I've noted before, value-added as one of many measures works for me; value-added as the master measure - which I fear it would become - does not. Why? Teaching is a multifaceted task, and value-added measures use a simplistic evaluation rubric to monitor a complex task. Alex Tabarrok sums up the potential problem here:

The law of unintended consequences is what happens when a simple system tries to regulate a complex system. The political system is simple, it operates with limited information (rational ignorance), short time horizons, low feedback, and poor and misaligned incentives. Society in contrast is a complex, evolving, high-feedback, incentive-driven system. When a simple system tries to regulate a complex system you often get unintended consequences.

An unintended consequence of blogging is that I am about to miss a deadline, so I've got to bounce. Stay tuned for more value-added debate next week. Enjoy the weekend, everyone!

January 24, 2008

The NYC Teacher Experiment Revisited

white_rat_in_maze.gif
Over at the Ed Sector, there's some confusion about my concern with the ethics of the NYC teacher experiment (see here). To be clear, my problem is not that NYC is collecting value-added data. As I have written before, standardized tests have a role to play in teacher assessment alongside holistic evaluation of teachers' effectiveness. But as eduwonk himself noted, the methodological issues are hairy and as of yet unresolved.

The concern expressed in my earlier post was how this experiment was conducted in secret and, in my opinion, in violation of generally accepted human subjects policies. The entire enterprise of social science relies on potential study participants trusting researchers to minimize risks and fully disclose the purpose of their study. Every time a gaff like this happens, it undermines researchers' ability to build trust with study participants in the future. Let's review the chronology:

1) In September, an academic experiment headed by two very talented researchers, Jonah Rockoff (Columbia Business School) and Tom Kane (Harvard Grad School of Ed), was announced. It was presented as an experiment intended to generate academic knowledge, not to inform human resources decisions in real time. (You can watch a video of a study recruitment session here.)

2) Academic research is bound not only by common sense research ethics, but by the conventions of university Institutional Review Boards. What this means is that when academic researchers conduct research intended to produce generalizable knowledge - i.e. if researchers want to publish off of these data - the experiment has to proceed within generally accepted research ethics and a university IRB has to approve it. (Even if this was not an academic research project, the DOE should have notified teachers of an intervention of potential consequence for them. After all, the data are not just being collected, but distributed to principals in the experiment's treatment group.)

IRBs are primarily concerned with the harm that researchers could do to subjects by intervening in their lives, and applicants to IRBs must demonstrate that their project poses minimal risks, that participants have been notified of these risks, and that participants have consented to the research. Teachers did not need to consent in this case, as they are government employees and their employers can collect whatever data they want.

However, it is difficult for me to understand how one could justify not notifying teachers in the study. After all, the information given to their principal - which, given the ongoing methodological problems with value-added, may or may not be accurate - has the potential to permanently change their principals' perceptions of them and their future employment prospects. Moreover, this treatment is not being applied universally to NYC teachers. By simply having the bad luck to be selected into the study's treatment condition, some teachers are affected and others are not.

It is important to note that a "live experimental" study like this one is different from the secondary data analysis studies that eduwonk cites. He wrote:

By that logic, all these various studies with panel data, choice studies using lotteries, etc...all constitute human experimentation and are wrong.

Studies based on secondary data analysis are fundamentally different - and are treated differently by IRBs - because researchers are analyzing "dead" data that have no effect on real people's lives. Ongoing research projects in which interventions are made in real people's lives are held to a different standard. And should be.

3)According to Edwize and the NYT article, teachers were not notified of the study. What went wrong is that at some point this went from an academic study to a human resources project that Chris Cerf wants to take prime time. Perhaps he mispoke, or the NYT article had this wrong, but it appears that these data, collected under the auspices of an academic research study, may be used as early as June. As eduwonk noted, simply gathering the data is not a problem. The problem is that under the cover of "academic research," data are being given to princpals in ways that affect teachers' future employment without teachers' knowledge.

The irony, of course, is that none of this would be a big deal if the project had been announced to teachers. When I watched the recruitment session video back in September, it didn't seem like a big deal at all. I bookmarked that this was an interesting experiment conducted by two reseachers whose work is first rate, and assumed that the experiment would proceed under normal conditions (i.e. full disclosure of the study). For reasons I don't fully understand, it didn't. And here we are.

There's much more to say about the methodological and broader philsophical issues with value-added measures. I'll follow up with a post on these issues later.

Update: eduwonk and I continue our bridging differences exercise. He wrote:

Her position here would be a lot more compelling if (a) this were an actual experiment in the way she and other anti-Klein partisans are seeking to describe it rather than what it is. In addition --and again-- the fact is that we don't know what they are doing with the data so at this point all these leaps to various consequences are unfounded.

But we do know what they are doing with the information, at least in the context of this experiment (and, as I have explained above, it is an experiment). Principals in the treatment group are given value-added data reports on each of their teachers. These principals' perceptions of teachers' academic effectiveness are thus affected - correctly or incorrectly - by this information. Saying "principals can't use it" is like trying to strike evidence from the record in a courtroom. Jurors' perceptions are already influenced, and the damage is done.

January 22, 2008

It's Our Secret! The NYC Teacher Experiment

telephone_shhh.GIF
The NY Times reported yesterday on an ongoing experiment on teacher effectiveness in NYC schools. Principals in the treatment group (140 schools) receive extensive value-added information on each teacher, and then are asked to evaluate the teachers. Principals in the control group do not receive these reports but also provide evaluations of their teachers. As far as I can tell, the goal is to determine how principals' evaluations are affected by having access to value-added data. By the summer, the NYC DOE will decide how these data will be used, and Deputy Chancellor Chris Cerf has even suggested releasing individual teachers' effectiveness data publicly. You can watch this video for more information about the experiment.

While much could be said about the challenges of estimating reliable value-added measures for teachers or the move to use test scores as the primary measure of assessing teacher effectiveness, I'll save those for later. (See more posts about measuring teacher effectiveness here.) Instead, I want to talk about the issue of research ethics in scientific experiments. It turns out that many teachers in participating schools have not been notified of the study.

Secret experiments have an odious history in science. The most notable example is the Tuskegee experiment, in which African-American men with syphilis were recruited into a study but not told of the purpose of the study or notified of their diagnosis. Their disease was left untreated so that researchers could track its progression. Once this experiment broke publicly, Congress passed legislation that, many commissions and administrative changes later, ultimately required universities receiving federal grants to form Institutional Review Boards to oversee all research. Human subjects policies require university researchers to receive the consent of all subjects and to make them aware of the potential risks of the study.

My point is not that the NYC experiment's secrecy is the moral equivalent of the Tuskegee Experiments. The Department of Education is not bound by any university's human subjects policy, and it is their right to examine whatever data they please to produce new knowledge. (Note that the university researchers involved are bound by IRB standards if they plan to publish off of these data.) But the Hippocratic Oath of the research community - that subjects should be aware that they are part of a study - has been grossly violated. And it does not help the reputation or future of "scientifically based research" in education when studies are conducted in secret. Even if this was not a research study, a decent boss notifies employees when they change the criteria on which employees are evaluated.

Where is this going next? Notably, Cerf's suggestion that individual teachers' data should be publicly released has precedent in New York. The New York State Department of Health started collecting similar data on doctors' effects on mortality in the early 1990s. In 1991, New York Newsday filed a Freedom of Information request, which forced the Department of Health to publicly release doctor level data. Since then, individual doctors' data have been publicly reported. Assuming the same Freedom of Information statutes apply to education, it may not be long before we can examine the "value-added scores" of NYC teachers while waiting for the C train to show up.

Back to data-driven decision making tomorrow.

January 18, 2008

They Never Say "Thanks for Improving My Test Scores!"

SHarris3.jpg
New York City posted the nomination narratives from its "Thank a Teacher" awards program. Here's the first one, about a physics teacher named Sidney Harris:

Mr. Harris’s expertise was in physics but what he taught me went far beyond science. He pushed me. He shaped the way I thought about my future. And he set expectations for me that were, before then, unimaginable.

What was his value-added on this kid's Physics Regents? We'll never know, but Mr. Harris' former student Joel Klein says: "I really believe I am chancellor today in no small measure because of Sidney Harris." Read a handful of these narratives and then ask yourself if we should evaluate teachers primarily based on their students' test scores.

January 8, 2008

Birthday Presents for NCLB: Some Thoughts on School vs. Teacher Effects

rod-paige-armstrong-william.gif
Today is NCLB’s 6th birthday. NCLB is, at its core, a policy predicated on the idea that schools vary widely in their ability to improve students’ test scores. By holding schools accountable, the hope is that “bad” schools will become more like “good” ones. (Note - this is a post about NCLB on NCLB's terms, so I'm going to focus on test scores. For more posts on NCLB, take a look here.

However, as I wrote yesterday, once we take into account students’ background characteristics, school effects on standardized test scores are pretty small. The good news is that teacher effects on test scores are quite large (you can find more posts on teacher effectiveness here). In short, the differences between teachers in improving test scores are much larger than the differences between schools. This finding has significant implications for the potential success of school-based efforts to improve test scores, as Barbara Nye, Spyros Konstantopoulos, and Larry Hedges wrote in their paper, “How Large Are Teacher Effects?”:

Many policies attempt to improve achievement by substituting one school for another (e.g. school choice) or changing the schools themselves (e.g. whole school reform). The rationale for these policies is based on the fact that there is variation in school effects. If teacher effects are larger than school effects, then policies focusing on teacher effects as a larger source of variation in achievement may be more promising than policies focusing on school effects.

(You can click to enlarge the picture above - courtesy of the Halloween Edu-Parade, Rod Paige is Armstrong Williams.)

January 7, 2008

Do Schools Matter?

head%20in%20sand.gif
Ask your companions at a dinner party about their elementary or high school, and you will learn that everyone has a theory about what made it “good” or “bad.” The amazing teachers. The decrepit building. The souped up science labs. The pungent cafeteria food. Unique extracurricular activities. The football team’s reign of terror. And the lists go on. When it comes to our schools, we all fashion ourselves as mini-experts. Most of us are convinced that some schools are better and others worse. And above all, we are certain that which school our kids attend matters.

What does it mean to say that schools matter, i.e. to claim that there are “school effects?” Essentially, this is a claim that, all else equal, going to one school versus another makes a substantial difference in a child’s outcomes. We all suspect that there are real quality differences between schools. But the trouble is that many studies find that differences between schools are dwarfed by differences within schools.

When the outcome in question is test scores, researchers have found that school effects are quite small. For example, once family characteristics are taken into account, private schools don’t come out ahead of public schools. (Catholic schools are a notable exception, though the most convincing studies find test score effects only on the students who are least likely to attend these schools.) Though city parents fight for their kids to get into selective elementary schools precisely because they are assumed to be “better schools,” economists Julie Cullen and Brian Jacob found that kids winning a kindergarten lottery to attend selective schools in Chicago don’t end up with higher test scores. (More details here.)

Does this mean that all schools are the same? Could 100 million parents be wrong? I don’t think so. These parents are only wrong if their sole goal is to pump up their kids’ test scores. But parents have a broad range of goals for their kids, and it’s not clear that test scores are the top priority. For example, Richard Rothstein and Rebecca Jacobsen found that parents, when asked to prioritize the goals of public schools, collectively value social skills and work ethic, citizenship and community responsibility, and emotional health more than the acquisition of basic academic skills. If we researchers took our heads out of the sand and studied the many goals of education, we might find that schools matter more than we think.

This is not to say that parents aren’t in it for academics – they are – but perhaps that parents see academic growth more broadly than the acquisition of test scores. Parents visit schools, and they discover that some kids are dissecting pig hearts, while others read out of textbook. They see that some schools require their students to write frequently in a variety of different styles, and provide their teachers with reasonable workloads that allow them to provide meaningful feedback. They notice that some schools offer art and music, while others have cut out these “extras.” And they know that some schools get their kids excited about learning, while others are passing out worksheets.

We’re so used to equating test scores with educational quality now that it’s easy to forget the big picture. Schools may not matter much for test scores. But that doesn’t mean that schools don’t matter.
The opinions expressed in eduwonkette are strictly those of the author and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.

Get RSS

Get eduwonkette delivered by e-mail. Enter your e-mail here:

Delivered by FeedBurner

Advertisement
Powered by
Movable Type 3.34
<
EW Archive