« Remembering Ted Sizer | Main | Simplify Everything Else, Not Kids & Subject Matter »

Should Teacher Evaluation Depend on Student Test Scores?


Dear Deborah,

What a lovely tribute to Ted Sizer! I did not know Ted nearly as well as you did, but I admired him very much. He was very much the gentleman, and truly a gentle man. I had many disagreements with Jerry Bracey over the years; he was not gentle at all. Nonetheless, it is sad that these two men will no longer be among us, as they were both completely independent, a quality that is in short supply these days.

Which brings us back to the Obama agenda for education. Most educators are dubious about this agenda, but unwilling to speak up. The profession encourages timidity, I am sorry to say, because no one is supposed to speak out unless their supervisor approves, and the superintendents these days are looking at that big pile of cash in D.C. and hankering for a piece of it. So no one speaks up.

But that's why we are here, so let's have at it.

As you know, one of the big-ticket items on the Obama agenda is a proposal to evaluate teachers by looking at changes in their students' test scores. As I explain in my forthcoming book, this idea comes out of studies by various economists who say that credentials and experience count for nothing, and that if we value improvements in student performance, we should judge teachers by their students' scores. If the scores go up, the teacher is "effective," and if they don't go up, the teacher is a loser.

This approach has become wildly popular among the chattering classes. They think it is akin to a business that makes a profit (a winner) and one that loses money (a loser). They do not know of the studies by economists demonstrating that this particular measure of effectiveness is highly unstable. A teacher may have a class that gets higher scores one year, but not the next; or lower scores one year, but not the next. And then there is the fundamental problem, as all psychometricians warn us, that tests should be used for the purpose for which they were intended, and not for other purposes. In other words, a test of fifth grade reading tests whether students in the fifth grade are able to read material appropriate for children their age. It cannot then be used to determine whether their teacher was good or bad.

Writers who know nothing about education love the idea, however. For example, The New York Times published an editorial on Oct. 29 about the new teachers' contract in New Haven, Conn., which will allow test scores to count when evaluating teachers. The Times was happy about that, but disappointed that the contract did not spell out a precise formula "in which the student achievement component carries the preponderance of the weight." Instead, the details will be determined, to the Times' chagrin, by a committee that includes teachers and administrators.

By coincidence, the Century Foundation published an issue brief by Gordon MacInnes on the same day titled "Eight Reasons Not to Tie Teacher Pay to Standardized Test Results." Among the reasons are these: "Even reliable standardized tests are valid only when they are used for their intended purposes"; students are not randomly assigned to schools or to classes; state data systems are in their infancy, and it is far too soon to produce reliable and accurate longitudinal data; the assumption behind such plans is that teachers are holding back on their efforts because they are not paid enough (when it is far likelier that teachers, schools, and legislators "simply don't know how to improve educational prospects for poor children"); such an approach will inhibit collaboration among teachers; and most teachers don't teach a subject or grade that is subject to regular testing.

I have been trying to figure out how a school would function if the advocates of tying test scores to teacher evaluation prevail. At least three years of data would be needed, though five years would be better. At the end of the three-to-five years, the teachers who did not get gains would be fired and replaced by teachers who have no track record at all. Every year, a new group of teachers who had not produced gains would be fired, and another untested group of teachers would take their place. Most teachers, as MacInnes points out, would be exempt because they don't teach reading or math. But for the unfortunate minority who do teach the tested subjects, there would be an annual game of musical chairs. There would be constant churn, with untried teachers thrown into the trenches. Some might make it (though it will take three years or more to be sure), but many will be ousted.

Does any other profession work this way?

Correct me if I am wrong, Deborah, but I don't think this describes what any of the high-performing nations in the world do.



Thank you for this post. The answer to your question can only be a "no, not only" or a simple "no". Yesterday we had a conference led by John Moravec from Education Futures at de Waag Society here in Amsterdam. Unsurprisingly part of our discussions centred exactly around your question. The problem with basing teacher performance on students' test results is with the incentive model it creates on both sides. Both students and teachers are forced to view education as a product rather than a process, or journey. Teacher innovation is discouraged and their pedagogical value quantified instead of qualified.
At Knowmads Amsterdam we do things very, very difficult, and we are not alone. All over the world initiatives are springing up that adopt a more dynamic and self-empowering model for educating, and beginning to show tangibly valuable results.
Please visit or get in touch for more information / feedback


OK, we're really getting to the meat of the reform program here, and it gets very hard to explain for some reason.

Many more organizations use metrics than just hard core maximize-profits-or-be-damned businesses. There are, for example, many businesses whose strategy involves customer service. They are looking at the long, long run, and may be willing to suffer even negative returns in the short run as long as they are making customers happy and increasing their base.

No doubt the Red Cross uses metrics to assure better performance in the next disaster than in the last.

Of course we have the health profession, which uses metrics to decrease deaths, post-care infections, pain, and other measures.

So why to educators so oppose metrics?

Part of the answer is the metrics are not good. Now, you may blame the chattering classes for this, but we who think on policy blame the education profession for not stepping up to the metrics plate fifteen and twenty five years ago. Yet that doesn't mean no educators have successfully used metrics to improve!

Can we learn here of schools that have used metrics toward success? There are some dramatic examples, right?

Meanwhile, we might also pause and consider why metrics creation and adoption has been so slow to percolate across the public ed spectrum.

My challenge to educators would be, if you don't like the bad state-mandated metrics, why not work toward good bottom-up, student-centered metrics?

And why not ask the NEA to deliver the message; to rewrite its program initiatives so they focus on educational quality (i.e. stop telling the country how to run health care, stop telling the 18th Ohio Congressional District how to vote), and start putting on the NEA web home every day a call to Detroit schools to implement solid student-centered metrics?

If you're paying dues to the NEA, why not go ahead and ask them to be the "party of yes" on education quality?

Dear Diane,

Thanks for the thought-provoking letter.

To Ed Jones--if you have read recent studies of the use of metrics in the medical profession, you know that there are some very negative outcomes in terms of disincentives to try and save lives in some cases. Why? Because the metrics are poor.

Any field that deals directly with human beings is going to have great difficulty in developing effective metrics because humans are unpredictable despite what some economists tell us.

Tim Sass suggests we use value-added metrics over a 3-5 year time period (I think what you were alluding to) and that we include other measures such as principal evaluations. Although a great suggestion, one problem is that I have found (as reported by Deb Viadero on her edweek blog) principals often don;t stay more than a year or two in low-performing/high-poverty schools, so the principals would not be able to effectively evaluate teachers over any length of time.

Does any other profession work this way? Yes, according to Malcolm Gladwell: http://www.newyorker.com/reporting/2008/12/15/081215fa_fact_gladwell Though I grant that even if his analysis is sound about how this works in the financial-advice field, putting people in charge or even semi in charge of children for three years and then removing them is a different story.

A few words from the chattering classes if you don't mind. Back in the day, as a junior high school student, I was required to read "Cheaper by the Dozen." It may not be great literature, but it was thematically consistent (I believe the theme for the year was something like family life), had "stood the test of time," but not yet become a total dinosaur. It presented a portrait of a large tightly-knit family with a highly opinionated patriarch, who incidentally was a pioneer (as was his wife) in time-study. At one point this father prepared to provide one of the errant offspring with a swat on the behind, to which the mother protested, "oh, no, not on the base of the spine!" This was apparently one in a string of such protests over time to which Father now protested, "not on the base of the spine, not across the face--where? On the bottom of the feet like the by-jingoed Chinese?"

It is clear why this book is no longer required reading for middle schoolers. But, it came to mind as I read through today's post. Putting aside for a moment the inconsistent willingness to evaluate students (and reward or punish accordingly through grades, course failure, suspension or other removal--or stars, stickers, awards and honors) using test scores on a regular basis, I come to Mr. Gilbreth's exasperated conclusion. We cannot evaluate teachers using the clueless subjectivity of supervisors, nor tests such as Praxis developed directly for such purposes, nor the views of students or parents, and finally not based on student academic outcomes measured by standardized tests. Then how? Clearly the Gilbreths had a base conflict about corporal punishment that was playing out in the string of prohibitions regarding specific body parts. Maybe it's time to consider whether this is a base conflict about the existence of any kind of teacher evaluation at all.

It is more than annoying to this clueless observer that it doesn't seem possible to broach the topic of teacher evaluation without going immediately to merit pay based on test scores. It would be enormously helpful as a sideline stakeholder to move the discussion away from this single facet and look more holistically at the question of evaluation. How ought teachers be evaluated? Despite the asserted timidity in the profession (which BTW enjoys a level of due process higher than most in order to protect teacher's right to express opinions), I have heard plenty of educator opinions about what won't work in terms of evaluation. Where are the opinions with regard to what works, and more important the support for why it works.

I agree that it is easy to come up with folks who will salute an over-simplified conception of the way that capitalism works vis a vis the pay of workers. While very few work in an actual environment of pay for effort or outcome--we are well-indoctrinated to believe that this is the way it works. And schools are as responsible as anyone in providing that indoctrination. I recall carrying home my first elementary school report card. We were cautioned not to open it and look at it, and not to discuss our grades with our friends. Our grades are like our father's (it was a long time ago) pay checks--we don't talk about these things. Grades are what we earn for our efforts--like paychecks.

Just yesterday I participated in an online discussion with teachers who wanted no limitations placed on their ability to fail students who had "earned it." Striking phrase. Students should get what they earn--based on teacher's subjective evaluation and a variety of grades including tests--the majority of which are teacher made and lack any minimal consideration about reliability or validity.

I would suggest that a good evaluation system is one that is capable of detecting variance in quality with sufficient specificity to provide guidance for improvement. A system in which 98% get the highest rating is only helpful if it is being used to set a minimum standard--any below the highest rating ought not be renewed, or advanced, or whatever. This describes the status quo--really only useful to weed out those who need to be gotten rid of at any cost.

A good performance-based evaluation system for students sets evaluation criteria in advance and measures them in a variety of ways, with opportunities for intervention and improvement along the way. Rubrics, or other means are used to provide feedback regarding not just whether a goal has been reached, but how near or far one has landed and what specifically needs to change in order to advance.

Frankly, I am of a mind that the standardized tests should not be the stumbling block that they are. Our inability to show much improvement has revealed to my mind, not the fallibility of the tests, but our inability to make changes focused on improvement. We seem to be frozen in some time warp of remembering how it was before we knew how bad it was. I understand the exasperation of teachers who are already working as hard as they can (recognizing also that this is not the universal state). But we need to recognize that it is not working harder, but working differently that will yield improvements.

We have rewarded for years the acquisition of the Master's degree, through the pay scale. This has resulted in a cash cow for Universities--spawning some drive-by programs of dubious value. I have no doubt that we can use a similar incentive to reward either behaviors, or programs that produce a more favorable outcome. But in the end, our eyes have to be on that outcome, however measured.

I would like to address the issue of ownership. Our students are tested so much their junior year that by the time they get to our state test, they are numb to the process. Since there is no ownership for them for this test, since they do not get a grade or acceptance into the college of their choice, they have little need for it. Some students hurry through and don't check their work. What is the point - they get nothing from it, and I am seeing they don't really care about doing their best so the school is successful.

And what about those students who cannot take tests for whatever reason. Test anxiety affects many students; and the minute a standardized test, or any other for that matter, is put before them, they freeze. How is it the teacher's fault that those students do poorly?

There are too many uncontrollable factors in test taking to even think that a test can actually measure how well a student can read even. Judging teacher's effectiveness based on the test scores of their students is absolutely ludicrous.

There is a difference between a good and a great teacher. I see a variety of teachers ranging from good to great as I go from school to school speaking to students. As any parent who moves into a new neigborhood soon finds out, everyone has an opinion on or about the teachers at the school. Teaching requires objective and subjective measues to be sure, but it can be evaluated. Currently a number of states do not even require teachers to be evaluated.

I believe teachers should be evaluated on 5 points including objective and subjective criteria:
1 - Beginning and end of term content tests - Did the class learn?
2 - Evaluation of Principal/Vice Principal and peer evaluation a minimum of 3 times a year
3 - Parent evaluations
4 - Student evaluations
5 - various/other as a place for anything either positive or negative related to the teachers performance to be included. An opportunity for teacher self evaluation related to personal goals

Wow...All I have to say is what about the other progresses of the child. What about the teen that skipped 50% of class last year, but is inspired to come to school this year because of a teacher. What about the kid that never talked to anyone, but now has friends and is happier going to school because of a teacher. I like the five points that the previous post listed. I just can't see compensation, evaluations, ect completely tied to one facet of a child's education. It makes no sense and of course the teacher is once again the fall guy...

The stakeholders in education are ALL WRONG. This institution is bloated and seems to reward the participants in a hierarchy that starts with teachers, proceeds to Principal, then Sup--with parents in the middle somewhere who exercise their stake by looking at report cards.

Where is the student in this scheme of inappropriate stakeholders? The institution as it exists (and, by definition, will continue to exist with no more than cosmetic changes allowed) assures that the students, who should be the primary stakeholders, will never under its watch experience logical consequences. And no, that does not mean spankings for bad grades; it means strengthening the connection between acquiring basic and manual skills and consequently being able to DO something relevant and meaningful with those skills.


More good stuff on your thought-provoking post, as usual.

11/1/09 (?), Jay Matthews had a piece in the Washington Post on the "Perils of Rating Teachers."

He noted, "The (DC) program is already underway. Fifty percent of each teacher’s rating will be based on how much their students improve over last year on the D.C. Comprehensive Assessment System test when compared to the average gain of a similar mix of students district-wide. Forty percent will be based on five 30-minute classroom observations by their administrators and district evaluators selected for their teaching experience, each followed by a discussion with the teacher on what looked good and what didn’t. Assessment of a teacher’s support for colleagues and the school, and the school’s overall tests gains, round out the rating...all other teachers will be rated more heavily on their classroom evaluations...because their discipline is not tested"

Last I read about (2007) Tennessee, which of course is where this all started with William Sanders, was using student test scores as eight percent of a teacher's evaluation. No, I would not want to operate Massachusetts schools based on anything done in Tennessee schools.

Stephen Sawchuk from the EdWeek blog, "Teacher Beat" referred to measuring the value added by a teacher by examining that teacher's student test scores as "not ready for prime time." While I'll agree VAMs may not be ready for prime time (completely), they are an important and critical step away from the subjective system currently in place in too many US schools today.

Somewhere between ten and fifty percent of a teacher's evaluation could easily be based on their students' test scores. To ignore this objective data is borderline counter-productive. Use multiple measures such as outlined in the DC scenario above, at least until VAM's get refined to the point where they can be considered more reliable. Even Arne Duncan has recently come out in favor of test scores as part of a multiple measures approach to rating teachers.

As well, why not take the item analysis that is usually part of the testing contract, and use this information to improve instruction. It can be noted in a teacher's evaluation they did well on X, Y, and Z but might have fallen down on A & B. I have to believe most teachers, especially if they see this information in their evaluation, are going to make every effort to improve upon their weak areas. Don't you?

BTW: I owe you an apology. Massachusetts standards are miles ahead of the several other states' standards I've examined over the past couple of weeks. They're knowledge based and not laced with the vagueries incorporated into the standards of many other states. They tell teachers what they are actually responsible for covering at each grade level. I never paid attention to them because there was never a need but I can easily see now how these well defined standards would be helpful for a teacher in Boston, Springfield, or Brockton. Again, my apologies.

It never ceases to amaze me that in all the conversations by people who are in and out of the education profession rarely mention how human beings learn. A class of students is not a monolith. It is a group of individuals who have been thrown together according to age, mostly. They are all individuals who learn by various means - visual or auditory or kinetic or degrees of each. To find out the effectiveness of teaching, one needs to look at where each student started from at the beginning of the school year and where they are at the end up of the school year. You may find that some students who performed well on a test did not progress as well as some who did not.

Some years ago, the Center for Research and Evaluation at UCLA did longitudinal study to determine why students in particular school consistently performed well on standardized tests. The conclusion was they performed well due to the frequency of testing to which they were subjected.So, the question is how can one be evaluated on the basis of the whole without looking at the very different parts?

As a former teacher of middle and high school students, I believe the current focus on teacher evaluation is misguided. From my experience, teachers who are truly out of their depth do not last long - they leave to pursue other careers. Using evaluations and test score data to try to separate teachers into marginally different categories of competence is an effort rife with problems.

For one thing, principal's evaluations are inherently subjective (a calm, orderly class to one principal is a slow-paced, boring class to another principal). As others have posted, student test scores are variable from year to year and cannot reliably be used to judge a teacher in a single year (plus, many subjects are not tested).

In my opinion, efforts should instead be focused on making each teacher the best teacher he or she can be. This would include providing more planning time for educators, cutting back on paperwork and other administrative duties, and using administrators to handle discipline rather than the teachers.

If a teacher can spend less time on classroom management, and has more time to prepare for their classes, this will almost automatically create better classes, regardless of the strengths or weaknesses of the individual teacher. When I had the opportunity to teach in environments with low discipline problems and adequate planning time, my classes flowed smoothly. Planning time could be increased by taking away one period per teacher and replacing it with a planning period and/or limiting the number of different subjects a teacher has at any given time.

None of these comments seem to address the basic problem that when the students enter school in KINDERGARTEN they are ALREADY substantially behind. Years of studies to show that this population doesn't change much. If the student's progress is ONLY measured by test scores & the teachers are evaluated ONLY on that basis...then as many teachers as CAN, will move AS SOON AS THEY CAN to the schools with higher achieving students. That is factual. Look at the DATA on TEACHER RETENTION in lower achieving schools. How does it help our children in any way whatsoever for the lower achieving schools to have a literal revolving door of new teachers each year, teachers who move on as soon as they are able to? The statistics on teacher retention in these schools are horrendous.

Sofia--I wonder why it is that we trust data that we agree with but find other data, that disturbs us, to be suspicious. We trust data that show that children are "already behind" when they start kindergarten. We reject the data that shows that this only gets worse the longer they are in school.

The problem with using test performance to evaluate teachers is that tests do not do any such evaluation. Tests given to students evaluate students' performance. Nothing in the test results clearly evaluates the effectiveness of the teacher. There is no means, for instance to determine if a low fifth grade reading score is the result of the fifth grade teacher's effectiveness or the result of the teachers in kindergarten, 1st, 2nd, 3rd, or 4th grades, or a combination of all of these teachers.
Standardized tests can be useful in evaluating the needs of students to better manage their education. Regular testing by teachers can aid them in evaluating the progress of each student. some standardized testing can be useful for general evaluation of what students are capable and in need of. Teacher created tests are the most effective measure of student progress. Tests, in and of themselves are not edcuational. They are simply measures. The use of any exam as a means of reforming education is like claiming that a thermometer can cure the flu.

There is no question that teachers differ, kids differ, and parents differ. But that's beside the point. If we are to teach all kids to read, both SES differences AND differences in teacher characteristics have to be overcome.

This is a feasible aspiration. But it will be done through instruction, not through testing or "standards." Mandated tests throw no light on the instruction a kid has received or on where the instruction should best go in the future.

It's tragically ironic that we've learned nothing about how to improve instruction during the two decades of the "standards and standardized tests" movement. At best, we've nudged the normal distribution up a bit.

One observes a "normal distribution" only when the determinants of a phenomenon are multiple and complex--as in flipping a coin. Delivering reliable instructional accomplishments is anything but flipping coins.

With reliable instructional accomplishment one observes highly skewed curve with performance stacking up at the high end. Under these conditions, both student SES and teacher characteristics wash out as causal factors. The differences are still there, but but the aspired accomplishment is being attained despite the differences.

There are districts who in the past and who are now using various forms of "merit pay." None of these has knocked the normal curve out of shape and historically they've been abandoned.

The National Academy of Sciences warns against the practice, ETS policy papers have warned against the practice. The MacInnes paper Diane cites warns against the practice.

The only thing the practice has going for it is that Secretary Duncan, Bill Gates, education deformers posing as reformers and business leaders "like the idea."

Would that these were the "chattering class." They are the ruling class. What in the world happened to "Change we can believe in" and "Yes we can"?

Teacher evaluation should not be tied to student test scores. It is a ridiculous concept. When teachers have control over the children that are placed in their classroom, then this might be viable. There seems to be some teachers who are regularly given students who have difficulties learning due to family issues, transience, poor attendance, etc. Then, there are teachers whose classes are from stable homes. I don't mind working with children who have extra difficulties to overcome, but how can one be evaluated the same way as the teacher whose children do not face these issues?

Just as systems for evaluating teachers are fraught with problems, which are likely to get worse given Duncan's intentions, the ways in which teacher education programs evaluate student teachers are problematic. Checklists and rubrics that seek to identify every possible characteristic of an accomplished beginning teacher don't work, and grading a student teacher is problematic when the success of the experience that person has depends on so many contextual variables that are not in his or her control. In fact, many programs have switched to pass/fail for the student teaching experience. The ways in which student teachers are evaluated after an observation also tends to get divided into this worked/this needs work. That sets the student teacher up for dreading the negative feedback, when what is needed is an open and honest assessment of what might have gone better and why, so that the student teacher is motivated to try harder the next time, to pay more attention, to plan better, and so on. It may seem that I am splitting hairs here, but perhaps I can explain it from the perspective of my own experience supervising student teachers. After an observation, they always want to jump in with everything that went wrong. I steer them in a slightly different direction by asking them to tell me what they noticed about the students. I am explicit about changing the focus from being specifically on their performance, to a focus on understanding what the students were talking about and doing. This is what I believe is most important, for the personal and professional development of the beginning teacher, for the students, and it is what should be at the heart of teacher evaluation. As long as we rely on metrics and numbers and principals' drive by half hour visits, we are doomed to see an even greater exodus of talent from the teaching profession.

Also from the Washington Post piece I noted above, George Parker, President of the Washington (DC) teachers union stated, "I've been sending the plan (IMPACT) to experts around the country with more favorable feedback than I expected." This guy is diametrically opposed to what Michelle Rhee is attempting to do.

I simply don't see what people are so afraid of, especially in light of the anachronistic evaluation system in place in our schools today. You'd think many (good/better) teachers would welcome the opportunity to make more money than what's currently offered for a graduate degree and/or time in the classroom, neither of which is indicative of teacher effectiveness. Talk about a civil service system for running our schools.

The ruling class of chattering capability doesn't stop very long to think about matters. As Teacher implies, the school, not the individual teacher is the smallest independent unit for assessing merit.

Cooperation/collaboration among school personnel is far more consequential than competition. As in any team effort, differential pay can be awarded, but this is typically done on the basis of greater responsibility and/or greater contribution to achieving the common accomplishment.

The "reformers" would be smarter to mandate that standardized achievement tests NOT be used as a consideration in teacher pay. Relying on ungrounded numbers that no-one (except "computers") understands removes the reason for thinking and makes the contingencies mindless.

Unless individual schools have more budget discretion and instructional decision-making authority than they do now, the smallest reasonable unit is the district.

As things now stand, the further away from the students, teachers, and parents on gets, the less the accountablilty/responsibility and the greater the control of instruction. That just doesn't make sense.

The irrationality is further compounded by all the categorical funding categories that that are imposed by federal and state authorities. This was wasteful when school budgets were growing, but it becomes ridiculous where budgets are contracting at the district and school level. A lot of "horse trading" is inevitably going on. A paper trail will be maintained as best administrators can to remain "in compliance" but it can easily be penetrated.

The fact that the bulk of jobs "saved" to date have been in education, reflects the simple fact that personnel salaries make up the bulk of el-hi expenses. Capital costs are budgeted and managed entirely separately as are other non-instructional costs. There is no functional accountability or management of these funds.

As Tom Friedman of the NYTimes has wisely noted, the U S could benefit from domestic Nation Building. Nowhere does this hold stronger than with the Nation's public schools. When will the bashing stop?

I just want to tell Margo/Mom (and anyone else) that we ARE, as teachers, proposing solutions to the evaluation problem that don't depend on standardized test scores. I believe it is possible to have rigorous, meaningful evaluation without depending on test scores. You can read my ideas, and join the conversation, at my blog.

Tying student test scores to teacher pay is meant to increase teacher "performance" but what we are trying to do is increase student learning. If we had similar accountability to the banking industry we could still teach whatever wanted as long as we demonstrated the goal has been achieved ie "learning". But, we are more concerned about the test score than the learning that is meant to be demonstrated by the test. As an NBCT I am totally for accountability and high levels of learning and teaching but until we develop a support structure for accomplished teaching from teacher prep to master teacher whatever we put in place will really evaluate how we support teachers not their effectiveness. Besides, there are plenty of economists who have studied how money is not an effective motivator in any profession and actually distracts from the goal. It focuses us on the gaming of the test scores and not the goals of the test scores. Student Learning.

Why not let the students have an input into evaluating their teachers? They know what their teachers do in the classrooms. They know how what their teachers do affects them. Encourage talk into creating rubrics/criteria that students can use for such evaluation. Isn't that what happens at workshops, conferences etc.? If one can base teacher evaluation on such an unreliable and variable criteria as test scores, then maybe you will get a better evaluation by asking the students.

Why are teachers the 'fall' guys? Yes, as educators we have all learned the 'right and wrong' pedagogy. Yes, as educators we are all applying this knowledge in daily lessons. So far so good.

This is where I begin to drift in another direction.

Here are two of my main concerns - 1. Educators that are tenured DO have a tendency to become complacent - teaching becomes a job and a guaranteed paycheck to some (not all); and 2.) Educators today are not teaching 21st century skills - this includes technology-based learning. Instead of writing an assignment - how about assigning a podcast? Why not study something like DNA virtually in Second Life, instead of just reading about it in a book? Want to remind a student something - send a Tweet.

I think the real problem in education is that teachers are now teaching 'for the test' - not necessarily for the learning and critical thinking they can inspire. If teachers are evaluated based on student test results - I believe the outcome will be devastating for the student. No one will win that contest.

During my long career, I had the experience of being a bad teacher, a mediocre teacher and an excellent teacher. Because of this, I believe I have some good ideas about the use of tests for the purpose of evaluating teachers.

When I taught in a private school in an affluent suburb in the Midwest, most of my students scored above the 90th percentile on standardized tests. To this day I can't believe I did such an outstanding job in my very first year of teaching. The only explanation I have is the probability that I was a naturally gifted teacher. The pay at the school was quite poor, though, and since I was so talented, I thought I deserved the relatively high salary of the public school. So I applied to an urban school in one of the poorest cities in the United States. At this school my students scored below the 10th percentile. To this day I don't understand how my teaching could have gotten so bad in such a short time. Perhaps I was having personal problems at the time, but I no longer remember.

The strangest thing occurred at the beginning of my third year at the inner-city school. A professor from a local university came looking for a "master teacher" for his "urban teacher education project" and asked the principal for a recommendation. For some reason she recommended ME. The professor observed a lesson and concurred with her! And so I became a master teacher at the university and was awarded with extra money and a fellowship!! Why they would choose such a bad teacher, I don't know but I was grateful for it then and now.

For the rest of my career, I taught the children of immigrants. They were too young to be tested, but once my former students entered the upper grades, most scored between the 30th and 40th percentiles, so you could say that I was a mediocre teacher for most of my tenure.

I think there is much to be learned from my experience. Every teacher can be made "outstanding" by transferring her to an affluent school. So, for example, if "Miss Jones" in East L.A. has low test scores, she can be sent to Beverly Hills for one year to improve. Once her test scores come out above the 90th percentile, she can be certified as "outstanding" and be sent back to the poor school to work her magic.

So yes, I do think test scores should be used to evaluate teachers, but only if they (teachers) can be moved around.

the best teachers can manage kids with respect and fairness. in an urban school this skill is paramount. learning doesn't happen in a noisy chaotic class. they can learn self-control from a good teacher, appearing to be engaged, but does this impart academic success? it can, but parents who are pushing their kids are more important. reasearch those ethnic groups that have had the most academic success. so why not reward parents too?

Should teacher evaluation depend on student test scores? No. It should instead depend on teacher exams and certification step scores.

State by state, too few teachers get passing scores on the academics portion of their certification exams. States regularly have to reset the rules after the fact, because they can't find enough teachers that pass the academic bar.

Teacher evaluation should be so designed to advantage primarily the teachers who are proficient in their own subject matter.

One of the biggest problems we have in a nearby district is that some of the schools in the very low SES areas have extremely high turnover - as much as 180% a year! How can a teacher be evaluated on student scores when the students that began the school year are not the students being tested?

Years ago I read a hypothetical article in which dentists were evaluated on the number of cavities their patients had, and were paid accordingly. Of course, 'patients' in nice neighborhoods with secure households had lovely teeth, and well-paid dentists. Those who chose to serve the needy worked diligently, but could not make up for the lack of care and nutrition that are needed from birth on to maintain healthy teeth. These dentists were deemed ineffective and did not receive merit pay, despite their best efforts. Too many cavities!

The students that struggle the most are those whose families move frequently. Even the most wonderful, brilliant, caring, educated and talented teacher cannot influence a child who is gone.

andrei radulescu-banu - perhaps, but where are we getting all these teachers that, "..are proficient in their own subject matter." ?

Hmmm, we might have to follow what they do in the private sector when they can't find enough qualified applicants for a position ... offer more money. Don't think that will happen though. : )

Reminds me of Enron's Rank-And-Yank.
Rank everybody, and if someone is in the bottom 10% two years in a row fire them.

Brilliant businessmen, those Enron guys! Managed to raise prices by denying power to people, and trashed the retirement accounts of tens of thousands... all through their application of mathematics.

Nev, teachers proficient in their subject matter are indeed in short supply; that is exactly the reason why they should be recognized during their evaluation, and rewarded with higher pay. This would indirectly prod teachers to look for Education School degrees with a better subject matter/pedagogy balance.

Asking the wrong questions will always yield random answers.

Schools are intended to be institutions of learning and citizenship ans assimilation into a country and peer generation.

Testing is useful as a metric of individual student progress and where applicable this metric can be used to compare that student to an age/grade norm within that peer group as well as idealized norms.

Tying teacher effectiveness to such tests turns the testing upside down. No longer is the metric based on understanding the student's needs. Now the act of test taking skills are elevated to high-stakes, high-art, and high-stress.

Teachers, as a matter of survival, will have to ride their students like racehorses to ensure that their test-taking skills and test content preparation are honed. We are no longer concerned with the learning needs of the student as an individual. We are now substituting the political needs of arm chair economists, tea-baggers, bean-counters, and ultra-right extremists for those of the client population.

If these were barn-yard animals involved in cock-fighting activities, federal agents would intervene because such sport is cruel and unusual. But these are children whose lives are as meaningless as pit-bulls and roosters to the plutocrats and get-ever-tougher politicians whose teacher blood-lust must be satiated with the narcotic high of 'accountability'.

If this country were interested in education, we would be advocating a diversity of teaching styles and support so that the needs of the students come first and that means working with all teachers to elevate the professional service of the school.

For example, teachers should turn in the tests they give their classes for peer evaluation. Are the tests well-thought out? Do they capture worthwhile information?

And what about follow up? has the teacher, after giving a test subsequently built lessons that reinforce the previous material and trajectories of students or is everything a one-off, drive-by lesson?

Those are metrics of teaching prowess that are worth weighing and tracking. Chronic offenders can either improve or be shown the door in more than one way.

- Frank Krasicki

While I believe our nations assessments and performance metrics must get better and more unified, it is without a doubt that teachers must be held accountable for student learning in their classrooms. For years we've gone without accountability, without clear standards and what has happened is that generally, students in middle and upper income areas and schools have performed fairly well (however, not when compared internationally). New accountabilty has highlighted that we have a drastic achievement gap in our country based on SES and largely on race.

As a teacher who works in a school where over 90% of our students are FARM recipients, I look around and see most teachers not even trying--lowered expectations, busy work, and compliance are expected rather than urgently working to catch our students up. In taking the long view, the first step is clear accountability and incentive for teachers to begin to be grounded in the extent their students are learning. Our world must begin to ask more often "to waht extent are my students learning? How do I know? What am I going to go do about it?"

Then coupled with this, we need to give teachers the support and development to then go do something about it. Without clear benchmarks, we can never know whether we're making adequate progress. The root of the problem is when these benchmarks (tests and standards) become the end themselves.

In my mind the standards and assessments are a baseline for our kids. These assessments do not require our students to be rocket scientists. THey require our students to have a basic level of competency in fundamental skills that enable our children to then have the opportunity to go to college, work, or trade school if they choose.

I have not read all of the above posts. Tying student achievement to teacher evaluation is absurd as both are presently assessed badly.
I retired as a high school physics teacher in an urban area part of the NY metro area. My students were of a significantly higher caliber than the 9th grade students taking earth science yet most of them failed the NY State Regents exam in Physics year after year. The teachers teaching 9th grade earth science had more stressful teaching conditions than I did and a much higher failure rate.
When I started teaching Physics at that school in 1997, about 30 students per year enrolled in Physics. By my second year, I had increased the enrollment to 4 classes (100 students) and a total of 33 students passed the Physics Regents. The exam was made optional in subsequent years. Failure rates increased and guidance counselors told the students to avoid Physics, "it isn't needed for college." College admissions people told me that they would prefer to see students with "C's" in Physics rather than students with only 3 years of science.
Teachers are assigned to teach students. Those assignments are often based on seniority. Groups of students may be in poorly functioning classes of at least 30 where neither the kids nor the teacher has much chance of success. Teacher and student "success" are rarely considered in structuring learning environments in urban high schools with limited resources.
After I retired from the above school district, I taught Physics in an upper middle class town in Connecticut. All of the college bound students were expected to take Physics and, in a school with the same size population, there were 5 Physics teachers. Not all of the students were successful. The overall rate of teacher and student success was MUCH higher and the learning environment was much more student and teacher friendly.
I am knowledgeable about both student and teacher assessment and have extensive training in both areas. I also have many years as a teacher trainer in content, process, and technology areas. I have administrative certification in NY, CT, and NJ at the school and district level though I have never been a school administrator.
My apologies. I haven't given this posting a great deal of thought as I am on vacation in the Dominican Republic.
The effort in New Haven is most unlikely to succeed. There isn't enough money to keep people in teaching under these circumstances (the pay in NY was 30% greater than in the CT town).
Good luck!

It's not about the test scores, it's about classroom instruction. The analogy to the dentist and cavities is a good one. What if dentists were held to a "no cavity" in any patient standard similar to what high schools will be held to shortly when 100% of their students will need to pass their state exam and meet state standards? Doctors with no illness left uncured and no body left on the operating table? For those who know nothing about teaching, please read Marzano and look at the National Board Certification process to know what skills need to be mastered to see what a master teacher is. To learn about evaluating teachers, read Danielson.
In the 33 years I have been a high school teacher, I have served as a state consulting teacher to a tenured teacher whose principal placed her on remediation, the final step before being released, for poor performance. It was a lot of work by all involved and due process assured the teacher that performance, not subjective opinions, were evident and given as just reasons for being terminated. Yes, poor teachers, tenured or not, can be released when thorough, objective steps are used to present the evidence for termination.
This is my seventeenth year evaluating the more than 25 teachers in my department. I have been mentored, attended workshops, and have received feedback on my evaluations and have become a mentor to other department chairs. I have worked with young teachers who have gone on to become National Board Certified Teachers, counseled other teachers that education is not a career for them, and put the time and effort in to gather evidence through observations to begin the remediation of a tenured teacher who eventually saw the hand writing on the wall and resigned during the school year. I know what good instruction and good teaching is. I don't need to look at test scores to know who my good, better,and master teachers are.
I have also spent 21 years coaching high school athletics. And as other writers have noted, it's the talent on the field or on the court that determines a winning record. The public should be reminded of what Ray Meyer of DePaul University said when asked about his final four basketball team. With apologies to Ray, 'All of a sudden you (reporters) think I'm a great coach because we're a final four team. I did my best coaching and teaching a few years ago when we weren't a winning ball team. Where were you guys then?' Coach Meyer was not unskilled for many years, he worked hard teaching to a limited talent pool. Coach Meyer didn't all of a sudden become a genius when he had an outstanding talent pool. He, as all good teachers do, got the most out his students year in and year out because he went to conferences to learn good teaching techniques, reflected daily on his instruction, and tried to become better and better as an instructor. If you look at Ray Meyer's win/lost record as the test and measurement of his skills, he would be average at best based solely on the scores. Not fair to him and not fair to other teachers.

Collectively, this blog puts the cart before the horse.

The opinions over merit pay and psychometrics do range widely. Each commenter seems to have a different recipe for reaching accountability too. These are all good things to be expected among intelligent and professional adults.

However, all the comments imply something like the following: "Under no circumstances must a parent or 'stakeholder' be allowed to buy- or not buy- public school services."

This fundamental right of a free person happens to be the most powerful first step in accountability. All other important issues, like psychometrics and merit pay, come only later in the process. Who would deny it?

I have a couple of comments here...

First, though, to Ed: As an active Red Cross Disaster Volunteer for more than 20 years, I can tell you that the Red Cross does use metrics. I am a member of the "response evaluation team" that collects the data... but our data includes a hefty dose of interview, done either by phone or in person. We interview clients, volunteers, and paid staff, we vary the information collected by the position the person holds. We also evaluate volunteers who serve 5 or more days (although any volunteer who is on a job 3 or more days can ask). Volunteers can be sanctioned for inappropriate behavior, and when that happens their local chapter is notified. This is also part of my job, although not a part I like... its hard to fire somebody who doesn't get paid, especially when you also don't get paid. I can tell you, however, that those metrics don't necessarily lead to "better disasters". Each disaster is different. Each client is different. Larger disasters are harder than smaller disasters. Disasters in areas where local chapters work together in regions are better run than disasters in areas where the chapters cannot or do not work together.
With that said: My paying job is a special education teacher. My biggest problem with tying my pay and evaluation to my student's test scores is that I have students with dual-diagnosis of autism, depression & anxiety disorder, cognitive impairment (mental retardation) and behavior disorder, severe and multiple disabilities who are non-verbal. Yes, I teach reading & math. I also teach history and science. We use lots of technology. And I can prove through work samples and checklists that my students DO learn. They become more independent, more able to access their community and communicate. They develop friendships and participate in school activities. Those are all things that my students aren't supposed to be able to do because of their disabilities. But that isn't on the state tests; if you want to give me pay based on their test scores, you need to first find and use a test that appropriately meets the needs of these students. It isn't fair to tell me that my students don't show improvement if the measuring device is inaccurate.


Unfortunately, testing is one of the surest ways teachers can measure how much their students are retaining. I understand test anxiety and sometimes these test are not the best indications of how smart a student is. However, until they can come up with an alternative that more accurately determines a students level it is all these teachers have to go by.

As I read all of the comments so far, I must make this comment: "You can give the best teacher in the world the wrong combination of students and it will be a horrendous year for both the teacher and the students."

There has been lots of discussion above about Daggett's Rigor and Relevance, and there has been very little discussion about Relationship. If there is a relationship between the teacher and the students as well as a positive relationship amongst the students, THEN it will be easier to teach, inspire and learn in that classroom.

Perhaps there is a misspelling of the word "Rigor." I taught second grade for ten years and the students often struggled with lower case manuscript r's and v's. Many times my second graders would spell "rigor" as "vigor."
Since rigor is a precursor of rigor mortis then perhaps we should instead put VIGOR into the relationship part of the teacher/student relationship. Students don't care how much we teachers know, but they do know how much we care.

Finally this discussion appears to be going in a more relevant direction. Till recently the only questionable link to progress was the teacher. As a relatively new educator (this is my 10th year in the High School classroom teaching Math at a suburban school with open enrollment from the neighboring city, 30% free/reduced lunches)I have much to learn. As a 57 year old I feel I have much to offer my students. My experience goes beyond the classroom as I have spent my life in several different careers, shaping both my views as well as my classroom approach. I spent the 70's as a steel worker, (making double typical teaching wages). When a crate of steel bars were brought to me to be welded, there was consistancy in the fit and quality of material. I was expected to achieve a certain level/quality of production and was paid accordingly. I have yet to see a classroom of students with the consistancy of that crate of steel bars! I have students who are hungrey for food as well as attention, who are sleeping on friends sofas, who are expecting a child, all of these affecting school performance. Then some moron-politician wants to equate my pay to this student's performance. Last year I taught two identical classes back to back. One had seven Fs the other none, (30 in each). Had I only the low-performing class guess I may have been fired based on these proposed standards. My saving grace is I only need to teach for 5 more years before this "nonsense" gets a foothold. My fear is for my Son and Daughter who teach Math and Chemistry in Urban schools and all of the wonderful young people I work with who have sacrificed "living wages" to be a part of what should be a noble profession.

Tom Fischer,

Good post. Do you agree that these moron politicians selling merit pay models have too create fictions that objectify the classroom just so the logic appears internally intact?

In other words, these technocrats have to assume away the uniqueness of each child, teacher and situation, dehumanize it all, and treat variables like they have the predictability and malleability of physical things- like steel if you will- so that their management dictates seem scientific. What say you?

I apologize if any offense was taken in the direction of the President whom I worked tirelessly for last year. After posting and rereading I felt there may be some misunderstanding. The politician I had in mind was the schoolboard member we have that has decided final exams were unimportant. And we no longer have them....but that's a whole other blog!

Stan Greenbaum, thank you for a very informative post on your experience as a Physics teacher.

I look and compare the Physics taught these days in my town schools. I am really concerned to see that Physics is relegated as an optional one year class in High School, where in many other places around the world it is taught to every kid, every year since 6th grade. How can that be changed in the States - what is your opinion of that?

The 2nd question I'd like to ask - what are in your opinion the best Physics curricula commercially available on the market?


Nuts, no time now to real all these great discussion points with care.

Tom F: when a crate of steel bars was brought to you, there was a consistency. Indeed. And some form of student-centered metrics will allow a teacher to at least gauge what the "properties" of the student really are. Its not enough to know that little Kylee got C- in 4th grade math, and gee, she'll get a C- in 5th as well. The state tests don't help with this, but they ought to get you thinking of how your school can.

MDS mentions interviews and other ways of measuring employee effectiveness. That's great; no one outside the system will have much problem with that as long as accountable management has some substantial say. What we know doesn't work is to automatically pay someone solely based on their years of coming back.

Nuts again. I really need to go work on some Test-Driven Software Development. That's where we design the tests before we design the software. Hopefully, I'll get paid when its all done and someone likes it.

Thank you for this article. It also expresses some things that I agree with. On the one hand, I very strongly believe in teacher accountability. I think it is very, very important that we hold ourselves to the serious task of helping students grow from whatever their individual starting mark is when they come to us. I even think that others should hold us accountable for such growth. But recently, I have seen some disturbing trends in how we are held accountable.

First, I wish that our tests measured growth.

Second, I wish that our tests measured more than just low-level skills, which is too often the focus of the EOCs. They don't test super important things such as whether our students can compose an essay of the quality that a college would require or do a real-world project that requires planning, researching, formulating opinions over time, and completing multiple steps to create a final product. The EOCs don't even test if our kids can question!!! Instead, they are given all the questions AND the answers. Education becomes the process of teahcing kids to choose from what is given to them. Education SHOULD be about helping kids to create their own questions and their own answers. There has to be a way to test this more authentically!

Finally, I am appauled by how test scores are used. It does not seem to me that failing scores, on their own, prescribe the needed reform. It seems to me that there are so many factors affecting our students' achievement that COULD be addressed but aren't. To me, it seems we focus on the wrong things as we try to raise our scores. To me, it seems our school needs some serious restructuring and far more resources to get our students engaged and caught up.

I was in so much pain seeing the data that places our school 417 out of 417. While I do not think that the EOC is the best measure of our students' worth (or ours as educators), it is clear that our school is not able to give so many of our students what they need in order to go on from high school and live the sort of lives we would hope for them -- lives that are happy, productive, and where they have choice and freedom to do things they love, not just things they are forced to do by a lack of skills. I hope and pray that we can improve this situation, but I do not think that EOCs will ever be the key to improvement as they stand now.

With you in solidarity,
Carolynn Molleur

Comments are now closed for this post.


Most Viewed on Education Week



Recent Comments