Report: Tennessee Teacher-Observation Scores Inflated
Too often, the officials charged with evaluating Tennessee teachers on their practices have failed to properly identify and help low-performing teachers, concludes a report by the Tennessee department of education on the implementation of its statewide teacher evaluation system.
Teachers generally earned high scores on their observations, but their students often didn't show corresponding levels of academic growth.
"If you're struggling and the student achievement data shows you that, but your boss tells you you're OK, you are not going to change your practices," Tennessee Commissioner of Education Kevin Huffman said in an interview. "We have to make sure that the system is applied consistently."
In addition, the report recommends reducing the weight placed on schoolwide performance in teachers' reviews, one of the main complaints educators have had about the evaluation system.
The report draws from the feedback of some 7,500 teachers, every district superintendent, and thousands more surveys of teachers and principals.
Implementation of the Tennessee system has been closely watched because it was one of the first post Race To the Top-funded evaluation systems to be taken statewide. Some of the implementation glitches—teachers and principals alike who felt overburdened by its requirements—have been widely documented, including at Education Week. Such complaints led to the offering of some flexibility by the education department last fall.
Still, the Tennessee Education Association has been largely critical of the program, and has pressed unsuccessfully for this year's results to be, retroactively, considered a pilot year with no stakes attached to the results. (Teachers that repeatedly earn low scores can lose tenure status.)
One component of the system seems likely to change as a result of the report. The schoolwide "value added" score, which represents the schools' performance overall in helping to boost students' math and reading test scores, counts for those teachers who don't have specific student exam results in their field or subject. Many teachers complained that this score didn't reflect their individual work, and that the schoolwide weight should be reduced.
An independent report issued by a state nonprofit watchdog group came to a similar conclusion in June. (The lack of individual results in nontested grades and subjects has been a nationwide challenge.)
The department's report says it will recommend reducing the weight given to this component. For one, it will seek to include more teachers in individual growth measures, partly by drawing on the work of groups of teachers who have worked to develop alternative measures in subjects like fine arts and physical science. While the development of additional measures may result in some more standardized tests in other subjects, "we are definitely not trying to come up with an individual assessment for every subject and situation," in the mold of Florida's Hillsborough County district, Huffman said.
The schoolwide measure won't be eliminated altogether, however. Principals reported that they saw their staff collaborating more to introduce academic concepts into subjects like fine arts as a result of that measure, and the report recommends preserving it to some degree.
In a troubling finding, the report shows a disparity between the the systems' two main components: The observation scores given by principals, and the academic growth of students. For instance, 16 percent of teachers got a 1, the lowest score, on the student-growth component, but only 0.2 percent of teachers got that score on their observations. The report suggests that observers are giving teachers higher scores than are warranted, a situation that prevents them from receiving high quality feedback on their performance or how to improve.
The cultural norms of the teaching profession, which have often tried to minimize differences of performance, as well as the everyone-knows-your name nature of some of the state's school districts, may be partly resulting in this disparity Huffman said. "There are real human challenges here," he said. "In a rural community where people know each other well, you're not just my principal. You sit the next pew up from me at church."
But a TEA official suggested in comments in a Tennessean article that perhaps it's the value-added scores that are out of whack, rather than the observation scores.
The education department made a series of recommendations for improving the system, though some of them, such as reducing the focus on schoolwide growth, need legislative approval. The state's board of education will take up others at a meeting later this month.
Among other recommendations in the report:
• Teachers getting top scores helping students progress would be able to count it as the final evaluation score, though they would still receive feedback from observations. They would also qualify for a streamlined evaluation process.
• Evaluators that give scores that deviate significantly from the value-added measures would be retrained in the observation rubric, a four-day certification process run by the Santa Monica, Calif.-based National Institute for Excellence in Teaching.
• Teachers getting the lowest score on either the qualitative or quantitative component would receive additional observations.
• The state board of education should exert its right to withdraw approval of district-devised alternative evaluation models if the results show a disparity between value-added and observation scores.