Making Sense of School Improvement Program Evaluations: The Case of TEEM
Staci Hupp of the Dallas Morning News writes that the Texas Education Agency has released a third party review of the Texas Early Education Model (TEEM) managed by the Texas Health Science Center (THSC) in Houston. The study, conducted by Edvance Research, finds that TEEM does no better preparing kids for school than other preschool programs. (The report is available with the article.)
In the 2006-07 school year TEEM served 27,000 children. Ultimately, the program is to be offered statewide via an interesting dissemination model. Head Start and private childcare centers are offered substantial financial incentives to adopt the model. The initiative has cost Texas taxpayers $45 million since 2003. It will cost a great deal more if implemented across the state. So the evaluation matters.
Hupp reports the usual reactions from those who would like to kill TEEM off, and the developers who want to keep going. What’s a policymaker – or taxpayer - to do?
Perhaps the first point to consider is the rapid scale-up of TEEM. When the stakes are perhaps several decades of expenses associated with the education of every Texas preschooler, prudent decision makers should establish a timetable of and conditions for “go/no go” decisions. In a program’s earliest stages, that suggests working with just enough children to get the timely, reliable and accurate evaluation data required to support a decision to terminate a research effort – and no more. My own experience with New American Schools’ multi team/multiyear development and scale up effort suggests that the THSC decision to serve roughly 1200 students/100 classrooms in its first year (2003-04) and especially 4000/250 in its second (2004-05) was a waste of money. More could have been learned of TEEM’s potential – with greater confidence – if THSC had focused its effort on fewer schools.
The second point is that the period of evaluation – just the first two years of the program, SY 2003-04 and 2004-5 – is unlikely to supply clear go/no go evidence. No one familiar with program development, scale-up or evaluation would expect “slam dunk” results for or against TEEM in this time frame. The model itself is new, program leader Dr. Susan Landry described her enterprise as a start-up, initial implementation at any school is disruptive, scaling-up implementation support capacity is hard, the selection of schools for initial dissemination is rarely strategic etc, etc. After getting past the “this dog won’t hunt” hurdle, what program managers and policy makers should be looking for is clues to the factors associated with program success or failure. Not necessarily findings of statistical significance that will support defense of a doctoral dissertation, but information that in the hands of those with experience in the creative aspect of program development, offers new avenues for thinking.
Ok, so what was the finding from these two years?
The bottom line from the Executive Summary:
There was considerable variation both between and within communities with regards to student performance and teacher outcomes. For about half of the communities, students in the treatment groups (with TEEM) improved more than students in the control groups (without TEEM), and for the other half of the communities students in the control groups improved more than the students in the treatment groups on the student outcome measures. TEEM did lead to overall improvement for teachers, although there was considerable variation, with teachers in both control and treatment groups obtaining both positive and negative difference scores on the teacher outcome measure.
Saturday: Some variation of the “about as good as what we’ve always done” finding seems to be the norm in the evaluation of school improvement programs. What should we make of it? What can we do about it?