eduwonkette_header_515.jpg

Through the lens of social science, eduwonkette takes a serious, if sometimes irreverent, look at some of the most contentious education policy debates. (Find eduwonkette's complete archives prior to Jan. 6, 2008 here.)

« The NYC Teacher Experiment Revisited | Main | Timely Tidbits on Unintended Consequences »

Data-Driven Decision Making Gone Wild: How Do We Know What Data to Trust to Inform Decision-Making?

spiffboy2.jpg
skoolboy returns to weigh in on data-driven decision making:

I’m as much a fan of data as the next guy. But I worry that proponents of data-driven decision-making are understating just how hard it is to use data thoughtfully.

I’d like to describe the strategy championed by the New York City Department of Education, and point out the difficulties involved. The logic that the DOE is promoting is (a) use data to identify an area where a school is lagging, either in relation to some absolute standard or to other similar schools; (b) use the available data systems to identify similar schools that are doing better in this area; (c) ask these more effective schools what they are doing that accounts for their success; and (d) adapt their suggestions for use in the school.

It’s not as easy as it looks to determine which schools are doing better than others. Two different criteria are relevant: is the difference in performance between two schools large enough to matter, which is sometimes termed educational significance or practical significance; and is the difference in performance between two schools real, or could it just be due to chance, which is typically described as statistical significance. Ideally, we are interested in differences that are both practically and statistically significant. But a difference could be large, but not statistically significant (which is often the case when we have a small sample of information about performance), or statistically significant, but very small (in which we are pretty sure that the difference is real, but it’s just not very important). (Yes, statistical significance does matter!)

This is kind of abstract, so here’s an example, drawn from the NYC Department of Education’s Survey Access tool, which reports the results of the system’s first round of Learning Environment Surveys in the spring of 2007. The Department’s spiffy PowerPoint presentation imagines the principal and a group of teachers in (mythical) IS 402 identifying teacher engagement as an issue. In particular, teachers in this school generally disagreed that “Obtaining information from parents about student learning needs is a priority at my school.” Using the Survey Access tool, it’s possible to identify 12 similar NYC schools (i.e., middle schools with an enrollment over 700 and at least 25% ELL students), seven of which have more positive scores on this question. In the top school, the Eleanor Roosevelt School, 71% of the teachers strongly agreed or agreed with the statement, whereas in the bottom school, 13% of the teachers strongly agreed or agreed. (In mythical IS 402, 36% of the 31 teachers who responded to the survey strongly agreed or agreed.)

So why not just look at the seven schools above IS 402? Because the percentages of teachers strongly agreeing or agreeing is an estimate of the true percentage that would be observed if all teachers in the school responded to the survey. (In these 12 schools the teacher response rate ranged from 26% to 53%; in mythical IS 402, 40% of the teachers responded.) Our interest is in the population of teachers in the school, not just the sample that chose to respond. And there’s a degree of uncertainty in these estimates. If a different group of 31 teachers in IS 402 responded, just by chance, we might not have obtained an estimate of 36% strongly agreeing or agreeing. In fact, with a sample of 31 teachers responding and a sample estimate of 36%, the percentage of all of the teachers in IS 402 agreeing or strongly agreeing could plausibly range from 23% to 49%. (There’s a finite population correction in there, for those who care about such things.) That’s a pretty big range, and the range of possible values is pretty large for the other dozen schools as well.

Of the seven schools above IS 402, just one of them, the Eleanor Roosevelt School, is really head-and-shoulders above it in a statistical sense. The other six are statistically indistinguishable, because there’s so much overlap in the intervals in which the true percentage of all of the teachers strongly agreeing or agreeing in each school lies.

Would the principal and teachers in IS 402 learn something from asking the staff in these seven other schools how they do things? Sure! It doesn’t hurt to think about new ways of doing business. Will doing so raise performance in IS 402? Probably not. Because an assessment of statistical significance suggests that, with the exception of Eleanor Roosevelt, these other schools really aren’t doing better, and therefore there’s no reason to think that adopting their practices will yield genuine improvements.

Data-driven decision makers, beware of spurious comparisons.

TrackBack

TrackBack URL for this entry:
http://blogs.edweek.org/cgi-bin/mt-tb.cgi/3107.

Comments

Well, I like data as much as the next guy too, of course. Now we can argue all day about theory (well, you can if you want) but the idea of using tests to assess working people without their knowledge seems plainly repugnant and unethical.

Chancellor Klein here reminds me of no one more than the fabled Dean Wormer in Animal House pronouncing, "You're all on double-secret probation!"

Of course, when your chief accountability officer literally runs from involved parents, the quest for scapegoats is daunting indeed.

Hi NYC Educator,

Just to clarify - skoolboy is not writing about the NYC Teacher Experiment, but the larger idea of D3M in education. If you check out the comments on the previous teacher experiment posts, you'll see he's on the same page.

Oops. I apologize. I knew he was on the same page, though, as I've read his comments elsewhere.

Did you mean to select a trivial example?

"IS 402 identifying teacher engagement as an issue. In particular, teachers in this school generally disagreed that “Obtaining information from parents about student learning needs is a priority at my school'"

This example points up another problem in DDDM --> my favorite professor always harped on "the error of misplaced precision." It's a good thing she did, because it's a lesson I learned well.

If you actually went looking for schools that had high ratings on this topic, I would suggest that you were wasting everyone's time. It's a common problem when people have too much data -- all the data is treated the same, whether it's important or not important.

To me, that survey question might -- might! -- be useful in trying to discover factors that might contribute to a bigger problem, but I would never look at it as a problem all by itself.

To me, the biggest problem in working with schools on using data is this: keeping the big picture in mind. In many cases, consultants make things worse, because they amass so much data (and so many bar graphs) that they make it look as though it's incomprehensible (without the help of a paid consultant, of course).

Hi Kathy,

I agree completely with you that schools can be awash in data, and that this can make it hard to see the big picture. (I haven't seen the insides of NYC's vaunted ARIS system, but I suspect that the goal is to have everything but the kitchen sink in there.) Understanding the big picture requires time for reflection and judgment, and such time is in short supply in schools and school systems being driven by short-term carrots and sticks. Moreover, the professional development to cultivate the necessary skills can't be done in a one-day workshop. In consequence, there is a grave risk that data-driven decision-making can devolve into a caricature of grabbing whatever data a system places within arm's length and running with them, whether or not those data are relevant to the big picture.

The example I used is the actual example the NYC DOE uses to demonstrate the value of the approach. You're right to worry that if this is the best they can come up with to demonstrate the value of D3M, they have a pretty superficial understanding of the process.

Skoolboy and Kathy: It is heartening to discover some real data wonks working in schools. I am always disheartened when among the academic community of a school there is not the ability to marshall the resources of mathematician/statistician, writer, planner, evaluator, scientist, etc--essentially the kind of team needed to plan and implement reforms. It's frightening, on the one hand to think that among a teaching staff these pieces are not available (as in you can't teach what you don't know), or on the other hand, that they haven't figured out how to draw productively on one another's skills.

I would say yes, to both of your points. If using a methodology that includes seeking out those who are doing better, you need to be sure that your data really shows that they are. And you don't want to pick out the least helpful data point simply because it is the one that is lowest--you have to understand what it means, and whether is it helpful to achieving your goals.

But where I disagree is on this issue of no time to do the appropriate thing because of the pressures to succeed. When I look at schools that have spent 8 years in improvement status (I believe the latest example comes from Chicago), and proclaim "nothing works," I have to think that rushing through the process has not gained them anything at all--whether educational gains for students, or surcease from pressures on adults.

I believe that there is a profound tendancy "in the field" to aim squarely at the foot when confronted with the necessity to move forward. Certainly the change in scenery is avoided (and who knows what might be on the road ahead), but at what cost?

Agreed! I have found that, with a bit of guidance and an emphasis on common sense, teachers are excellent at reviewing their own schools' data.

I am reminded of the group of teachers I was working with (who got it, they really got it) who reviewed their data and proclaimed that, "We've got to stop feeding these kids free lunches. Those lunches are just killing their test scores."

It's a funny example, isn't it? Because the school that doesn't get information from parents about student learning needs, is that a bad place?

In particular, my school gets an appropriate amount of information from parents about student learning needs, which is, frankly, just a bit.

This wasn't the only question where the best answer for a school may not have been the one worth the most points.

And all that, before the problems with sample, etc, etc.

Post a comment

Ground Rules for Posting
We encourage lively debate, but please, no profanity or personal attacks. By commenting, you are agreeing to abide by our user agreement.

USA-2008-olympics-ette_160.jpg

eduwonkette
E-mail me

The opinions expressed in eduwonkette are strictly those of the author and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.

Get RSS

Get eduwonkette delivered by e-mail. Enter your e-mail here:

Delivered by FeedBurner

Advertisement
Powered by
Movable Type 3.34

EW Archive