« Thoughts on a National Curriculum | Main | The Power of Big Money & Big State Over Knowledge »

President Obama’s Agenda


Dear Deborah,

I will get back to you on another day about the strengths and dangers of a national curriculum.

Today I want to initiate a conversation with you about President Obama’s education program. We previously discussed Secretary Arne Duncan's policy views, which frankly sounded identical to those of President Bush's secretary of education, Margaret Spellings.

Now President Obama, in a speech on March 10 to the Hispanic Chamber of Commerce, has repeated the same views.

The president said that his administration would support whatever works, without regard to whether the ideas are liberal or conservative. He then laid out a vision that heaped goodies on both the liberal Broader, Bolder Agenda (early-childhood education) and the conservative Education Equality Project (more testing, tough accountability, charter schools, merit pay).

This is a politically astute trick. President Obama avoids choosing sides by giving both camps what they want. The left wants more funding: Done! The right wants choice, testing, and merit pay based on test scores: Done!

But let’s look at the vision of where American education is heading. The key here, I think, is the $250 million that the Obama administration will give to states to build longitudinal data systems. These data “warehouses” will collect and track every student’s data from pre-kindergarten through the end of college. Students’ test scores will be linked to individual teachers. Teachers who fail to get test score gains consistently will lose their jobs, while those who do get gains will be rewarded with bonuses or higher salaries. That is one obvious use of the data warehouses.

The assumption here is that the tests we have are excellent; that they are vertically aligned from grade to grade; and that they can safely and reliably be used as the basis for making high-stakes decisions for teachers and students. Many testing experts would challenge each of these assumptions.

Another piece of President Obama’s vision can be seen in his call to states to remove the cap on the number of charter schools that may be established. This one worries me. We both know that there are many excellent charter schools and many abysmal charter schools; states have been slow to close down the latter. But even when they do, there remains this question: Over the long term, what happens to the public school system when the most motivated students enroll in charter schools? What will be the state of public education, especially in our cities, a generation from now if the states take the president’s advice?

So President Obama would have us turn our public schools over to charter entrepreneurs. Charters were originally proposed as a way to deregulate education. We should all wonder: Is deregulation a cure for what ails American education? Or will American education find itself in the same dismal condition as our financial institutions a decade hence?



Deb, a Happy St. Patrick's Day to you and All.

St. Patrick perhaps is a good way to start the conversation. How many of us left grad school thinking of St. Patrick as a mythical figure who charmed snakes and perhaps little children?

Yet St. Patrick was a rare man of civility and love of learning in a time when very little else was. Everyone here would certainly enjoy "How the Irish Saved Civilization"; a story a far cry from much else that is told.

The authors' posit that Patrick was 'the first human being in the history of the world to speak out unequivocally against slavery" may have been a bit over the top. Yet the story weaved here of how Patrick and his missionary followers converted much of Europe to a common cultural norm makes for a powerful education meme.

Note that I just said "cultural norm". Christianity has become, to many a self-proclaimed scholar, an ideology of the silly and unlearned. Driving from schools and texts any history of Judeo-Christian accomplishment has become a life goal for many an activist. They do so to the peril of collaborative cooperation.

In Patrick's time, Christianity was a way of moving beyond barbarianism; beyond the petty and violent regional competitions. In "How The Irish", Cahill provides wonderfully graphic description of the barbarian customs of Ireland and Europe after the collapse of Rome. (What boy wouldn't love this story?!)

Into this mayhem stepped Patrick and others; they began a second, incorporal, "Rome", again unifying--in a fashion long proceeding the European Union--the disparate cultures and fiefdoms of the continent.

As I've mentioned before, in my 20+ years of education, not one person existed before 1492; certainly no one with so interesting a story as St. Patrick. Why?

Your thoughts on the President echo mine exactly. How shrewd he's been. I disagree with almost everything he campaigned and stands for. He has given every liberal everything they ever wanted for Christmas.

Still, he is willing to deny the education associations their most cherished Grails, the denial of any form of competitive processes. For that, I am willing to concede its possible its worth the cost.

I might just trade everything else I believe in, if he will just 1) measure whether kids can read, and 2) pay teachers the way engineers and knowledge workers are paid.

Nuts. If its Tuesday, it must be Diane.

I'm sorry.

You have both been clear on what you don’t like about NCLB, the Bush education regime and so far the Arne Duncan/Obama program pronouncements. Could you specifically state what policy and actions the Obama administration should be taking other than sending more money to the states to spend as they see fit? Specifically:
1. Do you favor national K-12 math and science standards (fungible subject matter in any language) that are comparable to the world's highest standard (i.e. Singapore) or should states be able to continue to decide what math and science students need to learn to compete in a world economy?
2. Do you favor a local or national standardized high school exit exam or do you agree that states should continue to define their own levels of proficiency?
3. Do you favor higher compensation (cash, loan forgiveness, housing subsidies,tax breaks) for teachers willing to teach in the inner city ghettos or on Indian reservations?
4. Is the existing standard 180 six hour days school year enough time for students to learn what they need to become educated or should the school year and/or day be lengthened if so by how many days/hours?
From reading “Bridging Differences” I have a fairly good idea of what you are not in favor of but what you are in favor of remains somewhat murky to me. Probably my issue but some clarity would be greatly appreciated.


It was interesting a few weeks back when you were labelled a progressive and a Nazi (or a socialist) is the space of a few responses. I am similarly flummoxed by your (and others) characterization of BBA and EEP as liberal and conservative, respectively. It's a rare day when something I support is accused of being conservative.

But I don't know what these labels gain us--beyond the ability to know which side to be dismissive of. Personally I prefer Obama on the fence to some who have chosen sides. I don't really have a problem with the underlying theme of Bolder Broader. As Deb points out, many (but not all) of the countries that we might seek to emulate with regard to teaching and learning ability have a much more fully developed sense of social responsibility than we act on. Certainly all have more universal access to health care. Increased support for early childhood learning, improved child care opportunities, income guarantees for families with young children, tuition-free higher education--these are all things that can be expected to make a difference. And to the extent that they exist or have existed in the United States, they have made differences. Very likely not enough--we can virtually guarantee health care coverage for any child below a percentage above the poverty level, but complicate the application process and make the reimbursement process so complex that accessing coverage and then accessing care are both problematic. We provide free school lunches, but then make sure that everyone knows its only the poor kids that get it. Sometimes we provide breakfast (though not nearly to the extent that we might if we chose to reach all eligible kids).

But, even if we did all of these things right--tore down every non-academic barrier that exists--would we still have an achievement gap between the top and the bottom that aligned easily with both race and income? Can we honestly say that the content within every building in MY neighborhood is equivalent to the content within the buildings in the surrounding wealthier suburbs, or even the popular magnet schools in the district? Is every kid greeted with an equal sense of entitlement to the fruits of an education? Does every kid meet equal understanding, challenge and expectation of good things?

My expectation suggests not. I recall a line from M*A*S*H. There are two rules in war. Rule number one: people die in wars. Rule number two: doctors don't get to change rule number one. So, as a very white, very middle class and highly educated parent of non-white kids, I have had to learn my own two rules. Rule number one: schools don't provide the same education to every kid. Rule number two: parents don't get to change rule number one.

That doesn't mean that doctors do their job any differently--or that parents do theirs any differently. But we develop a different sense of what needs to change. As long as there is a war--people die, despite the best efforts of doctors. As long is the structure of schools is based on inequity--more black and poor kids will fail, despite the best efforts of parents.

So, for my part, I have to forgive Barack his "turn off the TV" comments, and forgive myself for not getting up at 4:30 AM to tutor my children, and accept his acceptance of the points made by the Bolder Broader folks--in recognition that he is also willing to pay attention to the points made by the EEP folks.

M/M said: Rule number one: schools don't provide the same education to every kid. Rule number two: parents don't get to change rule number one.

I'm the father of a 12-year-old public school kid and the teacher of 150 talented urban kids. An urban school will never be on the same playing field as a suburban one. Using the generic term schools suggests that all schools are the same, that some how certain schools get it and some are clueless. A suburban school will never have the same challenges as the urban. A mediocre teacher in an urban school will have an easy time in the suburbs, and probably be paid better.

One solution would be to nationalize the entire educational system. Have the same expectations for every school and move teachers and administrators wherever you need them. That way Joel Klein would be moved to Shreveport LA if the USDOE felt he could bring the schools up to acceptable levels there. Take all of the "great" teachers from Scarsdale and Grosse Pointe and have them prepare the kids in East New York for the Ivy League. Have all the money follow the student, and wherever the kid goes the money goes. Let's see how many cries of NIMBY there would be. Professional Ed people wouldn't like an administrator telling them where they can and can't work. What would it be like if EVERY educator had to sacrifice their career decisions for the good of the kids?

"The assumption here is that the tests we have are excellent; that they are vertically aligned from grade to grade; and that they can safely and reliably be used as the basis for making high-stakes decisions for teachers and students. Many testing experts would challenge each of these assumptions."

"Challenge" is putting it mildly. First, teachers and schools get saddled with the statistically impossible "Adequate yearly progress" mandate. Then teachers pay is tobe based on statistical manipulations on tests that are sensitive only to socioeconomic status differences not to instructional differences.

President Obama is getting very bad educational intelligence.

People can sign on to both the BBA and the EEP statements with no compunction because they are nothing more than pontificating rhetoric.

Meanwhile, directives are going out to state governments re the "educational "stimulus":
--Spend the money fast.
--Improve instruction (by going deeper into the failed educational policy of the Bush administration)
--Account for the expenditures categorically
--Don't establish any sustainable costs

What happened to "Change we can believe in"?


Maybe John Thompson was wrong. You and I could actually disagree on something.

I was pleasantly pleased with Obama's education agenda. The aspect that has me most excited is the $250 million for the data "warehouses." Finally teachers will be evaluated on some form of objective, quantifiable data instead of an administrator's (announced) walk through.

It was almost comical to see certain teachers get their hair done and a new outfit when they knew the principal was coming in to evaluate them. They also spent extra time after school the day before the big visit making sure their dog and pony show was in order. All this was then tempered by the teacher's promise to take the class out for an extra recess if they were well behaved during the principal's special visit. The very next day the teacher shows up in sweats looking like she got hit by a truck on the way to school. And her “lessons” for that day consist of one worksheet after another. What a farse!

You stated, "Teachers who fail to get test score gains consistently will lose their jobs." It's probably my naiveté but I would have to believe most administrators would want to use these tests to improve teacher instruction. Most principals I know would attempt to council these marginal teachers into respectability. The only teachers that should lose their jobs under these circumstances are the ones too stubborn/lazy and do nothing to remediate their weaknesses.


I had intended to say something about the data warehouses before my thoughts took another direction. I note that not only will education data be appropriately stored and accessible (including links between students and their teachers), but the same thing is going on in health care. Health care providers who utilize electronic medical records will receive a higher Medicare reimbursement rate--with declining rates doled out to providers who remain paper-based after a number of years. Yet I haven't heard terrorized cries from doctors that this must mean that they will be paid based on their morbidity and mortality rates. Nor have I ever heard the medical community urge that we either avoid including test results in medicial records until we are certain that they are all completely aligned and accurate (or that no decisions be made based on them).

But loosely throwing these kinds of statements and suggestions around in education leads to the kind of thinking that Dick exhibits. Exactly what, I wonder, is statistically impossible about AYP as it has been experienced to date, I wonder? I have seen many, many inaccurate assumptions about what AYP means, so let me elaborate. The first AYP targets (with room for 50 state variation) were set quite low--about the 20th percentile of school passage rates. The difference between that number and 100% proficient (another number that many suggest is set quite low in at least some states) was divided over the years between inception and 2014 and increasing goals set. Some states gave schools/district a slow ramp-up period (allowing for the accumulation of improved outcomes as kids progressed through school). Others just set an equal increase each year, others increased only every two or three years. That is the number to beat--for every school, for every district, for every subgroup. What is shocking and embarrassing is the number who couldn't get there from the beginning--as well as the volume of the outcry that this was just too hard for some of our kids. This doesn't even begin to account for things like the Safe Harbor provision (which allows for a 10% decrease in the number of kids below proficient in place of actually meeting the goal), or averaging scores across years, or setting minimum numbers for subgroups so large that they would only appear at the district level (or not at all). I don't see which part of that is "statistically impossible."

As to teacher pay to be based upon "statistical manipulations on tests that are sensitive only to socioeconomic status differences not to instructional differences," well, I would just like to see the documentation on that one: the statistical manipulations, the tests, the proposal to base pay on it--and especially anything that shows that the tests (or the manipulations--not sure which were referred to) are ONLY sensitive to socio-economic status and not instructional differences (or, what I would be looking for which is learning outcomes).

What would happen in every teacher had to sacrifice their career decisions for the good of the children? Lots of things would happen. But one thing that would definitely happen is that, ceteris paribus, teaching would become a much less attractive profession (and people aren't exactly beating the doors down to become teachers as it is.)

What is statistically impossible about AYP? In states that have set their standards and exams at low levels-- nothing. But the proficiency thus achieved, by definition, is proficiency in name only. In states that have set reasonably high standards--lots. In what is admittedly a fairly informal(by scholarly standards) forum article in Science, Bryant et al project that 99% of California elementary schools will end up in Program Improvement under the current version of NCLB.

I've been reading Bridging Differences for a while now, but this is my first post. One underlying issue that I don't see brought up as explicitly as I would like is some recognition of what risks are involved if/when particular reforms fail to achieve all that idealistic reformers hope. I certainly think that it is possible that "data warehouses" might improve public education. I've been around the block enough to think the same of nearly every reform proposed. My fear is not what happens if test score data works as intended, my fear is what happens if it doesn't. What happens when we come to trust data that we don't fully understand or that doesn't measure what we want. (See recent NYTimes article on the inaccuracies of the bond ratings firms of Standard and Poor's and Moody's to get the drift of the consequences.)

In California we adopted the Academic Performance Index, a single number to represent school quality. If one keeps the appropriate caveats in mind in judging this API number, it may not be a terrible idea. The problem is that it is nearly impossible to keep the appropriate caveats in mind. Once we create that one number, it becomes nearly impossible to pay attention to anything else.

Developing this theme of "what happens when things go wrong," one way to see the differences that Deb and Diane are trying to bridge, is to see that Deb is worried about the ways that a national curriculum can go wrong, depriving communities of control of the education of their own children, and replacing those local judgments with those of a distant "expert." Diane sees the dangers of allowing locals to define education in ways that might lead to a romantic anti-intellectual let-kids-be-who-they-want-to-be race to the bottom. Deb also sees how the small school movement she helped champion can get perverted. They both see the dangers of test-score-only way of looking at education. But they also have, I think, been willing to admit that the best versions of each other's ideals would result in quality schools.

Is what drives our view of needed education reform what we hope schools will become? Or are we driven by what we fear might be lost?

I read President Obama's speech and asked myself: why is it so uninteresting?

He has rhetorical flair. He has something to say. He speaks with conviction. What, then, is missing?

He seems to care mainly about global competitiveness, accountability, and equality, in that order. He does not emphasize other aspects of education: the subjects themselves, knowledge, understanding, insight, delight, humility, intellectual challenge, and culture.

Education is important because it gives you something you didn't have before, in the mind. He doesn't mention that. He says: “And I am calling on our nation’s Governors and state education chiefs to develop standards and assessments that don’t simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking, entrepreneurship and creativity.”

He should see some of these “assessments” of “21st century skills” (Intel’s, for instance)—awful checklists of student behaviors that the teacher monitors while circulating. In some ways the so-called “21st century skill” assessments are worse than multiple-choice tests, precisely because they purport to be superior.

President Obama does not consider that many school districts embrace methodologies hostile to subject matter—methodologies that deify process, that relegate the teacher to the role of “facilitator” or worse, that discourage teachers from actually teaching the subject, correcting the student, or using clear language. The “21st century skills” movement will only exacerbate this situation.

It comes as no surprise that he favors merit pay and falls for jargon. If he believes that the primary rewards of education are material, then material incentives will make sense to him. If truly believes that “problem-solving and critical thinking, entrepreneurship and creativity” are skills of the 21st century and not of earlier times, then obfuscation does not bother him too much.

He says, “Let me be clear: if a teacher is given a chance but still does not improve, there is no excuse for that person to continue teaching.” But what are the philosophical premises and values behind such evaluations? We must examine these closely, lest a teacher be judged for qualities and activities extraneous to teaching. Obama says that our global competitors are "spending less time teaching things that don’t matter, and more time teaching things that do." But what does matter, according to Obama? Do literature and history matter? Or does he support the continued dilution of "social studies" and "literacy" with things like "information literacy" and "financial literacy"?

Yes, education is hard work. Yes, some of the rewards are material. Yes, the processes are important. But education also gives us something to carry in our minds, to toss around late at night, to keep us lucid and strong. We must not give that up.

Diana Senechal

"The assumption here is that the tests we have are excellent; that they are vertically aligned from grade to grade; and that they can safely and reliably be used as the basis for making high-stakes decisions for teachers and students. Many testing experts would challenge each of these assumptions."

Let's accept that the tests are quite good, even if not excellent. Let's accept that the tests are at least monotonic, if not perfectly vertically aligned, where growth model is used. I think it will be now quite easy to find many testing experts that would NOT challenge them.

Now, can they safely and reliably be used as the basis for making high-stakes decisions for teachers and students?

I would argue that yes. First, there are no high stakes for students in NCLB, with the exception of high-school graduation exam. Which can be typically taken many many times. So, I say, it is safe and reliable enough for students. Second, for teachers, the reliability we care is not the reliability of a single test, it is the aggregate reliability of multiple tests (language, math, plus possibly science and social science) for every student taught by the teacher. I think I will have no problem finding experts that will not challenge them -- it's bound to be very high after the aggregation.

We should keep this in perspective -- test results are not going to be the sole measure of teacher's performance, but only a component. What's wrong with that? Strikes me as much better than relying 100% on principal's evaluation.

The experts on merit pay are trying to tell us something, but no one is listening. Has anyone thought through some of the things being said and written?

In the Rockoff, Jacob, Kane, and Staiger paper on new NY City math teachers, they tried using a number of teacher variables to predict student performance. But they were very honest in describing the limited usefulness of standardized tests. "One problem with interpreting the relation between successful teaching and college entrance exam scores is that performance on standardized achievement tests is determined by a host of different factors: access to educational resources in childhood, parental investment in education, personal motivation and willingness to study hard, raw intelligence, etc." (page 6, "NBER Working Paper 14485", copyright 2008, Rockoff, Jacob, Kane, and Staiger). How many of those variables do they consider in value-added merit pay models when it comes to student tests? How can they be certain that some of those variables, outside the control of any teacher, are consistent enough from year-to-year to let a student serve as his or her own control? If teachers' own self-reported SAT scores can't predict their performance as teachers (see Table 3), can students' own NCLB test scores predict their performance in whatever they choose to do?

Charlie Barone's review of the Tennessee growth model sets a high standard for a good model. Consider what he says about using students' previous scores to create a prediction. He writes, "Tennessee shows that the correlation of the predicted scores with the actual scores is about .80 (R=.80). That means the percentage of variance in the actual scores is only about two-thirds (.8 squared = 64 percent); thus, one third of the variance in actual scores is not accounted for by the predictive model. While an R of .80 is quite respectable in the research world, it may not be adequate for making real world decisions." (page 7, Education Sector Reports, 2009).

What about the teachers' influence? We are told by merit pay proponents that the teacher is the single greatest factor in a student's success. If you have a predictive model in which 64% of the variance can be predicted using only the students' previous scores, using no teacher data at all, how can the teacher be the single biggest factor in a student's success? What part of the one-third leftover might be the teacher? Can anyone point to any study that answers that question? Can Barone point to a value-added model of teacher effects that captures more than 64% of the variance? Using his own standard for "adequate," can he point to a technical paper on a merit pay model that is, at least, adequate? If not, should we be using these measures for making decisions?

In the other corner, we have Dan Willingham suggesting that teachers, for pragmatic reasons, might want to accept a somewhat unreliable merit pay model that misclassifies teachers -- to save the profession's image, appear more cooperative, and make up for past errors in the other direction. Dr. Willingham's strength as a researcher is in taking empirical research findings from cognitive psychology and applying the results to real world situations. What would a scientist be able to conclude about memory or learning if science had done as he suggests? What would have been the effect of using p less than .10 or p less than .20 for psychological research? Years and years of misguided research. Unreliable findings. A research field riddled with half-truths. But think of how many more professors could have earned tenure if research journals had accepted a more pragmatic threshold for reliability! What would be the cumulative effects of years of countless Type I errors on our understanding of teaching and learning?

Finally, we have Arne Duncan's merit pay strategy. Obama has pledged transparency. Duncan has pledged to use data to support and reward what works. Duncan says his model for merit pay success is based on his experience in Chicago, a program that changed Chicago public education forever. The Chicago work was funded by the American taxpayer via the Teacher Incentive Fund. Where are the data on the effects of the Chicago grant? Where is the report detailing the statistical model, including estimates of variance for all variables, teachers and students? Where are the correlations between test scores and peer evaluations? Where are the data that show that additional pay for anyone resulted in improved academic performance or did anything to close the achievement gap? Where did they draw the statistical line determining which teachers got how much money? Were rewarded teachers consistently excellent from year to year in the model?

What data convinced Arne Duncan? Given Duncan's reliance on data, those numbers must be readily available. Duncan is no longer in Chicago, but he is in charge of the agency that provided the grant money. Chicago was supposed to be collecting and using data and reporting to the federal government, so he can release whatever he wants right now. Shouldn't we get to see the data he found so convincing before we invest another couple hundred million dollars?

Arne Duncan has promised us data-driven decision-making and transparency. Is it too early to start asking for that?

"Over the long term, what happens to the public school system when the most motivated students enroll in charter schools?"

Isn't there a Boston study out there that studied 2 groups: one group that didn't make charter lottery (but signed up) against one group that did make it into the charter with the lottery. The study showed that the charter kids out performed the non-charter kids. Might that not invalidate your assumption?

On a different note, if states are slow to close charters - how does that compare to states that simply don't close horribly performing public schools? Which is worse?

I propose a little thought experiment. What if all public schools were charter schools? How would/could that work? What would be the regulatory environment that would make this possible?

If you can't imagine this--why not? What is it that has to be different for charters in order for them to work? And why?

I've always thought that some degree of choice, more than we have now, for both educators and parents (and students), would be a good thing. But the playing field needs to be level. If some schools face relatively little regulation, and can pick and choose their students, and others face enormous regulation and must take every student who shows up at their door--that's not a level playing field. And if you add differential funding to the mix--now it's seriously inequitable. But if someone can come up with a model that, for a given district, puts every school that's taking tax dollars in the same boat with regard to funding and regulation and requirements, and still includes room for different models and choices--I'd be all for it.

Big Shoulders,
Your post is absolutely brilliant. I have been writing a chapter about this issue and reading the literature and studies of value-added modeling, and it is clear that it is not ready for prime time. The claim that having a "great" teacher for three or four or five years in a row will close the achievement gap has never been demonstrated in real life. It is purely projection. And other economists have challenged it because it turns out that one-year gains fade; they do not accumulate without erosion. Lots of economists have punched holes in the validity of VAM, but the politicians are plunging ahead.

You ask "what would happen if all schools were charter schools?" Good question. We can only guess, but my guess is that we would see even more racial and social segregation than we currently have; in fact, this is what studies now show is the case. Charter schools are more segregated than their local district. Next, and more ominously, we would see an end to public education, since charters would be publicly-funded private schools, each with its own board or for-profit or nonprofit agency in charge.
I don't know of any country that has knowingly and willingly eliminated its public education system and substituted a privately managed school system for it. Do you?

The Boston study that you refer to had a little problem, and Deborah can comment on this, I hope. The study compared all the Pilot schools with the most successful charter schools, those that had a lottery. In other words, the failing charter schools were not part of the study! So, I am not sure what conclusion to draw from a study that looks only at the successful charters.

I don't think you understand the way that value-added-modeling works. No one looks at the aggregate of all scores, as you suggest (reading, math, social science, science); VAA considers only the results of standardized tests in which there is longitudinal data, i.e., math and reading. There are lots of technical problems. I suggest that you read Dale Ballou's article ("Sizing Up Test Scores") in Education Next in Summer 2002, where he explains why VAA/VAM is not ready for making high-stakes decisions about teachers. One reason is because the tests themselves contain measurement error and random error. When you use those scores to make judgments about teachers, the statistical "noise" is even greater. He gives a very persuasive explanation, and I urge you to read his article. Ballou has co-authored studies with William Sanders, so he is not anti-VAA. He is just worried about making unfair judgments based on inadequate and possibly flawed information.

dickey45, I'm not sure its fair to say that states are slow on closing charters. We have very little time and data on these; we're still working out many of the surrounding authorizor/evaluator issues as well.

On the topic of closing schools, I live in a district where 70% of the schools are planned for closing or have already closed.

Were we in New York or DC, these schools would all be brought into the national debate over education. As it is, no one outside the county cares a whit.

Why? Because it makes sense, financially and otherwise to close these old ladies and move onward. Is it painful to have to bus your kids 10-20 miles? Yes. If you want your kids to have music and art and speech services and team teaching and computers and air conditioning, do you need to accept that reality? Probably.

Creative destruction is how we grow. They do less of this in France; their economy and the street riots show it.

I disagree with you. I don't think we should be wantonly closing public schools that are often the only stable institution in their community. They are also training grounds for democracy. Every time a superintendent closes a school, it should be a demerit on the superintendent's report card. It just proves that he or she wasn't able to help the school improve.
When we destroy schools, we destroy communities. I think our society is rootless enough without contributing to even more alienation and loss of connection to our neighbors.
Creative destruction sounds good in theory, but how do you like our economy right now? Lots and lots of creative destruction, as icons and brand names fall into bankruptcy. Feels good? I don't think so.

Diane, Drat, I was just heading out to inspect the trails in one of my parks. Its 70 out, you know!

OK, I am about as hard on destruction as anyone; I spend much time trying to save "worthless" structures. My point, though, is that we need to be more specific. A closed school may be a loss to a community, or it may be a seed for better community.

Segueing,...I like our economy right now just fine. Economies always develop new problems, in this case we had four:
1) the dishonesty of borrowers and local mortgage originators who outried lied about repayment ability.
2) the failure of banks to exercise due diligence in checking on the above
3) the dishonesty of the managers at Moody's and the other ratings agencies, who decieved themselves into believing they could number crunch a bad risk into a good risk
4) the political greed of the House and Senate banking committees who refused to
listen to all who brought forward, in 2003-2008, these issues and possible solutions.

I don't, frankly, know how to project any lessons from those 4 failures onto the schools, except to say that our colleges need to look harder at Honor Codes (a la U.S.M.A).

In the long run, I agree about public schools and communities. In fact, I would argue that we need to strengthen these bonds. However, an elementary school may not make a community. I think a high school has more impact there.

I've elsewhere told Deb I believe a high school should:

be able to field a varsity, reserve, and freshman football team (if 9th graders are there). There should not be so many students that many who want to play get left off. They should be large enough for a decent speech and debate organization, drama activities, a band with all the instruments filled, swimming activities, dabbling with robotics and Arabic or Mandarin or Farsi.

I don't think they should be much larger. These 5000 student high schools seem absurd to me.

How many high schools in a district? Around here it is one. One school board, one high school, 1-8 elementary buildings. It works. As a general rule, though, how many principles should a superintendent oversee? A large handful?
Testing, charters, and vouchers, then are a way to break up the abominably big districts and schools (and staffs and central offices) which fail kids so much in the big cities.

I know not how else to do it under the edu-policital establishment.

I like Obama as much as anyone else -- except for where is head is at re: public education.

Obama spent his earliest years at private religious schools in Indonesia. He spent his 5th to 12th grade years at prestigious college prep then he was off to Occidental, Columbia and Harvard.

Michelle Obama spent her earliest years in classes for gifted children in Chicago Public Schools then she was on to Whitney Young High School, an academic magnet school with highly selective enrollment. After that, she was on to Princeton and Harvard.

Even before Obama was president, this couple sent their kids to private schools. In Chicago, they used the University of Chicago Lab School which is currently $20,286/year for kids in grades 5 to 8. By the way, tuition for their daughters’ current school, the Sidwell Friends Lower School, is $28,442.

People who live their lives like this do not have the slightest idea about what is REALLY going on in today’s struggling urban public schools. Yet they think they “know” and also believe they "know" what needs to be done to “solve” the problems. It's very unfortunate how misdirected they can be.

Here’s my best metaphor for now; bear with me, or not.

This makes me think of a driver who is cruising along on a frontage road next to a completely jammed freeway who believes he knows exactly what the drivers in that traffic jam are experiencing. Not only that, when he goes past the accident that's causing the mess, he thinks all that is needed is to move those cars out of the way, and then everything will be fixed. He doesn’t see the finer issues, such as that the highway patrol, ambulances and tow trucks will all have to come from behind that big backup. In other words, he’s nearly completely out of touch.

I’d love to see a summit where all of us who KNOW the inner-depths of the urban school experience get to clue him in as to what it is all about.

An Oakland public school parent since 1993, with plenty of battle scars to prove it.

I want to trust Obama as much as anyone else – but I don’t for where is head is at re: public education.

Obama spent his earliest years at private religious schools in Indonesia. He spent his 5th to 12th grade years at prestigious college prep then he was off to Occidental, Columbia and Harvard.

Michelle Obama spent her earliest years in classes for gifted children in Chicago Public Schools then she was on to Whitney Young High School, an academic magnet school with highly selective enrollment. After that, she was on to Princeton and Harvard.

Even before Obama was president, this couple sent their kids to private schools. In Chicago, they used the University of Chicago Lab School which is currently $20,286/year for kids in grades 5 to 8. By the way, tuition for their daughters’ current school, the Sidwell Friends Lower School, is $28,442.

People who live their lives like this do not have the slightest idea about what is REALLY going on in today’s struggling urban public schools. Yet they think they “know” and also believe they "know" what needs to be done to “solve” the problems. It's very unfortunate how misdirected they can be.

Here’s my best metaphor for now; bear with me, or not.

This makes me think of a driver who is cruising along on a frontage road next to a completely jammed freeway who believes he knows exactly what the drivers in that traffic jam are experiencing. Not only that, when he goes past the accident that's causing the mess, he thinks all that is needed is to move those cars out of the way, and then everything will be fixed. He doesn’t see the finer issues, such as that the highway patrol, ambulances and tow trucks will all have to come from behind that big backup. In other words, he’s nearly completely out of touch.

I’d love to see a summit where all of us who KNOW the inner-depths of the urban school experience get to clue him in as to what it is all about.

An Oakland public school parent since 1993, with plenty of battle scars to prove it.

Here's something to read: http://www.outsidethebeltway.com/archives/michelle_obama_and_public_schools/


I may have overdone the terseness of my previous post. You are quite right that each student's value added is determined individually on each subject. But before it can be applied in any way to effect merit pay, those individual student gains need to be aggregated for each teacher. After all, the teacher will not be rewarded for the improvement of a single student, but of a whole class (or multiple classes.) Further, the purpose generally is not to place a teacher on some absolute scale, but rather to compare him or her to other teachers in the same school.

Ballou, for example, mentions 2 sources of measurement uncertainty: (a) errors due to external factors beyond school control (family, type of subject, tutoring, etc.) and (b) inherent imprecision (non-linearity) of vertical scales. Both of them mostly wash out if the gain is based on multiple students in one or more classes, and is used to compare each teacher against the rest rather than give him or her some absolute grade. After all, we tend to have students with a broad range of demographics and achievement in each class.

Obviously teacher scoring will also need to be adjusted for subjects (how do you measure progress in music?) and for teachers without sufficient longitudinal record. This, however, is not different what we already do in any workplace, where different positions may have different criteria.

The point I am trying to make is that while value-added is not perfect, it is much better than the system we have now. VAM will add a relatively objective component to employee evaluation, supplementing currently completely subjective evaluations.

VAM opponents tend to show its imperfections and compare it with some perfect abstraction. The current reward system is anything but perfect.

Value-added experts have said exactly what NCLB tests are measuring, or not measuring.

“The fixed student contribution is often called ‘innate ability’ by economists and is akin to what psychologists consider general intelligence, or g. The more general term, ‘fixed student contribution,’ is used here because it is virtually impossible with education data sets to estimate anything like innate ability. No data sets include measures of student abilities at birth or, in their absence, sufficiently measure family and other environmental factors well enough to distinguish innate from environmental differences.” (Footnote 11 from Douglas Harris's WCER paper, 2008, which references Harris's own forthcoming chapter in the International Encyclopedia of Education)

So, no single measure or set of measures can separate genetics from environment, but value-added experts consider three or more NCLB test scores together to represent "innate ability" "akin" to IQ? The only reason they don't call it "IQ" is because it is not a perfect IQ measure because IQ can't be measured, except with three test scores that, together, they often call "innate ability." Clear enough.

Does that mean that one NCLB test score probably represents achievement, two test scores together might represent achievement, but three test scores together represent something akin to IQ? What would it be about that third test that suddenly changes the nature of the previous tests? Economists can't believe that every NCLB test measures IQ because a classroom teacher can't have control over a student's IQ -- it's innate. In value-added systems that compare a prediction with an observed score, why is the observed score considered a measure of achievement due to the teacher and not thought of as the fourth IQ score due to the student? What makes that last test different? When does a test measure IQ and when does it measure achievement? Harris provides an honest summary of expert opinion: No one really knows. And that creates a lot of uncertainty when you try to make sense of value-added measures.

In value-added systems that compare the IQ-akin predicted score with an observed test score, linked to the teacher, how do you know that the observed test is measuring the teacher's skills, the student's IQ, or some other environmental factor that affected the student's performance on that test? You don't. How do you know that the only differences in predicted and observed scores, your teacher effects, weren't just due to measurement error in the observed score? You don't. When you get a wide spread of scores for teachers and it looks like a bell curve, how do you know that the scores aren't just normally-distributed random error? You don't. Do you see proof that the best teachers remain the best teachers or the worst, the worst, year after year? You don't. When the President of the United States of America says that, based on your research, we know that the teacher is the single most important factor in a student's success and we will be providing hundreds of millions of dollars to help you do your work, what do you say? You don't. Who would?

We should listen to the experts, even if it means having to read the footnotes. The experts are trying to tell us something. That's why the footnotes are there.


I am suggesting caution in using student test scores to make high-stakes decisions about their teachers. There are many reasons to be cautious. One, the tests are not perfect. Two, if a test is given mid-year, it is impossible to know which teacher gets credit or blame. Three, the measures in value-added models are not stable. Read Dan Goldhaber and Michael Hansen, "Assessing the Potential of Using Value-Added Estimates of Teacher Job Performance for Making Tenure Decisions." They show that teachers who were ranked in the bottom quintile may a couple of years later rank in the top quintile. A teacher may be "great" one year, not so great another year. Not because she is having a bad year, but because of other factors (class dynamics, one troublemaker, whatever).
I suggest you read more.


I appreciate your politeness, but there is really not much need for it. The issue at hand won't be resolved by more reading--I have heard and read many of the researchers on this topic. The issue--at least at this stage of knowledge--seems to be that despite all its weaknesses and issues, VAM still seems more powerful than the alternatives. I draw on my long experience in the world outside education where imperfect performance-based criteria have been used--mostly effectively, even if sometimes quite badly--for decades. Outside world also has to evaluate new entrants with short performance records, deal with a spectrum of jobs where productivity is harder to evaluate for some than for others, and support mentoring and teamwork in the workplace. Yet merit pay still is being used there, mostly successfully. I remain unconvinced that teamwork and motivation are in some way uniquely fragile or different in education. This is clearly not true in your case, and we will have to disagree. Until better research and better understanding of VAM, perhaps? :-)


The more vague the standards and curriculum, the more elusive the material on the test. The more elusive the test material, the more emphasis on test prep. The more emphasis on test prep, the less room for excellent instruction in the subject and other subjects. The less room for excellent instruction, the less accurate a value-added model can be.

A value-added model today would reward test prep, not good teaching. Given our current tests and standards (in many states), the two are far flung. I am skeptical about value-added models in the best of circumstances. Under current conditions they are not only problematic but absurd.

Diana Senechal

Big Shoulders:

As a big footnote reader, I would have appreciated a few on your post. Who are the "experts" who suggest that 3 NCLB tests (whatever that may mean) together add up to "innate ability akin to IQ." I haven't run into any "experts" who suggest that academic achievement on a criterion reference measure equates to "innate ability" no matter how many of these measures are added together.

The interesting thing about value-added measures is that when they were still largely theoretical, many teachers and teacher-defenders (or school defenders) seemed to be calling for them, claiming the weakness of a "single measure," in not being able to account for "growth" that a student experienced while in a particular building or classroom--the assumption always being something like, they were already broke when we got them, we just did the best that we could, but nobody can tell. There were wild anecdotal claims about how "the tests" were unable to account for a fifth grader who came in at the first grade level, advanced all the way to third (but still showed up as a failure on the fifth grade test).

Now, I am not big on throwing caution to the wind, or making decisions "wantonly" (whether it is closing buildings or hiring, firing, or promoting teachers or other employees). But the reality is that high stakes decisions are made every day in every way in and outside of schools. Students are determined to be eligible (or not) for special education services (based on data from tests--including tests that actually purport to measure IQ). Students are accorded grades, which determine their passage, or not, into the next grade, and ultimately their eligibility for college (combined with, of all things, test scores). Schools receive financial support based on a panoply of data: property values, numbers of students, income level of student families, attendance and enrollment data, local or state demographics, geo-political divisions. Birth dates, of all things, have long been considered the primary data source for school readiness (or entitlement, depending on how one looks at things). Students are suspended, transferred or expelled (or searched or questioned)based on circumstantial evidence, hearsay, all manner of things that we would not allow to determine the fate of an adult.

Think about some of the other data that feed high-stakes decisions that immediately affect the lives of students: food stamp eligibility (not only actual proxies of poverty, but a family's skill in providing documentation), access to health care--provided to some based only on the report of a birth date to an employer, to others provided (or denied) based on substantiation of the date on which an illness or condition presented, proof of income, or disability status.

There have been court cases based on the kinds of data used to rule-out potential employees from certain jobs (the relevance of height to firefighting, for instance) and whether they served as legitimate screens for the needs of the job, or were used as proxies for such things as gender.

It's not about the quality of the data. We are intensely comfortable with making decisions on limited data all the time. It's about the particular decisions that might be made, and who they will affect. To leap from developing data warehouses (with some quality) to an assumption that teachers will be fired based on the "single measure" of student achievement on a test is really quite an extension.

To take it further, to the conclusion that we ought not make any high stakes decisions until the data is perfect, is absurd.


After reading the Ballou, then the Goldhaber and Hansen pieces I see your point about value-added assessment for evaluating teachers.

However, while I believe VAA is still in its developmental stages, I also believe it can be tweaked in a relatively manageable timeframe to address its imperfections. We can bring it to a point where its validity will prove it to be vastly superior to the existing subjective evaluations being used in our schools today.

My post from above does not reflect an aberration in teacher behavior. I saw it happen time and time again. And this occurred in a reputable Massachusetts school district. It's been a farce for too long and will continue to be so until some quantifiable model replaces it.

Even with its imperfections, I'd still have to side with Ze'ev’s opinion from above. I’d favor VAA over our existing system. Our current system is much too subjective. I’d go so far as to say it’s borderline embarrassing. Anyone who knows anything about the operation of our schools also realizes this is one of the primary reasons it is so difficult to dismiss an incompetent teacher. There’s simply not enough quantifiable data to substantiate a principal’s written observations in court.

Let me reiterate my posture from above. I do not endorse VAA for dismissal purposes but strictly for improving instruction (unless the teacher in question refuses to alter their ways).

The folks who demanded education reform since '83 are also well aware of the multitude of problems with the present evaluation system and are the ones most demanding a change. They realize this is one of the remaining untouched avenues of education reform which needs to be confronted and significantly altered.

This is yet another example as to why the educational establishment has not been invited into the education reform dialogue. It's obvious what their views and opinions would be. Status quo, status quo, status quo. What’s wrong with the status quo? Plenty! The status quo in this instance is once again unacceptable practice.

Some things are worse than the status quo. This is an important lesson from history. Be careful what you wish for.


One last quick comment: Paul, you say "I do not endorse VAA for dismissal purposes but strictly for improving instruction (unless the teacher in question refuses to alter their ways)."

You have such faith in power relationships and dynamics in education. What are the odds that, once the info is there, it will NOT be used to make dismissal decisions? Close to zero. Think of the outcry if it were not.

Teachers, at least here in California, have so little discretion in how they respond to test data now that it is virtually useless to them in terms of helping them improve their teaching. If it could be used, by teachers, along with other sources of information, to help them improve their teaching--if there would be an appropriate degree of slack in the system, so that the system took into account the inherent uncertainty regarding causality in social systems in general and student learning in particular--value added assessment might be useful. But that's a lot of if's. I know that I as a teacher would be interested in seeing the results--and that they would be a rather crude indicator, not terribly useful in terms of helping me to know what I had done well and what I had not done well, what had worked and what had not, with my students that year. They might tell me THAT I need to improve, but not WHAT I need to improve. Or they might tell me I'm doing OK. Or they might just confuse me--every group of students is different, and what works with one group can totally bomb with another, so there will be times when a teacher thinks "Gee, I did the same things this year that I did last year, why didn't they work? And what do I do next year, with a whole new set of students?"

But regardless, if the numbers are there--the tests essentially produce numbers--managers will use them to make decisions, whether they are worth anything or not. (That's happening with worthless data where I teach now.) And principals can, and some will, manipulate things to give teachers they favor good odds of producing high numbers and others low odds. Who gets the low-scoring students? Who gets the behaviorally challenging students? Guess who gets to decide? Right now, with data aggregated at the school and district level, there's a big debate over the fact that charters can pick and schoolse students while traditional public schools cannot. VAA aggregated at the teacher level pushes these issues down to the teacher level. (This is already an issue within most schools--but using the VAA data for either dismissal or "merit" pay will seriously aggravate the dynamics around who gets which students.)

BTW, does anyone know when the conventional "expert" wisdom switched from "teaching makes no significant difference--all the variance is due to SES" to "teaching is the largest single factor" in explaining differential student learning? Back in the '70s and '80s, people kept telling me that no matter what I did as a teacher, or what the schools did, it pretty much made no difference; now it's all on teachers. What changed? I mean, if once we made no difference, and now we make a huge difference--that would be progress, surely. But somehow I doubt that teaching has changed all that much. So what gives?

"BTW, does anyone know when the conventional "expert" wisdom switched from "teaching makes no significant difference--all the variance is due to SES" to "teaching is the largest single factor" in explaining differential student learning? Back in the '70s and '80s, people kept telling me that no matter what I did as a teacher, or what the schools did, it pretty much made no difference; now it's all on teachers. What changed?"


If that's a serious question, here's a serious answer. http://www.coveringeducation.org/docs/keyDocs/EdWeek%202006%20Article%20on%20the%20Coleman%20Report%20and%20its%20legacy.doc

The "schools don't matter," factoid is generally attributed to the Coleman report--which didn't quite say that. What Coleman did was to use testing data as an indicator of quality (in addition to, or in contrast to, various input data). That was pretty revolutionary, and the results surprising. Even with something approaching "equal" inputs, there were racial and SES gaps in achievement--that increased over time. It did note that there were variations within as well as between schools--but I don't know that teacher quality was extensively looked at, although there was a correlation between teacher verbal ability and student achievement.

Perhaps a more accurate statement of conclusion might be (rather than schools don't make a difference) schools are not making a difference, or schools are not making enough difference. I don't know that anyone associated with the report or follow-up studies ever intended the result to be complacent acceptance of achievement gaps as inevitable.

I have read some more recent research (recall that the use of student outcomes was pretty cutting edge) that has looked at "teacher effect," and guess what, there is one! There are also various studies on many within-school variables that may in isolation or in combination have an impact on students. My pragmatic point of view is that it makes more sense for schools to concentrate on improving the things (teachers, curriculum, pedagogy, climate) that lie within their sphere of influence, than to put a lot of hand-wringing into those things unlikely to be changed in the short-term by their efforts (race, family SES).

Jean, the conventional wisdom changed because of William Sanders' value-added-assessment methodology. Sanders concluded that teachers make a huge difference, that a string of good teachers produces enormous gains, while a string of ineffective teachers dooms a kid for life. Other economists soon reached the same conclusion, and now the race is on to use VAA methods to find the good teachers and give them bonuses, and to find the "bad" teachers and fire them.


Your mention of Sanders' value-added conclusion about the effect of good teachers reminded me of something else.

IES just published Mathematica's study about effectiveness of 4 elementary math programs (it will be discussed next Tuesday at the SEE forum). This seems to be an exceedingly well done curriculum study (the first in not-so-recent memory), and in a rather striking difference from other curriculum studies it finds large difference in effects among the programs. Those differences, if shown to be sustained and cumulative, are on the same order of magnitude that Sanders found for good teachers.

We should still be careful as the report describes only the first year of the results, but I keep my fingers crossed. If this happens, we will find ourselves on the same side (again :-), where the focus is on strong curriculum rather than on labor practices.

It's difficult to sort out technical matters in a forum like this but Diane, Big Shoulders, Margo and others are doing a good job.

"Value Added Models" is a new slogan for very old statistical techniques. The assumptions and pitfalls involved are (or were) well-known.

The current applications fail the transparency test, which should be reason enough to be sceptical. Several of the "models" (which are nothing but statistical fornulas) are proprietary, and few have been documented in sufficient detail to permit examination.

Think AIG.

There are empirical failures, however, that have been honestly reported.

One occurred in the Chicogo Public schools, just before Sec. Duncan's time there, in a study conducted by Tony Bryk and colleagues. Bryk is the author of the definitive textbook on the stats involved.

The study concludes:

"Embedded here are important lessons for school systems that seek to become more outcome-oriented. The “load bearing wall” in such reforms is the assessment and reporting system.

"Unless testing and indicator reporting systems are much more carefully crafted than is currently the case, high stakes accountability may misinform rather than inform and may even distort school practice rather than improve it.

"Unfortunately, extant standardized testing systems like the ITBS do not afford an accurate basis for assessing school productivity and how this might be changing over time. Entirely new testing and reporting systems are needed."

That statement is as timely today as it was in 1998.

It makes makes chasing other metaphors like "data warehouses" pretty silly endeavors.

Richard Schutz,
Thanks for the good post. Chasing "data warehouses" is more than a metaphor. The stimulus bill contains $250 million for states to create such warehouses; nearly a dozen states already have them and are trying to use them for precisely the purposes you describe. We will not improve American education with this approach, though we may manage to make it worse.


I hope you are right about the curriculum studies. Of course, we know that teachers can soar with great curriculum or mess it up.



Bryk's research is now a decade old. This doesn't mean it should be trashed, but it is important to look at what has happened in the intervening years. Bryk, et al used the ITBS, a nationally normed test, for their study--which necessitated some gymnastics to arrive at content referenced scores in order to be comparable from year to year. Most states have now developed, and are using content/criterion referenced (standards-based) testing systems, rather than systems like the ITBS.

Student mobility is somewhat of an issue--however those students for whom this is an issue are easily (and in my state are) removed from Value Added computations. He also discusses the insensitivity of pass/fail type accountability systems (above/below proficiency or in the Chicago system at that time above/below the national norm) to movement in groups that remain above, or below the line. This has also been responded to in most states by some kind of a performance index which averages student scores by levels in order to enhance sensitivity (and ensure that a broader range of students or of concern to the decision-making adults).

"Most states have now developed, and are using content/criterion referenced (standards-based) testing systems, rather than systems like the ITBS."

All of the state tests (that I'm aware of) are derived using Item Response Theory. However labled, the measures are inherently insensitive to instructional differences.

"This has also been responded to in most states by some kind of a performance index which averages student scores by levels in order to enhance sensitivity."

Only in the dreams of those who are running the derivatives through the computer. Is this "enhanced sensitivity" transparent or is it stored in "data warehouses."?

The assumptions that these statistical operations don't come close to meeting haven't changed over the years. What has changed is that the assumptions have been more widely ignored and new rhetoric has been invented.


"All of the state tests (that I'm aware of) are derived using Item Response Theory. However labeled, the measures are inherently insensitive to instructional differences."

Agreeing with the first part, I strongly disagree with the "inherently insensitive" part. I read your SSRN paper and Popham's argument in PDK on this issue, and I observe:

- There is nothing inherent about test made of IRT items that forces it to be insensitive to instruction. Popham certainly doesn't claim it. Instead, he suggests--without any supporting evidence--that this MAY BE the case across the nation. He then goes on to suggest how we can make sure that those criterion-referenced and IRT-based tests are indeed instruction sensitive. Clearly, had they been "inherently" insensitive, they would be beyond redemption. It also happens that California has been effectively following his suggested procedures for almost a decade. I can't speak for other states, but I suspect that California is not completely alone.

- In your SSRN paper you seem to imply that because IRT relies on "invariant latent trait", it makes it inherently IQ-like and SES-sensitive. I think you were insensitive to the precise technical meaning of that term in the context of IRT.

- Finally, if you (and Popham) were correct, then we would not have poor schools that beat the odds and SES would be king. Yet we have KIPP schools, and quite a few others, that do beat the SES odds. And their success shows strongly on the same, supposedly "instructions insensitive" tests. How can that be?

I am with Margo/Mom here. Bryk's research is old and inapplicable after 7 years of NCLB.

"Only in the dreams of those who are running the derivatives through the computer. Is this "enhanced sensitivity" transparent or is it stored in "data warehouses."?"

It is a very simple and transparent formula--explained on the state report card. Kids who take no test are given a score of zero. Beyond that, there are five levels (proficient plus two above and two below). Proficient scores a "1", the highest a "1.2" and lowest a ".3." The resulting 100 point scale comes out at 100 being acceptable. My own local district chose to focus on improvement based on this scale, rather than absolute percentage of students proficient in order to move up the state's rating system. The advantage is that there is a bigger bang for the buck in focusing on movement for all kids, rather than just trying to get the "bubble kids" over the proficiency mark. Bryk has several nifty graphs that demonstrate this kind of movement and why simply looking at numbers proficient doesn't capture growth of this nature.

I know that California also uses such a system--I haven't fully surveyed all states, but it is not complicated--requiring only that there be several, rather than one, level of cut scores.

When people talk about "sensitivity" and "transparency" they aren't referring to formulas that operate on scale scores on ungrounded tests.

If that were the situation, the financial derivatives that led to the economic meltdown would be "sensitive" and "transparent."

The educational derivatives share the same characteristics. The toxicity just hasn't been noticed yet.

I don't want to get "off thread" in talking about technical matters, but I'd be glad to dialog privately with anyone who cares to.

Click on my name at SSRN site for my email.

The matters are extremely important, but very few people have any interest in them.

"When people talk about "sensitivity" and "transparency" they aren't referring to formulas that operate on scale scores on ungrounded tests."


I may be just a mom from the midwest, and not a statistician, psychometrician, or even in the classroom, but I recognize when I am being patted on the head and told not to worry my pretty little self over things that I don't understand.

In my understanding, sensitivity means that something responds when something else changes (as in, the indicator responds to improvement, not just whether a student is above or below the bar). Transparent means that rather than the proprietary (AND difficult to understand) Sanders formula, the formula is publicly available, not hidden in some black box (or as you suggest "data warehouse")-and in this case, transparency is enhanced by the formula being easy for a layman to understand (and replicate if s/he so chooses). It is based (derived from) on tests that do have considerable peer review and psychometric "stuff" behind them (also publicly available in tech reports found on any state dept of ed website). The results "speak" to the people (and here I would include stakeholders such as myself, as well as teachers and a multitude of public school employees) who are concerned on all sides of the accountability equation.

In my experience, all of the technical "stuff" only comes up when teachers feel that life as they have always known it is threatened. And it comes up only in terms of pointing out why the tests/scores/ratings/rankings are not valid/reliable/suited to the use to which they are being put. When improved measures respond to each weakness, the chorus only reforms.

I have some fears about what we are teaching our children from this. Not only have teachers relied heavily of highly imperfect measures for centuries in evaluating students (and appear to be very eager to return to such a happy state of affairs), but many of the self-serving cries come from folks who understand even less than I do all of the statistical implications of what they are talking about.

Hey, Margo. You must have quit reading my last post after that first sentence. I agree with a lot of your sentiments and invited further dialog.

However...I question that "transparency is enhanced by the formula being easy for a layman to understand (and replicate if s/he so chooses)" Have you actually tried explaining the formula to a layman, or anyone else, and found that they found it easy to understand and felt that they could apply it they so chose?

The standardized achievement tests do "have considerable peer review and psychometric "stuff" behind them (also publicly available in tech reports found on any state dept of ed website)."
But this doesn't change the fact that they are ungrounded statistical scales, or that "proficiency" on the measures means nothing more than being above an arbitrarily-set cut score.

My contention is that neither the tests nor the "value added models" are fit for use. I've tried to explain this contention and to suggest alternatives, but this is not the place to go into that.

All I've tried to do here, is to add to Diane's contention that President Obama's agenda has "issues," as teachers say about kids they are having difficulty teaching.


I read with interest your paper "Why Standardized Achievement Tests Are Sensitive to Socioeconomic Status Rather Than Instruction and What To Do About It." I agree with some of your points but am not convinced that the standardized tests measure innate ability over instruction.

I have scored ELA and math tests. As you note, the reading passages are contrived. This, in my view, makes them rather difficult to read, even if the concepts and vocabulary are simple. A student with high ability (or an advanced reader) may struggle to make more sense of them than is there.

Sometimes the passages have errors. One passage had two characters with the same first name and different last names. The passage did not clarify who the second one was supposed to be. I have a feeling that they changed the last name (to meet cultural representation guidelines) but forgot to do it in both instances.

The constructed response items do not favor the thoughtful student. They favor the student who follows directions precisely and literally. We had responses full of logical and grammatical errors, and the rubric let them pass. Many sentences contained strings of "because" clauses. Sometimes students would use conjunctions like "although" without presenting a contradiction or contrast. Sometimes they would bring in examples from the text that contradicted the point they were trying to make. All this passed because they were fulfilling the charge of the task.

I saw a few highly thoughtful responses that received lower scores because they did not follow the directions completely. Some even pointed out the problems with the reading passages.

Most constructed response questions had a graphic organizer, even when there was no need for one. That kind of task does not favor the student who does a lot of work in the mind and does not need to chart out the simple things.

This leads me to believe that the tests are indeed sensitive to instruction--but what sort of instruction, and in what subject? If you teach students to follow directions exactly, use graphic organizers even when they are not needed, and employ every "strategy" in sight for comprehending a contrived text, you might well bring test scores up.

That is not the sort of instruction that should predominate in the schools. We should teach the real stuff. Grammar. Literature. Word structure and etymology. Essay writing. Logical thought.

Diana Senechal

Correction to my first paragraph: I meant "ability" in quotes, not innate ability. I understand your point that so-called "ability" tests are actually sensitive to racial/socioeconomic differences. I withhold judgment on that particular point. I mainly question your assertion that achievement tests are essentially "ability" tests, for the reasons I gave above.

Being a teacher myself, I am obviously concerned with the apparent attitude that many Americans seem to have about the teaching field in general. There seems to be a great amount of hostility out there among people who think that teaching kids is an easy job, that teachers have it made, get their summers off, aren't doing their jobs, etc. I'm not going to attempt to try and defend the field of teaching here, but I DO have to say that one of the things I see missing in a lot of these opinions about what should be done about the poor public schools (and many of us, even teachers, regardless of differing political leanings, agree that the public schools are not all that they should be) is the lack of mention of learning attitudes on the part of students and their parents.

I've been teaching 20 years, and while I do not consider myself to be the end all, be all in the field of pedagogy, I have to say that I have met many a parent and many a child who was not the end all, be all as far as being willing to LEARN either. Not to mention this in any discussion of how to improve our schools is an important, and I suspect, a politically correct and convenient, omission as well. It seems that many of our "experts" in education, whether they are degreed and published, or merely of the "armchair" variety, conveniently leave out the oft observed by myself and any other teacher who has their feet planted in reality and not up in the clouds, (my opinion) fact that some kids just don't WANT to learn! At least, they don't want to learn what we are being told to TEACH them! And it doesn't MATTER what methods we use. I have seen a great variety of teaching styles implemented by teachers I have worked with, but with the kids I am talking about here, it simply DOES NOT MATTER!!! They, and quite often their parents, are hostile to the fact that we are even ATTEMPTING to teach them the material we are told to teach them by our superiors AT ALL! Yet, if these kids end up getting bad GRADES on their report cards, their parents are the FIRST ones up at the school complaining about the teacher.

I have noticed this phenomenon from the start of my career, and it persists to this day. It seems that it just does not compute with these folks that if I have twelve to twenty grades in my gradebook in a six week period, and seven of those grades are zeros because the child didn't DO the work (often despite repeated attempts and notices from me to them and/or their parents that this is happening)they CAN'T logically expect to get a passing grade! Then they expect ME to give "extra credit" work during the last six weeks or so in order that their child may get "passed on" to the next grade level. Many teachers actually bow to this pressure, (and the attendant pressure from principals to pass the kid on), and do this. Then later on down the line, the kid graduates and has learned nothing, either academically or about society's expectations of achievement and effort in the real world, and once again, it is the teaching field that gets ALL of the blame.

Now I'm not talking about kids who CAN'T do the required work here; I am talking about all of those (and there are a LOT at my school) who simply WON'T! In my school district, a child may "redo" any failed work taken for a grade, thereby earning a passing grade of 70 on the "redone" work. It is a "second chance." Many kids I see will not even do THAT!!!! Then their parents tell me, "Oh, I JUST don't know WHAT TO DO!!" "We have taken EVERYTHING away from him/her and he/she has been grounded for a month!!" Sometimes this may be true, sometimes not, but the point is, if THEY can't correct the behavior of their own child, then WHO CAN??!!

Education in our country seems to be an issue where we TALK a good line, and even throw a LOT of money at the problems hoping that they will then proceed to just go away, but we then give a nod and a wink when our kids themselves don't TRY, like hating school is to BE EXPECTED by children. And I'm NOT talking about a MILD dislike here; I mean absolutely HATING and DETESTING school, the TEACHERS, and everything that goes along with it!!! And these kids are NOT SHY about letting the teachers know just what they think of us!!!

Now I remember being a kid, and believe me, I was not the best or most cooperative of students either, but in my case, being at school was better than being at home, where my parents were constantly arguing and taking out a bad marriage on me, their only child. But overall, I took school as a joke; one to be laughed at while goofing with my friends, all the while getting the minimum amount of work done in order to get by and get passed on. The kids I am talking about aren't joking. They are simply NOT going to cooperate, and they DARE you to do anything about it, which, quite often, administration WON'T.

Now perhaps I am just an old, "burned out" teacher who should retire, but unfortunately, I STILL happen to think that learning to read and comprehend, learning to write where OTHERS can comprehend what you are writing, learning basic math, science, history, and geographical skills and information, etc. are important even today in our modern, technological society. Maybe it's just ME here! Would anybody care to comment?

Diane Senechal says: "This leads me to believe that the tests are indeed sensitive to instruction--but what sort of instruction, and in what subject? If you teach students to follow directions exactly, use graphic organizers even when they are not needed, and employ every "strategy" in sight for comprehending a contrived text, you might well bring test scores up."

I agree. This is the "teaching to the test" that accounts for nudges in test score "gain." But that's not the kind of test sensitivity or instruction we're looking for.

It's hard to say what "ability tests" measure. The SAT started out as the Scholastic Aptitude Test, but it's been diminished to the acronym--like IBM, GE, GM--it's whatever you want it to be.

Comment on Shill's post: No question that such kids are in the system. However, when they entered K, they weren't that way. If a child has not been taught to read by grade 3, current schooling is largely incapable of teaching them--the curriculum in all subjects presumes they can. But there is no record of what was or was not taught early on, so the kids and their parents take the hit.

It's never too late to learn the rudiments of reading and math. But the prevailing instructional product/protocols and testing practices are not "fit for use."

This puts teachers like you between the rock and the hard place.

Personally, I don't think you should have to put up with kids like this, but I'm not your school principal. If I were I'd join you in a teacher-parent(s)-student conference and have a talk about "the instructional facts of life." If it's not possible to get a parent and kid to "try hard" the student is an "instructionally dead kid walking."

President Obama gave kids and parents marching orders to assume responsibility as a patriotic duty. If they won't listen to the President, they're unlikely to listen to you.

The problem is consistency.

Imagine you had a criterion-referenced test with five performance levels: Below Basic, Basic, Proficient, Advanced, and Accelerated. You had a school with 250 students in each grade. You had students' 4th grade and 5th grade math and reading scores, the same students from one year to the next. Reviewing the data, you saw that something strange had happened.

In math, between 4th and 5th grade, 4% of the students who were Below Basic moved all the way to Accelerated, from the very bottom to the very top in one year. Ten percent of the Below Basic students moved all the way to Advanced. That could indicate a real "Beating the Odds" effect; in one year, about 14% of the very worst performing students not only reached Proficient, but went above and beyond.

On the reading test, the results were even more surprising. Of the students who were Below Basic in 4th grade, 12% jumped to the top, Accelerated, and another 14% made it to Advanced. That means that 26% of the students who were at the very bottom of the score distribution moved all of the way beyond Proficient -- in one year. Remember: these were the same 250 students from 4th grade to 5th grade. That is really beating the odds.

But then you notice that the opposite also happened to the best students. Of the students in the highest category in 4th grade math, 2% fell to Below Basic in 5th grade and 8% fell to just Basic. About 10% of the very best students fell to below Proficient in just one year. Reading scores were worse. Of those who were in the highest category in 4th grade reading, 8% fell into the lowest category the next year and 14% more were just Basic. That means 22% of the students in the very top category in reading in 4th grade were not even Proficient in 5th grade. What happened in just one year?

What about the students who had been in the middle? For math, for the students who had been in the middle category, Proficient, in 4th grade math, they were all over in 5th grade math: 16% Below, 24% Basic, 26% Proficient, 18% Advanced, and 16% Accelerated. Reading was just as strange with those in the middle in 4th grade spread out over the categories in 5th grade: 26%, 24%, 16%, 22%, and 12%.

Would anyone call that a reliable criterion-referenced test? Those were the same students from one year to the next. We expect some change, but those results are ridiculous with up shooting down, down shooting up, and the middle scattering high and low. You would probably conclude that there had been a serious error with the test scoring process.

Of course, those unusual patterns are not from a real criterion-referenced test. Those are real, reported "teacher effects" calculated using a 3-year value-added metric. It is not 4th and 5th grade, but pre- and post- tenure, the year before and after tenure, adjacent years. The proficiency levels are quintiles of performance. There really were 250 teachers and the calculations were done using the most sophisticated statistical techniques -- all the new technology that makes those calculations possible. And the results really were that chaotic. See for yourself. The data are directly from tables 1 and 2, Goldhaber and Hansen, CALDER POLICY BRIEF, November 2008, "Assessing the Potential of Using Value-Added Estimates of Teacher Job Performance for Making Tenure Decisions."

This is the kind of result being reported by experts who are in favor of value-added. Goldhaber and Hansen cite some of their research that shows that teacher effects are very large. But they admit they don't know why. "Thus, our decomposition suggests that changes in teacher quality within a teacher over time are, like teacher quality itself, almost entirely attributable to unobservable factors." (pg. 3) "Nevertheless, pre-tenure estimates of teacher job performance clearly do predict estimated post-tenure performance in both subjects, and would therefore seem to be a reasonable metric to use as a factor in making substantive teacher selection decisions." (pg. 7)

How could such seemingly random shifts in levels be called a clear prediction? Correlations. Variance. Statistics most people don't understand. But most people are smart enough to realize that, if those value-added quintiles were performance levels on a criterion-referenced test, parents would be mad as hell. Students don't go from the worst to the best to the worst from one year to the next.

But the fact that value-added can't identify which teachers will be excellent from one year to the next doesn't impede the pundits from thinking about assigning students to those unidentifiable teachers.

In the NY Times, Nicholas Kristof repeats the well-worn theory of teacher effects. "The reform camp is driven partly by research suggesting that great teachers are far more important to student learning than class size, school resources or anything else. One study suggests that if black kids could get teachers from the profession’s most effective quartile for four years in a row, the achievement gap would disappear." ("Education's Ground Zero." NYT, published March 21, 2009).

In the Goldhaber and Hansen study, only 44% of top quintile reading teachers stayed in the top quintile the next year. Only 42% of math teachers in the top quintile stayed in the top quintile the next year. So, even if you had assigned some students to a top quintile teacher using this year's value-added scores, more than half the time that teacher wouldn't still be a top quintile teacher by the start of the next school year. That would make assigning four top quintile teachers, one after the next after the next after the next, virtually impossible.

Goldhaber and Hansen are to be commended for their transparency. The year by year quintile table should be required for every district participating in the merit pay experiments conducted at taxpayer expense. Those tables make clear what a lot of statistical mumbo jumbo has a way of hiding. Anyone who can read a table can see that either teachers are completely inconsistent in their skill from year to year or judgments based on this metric are highly unreliable. Either way, this is a fatal flaw in the use of these data.

Yep. President Obama's advisers are telling him to chase error variance. They don't know what error variance is, but they've swallowed the rhetoric of "value added models." Tragic.

Good reading. All of it!

Margo/mom! You don't need to embed your arguments on just being a dumb midwestern mom! You're in the same camp--re smartness and scholarliness--as everyone else on this blog.

It's inconceivable to me that we should not recognize that we pass on every advantage we can to our own children. But we "can't" do so equally because we don't have equal advantages. Ditto for IQ scores--which are, in fact--as someone noted--not substantially different than so-called "achievement" tests. A test is created out of a pool of items which have been pre-tested to insure that the right kids get the right answers right.

But my greater concern is that if we rest education on so-called objective short answer or multiple choice answers we are skewing the climate of education itself in a particular direction. We are undermining the role of judgment and overemphasizing the role of right factual answers. Of course good judgment rests on knowledge and reason and experience, but the synthesis of these is greater than any one alone. What's impressive or depressive about standardized tests is that while right-answer pedagogy is a short-cut way of trying to raise scores, the best testers are those adept at making quick intuitive judgments based on a life-time of experience, only a small part of which rests with schooling itself. The logic of this, for me, has been to consider how to use the school day to best drive learning during the nonschool day/year. To create setting which lead kids to thinking, seeing and experiencing the other 4/5 of their waking hours in more powerful ways.

That's why early so-called academic schooling is so interesting to me--how we rest it on the least productive, efficient and powerful tool that children are all born with and have used remarkably well in the years before they enter our doors. Boredom and fear of appearing stupid are amazingly powerful ways to undermine young children.

It's not innate vs schooled-- Yes, it's partially "innate"--which includes our predispositions, styles of learning, pace, etc, etc, partially experienced, the way life has responded to us, judged us, and valued us. And yes, these re all partially social class and a bunch of other nonschool characteristics. They overlap, thus playing a large role in outcomes. Although of course, there re plenty of exceptions at both ends of the SES scale.

There was no way as a young girl that I wasn't influenced to see myself as a devoted fan while my brother saw himself as a player. We sat there together at the Yankee Stadium, equally enthralled, but physically and mentally engaged in a different activity. Still there were those exceptional women who defied the odds and became pro-ball players during WWII. But to presume that demonstrates that sexism outside the school was irrelevant, and that it ll rested on the phys ed teacher would be clearly missing the impact of the broader reality.

Teacher training should include blogs like this. Thanks!


I REALLY enjoy this blog and the comments by readers that follow. I'd like to build on what Dick Schultz about "teaching to the test" and about how if your child hasn't been "taught to read by 3rd grade, he/she won't be taught in school." I'd like to apply those same thoughts to the idea of "college and career readiness," a topic that the Obama administration is throwing around and connecting to the Stimulus Package. Our schools put incredible effort into teaching and then testing literacy and numeracy. But no one teaches how to apply for college, how to get financial aid and grants, etc. Instead we leave it up to students to mosey down to the guidance office and thumb through brochures and figure it out with the help of their parents. And yet, we are supposed to "make the U.S. the world leader in college graduates by 2020." -- WHAT!? HOW!? What if we actually TAUGHT students how to apply for college, what a B.S. was, why you would need a graduate degree, how to apply for financial aid, what colleges were looking for, can you imagine how many would apply and go to college! And to the point about teaching to the test, I say go for it in this instance. Ask students if they know how to apply, what B.S. stands for, etc. then put them through a class and then ask them again, of course they'll improve, and GOOD. There is a company called Envictus Corporation out of Washington D.C. (http://www.envictus.com/products_model_results.html) who has created a course just like this and is already seeing jumps between 10-30% in numbers of high school kids going to college after taking the "college and career ready" course.

Comments are now closed for this post.


Most Viewed on Education Week



Recent Comments