May 2009 Archives

May 25, 2009

Five Good Assumptions about School Change

Education Week founder and former editor Ron Wolk did us all a big service a month ago when he wrote this op-ed criticizing what he termed “Five Faulty Assumptions” of the pivotal report, “A Nation at Risk.” Wolk pointed out the flaws in each assumption, and his piece should be read and re-read, especially by those empowered to make education policy.

Here in my little corner, I want to build on his critique, and offer some alternative assumptions. So let’s see if we can take these five faulty assumptions, and replace them with sound ones.

(Faulty) Assumption One: The best way to improve student performance and close achievement gaps is to establish rigorous content standards and a core curriculum for all schools—preferably on a national basis.

New Assumption One: The best way to improve student performance and close the achievement gap is to turn each school into a powerful community of learners, where a stable core of teachers model collaboration and creative problem-solving as they improve instruction as a team. This school community extends beyond the walls to include the parents, families, and businesses in the area, so that education is supported by everyone, and learning is connected to the aspirations of the community.

We need to replace the rhetoric about preparing every child for college with a reality that gives more of them a chance to attend and succeed there. There needs to be greater access to scholarships, and an elimination of financial barriers that currently keep most working class students out of the best schools.

(Faulty) Assumption Two: Standardized-test scores are an accurate measure of student learning and should be used to determine promotion and graduation.

New Assumption Two:
The most powerful assessment is that which is done in the context of learning, within the classroom. There is a valid role for standardized tests, to provide an external yardstick, providing all of us with a reality check on how our students compare. But classroom-based formative assessment, connected to ambitious authentic learning, can provide students and teachers with valuable information needed to grow. Schools should be challenged to create projects and assignments that demonstrate this learning to the public, and open the school’s walls so learning is visible.

(Faulty) Assumption Three:
We need to put highly qualified teachers in every classroom to assure educational excellence.

New Assumption Three:
We need to retain and develop the capacity of the best teachers, and transform them into leaders of strong collaborative communities, where the best practices are developed and shared. The schools in our most troubled districts have huge turnover rates, and programs that emphasize recruiting smart people into these schools miss the point. Smart people figure out very quickly that these are incredibly tough places to feel effective – and they leave. We need to boost pay, and honor the expertise of those who are successful in these settings. They will show us the way.

(Faulty) Assumption Four: The United States should require all students to take algebra in the 8th grade and higher-order math in high school in order to increase the number of scientists and engineers in this country and thus make us more competitive in the global economy.

New Assumption Four:
More mathematicians and scientists will serve our nation well. But so will more historians, more artists, more writers, more carpenters, more auto mechanics and more musicians. Our schools should offer students opportunities to develop in the areas where they are gifted, and encourage the pursuit of needed occupations through scholarships for advanced study. Forcing all students into Algebra whether they are ready or not will lead to another generation of kids who associate math with difficulty and failure.

(Faulty)Assumption Five: The student-dropout rate can be reduced by ending social promotion, funding dropout-prevention programs, and raising the mandatory attendance age.

New Assumption Five:
Students are voting with their feet, and in our toughest schools far too many are leaving. Teachers at schools with high dropout rates should be empowered to collaborate with one another, and with student leaders, to develop innovative programs to transform the schools into places more strongly connected to students’ lives. Students need to feel a direct connection between their education and their future, and that needs to begin before middle school and continue through graduation. Mentors, role models and community connections can bring students an awareness of how a solid education can help their families in the future. Students should be aware of the many pathways to success, from community college and four-year universities, to on-the-job training and entrepreneurship.

What do you think? Any faulty assumptions you would like to challenge? Any new assumptions you would like to offer?

May 21, 2009

Rothstein Interview Part 4: How About Performance Pay?

Last week I posted Part 1 and Part 2 of a four-part interview with author Richard Rothstein. Monday I posted Part 3, and today I post the fourth and final segment, focused on the trouble with performance pay and some fresh ideas for building accountability for our schools.

6. There is much discussion of providing financial incentives for teachers who improve student achievement. Is this a wise strategy?

We should be cautious about this strategy because we do not yet (and may never) know how to measure accurately an individual teacher's contribution. Teachers know that in some years they get “good” classes, and in others more difficult ones, even with similar student demographic characteristics. Variation in the cognitive ability of students in different classes in the same grade and in the same school is also often deliberate. Good principals assign students to classes by matching students' and teachers' strengths and weaknesses. Tracking, where students are assigned to classes based on their prior performance, continues to characterize many American schools. Pay-for-performance schemes require comparing the performance of teachers facing similar challenges. Because teachers in the same grade and in the same school rarely face similar challenges, pay-for-performance schemes are unlikely to distinguish superior teachers with sufficient accuracy. These schemes may simply reward teachers who, from luck of the draw or from pupil assignment policy, happen to get classes for a year or more in which posting gains is easier.

Pay-for-performance proposals typically want to base merit pay on math or reading scores. But a teacher who is particularly effective in math instruction may not also be unusually effective in reading. Paying teachers for math or reading scores will, if it works, also give them incentives to ignore curricular areas for which they are not rewarded. Pay-for-performance will, therefore, accentuate the curricular distortions we have already experienced under No Child Left Behind.

There is some evidence from psychology that when people are intrinsically motivated to succeed, and are then given financial rewards for success, these rewards can undermine the intrinsic motivation. Many young people go into teaching with a commitment to children and a belief in education's importance. They want and demand adequate compensation, but the best teachers may be those who, in addition, have a deep commitment to the norms of the profession. We should do more to investigate whether we will undermine that commitment with “pay-for-performance,” before implementing such schemes.

7. If we can agree that the current accountability system is flawed, what system should we be advocating in its place?

I cannot specify a detailed alternative, but experts more qualified than I should begin now to develop one. It will require considerable experimentation to design a constructive accountability policy.

In Grading Education: Getting Accountability Right, I and my co-authors (Rebecca Jacobsen and Tamara Wilder) set forth some proposals to consider. One is expanding the National Assessment of Educational Progress (NAEP) to provide state-by-state comparative information on a wider variety of curricular areas including other academic subjects, like history and the sciences.

GradingEducation_Cover_200.jpg
A chapter of the book recounts the little-known history of early NAEP, when the federal government reported on behavioral characteristics of a representative sample of American students. NAEP provided information on whether students could work cooperatively, had good health, dietary, and exercise habits, were learning to participate constructively in our civic life, had appreciation and knowledge of the arts and music. A return to this NAEP model could give states the knowledge they need to infer whether their public schools were performing better or worse than those in other states, not only in math and reading but in many more of the curricular areas that comprise a balanced education.

The book also suggests learning from the experience of other nations that have been debating how to hold schools accountable for better performance. Grading Education provides a more detailed description of the English inspectorate system that uses test scores but also sends trained professional experts into schools to evaluate the quality of teaching, as well as students’ behavior and the development of character traits emphasized in the curriculum.

Recently, a committee of the Broader, Bolder Approach to Education (BBA) campaign met to develop principles for a new American accountability system. I was privileged to serve on that committee, and in some respects, its recommendations overlap with those of Grading Education. The principal BBA recommendation is that states hold schools accountable by conducting inspections using trained professional evaluators, able to judge the quality of educational delivery and outcome. Test scores should not be abandoned, but should not be the sole measure of school effectiveness. As of this writing, committee members are polishing their report; it should appear soon on the BBA website. Any readers who wish to receive a copy of the report once it is posted should send a note to boldapproach@epi.org with “send accountability report” in the subject line.

What do you think? Do you think performance pay might undermine teachers' intrinsic motivation? Should educators be looking for alternatives means of accountability?

May 18, 2009

Rothstein Interview Part 3: Obama Faces Tough Questions

Last week I posted Part 1 and Part 2 of a four-part interview with author Richard Rothstein. Today I am posting Part 3, focusing on tough questions President Obama must face if he is to live up to his goals of improving educational outcomes.

3. You quote President Obama as being critical of the way NCLB has narrowed the curriculum to focus on tested subjects. Are there indications that steps are being taken to reverse this emphasis?

During the election campaign, President Obama said that NCLB “has become so reliant on a standardized test model that…subjects like history and social studies have gotten pushed aside. Arts and music time is no longer there. So the child is not having the well-rounded educational experience I benefited from and most in my generation benefited from.” We must change NCLB, he said, “so that the assessment is one that takes into account all the factors that go into a good education.”

To date, neither the president nor Secretary of Education Arne Duncan has indicated how the administration plans to apply this insight. It is more difficult than it looks. Any accountability system that uses incentives to improve only some of the goals of education will inevitably undermine the well-rounded education that President Obama supports. Rational educators will de-emphasize curricular areas for which they are not held accountable, to increase emphasis on those for which they are rewarded or punished. Thus far, most Washington discussion about "fixing" NCLB has stressed improving how we assess math and reading – for example, by using gain rather than level scores. However, even if math and reading assessment were improved, holding schools accountable only for math and reading and not for “all the factors that go into a good education,” will necessarily result in continuing to “push aside” arts and music, history and social studies, science, and other curricular areas.

4. There seems to be a consensus in Washington that NCLB can be fixed by making schools accountable for gains in test scores, rather than absolute targets. Will this be a meaningful change?

You are right; this consensus does seem to have captured Washington, but is ill-considered.

There is a conspicuous conflict between a desire to measure schools by their gains, and a continuing belief that all students can eventually reach the same proficiency point. NCLB requires that students with varied backgrounds and disadvantages must pass standardized tests by 2014. Although the Washington consensus seems to acknowledge that the date should be pushed back a little, no date makes sense if schools are to be evaluated by their gains. Advocates of using gain scores for NCLB accountability do not seem to have abandoned the idea that all students should reach the same level, but maintaining both standards simultaneously is ludicrous.

Should schools with more disadvantaged children, where present scores are lower, be expected to make faster gains than schools where present scores are higher? You might think so, because they have more opportunity to make progress. But perhaps students with more skills can apply them more effectively to learn even more. If so, then schools with fewer disadvantaged children, where present scores are higher, would make faster gains and the score gap will increase. Policymakers who advocate using gain scores to measure school effectiveness have not yet explained how they think such expectations can be adjusted.

Experts do worry about several technical impediments to using gain scores for NCLB accountability. One is that most states do not yet have data systems that can link a student’s test scores in successive years; another is that gain scores must be based on two successive annual tests, compounding the unreliability of each; another is that school evaluations based on gain scores must ignore large numbers of students who switch schools at the beginning of or during a school year; yet another is that schools, even after controlling for demographic factors, may not enroll representative collections of these demographic groups.

But these technical discussions avoid the more serious issue I discussed above - an accountability system based on easily measured subjects will inevitably result in narrowing the curriculum, because educators held accountable for math and reading will rationally redirect their effort away from other areas. Holding schools accountable for math and reading gains instead of math and reading levels will do nothing to solve this problem.

Nor can we solve it by testing in other subjects. Some don't lend themselves to standardized testing. Whether schools are teaching students to work cooperatively, to exhibit good habits of civic participation, to resolve conflicts nonviolently, to appreciate the arts and music, or to develop healthy exercise and other lifestyle habits, cannot satisfactorily be assessed by either gain or target scores on paper-and-pencil tests alone.

Reform of NCLB’s accountability design requires more than improving our technical capacity to evaluate the teaching of math and reading. It requires development of systems, requiring qualitative judgment along with testing, that give schools incentives to deliver a balanced curriculum.

5. Can you explain Campbell’s Law? Why do you think the crafters of NCLB ignored this principle when designing their program?

The great methodologist, Donald T. Campbell, studied President Richard Nixon’s “war on crime” in the 1970s and concluded that “the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” Campbell found that when police departments were held accountable for reducing crime rates, reductions were achieved by manipulating statistics, not by better policing.

Even before Campbell’s study, and certainly afterwards, social scientists have observed that institutions other than schools are invariably corrupted by accountability only for narrow, quantitative performance measures. Grading Education illustrates this with examples from many public and private policy fields. For example, workforce training agencies, held accountable for placing unemployed workers in jobs, reduced educational programs leading to high quality, long-term careers, instead emphasizing placement of large numbers of workers in short-term, unskilled jobs that boosted agency success statistics. Cardiac surgeons, held accountable for patient survival rates, demonstrated superior performance by refusing to operate on sicker patients. U.S News and World Report evaluates the quality of colleges by the percentage of applicants who are accepted; colleges have reduced their percentages (and boosted ratings) by waiving application fees for unqualified high school graduates. The narrowing of curriculum under NCLB, the “teaching to the test,” the opportunistic focus of instruction on students who could boost “adequate yearly progress” statistics rather than on students who most need attention, all have been foreshadowed by similar experiences in other fields.

I really don’t know why the Bush Administration, Congress, and policy advocates who crafted NCLB ignored this overwhelming body of experience, especially because, even in the private sector, institutional accountability rarely relies on simple quantitative indicators. Instead, qualitative evaluation in the private sector is commonplace.

What do you think? Do you see the narrowing of curriculum Richard Rothstein describes? Has our emphasis on test scores distorted our educational system? What do you think of the steps President Obama has taken thus far?

May 13, 2009

Rothstein Interview, Part 2: International Comparisons Miss the Mark

Earlier this week I posted Part 1 of a four-part interview with author Richard Rothstein. Today I am posting Part 2, which focuses on the dire warnings we have heard over the past few decades, echoed recently by President Obama, that the United States is in danger of falling behind other nations due to our poor educational system.

Question 2: It is often said that our students are falling behind those in other nations. Is this the case? What should we do about it?

American students perform less well in mathematics than students in many other industrialized and in East Asian nations. We do relatively better in elementary and worse in middle and secondary school. Explanations range from an American curriculum that is “a mile wide and an inch deep” (with superficial treatment of too many mathematical topics), to (as in Malcolm Gladwell’s recent book, Outliers) the fact that Asian languages have more literal words for numbers (“ten-two” rather than “twelve”) and that Asian rice cultivation inspired cultural beliefs in harder work than beliefs inspired by American wheat farming.

Evidence of American inferiority in curricular areas other than math is skimpier. In reading and civics we do quite well on some comparisons.

Do we have an education-driven competitiveness crisis?

These international comparisons don't really matter. Our math and reading scores are apparently quite adequate for economic competitiveness (although the recent collapse of the speculative bubble may suggest that our biggest shortcoming is in the teaching of ethics, character and judgment).

Until the asset bubble burst last year, American productivity growth was extremely rapid, about 2.2 percent a year from 1989 to 2006, outpacing that of comparable industrial nations whose test scores in mathematics were higher. Some economists now wonder whether our productivity growth was superficial, including, as it does, financial sector gains attributable to the speculation. Yet even if we subtract the contribution of financial services to overall productivity growth both here and abroad, the United States still performed as well as comparable nations, both in absolute productivity and in its rate of growth.

A good education system is necessary for such growth. Well-educated innovators develop new technologies, and well-educated workers can utilize them. Yet our school system seems quite adequate for these purposes.

Most Americans have seen their incomes stagnate in recent decades, even when the economy was growing rapidly. Some commentators (for example, authors of the widely-publicized 2006 report, Tough Choices, Tough Times, issued by the National Center on Education and the Economy) have attempted to blame this income stagnation on the failures of our public schools. But as Lawrence Mishel and I (in an appendix to Grading Education) argued,

while adequate skills are an essential component of productivity growth, workforce skills cannot determine how the wealth created by nation¬al productivity is distributed.... American middle-class living standards are threatened not because workers lack competitive skills but because the richest among us have seized the fruits of productivity growth, denying fair shares to the working- and middle-class Americans, educated in American schools, who have created the additional na¬tional wealth.

Mishel and I also note that crisis warnings about internationally comparative test scores are not new. A quarter-century ago, the Nation at Risk report concluded that failing public schools were responsible for American firms' loss of market share to Japanese automobiles, German machine tools, and Korean steel. A 1990 report of the same National Center on Education and the Economy engaged in similar hand-wringing. Yet the American economy out-performed the economies of Europe and Asia in the 1990s; indeed Japanese auto manufacturers set up plants in the U.S. and found public high school graduates in the southern states - where test scores are typically lowest - to be appropriately skilled for Japanese manufacturing methods.

We already produce more college graduates than the American economy can absorb. This does not mean that we should stop increasing the rate of college graduation – there are other important reasons to educate a population, having to do with our civic and cultural life, than economic ones.

If we truly had a shortage of skills, simple economic theory would lead us to predict that young college graduates with the greatest skills would see rapid increases in wages. Yet college graduates' wages have been rising mostly because, before the bubble burst, wages in finance, sales, and administration were going up. Science, technology, engineering, and math wages were mostly stagnant, at best. This indicates no shortage of skills.

Partly, the sufficient skill supply is attributable to immigration of well-educated workers. This immigration will continue, and the entire nation will continue to benefit from an economic surplus of education.

The new McKinsey report
Last month, a report by the McKinsey consulting firm revived the complaint that poor-quality schools threaten our economic security. The report (with an encomium by Thomas Friedman in the New York Times) claimed that the poor achievement of low-income children was costing the nation between $400 billion and $670 billion a year in lost productivity, or 3 to 5 percent of our gross domestic product. The achievement gap, it said, puts the nation into the equivalent of a permanent economic recession will "almost certainly act as a drag on overall U.S. economic performance in the years ahead."

The McKinsey report came to this conclusion by relying on a regression analysis by Stanford economist Eric Hanushek and colleagues, showing a high correlation between countries' test scores and their economic growth rates: a standard deviation increase in scores (about 33 percentile points for a country with average test scores) is associated with almost 2 percentage points of higher economic growth. McKinsey then took this percentage and multiplied it by the share of low-income and minority participants in the U.S. workforce, by the size of the achievement gap, and by the historical growth rate of the U.S. economy, to calculate its estimate of the hundreds of billions of dollars that the achievement gap costs.

It is safe to guess that few, if any of the journalists who promoted the McKinsey conclusions examined them carefully, or recalled what they had been taught about the dangers of assuming causation from a correlation. The Hanushek regression line relies on facts, for example, such as that South Korea has had high test scores and rapid economic growth while the Philippines has had low test scores and slow growth. But surely nobody can believe that if the Philippines could somehow raise its test scores, that country would then mimic South Korea's economic growth rates. Although well-educated workers were certainly necessary for South Korean economic success, the country also benefited from enormous American subsidies (Hyundai got started as a contractor for the U.S. military, using U.S. surplus military equipment), its steel industry was initially financed with war reparations from Japan, and the nation followed an industrial policy that prohibited imports, manipulated exchange rates, and provided free credit to favored industries. South Korea had a 30 percent savings rate, with consequent capital investment. The Philippines had none of these advantages or characteristics. Although this is an extreme comparison, every country on Hanushek's regression line has a unique story that includes more than its test scores.

Hanushek acknowledges that the United States – with low test scores and high economic growth - fits his regression line very poorly. But Hanushek dismisses the significance of this challenge to his theory by saying that American economic success was attributable to "generally less intrusion of government in the operation of the economy," and weak labor unions. But the regression line still proves, he implausibly claims, that our future economic security will require higher test scores.

This bottom line remains: the United States needs a well-educated population to grow and prosper. Education levels in the U.S. have improved over the nation's history, and our economy has taken advantage of its workforce education levels. There are many reasons to improve our education system – the quality of our cultural and civic life depend on it. But there is little reason to believe that the American economy has suffered from insufficient workforce skills, or is likely to do so in the future.

Do we have an education crisis?

The area where we fall most short is in the low percentage of students from disadvantaged families who graduate from high school and then college, prepared to compete for the most remunerative and technically skilled jobs that become available in our economy. This is not a problem of international competitiveness - unfortunately, we can compete just fine, using a mostly white and immigrant professional and technical class. It is a problem for our own identity as a nation, for the quality of our civic life, for the integrity and values of our future citizenry. This is the reason to improve our education system, not because of international test score comparisons.

An intriguing result of international testing is that students in some American states, particularly those with relatively few minority and economically disadvantaged students, perform as well as students in the higher scoring countries, even in math and science. The relatively poor achievement of American students overall is attributable, at least in part, to our greater socioeconomic inequality and shamefully high child poverty rate, compared with other advanced nations.

And that shamefully high rate of child poverty is destined to go much higher as unemployment continues to rise in the current recession. Estimates by Lawrence Mishel of the Economic Policy Institute are that the child poverty rate will rise from 18% to 27%, and the rate for black children will rise from 35% to 52%: yes, over half of all black children will likely soon be living in families with income below the poverty line. This will have a palpable impact on academic achievement – they and their families will be under greater stress; they will be more mobile, changing schools and teachers more often; their dreams of college will be dashed. Other industrialized countries have a stronger safety net to help vulnerable sub-populations weather the worldwide recession. The economic catastrophe we are now suffering will inevitably widen the achievement gap.

This, and not a false focus on international test comparisons, should be the crisis that grabs our attention.

Once we have recovered from the recession, we will not succeed in sharing our renewed prosperity with youth from disadvantaged families without doing a better job of preparing them to take advantage of what schools have to offer. Last year, a diverse and bi-partisan group of researchers, practitioners, and policymakers called for a "Broader, Bolder Approach to Education (BBA)" (www.boldapproach.org) that would combine school improvement with the social, economic, family and community supports that enhance achievement. In particular, BBA urges the nation and the states to narrow the achievement gap by implementing high-quality early childhood care and education for all disadvantaged children; by providing routine and preventive pediatric, dental, and optometric care for all disadvantaged children (in full service school-based health centers, for example); and by ensuring that disadvantaged children have access to enriched academic content, as well as opportunities for social and emotional skill-building in cultural, organizational and athletic experiences during out-of-school time (after-school, weekend, school-year vacation, and summer hours).

Richard Rothstein's latest book, Grading Education, Getting Accountability Right is now available. Rothstein is also part of the new project A Broader, Bolder Approach to Education. Part Three, coming next Monday, will describe the effects NCLB has had on our curriculum, and what Campbell's Law tells us about the distortion of our goals.

What do you think? Should we focus more attention on childhood poverty and less on international comparisons?

May 10, 2009

Rothstein Interview Part 1: National Standards are a Quagmire

Former New York Times columnist Richard Rothstein has emerged as one of the nation’s sharpest critics of the current test-centered approach to education reform. Six weeks ago I posted a review of his recent book, Grading Education, Getting Accountability Right.
I thought it would be great to hear his comments on the debates raging over how to fix NCLB, and proposals such as national standards. Here is part one of a four-part interview:

1. In Chapter 4 you describe how a student who scores as proficient in 8th grade math in Montana could go a few miles across state lines to Wyoming and be far below proficient. Because of such embarrassments, many education policymakers now advocate requiring states to adhere to higher, common standards. Would such a reform correct the problem?

The widespread call for higher common (or national) standards has little relevance to the problem it pretends to address – that states manipulate passing points on their tests under the pressure of NCLB’s accountability requirements.

Although proficiency data from various states are not comparable, we already have adequate means of comparing student performance in Montana and Wyoming and in every other state, at least in math and reading in the elementary grades. We can do so by using results from the National Assessment of Educational Progress (NAEP), a federal test given to a representative sample of students in every state.

Advocates of national standards typically confuse three things:
* standards
* test coverage (or alignment), and
* cut points (proficiency, or passing scores).
Standards are descriptions of the knowledge and skill that teachers should cover in each grade. Tests reflect whether students have gained that knowledge and skill. Cut points are the number of questions on such tests that students must answer correctly to pass, or to meet accountability targets.
In practice, standards, tests, and cut points are often designed with little regard for each other.

Alignment of standards and tests
State test questions should cover a representative body of the knowledge and skill that state standards say should have been learned. Because a curriculum covers a large span of knowledge and skill, any test of one hour or so must select only a small portion of a year's standards to assess.

Typically, states select only the simplest standards for tests used for accountability purposes. State officials claim that their tests are "aligned" with state standards because each question on a test covers something found in the standards. But when these questions cover only the simplest skills in the standards, students who do well on the tests may still not have learned a representative selection of what the standards say they should have been taught.

Making matters worse, many states have standards so comprehensive that they could not possibly be delivered in a year-long curriculum. These standards are "high," but have little relationship to reality. Because a national standards-setting process would likely be controlled by elected officials and policy advocates, not classroom educators, efforts to establish high common standards will likely have even more fanciful results.
Many states have high standards and easy tests. Establishing high common standards will do nothing to solve this problem.

Cut points
Once a test has been adopted, NCLB requires states to establish a cut point, or passing score. With tests assessing the same underlying knowledge and skills, a state can have a high passing score, showing a small proportion of students “proficient,” or a low passing score, showing a high proportion of students “proficient.” A state can have higher standards and a low passing score, or lower standards and a high passing score.

Now let’s return to your Montana-Wyoming question. Even if we had high common standards, Montana could have a test that sampled an easier portion of the common curriculum, and Wyoming could have a test that sampled a more difficult portion of the common curriculum. Or, with a common curriculum and comparable alignment, Montana could establish a low passing score on its test and Wyoming could establish a high one. A large share of Montana's students and a small share of Wyoming's would then be deemed proficient.

If we want students in Montana and Wyoming with the same achievement to have the same chance of passing accountability tests, we need a national test, with questions drawn from the full grade-level curriculum, with a single passing point - not national standards alone. Establishing a national test, however, is widely regarded as politically impossible. President Clinton proposed voluntary national tests and even this was shot down. There is today a new attempt, led by the Council of Chief State School Officers and the National Governors' Association, to create national standards. The success of this effort is uncertain; even less certain is whether, if states voluntarily adopted common standards, voluntary national tests would follow.

Already, the test-based accountability coalition is splintering on this issue. Former Secretary of Education Margaret Spellings, for example, recently denounced the call for higher, common standards because it will interfere with NCLB's goal of closing the achievement gap (when defined as achieving a low level of proficiency) by 2014. She's right, if national standards lead to requiring higher cut scores on a more difficult test.

Can state tests be "equated"?
Alternatively, we could require Montana and Wyoming to establish passing points on their respective, very different tests that reflect a similar achievement of knowledge and skill. If one state, for example, had a relatively easy test, NCLB could require a larger number of correct answers for passing; if another state had a relatively harder test, NCLB could require a smaller number of correct answers for passing. Precision in this exercise would not be possible, but it is technically feasible to determine what roughly equivalent passing points should be.

But it is hard to imagine how this could be accomplished in practice. It would take considerable time and expertise to create such definitions – a sample of students would have to take both state tests, or a new common test, and their scores on each test compared – and when a state changed its test, the effort would have to be repeated. To equate the tests of many states would be more complex, and the processes would have to be repeated frequently because states must change their tests frequently to make the precise questions unpredictable and minimize “teaching to the test.” (Many states now change their test questions in minor ways, but don't change the portion of the curriculum the test covers. Such changes do only a little to avoid teaching-to-the-test corruption, but would still require new equating studies to determine if passing points were similar.)

To the extent that tests in different states included questions that represented different aspects of a common curriculum, efforts to equate such tests would be impossible.

The Northwest Education Association has a common test administered in some (but not most) states, and the NWEA has used its common test to compare the passing rates on states' own tests. But because states change their tests, and passing points, frequently, an NWEA report can have only a very short shelf-life.

The bubble
States' educational performances can differ, even if similar percentages of students were to pass identical tests. When NCLB holds schools accountable for getting students past an arbitrary proficiency point, some states and school districts can (and do) tell teachers to focus inordinate attention on students who perform at a level just below the cut point, to push those students, typically referred to as "on the bubble," over the passing line. Teachers who pay extra attention to bubble students necessarily spend less time instructing children who are far below or already above the passing level. States where this takes place can have higher passing rates with lower overall performance.

We already know how students compare across the nation.

We already have almost all the information we need to determine how student performance in math and reading in one state compares to that in another. The National Assessment of Educational Progress (NAEP) gives a common test to a sample of students in every state, in 4th and 8th grade, every other year. Several different test booklets are used; this makes it possible to sample a broader swath of the curriculum than would be possible if all students were given the same test. Because teachers do not know far in advance whether their students will be among those sampled, “teaching to the test” is less present for NAEP than for state tests. The underlying framework of NAEP (i.e., the implicit curriculum that NAEP assesses) is, in effect, the common standards that many people say we now need.

NAEP reports not only the average scores of students in each state, but also the distributions – for example, how students in the bottom quartile of performers in each state compare. NAEP reports the average scores of race and ethnic groups within each state, the average scores of boys and girls, and the average scores of children from low-income families. With all this information, and without explicit national standards or tests, we can easily compare the performance of students from the various states, and make inferences about the quality of each state's educational and youth development systems.

Thus, from NAEP, we already know that the performance of students in Montana and Wyoming is almost identical. On state tests, 64% of 8th graders in Montana were deemed (in 2003) to be “proficient” in mathematics for NCLB’s accountability purposes, compared to only 11% in Wyoming. But NAEP also established its own common passing score, and reported that 35% of 8th graders in Montana were NAEP-proficient in math in 2003, compared to 32% in Wyoming.

Decisions about how many NAEP questions a proficient 8th grader should answer correctly are just as arbitrary as decisions about how many must be answered correctly on the Wyoming or Montana tests. There is no basis for saying that the NAEP proficiency definition is better or worse than the Montana or Wyoming definitions. But it doesn't matter. NAEP's arbitrary definition (and actual scale scores) gives us all the information we need to determine how student achievement in Montana compares to student achievement in Wyoming; national standards can add nothing to what we already know in this respect.

Why can't NAEP be the national test?
As I mentioned earlier, NAEP is now given only to a small sample of students, but one large enough to reveal statistically reliable generalizations about the various states. Teachers do not know far in advance that their schools will be selected for NAEP, and so have no incentive to corrupt the test by preparing students for test questions rather than teaching the underlying curriculum. And because each test-taker answers only some questions in the overall assessment, NAEP can cover a fuller sample of the curriculum than if all questions were crammed into each test-taker's allotted time.

Sampling students and the curriculum means that NAEP can report no individual student scores. It is not a national test.

A very dangerous proposal is to make NAEP a national test by giving it to every student nationwide. This would corrupt NAEP in the same way that state tests have been corrupted under NCLB. Knowing in advance that their students would have to take the test, teachers could prepare students for it, independent of teaching the underlying curriculum. Giving all NAEP test takers identical questions would permit educators to predict which aspects of the broad curriculum would more likely be tested, creating incentives to stress these aspects and overlook others.

Most states have reported dramatic gains in state test scores under NCLB. But these gains have not been duplicated in state NAEP results. Partly this is because, unlike state tests, NAEP's framework (the implicit curriculum implied by NAEP questions) is not so disproportionately skewed toward the easiest skills. Also, because teachers are not so familiar with NAEP that they can predict particular types of questions, answers are a more accurate reflection of what students truly know and can do. These characteristics will be lost if NAEP becomes an individual student-level national test. We would then no longer have an independent monitor of the performance of American students, or an accurate way to compare students in the various states.

National standards are a quagmire

Establishing unnecessary common standards leads to a quagmire we will soon regret. The late 1980s and early 1990s saw a similar belief that national standards would improve American education. In math, the National Council for Teachers of Mathematics (NCTM) promulgated standards that were fiercely defended by some and attacked by others. Some states adopted them while others did not. Since then, math performance of elementary school students has climbed substantially. Indeed, math scores of black elementary school students on the NAEP have increased so much that they are now as high as whites' in 1982. In other words, if white students’ scores had remained stagnant, the black-white gap would have been eliminated. Because NAEP has only recently been given to large enough samples to generate accurate state-level (as opposed to national) results, we can't say whether the improvement was greater in states that adhered to the NCTM standards. Although there may now be more agreement about math instruction than 20 years ago, any attempt to re-introduce national mathematics standards could set off another round of “math wars.”

A fierce fight also developed over proposed national American history standards. Disputes between those stressing facts about political and economic leaders, those stressing the experiences of workers, women, and minorities, or those wanting students to interpret original source documents, persist today. A new attempt to establish national history standards will set off a similar war.

We have a recent example of how national standards can be politicized. Under No Child Left Behind, “Reading First” funds were used to establish implicit national reading standards requiring an excessively mechanistic curriculum. Corrupt administration of these funds by the Bush administration may have helped to discredit this approach, but an effort now to make this national curriculum explicit will set off unproductive battles between its advocates and those favoring “whole language” or “balanced” teaching.

And do we really want Congress debating whether evolution is only a theory? Proponents of national standards warn that without them, some states will adopt such an approach. Skeptics about national standards (like me) worry that with them, all states may be required to do so.

This ends part one of a four-part interview. Part Two will address the idea that the US is falling behind other nations in the race to succeed, and will be posted on Thursday, May 14th.

Richard Rothstein is part of the new project A Broader, Bolder Approach to Education.

What do you think of these ideas? What is your opinion of the push for "tougher" national standards?


May 04, 2009

NCLB Fails its Big Test: The Achievement Gap Unchanged

This week some educational bombshells have exploded, and we need to take some time to examine their implications.

But first, a bit of my own history, to provide some context for my perspective. I chose to teach in Oakland because I had experienced the civil rights movement as a child. In 1968, as a fifth grader in Berkeley, I was reassigned to a South Berkeley school that had been predominantly African American in the city’s voluntary desegregation program. My parents were deeply committed to social justice, and I emerged from high school active in the civil rights struggles of that era – fighting the Bakke decision that undermined affirmative action.

When I got my teaching credential in 1987, I knew I wanted to teach science where I was needed, and went to Oakland. The concept that education is a civil right is not new to me, or to many teachers of my generation, who entered the profession for that very reason.

When I first encountered No Child Left Behind, I was worried. I did not believe annual standardized tests were the best way to measure student learning, and I feared the law would actually deprive those who performed poorly of the best teaching possible – that which is creative and responsive to their interests and aspirations. My fears were borne out. I saw the school where I worked, which had a wonderful staff and diverse student population, hammered year after year because we could not manage to get all six subgroups to rise simultaneously.

But the law gathered some important defenders – especially in the civil rights community. The Mexican American Legal Defense and Education Fund (MALDEF), and the National Association for the Advancement of Colored People (NAACP) both joined with Senator Ted Kennedy, Congressman George Miller and President George W. Bush to promote the law. They believed that the law would force schools to improve, and that provisions that highlighted the performance of racial and economic subgroups would finally close the achievement gap between whites and disadvantaged minorities.

This week we got some shattering news.
The National Assessment of Educational Progress (NAEP), a series of tests that provides us with our most accurate barometer of student performance, revealed that the achievement gap has not, in fact, been narrowed under the last eight years of NCLB. The New York Times article noted that:

Although Black and Hispanic elementary, middle and high school students all scored much higher on the federal test than they did three decades ago, most of those gains were not made in recent years, but during the desegregation efforts of the 1970s and 1980s.

This week from the Civil Rights Project at UCLA came another blow. A report was released which concludes that NCLB has done more harm than good. Their description states:


The report finds that NCLB is failing on three fronts. First, there is little evidence that high stakes accountability under NCLB works. It has not improved student achievement and the sanctions have had limited effects in producing real improvement. The law also is not very good at accurately identifying schools needing improvement and far outstrips the ability of states to intervene effectively in the schools it sanctions. Third, the law has failed to connect in a meaningful way to the educators who must implement it -- they do not see the accountability goals as realistic and consider the sanctions to be misguided and counterproductive for improving schools.

The most important finding is the damage the NCLB is doing to our educational system. Under NCLB, the system "works" when education systems operate within only a basic skills framework and with low test rigor. The cost to our nation is revealed in an educational system stuck in low-level intellectual work.

While President Obama has pledged that reform of NCLB will move us away from an emphasis on standardized tests, Secretary Duncan has joined the chorus calling for national standards and investment in vast data systems. This makes me think national tests and even a national curriculum might not be too far behind. I fear that these reforms do not move us away from standardized tests. They may make those tests more efficient and pervasive.

It is instructive that it was the real structural changes of the 1970s and 80s – the school desegregation and affirmative action programs I participated in as a fifth grader, and defended as an activist in my teens and twenties – that had significant effects on the achievement gap. There was a democratic ideal, the concept that when children learned together we would learn to work together, and that this would help build a stronger society. Over the past decade, however, even as we have implemented NCLB, our schools have become MORE segregated.

The authors of the Civil Rights Project report provide a powerful prescription for change in their conclusion.

Schools cannot be improved against the better judgment, and without the enthusiastic participation, of those charged with making the improvements. While this commitment cannot be coerced through sanctions, it can be motivated through guidance and mild and positive pressure that mobilize internal ideals and standards of competence and care. For educators, such standards need to be developed through professional socialization in teacher-preparation programs and sustained by way of good instructional supervision, learning communities at school sites, professional networks – and the soft power of accountability systems that are redesigned to inspire educators. Accountability systems inspire educators when they connect to broader educational values and give the stronger teachers enough flexibility to model best practices. Soft accountability is powerfully augmented when parents are mobilized to support their children‘s achievement and press for high-quality schools. We submit that after about fifteen years of state and federal sanctions-driven accountability that has yielded relatively little, it is time to try a new approach. The hard work of broader-based movements, nourished by government and civic action, will have to replace legal-administrative enforcement and mandates as the centerpiece of such an equity agenda.
So what do you think? Has NCLB done more harm than good? What approach should we take to tackle the unmet challenge of closing the achievement gap?


Views expressed in this blog are strictly those of the author and do not reflect the endorsement of Education Week or Editorial Projects in Education, which take no editorial positions.

Follow This Blog

Advertisement

Archives

Recent Comments

  • Jackie Conrad: National standards will result in making teachers as dishonest as read more
  • Marsha Ratzel: I couldn't agree with Anthony any more about that national read more
  • Anthony Cody: Leslie, Thank you for stating so eloquently the reason so read more
  • Leslie S. Leff: Dear President Obama, I became an elementary teacher over 20 read more
  • marc: Well, since you're asking for my professional opinion, first I read more

Most Viewed On Teacher