Why We Need National Testing


Dear Deb,

You make some good points about the distinction between norm-referenced tests and criterion-referenced tests, but I disagree with your characterization of the latter.

The problem with norm-referenced tests, I think, is that you really never show much progress. If it is a test of fourth grade, half the children will be above the norm, and half are supposed to be below. It may be useful to know what the norm is, but it is misleading. I recall that for many years, the New York City Board of Education reported norm-referenced scores, and the newspaper headlines would scream that half the students in a given grade were “below grade level.” Since the norm was established to be sure that half were “below grade level,” such a result was predictable. And the public and news media never understood that the test was designed to get that result.

The promise of criterion-referenced tests is that the test-makers presumably determine in advance that students “ought” to know certain things and be able to do certain operations in a given grade. I hope I am not doing a terrible injustice to the field of psychometrics by my explanation, but I suggest that a good criterion-referenced test would be akin to a test to get a driver’s license, whether it is a written test or a performance test. The applicant must get a certain score on the test or they don’t get a driver’s license; the scores are criterion-referenced, not norm-referenced. It is possible that everyone might get a driver’s license, if all the applicants know and can do whatever is expected by the people who made up the test of state laws and driving operations. And it is equally possible that everyone might fail the test. If we want safe roads and qualified drivers getting licenses, then we should want a criterion-referenced test, not one that is norm-referenced.

Think of the same question in terms of a test of what people weigh. If everyone is grossly overweight, then the “norm” is to be overweight. But if health experts set a certain range of desirable weight for, say, a woman of 40 who is 5’6”, then that is the optimum weight, regardless of what the norm is.

You describe these determinations about what students should know and be able to do as “politically” determined, because they are based on expert judgment, including the judgment of teachers of students in a particular grade. The NAEP standards are based on expert judgments, and when last I participated as a member of the National Assessment Governing Board, the process of setting standards was managed by the American Institutes of Research in a very professional manner.

Knowing that the standard-setting was done by professionals and involved the judgment of nonpartisan people, I get uncomfortable to see this process described as “political.” Calling it “political” suggests that some politicians rigged it to make it too hard or too easy. I don’t accept that because I have seen the process and seen that it is insulated from political influence.

Now, having defending the process, I’ll pass along a bit of hearsay that will stoke your fire as a critic of standardized testing. I recently attended a social event at which I met a long-time employee of the New York State Education Department. I had known this person off and on over the years but not awfully well. When we got into a discussion of the state test scores, she lowered her voice and said, in words to this effect, “When the scores come in, they are ‘adjusted.’ If they are too low, they are raised. They are anything that state officials want them to be. Then they are released. It’s a no-brainer to get high scores.” When we turned to the subject of graduation rates, she confided that it was easy to make them high for small schools: “Just make sure that the high-scoring kids, even in the poorest neighborhoods, go to the small schools, and the remainder are assigned to the large schools. Another no-brainer.”

This conversation reinforced my view that we need national standards and national testing, and that the tests should be conducted by officials with no reputational stake in the results. Until we have national testing, we will continue to have this bizarre situation where the states are reporting remarkable progress while NAEP scores remain flat.

I don’t think that scores on a national test should be a single measure of student progress. I think such scores are important as indicators, but should be used in combination with (as you suggest) grades on written work and examinations conducted by teachers.

As long as we continue to depend on state and local officials to grade themselves, we will live in a constant condition of grade inflation.

As to curriculum, I don’t think we can have a common civic culture, a common democratic culture without some shared knowledge, shared discussions, shared poems, and shared history. I know it is hard, but it is not impossible to agree on what should be shared and to recognize that the shared part of the curriculum need not consume more than about 40 percent of each subject, in history, literature, math, the arts, and science. Certainly in math and science, we do not expect every teacher to make it up as he or she goes along. Every discipline has a recognized body of knowledge (I know that term makes some people cringe, but not me!), and that body of knowledge changes over time, sometimes slowly, sometimes rapidly. It would be a shame if a student were to spend a year in a science class with a teacher who made it up as he went along, with no reference to what anyone else in the field has learned.

There is such a thing as “standing on the shoulders of giants,” and this is what a good education enables one to do, or so it seems to me. Because if you stand on the shoulders of giants, as the saying goes, you can see a lot further.



So with national testing (I'm assuming criterion based), wouldn't there be a need for a national curriculum? If so, sign me up. I'll volunteer!

Dear Diane,

In the "Brookings Papers in Education Policy 2001" that you edited is a chapter by Mark Reckase. Reckase offered an eloquent, convincing defense of the soundness of the revised standard setting procedures for NAEP ("The Controversy over the National Assessment Governing Board Standards", p. 231-265). But in his conclusion, he warned:

"Because of the nature of NAEP, it does not support the kinds of inferences that some seem to want to make. NAEP is not a high-stakes testing program. It cannot tell how students will perform when they are highly motivated and well prepared for the task at hand. At best, it tells how students perform when they are not trying hard on material that may not match the course work they are taking. It is a general indicator of performance, not a focused indicator of the highest possible level of performance.... The challenge is for users to understand the context and not make more of the information than can be supported by the structure of the testing program." (p. 253).

Given that you were the editor of this volume and seem to agree with Reckase in regard to the soundness of the standard setting, do you agree with his concerns about interpretation of NAEP?

If you do agree, what does this say about the conversations that began today with the release of NAEP for 2007?

If you do agree, how would the national standards and testing you are suggesting overcome the weaknesses he cites? Even if we had national standards and national testing, the tests would not reflect the course work students are taking unless we mandate a sequenced national curriculum.

Even if we had a national mandated curriculum, how would we overcome the weaknesses of a NAEP-like test as a general, not a focused indicator of performance? Or the motivation issue? If we augmented the national test with grades and other measures as you suggest, how will we know which is more valid if the two (or more) measures disagree?

And there we are tomorrow: Back again where we are today and none the wiser.

I wasn't asked the question but I would imagine that you would take the NAEP at face value, maybe make it part of the grade (some schools already do that with our state testing results) and use other materials (essays, papers, homework, quizzes) as part of the final grade also. Never will you get a perfect representation of a student from one test (say, the NAEP test), but at least you get an idea of sorts. Going in the dark is horrible, just horrible. Why any teacher, parent, administrator, or ed college mucky muck would not want something to help make comparisons is beyond me.


As you know, from an academic point of view, those educational systems that use external exams have on average a one year advantage at 8th grade, over systems that do not.

But in addition to the substantial increase in student performance, one of the most overlooked benefits of having national standards and external exams is the change in relationship between teacher and students.

With an explicitly delineated external goal, both the students and the teachers have clarity about what success is. Most importantly, in external exam systems the teacher is seen by the student as an advocate not as a judge. In 1990 the Association of Secondary Teachers of Ireland cited this reason for staunchly rejecting the call to replace exit exams with teacher awarded grades.

Great educational experiences are critically dependent upon the relationship between teacher and student. Teacher awarded evaluations erode the foundational trust and respect needed for learning. With external evaluation, the teacher-student relationship can thrive with a common shared goal; learning enough to perform well on an external exam.

It is the shared experience between teacher and student that Deborah so rightly cites as the purpose of schooling. And yet in our system of teacher awarded grades, the student-teacher relationship always contains an element of adversary. Is an adversarial relationship between teacher and students necessary for a quality education?

With a quality system of external evaluations what are grades good for?

Erin Johnson


Agree completely with you on this issue of national standards and national assessments.

Too many states are currently administering "feel good" tests, relative to the federal tests administered by the National Assessment of Educational Progress, commonly referred to as the "nation's report card."

In 2005 Tennessee tested its eighth-grade students in math and found eighty-seven percent of students performed at or above proficient while the NAEP test indicated only 21 percent of Tennessee's eighth graders proficient in math. In Mississippi, 89 percent of fourth graders performed at or above proficient on the state reading test, while only 18 percent demonstrated proficiency on the federal test. In Alabama 83 percent of fourth-grade students scored at or above proficient on the state's reading test while only 22 percent were proficient on the NAEP test. In Georgia, 83 percent of eighth graders scored at or above proficient on the state reading test, compared with just 24 percent on the federal test.

Oklahoma, North Carolina, West Virginia, Nebraska, Colorado, Idaho, Virginia, and Texas were also found guilty as charged in the area of "truth in advertising" where their determinations of proficient didn't seem to match what the NAEP test indicated.

This is a crock!

In 2002 it may have been acceptable to allow each state to develop and score their own tests and determine their own level for proficient but it's quite clear that experiment failed miserably.

National standards and corresponding assessments should not be developed by the DOE in Washington but could instead be produced by a committee of the fifty state departments of education. Great expectations and the firm belief that all children can learn would be key to the success of this process.


I agree with many of your ideas including your viewpoint regarding national standardized tests.

Why can’t we create national standardized tests for students at each grade level? We have tests for almost everything else. For example, we have tests to become a doctor, lawyer, CPA, contractor, bus driver, police officer, firefighter, pilot, realtor, trash collector, as well as tests to get into the military, to get into college, to get a drivers license, etc. Why is it so hard to create standardized tests for each grade level?

Would you want a doctor to treat you who couldn't pass his/her medical boards? Would you hire a lawyer who couldn't pass the bar exam? Would you ride on a bus with a bus driver who couldn't pass the drivers test? Would you let your children ride on that bus? So why do we make so many excuses for students who cannot pass grade level standardized tests?

Do professional tests/exams measure all aspects and/or the entire worth of a person? Of course not, but we still require someone to pass an exam in order to proceed to the next level. Should we get rid of all professional tests, since they don’t measure the entire knowledge base and or potential success of a person? If not, then why do so many people argue against standardized tests in the schools?

If someone doesn't pass a test what should he/she do? Work harder? Try something else? Get a tutor? Study more? Use a different measuring stick? Educators tend to use a different measuring stick if they don’t receive the results that they want/expect. I don’t trust teachers to decide who should be promoted to the next grade without national standardized tests. Why? Because of the results. In the past as well as currently, teachers promote too many unqualified students and that is why the high school exit exam was created. It is also why we are having problems in so many classrooms and why the dropout rate is so high.

If an eighth grader cannot read, write or do basic math, then should that student be in the eighth grade? Should the student be promoted to the ninth grade if he/she still can’t read or write at the end of the school year? If so, based on what?

What lesson are we teaching students when we allow the students to promote to the next grade level when they have flunked the work at the current grade level? Will students assume this is how the world works? When that student becomes an adult and is hired to do a job that he/she is incapable of doing, will the adult assume that he/she should keep his/her job/paycheck even though he/she can’t do the work?

Why is education the antithesis of everything we want the students to become and/or be? Why shouldn’t schools act more like the sports, businesses, and the real world? Isn’t that what we are preparing the students to become a part of?

Why is it so hard to create national standardized tests for each grade level? E.D. Hirsch created a series of books regarding what a child needs to know at each grade level. Why can’t educators use Hirsch's books as a starting point to create tests at each grade level? Accelerated Reader created reading levels for readers, so why can’t we create national standardized tests for each grade level?

When we don’t hold students accountable and when there are no consequences, then the learning process is harmed and society is harmed as well. It is hard to have and hold onto a democracy when people can’t spell democracy and don’t even know the meaning of the word democracy.

Dear Diane,

I'm chiming in to say that I really appreciate your last two paragraphs. As long as that curriculum is put together in a moderate/minimalist fashion, it can be done. Will people still disagree? Yes. Few things we touch approach perfection. But the social and political need for some commonality in history, philosophy, literature, arts, etc. demands it.

- TL

I disagree Dianne.

The purpose of schooling should not be to make a government engineered "super-citizen." The purpose of schooling should be to nurture each individual to be his/her personal best.

This type of outcome requires freedom. Give kids the freedom to study topics of their own choosing, place them in internships/volunteer projects with adult responsibilities at an early age, and allow each child to develop at his/her own pace.

We went to a war in Iraq because a school educated citizenry was used to following orders and followed its government blindly. This is what national standards will produce more of.

What we need is freedom. In order to be self-sufficient, independent adults, young people need experience making their own decisions, experiencing failure, and learning from their mistakes. This is a REAL education.

It would be nice if the world worked as you describe with experts setting the standards - but in the world of schools - where I taught k-12 for 30 years, the experts may have a good grasp of the content but not of learning. In addition, my background is mathematics - and we have the math wars going on so what you get on criterion referenced tests depends on which set of experts you choose - language it's whole language vs phonics wars. When all a test tells if they got the correct answer or not - and never tries to find out why they answered as they did - it serves no useful purpose except to punish and reward.

