« Road testing for schools | Main | Habits of mind »

What kind of testing is best?

| 16 Comments

Deb,

You will not be surprised to learn that I agree with you about the value of a road test for licensing future drivers. If you can't actually operate a car with safety and confidence, then you should not be licensed to drive, no matter how well you score on the written exam.

As it happens, many of the studies that are taught in school are not comparable to driving a car. Many of them involve not only "habits of mind" but the acquisition of skills and knowledge that cannot be evaluated in any way that is akin to a road test. How should we test a student's ability to read? We might have them read out loud. That is a good thing for a teacher to do with regularity. For testing purposes, though, it would be very time-consuming, and for an entire class it might take days to listen to each student read selections of varying levels of difficulty. Or we might give the students a test of reading comprehension in which they read an essay or poem or story and then answered questions to demonstrate that they understand what they have read. The latter method, in the eyes of most school officials, is preferable because it takes less time and less money to administer and turns out to be a reliable indicator of student reading ability. It also makes it possible to compare student performance and to gauge whether they are making progress in relation to what students of their age typically know.

I think the same argument could be made for assessing students' knowledge of mathematics, science, and other subjects. I prefer to see students writing research papers in history, to be sure. Most fill-in-the-bubble style history questions are extremely superficial. And yet, superficial as they are, such questions too (if they aren't too stupid, too superficial, too vapid) can quickly identify students who really don't have a clue about whatever history they studied and they can be designed to show different levels of difficulty and knowledge. For example, the latest NAEP test of U.S. history has a 12th grade question that shows a map of the continental U.S. around 1800, on which a dotted line traces a route. Students are asked to identify whether "the expedition whose route is shown was undertaken to explore the: a) lands taken in the Mexican War; b) lands taken from England in the War of 1812; c) Louisiana Purchase; or d) Gadsden Purchase." Students need to be able to look at that map and know that they are looking at the expedition exploring the Louisiana Purchase. There is no way to fake it, other than a lucky guess.

I am all in favor of exhibitions, research papers, and other means of demonstrating what students know and can do. These ways of assessing give teachers an in-depth look at what students have learned. These are the right tools for the individual teacher and for Sizer-style schools. More power to them and to you.

But you acknowledge that these are not the right tools for a district or a state or a nation that is trying to see how well students in fourth grade or eighth grade or twelfth grade are doing. You suggest that large-scale standardized testing should be sample-based, like NAEP. This would assure that there are no "stakes" for any individual student. Quite honestly, I don't know what the right answer is. I know that there are testing experts and education economists who argue that having stakes is very important, that they create incentives for higher performance, that one reason NAEP 12th grade results are so poor is because students know the test has no stakes. Al Shanker often told audiences that his students would ask him, "Will it be on the test?" If he said no, they didn't bother to learn what he was teaching; if he said yes, they were very attentive.

Maybe our readers will weigh in and help me on this one. Or we should call in some psychometricians.

Diane


16 Comments

For me, the problem with standardized test is not that they don't provide teachers and other educators with useful information about what students have learned -- they do. But they're not a complete picture -- most educators want students taking a history course not only to acquire enough factual knowledge to pass a multiple choice test, but to be able to write and speak insightfully about what they have learned.

But in a resources-pressed world, what is tested becomes what is taught, particularly if the test results have large consequences in accountability frameworks.

If states decided that giving road tests for driver's licenses was too expensive and required only a written test, students would spend lots of time studying the rule book and very little time on the road. I don't think the overall effect on driver education would be good.

If all that really "counts" for schools is how well their students do on multiple choice tests, that's what schools will focus on. And I don't think the overall effect on the teaching of history will be good.


Diane,

The tests that we currently administer to our children have caused a narrowing of the curriculum and the extreme cases of “test prep” driven classrooms. By limiting our assessments to “reading” and “math”, time within the classroom necessarily squeezes out everything else (science, history, art, music, etc…).

While this narrowing of the curriculum is not beneficial for our children, it does strongly suggest that external measures can dramatically affect what is happening in our classrooms.

For a quality education, it is essential that those external measures reflect the full breadth and depth of our values, in that reading and math are most useful in understanding history, science and literature, and that art, music and athletics enhance our children’s lives and education.

Now an argument can be made that all tests are inadequate and do not capture “true” learning. This is in fact true. There is no single test that will capture the abilities, strengths and virtues of an individual. I would suggest a less grandiose purpose for a test; measuring how well that individual learned very specific, well defined skills, knowledge and abilities.

We do have a type of testing that well captures this highly articulated mode of learning. The AP exams do measure and the outcomes correlate well with the integrated skills, knowledge and abilities needed for college. This approach is successful at predicting college success as each course syllabus is well defined reflecting rather specific, articulated and clear goals and most importantly correlates well with what the student will see on the final test. This approach could/should be readily done for all core high school courses and perhaps middle/elementary school as well.

By articulating specific course requirements and testing externally, clarity is provided both to the student and the teacher about what learning is important and how to succeed.

Now specific course requirements are not a proxy for life and will never capture what an individual will do over the course of their lifetime. But we are obligated to state to our children what we mean by an “educated citizen.” I do not think that in our current world a little reading and a little math is enough, our children need more. That responsibility should not be placed on the shoulders of our teachers alone, but on us all.

As much as we like to blame testing for all the ills of education, truly the blame is on ourselves for not clearly articulating what children need to know to be successful in life.

Rachel and Erin,
I agree about the limitations of defects of multiple-choice tests. They should not be the only measure used in any subject. History teachers should assign research papers; students should be expected to read books--real books--and write about what they have learned. English teachers should expect students to read outstanding literature and write about the ideas and characters they meet. Writing is a form of thinking, and it requires enormous effort to assemble one's thoughts about the ideas and events of history. This sort of assessment is very valuable for teachers and should be part of students' grades. But this sort of assessment seldom has any role in large accountability programs used by districts, states, and the nation.
I agree with Erin that end-of-course exams are valuable. To rail against all testing doesn't get one very far. For better or worse, testing is part of life. Our job is to figure out how to make it work with our aims in education, not against them.
Diane Ravitch

Rachel and Erin,
I agree about the limitations of defects of multiple-choice tests. They should not be the only measure used in any subject. History teachers should assign research papers; students should be expected to read books--real books--and write about what they have learned. English teachers should expect students to read outstanding literature and write about the ideas and characters they meet. Writing is a form of thinking, and it requires enormous effort to assemble one's thoughts about the ideas and events of history. This sort of assessment is very valuable for teachers and should be part of students' grades. But this sort of assessment seldom has any role in large accountability programs used by districts, states, and the nation.
I agree with Erin that end-of-course exams are valuable. To rail against all testing doesn't get one very far. For better or worse, testing is part of life. Our job is to figure out how to make it work with our aims in education, not against them.
Diane Ravitch

Rachel and Erin,
I agree about the limitations of defects of multiple-choice tests. They should not be the only measure used in any subject. History teachers should assign research papers; students should be expected to read books--real books--and write about what they have learned. English teachers should expect students to read outstanding literature and write about the ideas and characters they meet. Writing is a form of thinking, and it requires enormous effort to assemble one's thoughts about the ideas and events of history. This sort of assessment is very valuable for teachers and should be part of students' grades. But this sort of assessment seldom has any role in large accountability programs used by districts, states, and the nation.
I agree with Erin that end-of-course exams are valuable. To rail against all testing doesn't get one very far. For better or worse, testing is part of education and part of contemporary life. Our job is to figure out how to make it work with our aims in education, not against them.
Diane Ravitch

Diane,

I concur that writing is enormously beneficial in developing clarity of thought and understanding about a particular subject. Assessing the ability to synthesize content knowledge with writing skills is an area that multiple choice tests are largely inadequate.

But if I were to take my own children’s experience with writing instruction, I would say that your vision of a teacher using writing to stretch a student’s understanding of the material is largely lacking in our schools. To often writing is treated as a process skill devoid of content. (When my very creative second daughter was in 2nd grade, her end of the year paragraph was penalized because she had written more than the required 8 sentences that were in the rubric for a standard paragraph.)

Additionally, because teachers to often focus on evaluation (by assigning a grade) there is little to no input about how to improve the writing and the thought processes involved. Giving poor grades does not improve writing; it only cements the idea in the student’s head that he/she can’t write.

I completely agree about reading real books, and by real books I mean the ones that discuss difficult, complex and interesting ideas. It is fairly easy to assess using multiple choice tests the key ideas contained within a particular book. As long as the student reads the book, discusses and writes about the ideas, an end of course assessment about the details of the particular book would be simple. What is difficult is getting agreement about what books/ideas our children should be discussing.

What multiple choice exams can do well is test the thoughts and ideas that have already been formulated by the student. A test is not the time to stretch and develop new ideas but to see what the student already thinks about and knows.

Erin

Diane,

As long as we understand the limits of multiple-choice exams and don't use them as the sole source of final judgments, I don't have much concern about them. But when test scores used as the be-all and end-all, we get into all sorts of problems. I think that as a society, we've confused quantitative judgments of education with independent judgments of education. Independence is the critical ingredient in accountability--having someone with an informed view who looks at evidence of what children have achieved but who is outside the local environment (however defined). Or several somebodies. Or all of us. In Accountability Frankenstein, I've suggested we could use grand juries to perform much of that job when it comes to looking at inequalities. They can look at test scores, but they'd have the power to subpoena school officials and ask them questions under oath about expectations and curriculum. That's a plausible local solution, I think.

At the state or national level? I've occasionally proposed a thought experiment for Florida: give every school a scanner and a computer with a special program that will select five children at random every day. The work of those five children that day gets scanned (after covering the names, of course), and it all gets uploaded to a state server that accumulates a random sample of student work, work that shows in a detailed way what students are doing in a class and what the taught curriculum is, as well as a snapshot of student performance. For a variety of reasons, it's unworkable (could you imagine the hundreds of computers all working smoothly without millions of dollars in maintenance every year?), but most people I tell this to have a glazed look about them. "How would you summarize it?" they say. "But you can look at the work!" I point out. "But you can't quantify it." Ah... the confusion between quantification and independence. So what do you think?

I suspect Deborah would disagree somewhat with my argument in favor of independent judgments. When students didn't complain (yet) that The Power of Their Ideas was too old, I had to point them to her description of accountability at CPESS: a group of teachers and community members who looked at the portfolios and asked the hard questions about whether the school's standards were appropriate and whether the students were meeting them. I've told a few hundred students my guess that CPESS probably selected some community members who were well-known and politically connected for this rubber-meets-the-road accountability, just to protect the school if it were ever threatened. (Deborah, was I right?) That's a local and a politically conscious form of accountability. Legitimate, I think, but I suspect it doesn't meet your concerns about broader scales.

Sherman,

Right and wrong--mostly both. We assessed kids with reviewers who were insiders and outsiders whose judgment seemed worthy of respect. We assessed the school with a panel that included some politically well-connected people whose word would carry weight but we also included high school and college "peeer" who could make comparative judgments that we would respect. Their judgment was of the school's standards, not an individual child's. I think both forms of judgment are critical.

Nothing I have said about paper-and-pencil short answer testing is meant to prevent teachers or schools from using such tools when in their judgment it will provide information useful to them in teaching their kids. Note: their judgment and their kids. But if we want it for State purposes--legitimate enough--we do not have to test them all, and as long as the testing rules remain the same for all we will solve Shanker's problem (the student's lack of motivation to do well)--and we'll be able to do it cheaper and in greater depth if we want to pay for that as well.

If what we want are grown-ups capable of exercising judgment on matters of life-and-death importance, let's start by allowing teachers, students, schools and parents to make a few important decisions about schooling--including how they choose to utilize psychometrics (the art of testing).

\

Sherman,

Right and wrong--mostly both. We assessed kids with reviewers who were insiders and outsiders whose judgment seemed worthy of respect. We assessed the school with a panel that included some politically well-connected people whose word would carry weight but we also included high school and college "peeer" who could make comparative judgments that we would respect. Their judgment was of the school's standards, not an individual child's. I think both forms of judgment are critical.

Nothing I have said about paper-and-pencil short answer testing is meant to prevent teachers or schools from using such tools when in their judgment it will provide information useful to them in teaching their kids. Note: their judgment and their kids. But if we want it for State purposes--legitimate enough--we do not have to test them all, and as long as the testing rules remain the same for all we will solve Shanker's problem (the student's lack of motivation to do well)--and we'll be able to do it cheaper and in greater depth if we want to pay for that as well.

If what we want are grown-ups capable of exercising judgment on matters of life-and-death importance, let's start by allowing teachers, students, schools and parents to make a few important decisions about schooling--including how they choose to utilize psychometrics (the art of testing).

\

Sherman,

Right and wrong--mostly both. We assessed kids with reviewers who were insiders and outsiders whose judgment seemed worthy of respect. We assessed the school with a panel that included some politically well-connected people whose word would carry weight but we also included high school and college "peeer" who could make comparative judgments that we would respect. Their judgment was of the school's standards, not an individual child's. I think both forms of judgment are critical.

Nothing I have said about paper-and-pencil short answer testing is meant to prevent teachers or schools from using such tools when in their judgment it will provide information useful to them in teaching their kids. Note: their judgment and their kids. But if we want it for State purposes--legitimate enough--we do not have to test them all, and as long as the testing rules remain the same for all we will solve Shanker's problem (the student's lack of motivation to do well)--and we'll be able to do it cheaper and in greater depth if we want to pay for that as well.

If what we want are grown-ups capable of exercising judgment on matters of life-and-death importance, let's start by allowing teachers, students, schools and parents to make a few important decisions about schooling--including how they choose to utilize psychometrics (the art of testing).

\

Sherman,

Right and wrong--mostly both. We assessed kids with reviewers who were insiders and outsiders whose judgment seemed worthy of respect. We assessed the school with a panel that included some politically well-connected people whose word would carry weight but we also included high school and college "peeer" who could make comparative judgments that we would respect. Their judgment was of the school's standards, not an individual child's. I think both forms of judgment are critical.

Nothing I have said about paper-and-pencil short answer testing is meant to prevent teachers or schools from using such tools when in their judgment it will provide information useful to them in teaching their kids. Note: their judgment and their kids. But if we want it for State purposes--legitimate enough--we do not have to test them all, and as long as the testing rules remain the same for all we will solve Shanker's problem (the student's lack of motivation to do well)--and we'll be able to do it cheaper and in greater depth if we want to pay for that as well.

If what we want are grown-ups capable of exercising judgment on matters of life-and-death importance, let's start by allowing teachers, students, schools and parents to make a few important decisions about schooling--including how they choose to utilize psychometrics (the art of testing).

\

"The tests that we currently administer to our children have caused a narrowing of the curriculum and the extreme cases of “test prep” driven classrooms. By limiting our assessments to “reading” and “math”, time within the classroom necessarily squeezes out everything else (science, history, art, music, etc…)."

One does not necessarily preclude the other.

Reading can be done across the curriculum. For example, during reading, I have my students read Prentice Hall's wonderful Explorer series in science and history.

Here's a suggestion from former US Secretary of Education Rod Paige for instantly eliminating the achievement gap. Stop the NCLB testing in all US public schools. That's right. If we stop the testing, the achievement gap will magically disappear. No, it won’t! If we pretend for a long enough period of time there's no problem or we ignore a known problem, then maybe, just maybe, it will go away. No, it won't! The folks from Fair Test in Cambridge refuse to accept the idea that testing has identified significant differences in achievement of cohorts of students. The achievement gap has turned out to be the civil rights issue of the 21st century...because it's been documented. The first step in solving a problem is acknowledging a problem exists - and it does. Our poor/minority students are performing below their White and Asian peers in schools from coast to coast. We now know this because we’ve tested for it. Because we now have this information we can finally start addressing it.

Deb, I really don't consider myself a "testing maniac" but I believe testing is the most expedient, objective and quantitative method to determine whether learning has occurred. From tests I can apply the appropriate remediation where needed. So called authentic assessments like portfolios are too easily compromised.

The interesting point that Diane elides is the premise that testing is necessary because we cannot trust the judgments of teachers.

"I am all in favor of exhibitions, research papers, and other means of demonstrating what students know and can do. These ways of assessing give teachers an in-depth look at what students have learned. These are the right tools for the individual teacher and for Sizer-style schools. But you acknowledge that these are not the right tools for a district or a state or a nation that is trying to see how well students in fourth grade or eighth grade or twelfth grade are doing."

These are only the wrong tools if we don't believe that we have accurate standards by which teachers can judge and report competence. The rise of the testing industry (and at the billions of dollars being spent on test development on the one hand and test preparation--often by the same companies--on the other...it is truly an industry) coincides nicely with the delegitimation of teacher expertise.

"[I]n the eyes of most school officials, [the test] is preferable because it takes less time and less money to administer"...but students crave the chance to demonstrate real learning, and learn from the experience, not be slaves to efficiency. Who are our schools serving in this scenario?


"The interesting point that Diane elides is the premise that testing is necessary because we cannot trust the judgments of teachers {unions?}."

Nick, if the shoe (of the teachers' unions) fits, they're going to be stuck wearing it. They are anti-charters, anti-competition, anti-choice, anti-testing, anti-accountability, anti-merit pay, anti-NCLB, etc., etc.

The questionable judgment of teachers has also been the rationale behind their failure to be included in the education reform dialogue in many states. State legislatures and the business community are all too familiar with the suggested prescriptions for reform of professional educators and have essentially decided to ignore them. Teachers and teacher unions had their way in the operation of our schools for a long time and the results were nothing short of an embarrassment.

I don't intend to belittle your ideas but, "...students crave the chance to demonstrate real learning, and learn from the experience, not be slaves to efficiency." Can you document this statement or is this the opinion of another teacher?

Paul, I'm sorry you seem to have issues with unions (though that wasn't my point at all--thanks for expressing your anger about that tangent). Before I go further,
what would "count" as "documentation" of my statement to make it true (since "the opinion of another teacher" implies that it is insufficient)? Do you have "documentation" that children crave to be slaves to efficiency? Would this documentation come from "the business community" (a phrase I've understood as little as "the gay community" or "the Christian community"--as if they're all members of some Borg-like entity)? If teachers aren't worth paying attention to (as implied in your "opinion of another teacher" and your lumping of all teachers together ) what would you replace them with? Given that you seem to believe that teachers should be interested in "charters, competition, choice, testing, accountability, merit pay, NCLB", you seem to value competition above all else. Does school exist just as a sorting machine? If so, what does "all children can learn" (from NCLB and other political platitudes) mean? Does it mean that these standards are so low, that any child, no matter what other problems they may have in their lives, is capable of doing this at the same time? Since it seems obvious that to you it doesn't mean that teachers are so talented that they can accomplish this for everyone, no matter the circumstances, without any more resources.

Comments are now closed for this post.

Advertisement

Most Viewed on Education Week

Categories

Archives

Recent Comments