The Role of Testing to Support Deeper Learning
This post is by Linda Darling-Hammond, the Charles E. Ducommun professor of education at the Stanford University graduate school of education and director of the Stanford Center for Opportunity Policy in Education (SCOPE).
For more than a decade, Congress has been unable to reauthorize the increasingly out-of-date No Child Left Behind Act (the version of the historical Elementary and Secondary Education Act (ESEA) passed in 2002). The law sought to guide school reform by testing, with a requirement that targets and sanctions be set so that 100 percent of students would score "proficient" on state tests by 2014. However, 2014 has come and gone, and that unattainable goal is unmet. Meanwhile, nearly every public school in the United States has been deemed "in need of improvement" or failing, and experienced some form of intervention, under the law's test-and-punish enforcement strategy.
Despite the intensive focus on testing tied to consequences, achievement gains have slowed since the 1990s on the National Assessment of Educational Progress, and achievement gaps have remained stubbornly large. Between 2000 and 2012, U.S. scores on the international PISA tests, which measure higher-order thinking skills and applications of knowledge, declined in math, reading, and science. The No Child Left Behind (NCLB) strategy has clearly not worked to prepare our students for the challenging world they are entering.
Remarkably, with new leadership, Congress finally seems primed to update ESEA. In recent weeks, one of the hottest debates has been about the law's testing requirements: in particular, how often federally-mandated testing should occur. Some argue that annual testing is critical to ensure that schools know how each child is doing in order to address learning needs, as well as to hold schools and districts accountable for the progress of all students. Others argue that state annual testing, especially when tied to high stakes for students, educators, and schools, has led to test prep, over-testing, and narrowing of the curriculum to the topics and formats of required tests, focusing on lower-level skills, and de-emphasizing more important forms of learning.
This debate thus far is missing the most important questions, however: What is the quality of the tests and what can they tell us? And what kinds of assessments should be used, when, how, and for what purposes if we want high-quality learning to occur?
As policymakers debate the role of testing in ESEA, it will be vital that these discussions envision the end-game we're aiming for: classrooms that engage all students in meaningful, engaging learning that prepares them for college and careers in our complex modern world. Students need to learn how to be critical thinkers, problem solvers, collaborators, and life-long learners. They also need to be supported by teaching that addresses their specific learning needs.
Of course parents and teachers need information every year about student learning and progress. And states regularly need information about how schools and districts are doing in attaining important curriculum objectives and closing achievement gaps.
Unfortunately, state tests--especially as they have been shaped by federal mandates in recent years--are often poor tools for achieving these goals. For reasons of cost and time, they tend to measure those things that can be most easily assessed with multiple-choice or short-answer questions. By federal law, they are restricted to measuring grade-level standards, so they cannot measure achievement or growth for the large share of students who are above or below grade level. (This also means that teachers often feel compelled or are directed by their districts to teach only the grade-level standards, even when some students could move ahead in their learning and others need instruction that will help them catch up to their peers.)
Most state tests offer only a single numerical score to describe a student's learning in a given area--rather than rich descriptions of what a student knows and can do in various domains and with respect to a range of skills. And because competitive comparisons and sanctions have been emphasized by the law and subsequent waivers issued by U.S. Department of Education, the tests must be given in the same time window at the end of the year under highly restricted conditions, rather than allowing students to take them throughout the year so that teachers can adapt instruction as they move along.
In a word, we are stuck with a set of ideas about the high-stakes functions of state testing that are holding back much more productive approaches. Although states developed innovative assessments of student performance during the 1990s--including computer-adaptive technology-based tests that students could take anytime, portfolios of written work, and performance assessments measuring research, investigation, collaboration, and communication skills, all of these advances were ended by NCLB.
While other countries have been moving ahead with innovative assessments, NCLB pushed U.S. testing back to the 1950s when multiple-choice Scantron tests were considered a modern technology. While our children are bubbling in answers at the end of each year to questions with five pre-determined choices, young people in Singapore are conducting collaborative projects that are part of the examination system; students in Australia are designing and completing science investigations; children in England and New Zealand are evaluated on a set of authentic reading, writing, speaking, and listening tasks that provide extensive information about their developing literacy skills; and those in Hong Kong are demonstrating their understanding of physics problems in hands-on tasks as well as extended essays.
In these other countries, externally-administered tests are less frequent (usually occurring once or twice before high school, plus a set of examinations at the end of high school), but much deeper than in the U.S. Meanwhile, school-based assessment--often guided by common curriculum frameworks, assessments, or tasks from the state or national government--is continuous, and may be used as part of the reporting and accountability system, as well as for teaching decisions and reports to parents.
Rather than treating tests as black boxes to use as hammers for rewards and sanctions, these countries understand that assessments of, as, and for learning should encourage valuable learning and provide rich information for teaching. They also understand that no single test can support the rich learning experiences schools need to create or supply the diagnostic information that teachers need to have to be effective with students.
These goals require a system of assessments rather than reliance on a single state test. Furthermore, these assessments need to perform different important functions rather than drilling students on the same limited items they will encounter on the most constrained of the tests at the year's end. This is what occurred under NCLB when local assessments were reduced to test-prep clones designed to boost scores on the end-of-year test rather than to offer better information for learning and teaching. As a result, for all the money spent on testing, we failed to move the needle on higher-order learning.
Instead, states and districts should be encouraged to support students through a set of assessments that include richly descriptive diagnostic tools (for example, the individually-administered Developmental Reading Assessment or Qualitative Reading Inventory used by many schools to provide extensive information about student decoding, comprehension, inferential, and other reading skills); English language proficiency assessments as well as assessments of proficiency in other world languages; competency-based assessments measuring student progress throughout the year; and hands-on extended performance assessments that complement higher quality sit-down tests.
Although locally-administered, most of these tools have been designed and validated by assessment developers working with teachers at the national level or across-jurisdictions and can be reliably scored and comparably interpreted across settings. Because they are often integrated into the classroom, they become part of the teaching and learning experience, rather than adding to the over-testing many students experience today. And in a system of assessments, some tools may use matrix sampling or be offered less often in order to allow more expansive tasks while reducing testing time.
With the right mix of assessments, students can be engaged in exciting learning that will prepare them for their futures without being over-tested. Teachers can have data that informs them about how students are learning as well as what they know. Districts and states can have data about how different groups of students are doing in different areas of the curriculum, so they can invest wisely in curriculum development, professional learning, and instructional supports.
The new assessments from the Smarter Balanced Assessment Consortium and the Partnership for Assessment of Readiness for College and Careers show promise as part of a broader system of assessment and instruction. Although still constrained to end-of-year formats and grade-level standards, the new assessments include some open-ended performance tasks and, in the case of Smarter Balanced, a set of formative instructional supports and interim assessment tools that can be used during the year. States participating in these consortia (about 30) could do even more to further modernize these tests and to embed them in productive systems if ESEA would let them.
To achieve this vision, several things must change in ESEA:
- Assessments should be reported and used for information and improvement rather than for sanctions and punishments. Then educators and students can pay attention to the full range of learning they need to accomplish, rather than drilling all year on a single, limited test that necessarily narrows the curriculum.
- Federal law should no longer prescribe technical features of tests in ways that prevent innovation and change.
- States should be encouraged to create integrated systems of state- and locally-administered assessments that provide information for the multiple purposes they need to serve. They should be allowed to distinguish between tests used annually to gauge student progress and those used for large-scale reporting purposes, so that the end result is a high-quality, cost-effective system that provides more useful information for guiding improvement.
- ESEA should encourage accountability systems that rely on data dashboards that offer multiple measures of student success, as well as further students' opportunities to learn. That will encourage states and districts to close the gaps in students' access to high-quality curriculum offerings as a means to improving outcomes.
- These systems should gauge student learning by measures that capture success in ways that extend beyond tests, such as: successful completion of challenging coursework, like Advanced Placement, International Baccalaureate, Early College courses, and Linked Learning courses of study; rigorous portfolios that assess coursework and performance like those offered by the National Academies Foundation and the New York Performance Standards Consortium -- all of which are more predictive of college and career success than a sit-down test offered on a single day.
Testing, in any form, should never be the be-all and end-all of an accountability system, but it can have an important role. Allowing states to create intelligent systems of assessment will, in the long run, better support student learning than the one-size-fits-all model we've struggled with for the last decade.