Tests now being designed for the common standards are likely to gauge deeper levels of learning and have a major impact on classroom instruction, according to a study of the common assessments released today.
UCLA's National Center for Research on Evaluation, Standards & Student Testing, or CRESST, analyzed the work done so far by the two consortia of states designing the tests. The center concludes that the assessments hold a lot of promise for improving teacher practice and student learning. But its report also cautions that the test-making projects face key financial, technical, and political challenges that could affect their success.
With the "essential relationship between what is assessed and what is taught" in mind, co-authors Joan Herman and Robert Linn sought to explore the extent to which the common assessments will gauge "deeper learning." Their study was funded by the William and Flora Hewlett Foundation (which also provides support for Education Week's coverage of deeper learning).
In examining the potential rigor of the coming tests, Herman and Linn were guided by Norman Webb's "depth of knowledge" classification system, which assigns four levels to learning, from Level 1, which features basic comprehension and recall of facts and terms, to Level 4, which involves extended analysis, investigation, or synthesis. Herman and Linn examined the work so far of the Partnership for Assessment of Readiness for College and Careers, or PARCC, and the Smarter Balanced Assessment Consortium for signs that they would demand the kinds of learning at Levels 3 and 4 of the so-called "DOK" framework.
The researchers found reason for optimism that the assessments will demand those skills. They singled out, in particular, the more lengthy, complex performance tasks being crafted by the two groups, saying they seemed likely to assess skills at DOK Level 4.
"It appears that the consortia are moving testing practice forward substantially in representing deeper learning, but the nature of available data make it difficult to determine precisely the extent of the change," since the tests are still in the design phase, the study says.
Herman and Linn noted a RAND study from last year that examined released items from 17 states reputed to have challenging exams and found "depth of knowledge" levels overwhelmingly in the 1s and 2s in mathematics, and those in English/language arts a bit more rigorous. While the unfinished work of the two consortia can't be directly compared with existing state tests, they said, the two groups still appear to be on track to creating tests that are more rigorous than what most states currently administer.
Important questions remain, however, about how well the two consortia's plans will be realized, the study says. Among them:
• Maintaining their performance tasks, in the face of pressure from states concerned about cost and time. As we reported earlier, SBAC has already had to grapple with this pressure from some of its members.
• Making automated scoring possible for constructed-response items and performance tasks. Without it, current $20-per-student projected costs for the summative tests could soar.
• Ensuring the comparability of the with-accommodations and without-accommodations versions of the tests.
• Managing the "shock to the public and to teachers' instructional practice" that the tests' increased intellectual rigor will demand.