A technical review panel set up by the U.S. Department of Education is urging both common core assessment consortia to pay better attention to ensuring that their tests are accessible to students with disabilities and those whose native language is not English.
That is one of the more stern outcomes of the panel's first appraisal of the work so far of PARCC and Smarter Balanced. The review panel, created in March, issued its reports on July 3. You can read them on a special page of the department's website.
It's important to know that this panel is not charged with reviewing everything the two consortia are doing. Its examination focused on how test-development is going and what research is being planned to ensure that the tests are valid for their intended uses. (The department's more wide-ranging oversight includes annual performance reports and ongoing conversations with the consortia. You can review that process on the department's website as well.)
Each consortium was graded in three areas: assessment development, accessibility and accommodations, and research and planning. Using the panelists' recommendations, the department rated each consortium's work in each area as "generally on track"; "some aspects on track, others need additional focus"; "needs attention"; and "needs urgent action." Here's how the two groups came out:
Accessibility and accommodations:
• PARCC: "Needs attention"
• Smarter Balanced: "Needs attention"
• PARCC: "Generally on track"
• Smarter Balanced: "Some aspects on track"
Planning and Research:
• PARCC: "Some aspects on track"
• Smarter Balanced: "Some aspects on track"
No chunk of the consortia's work earned the worst rating, "needs urgent action."
On accommodations and accessibility, the panelists urged PARCC to "carefully research the way items work for the full range of English-learners and students with disabilities," and to "expand its accessibility work into all phases of item development," including expanding its training for item writers and reviewers, according to its report on PARCC.
PARCC responded (highlights of each consortium's responses to reviewers' comments are included in the report) that it will incorporate that training for item writers into the second phase of its item-development work. And it said it will study how items function for students with disabilities and those learning English "based on field test results" (field tests are scheduled for spring 2014) and make adjustments accordingly.
The panel commended PARCC, however, for its draft policies on accessibility and accommodations; for its commitment to universal design in crafting its tests, and for "evidence of of attention to accessibility" in the way it wrote its descriptions of what students should be able to demonstrate to show mastery at various levels of the test.
When it examined SBAC's work on accessibility and accommodations, the technical review panel was pleased with how it incorporated universal design and attention to bias and sensitivity issues into its process of writing test items. It also commended the group for building accessibility features into the test-delivery system.
But it recommended that Smarter Balanced step up its use of experts in the testing of students with disabilities and English-learners "in all aspects of the item and test development process." The panel wants SBAC to add cognitive labs, item tryouts and other research studies that include greater numbers of English-learners and students with disabilities.
Smarter Balanced responded that experts have guided many aspects of its work. And it noted that it conducted cognitive labs (see our story on that here)and small-scale trials before the spring 2013 pilot tests. It plans to do subgroup analyses of those pilot tests, as it does with the 2014 field tests, according to the report.
Nonetheless, the technical review panel cautioned SBAC to "focus additional attention on ensuring that the assessment system is accessible for all students." To do that, it should "increase and improve training for item developers to focus on accessibility," and do sufficient research to ensure that English-learners of various proficiency levels, and students with disabilities, can access the content of the tests.
In the area of assessment development, PARCC came in for compliments on items and text passages, with the panel finding them "generally [of] high quality and aligned to the standards." The consortium was also patted on the back for a nice job in using evidence-centered design. So far, the test seems likely to be able to provide "clear, valid information for students, including students scoring well or poorly," the report said. But it cautions the consortium to confirm through research that the test can do a good job of "discriminating" well (providing valid feedback for students at all points of the scoring spectrum).
The panel identified a potentially troublesome area for PARCC, though: maintaining item quality while pushing hard to finish all its items on time.
"Reviewers recommended that the consortium ensure that rapid acceleration in item development necessary to meet the consortium's item development goals does not lower item quality," the report said.
PARCC responded, according to the report, that it has "implemented item quality measures to ensure it continues to produce high-quality items through the tight timelines."
Reviewing Smarter Balanced's assessment development work, the review panel commended it for developing 5,000 items for its spring pilot tests, and "thoughtfully attempt[ing] to incorporate lessons" from that first phase into future phases of item development. But it suggested that training for item writers and reviewers could be improved "to enhance the quality and consistency of item development." It also urged the consortium to make sure that 500 "archetype items"—used for guiding and training item writers—"are sufficient to address the breadth and depth of the content standards."
Turning to the consortia's work in research and planning, the review panel "strongly suggested" that PARCC do some near-term research to make sure that "items and tasks work as intended; that students understand what they are being asked to do; and that test design, items, and scoring rubrics provide strong discrimination for students who score at high and low levels as well as students who achieve mid-range scores." It also suggested that PARCC find out, as part of those studies, whether students involved in the studies have received common-core instruction on the material being tested.
For Smarter Balanced, the panel suggested that the consortium consult its technical-advisory committee about the "content coverage" of its test blueprints. Specifically, it raised the question of whether it was using enough literary and informational texts to adequately cover "the full range of content" in the common standards. Smarter Balanced responded that it had already increased the number of texts it is using.
The panel also suggested that SBAC validate its plans to, among other things, combine the scores from the computer-adaptive portion of the test with those from the performance tasks into one scale score.