Response: 'The Grading System We Need to Have'

Two years ago, Charlie Herzog asked:

Should we continue to assign students grades in the traditional manner (percentages & letters), or should we move towards a system based on levels of mastery?

I published a very popular response, Several Kinds of Grading Systems (with responses from Thomas R. Guskey, Susan M. Brookhart, and Bill Ivey, along with my own thoughts), but it was only a "one-parter."  Since it's a topic we all have to deal with, I decided that it was important enough to publish a "Part Two."

Rick Wormeli, the well-known educator, author, and speaker, has agreed to provide the primary response today.  In addition, several readers contributed their own thoughts.

Response From Rick Wormeli

Rick Wormeli is a long-time teacher, consultant, and writer living in Herndon, VA. He can be reached at rwormeli@cox.net.

From time to time, we all stare into the silent dark of night or the miles of rush hour traffic carmelizing in front of us and wonder who we are and whether or not we are perceiving everything that's important. In a rushing and ever-shortening attention span world, we feel the need to stop and take stock. On shifting sands, it's assuring to have some kind of metric that we're going to be okay.

Just as Mark Twain [supposedly] once declared that teaching is like trying to hold 35 corks underwater at once, students and parents feel the same way as they figure out the dual games of succeeding in school and growing up. Each year, students face different teachers, subjects, policies, and hormones. They are catapulted by new-found strengths, strangled by familiar weaknesses. Students and their parents use grades as navigation buoys in this uncertain harbor. No matter our age, we will always want some kind of reporting mechanism for how we're doing as learners.

The key, of course, is to create a grading and reporting system that is helpful to all stake-holders: students, parents, teachers, and the larger community. A reporting system built merely to sort humans in order to provide sports eligibility or grant scholarships is destined to be abused and unhelpful in the long term. Grades are first and foremost communication; they are information, nothing more. The moment we make them something more, we corrupt their constructive use.


Stan Williams and Emily Rinkema (at @CVULearns on Twitter) posted a blog in February 2014 comparing a grade to the colored dot posted at our intended destination on a GPS system. I loved the comparison, so let me add a bit here:

When we want to join a friend for dinner at a new restaurant or find a parking space near the sports stadium for the big game, we insert the address into our GPS and start driving. We eventually arrive, and the GPS announces, "You have arrived at your destination." The colored dot is the grade: it's pure information, a statement of fact for where we are at the given moment in relation to our goal. It is nothing more than this. It is not a reward, affirmation, validation, or compensation.

Our reward for arriving at our destination will come in the form of finding the restaurant and joining our friend for dinner and spirited conversation. We will be affirmed for our diligent navigation by parking near the sports stadium but in a place that affords us easy egress after the game. Students with high scores are affirmed by being allowed to pursue activities that make them happy, and by being granted more autonomy for the next unit of study for having demonstrated so much maturity in the current one.

The grade is NOT the reward, nor can it ever be considered such. Once a grade becomes a bartering tool, its power to inform stake-holders and be used to make instructional decisions or document progress accurately is impugned. Any reporting system that sanctions grades to be used as reward or bribery is ripe for abuse. It's corrosive to the teaching-learning dynamo.
Beyond this initial emphasis on shifting grades from compensation to communication, there are seven non=negotiable elements of successful grading and reporting systems, making for a completely arbitrary but nicely framed, "Grading System Eight." Briefly, let's look at each one:

1. Teacher Utility - If time-constrained teachers can't use the grading system easily, it won't matter how wonderful it is. If changing to a new grading system takes time, that's fine; we understand change takes effort. Once there, though, it has to be manageable to those using it daily. This means it in cannot be unduly reductive, boiling every breath a student exhales down to a number on an esoteric scale, and using complicated math gymnastics to justify what would have been just as readily determined by teacher's informed analysis and professional judgment.

2. Transparency - It must be absolutely clear to everyone at each entry point into the system. Performance levels must be plainly defined at every level. While we may use generalized markers to label different levels such as, "Exemplary," "Proficient," and, "Developing," they must be accompanied by clear exemplars and specific evidence that constitute performance at each level. Anyone moving a cursor over an assessment on-line should reveal a pop-up text box listing the evaluative criteria and specific comments about the student's performance.

3. Evidence-based - Grades must be reports of students' performance against standards, nothing else. Many teachers don't have training in calibrating evidence of standards with one another, however, and they are required to, "get two grades into the gradebook every week," so they incorporate multiple, non-academic factors into something that is supposed to be an academic report.

To make this clear and useful, then, successful grading systems report non-academic factors (effort, attendance, behavior, homework, work habits, maturation) separately from academic standards. You read this clearly: homework does not count in the academic report, but it has its own reporting. Note that when we report these elements separately, maturation and academic competence improve over systems that aggregate them into one score. See the work of Guskey, Marzano, O'Connor, and Reeves for more on this.

Teachers new to such a system spend significant time (two to three years for one year's worth of curriculum) unwrapping standards in terms of evidence the community will accept for each level of performance. Larry Ainsworth's books are particularly helpful in this regard.

When we calibrate evidence, we can claim our grades have integrity, they mean what they say. Today, about 40% of high school students entering college have to re-take high school courses in college because the grades were false reports of competence. Sound grading systems have integrity because they are based on evidence of standards at the end of learning's journey, not the routes students walked to get there, the inconsistent weather of their teachers' emotional state, or their uninformed, coercive classroom management techniques.

Finally, gradebooks are set-up according to standards or learner outcomes, not tests/quizzes/projects/homework. Along one axis is a list of the standards taught that marking period, and along the other are the vehicles used by students to manifest evidence of those standards. Alternatively, a gradebook can list a standard and when clicked by a user, a separate window opens up containing all the assessments that students submitted to demonstrate evidence of that particular standard.



4. Feedback-Focused - Students can learn without grades, but they can't learn without timely, descriptive, feedback. Unfortunately, many teachers aren't trained in providing effective feedback so they see it as merely a sidebar to instruction. They think they have to stop teaching in order to provide feedback, not realizing that experiencing descriptive feedback is an overt act of direct learning. Effective grading and reporting systems are mindful of this powerful element, and they are set up to facilitate timely, descriptive feedback to students within the learning cycle, keeping judgment and evaluation only for the end of that cycle.

This means that gradebooks have major real estate dedicated to formative assessments and its associated feedback separate from the real estate dedicated to summative judgments of students' final performance regarding standards, or a similar system to delineate between the two.

5. Disaggregated - The less curriculum aggregated into one grade or symbol, the more useful it is. We can't pare everything down to minutiae, of course, nor is this even desirable, but we can take larger topics like, "Reading," and break them down into three to five sub-categories. We provide individual scores/grades on each one, and there is no one overall reading grade. It's a profile of reading skill development at this one point in time. Effective grading systems avoid grades such as, "Ancient Egypt Test: 87%," or, "Final Term Paper: C+." At the top of tests, the standards being assessed are recorded with their separate scores.

6. Mode, Not Mean - Standards-based grading means we no longer hide behind simple math to determine a student's final grade. Effective grading systems do not average scores because averaging distorts the accuracy of the final grade: "Here's the report of what you know and can do today, plus all that you used to not know and be able to do, all rolled into one inaccurate judgment." We don't accept this in our own evaluations, so we shouldn't force it upon the next generation.

The most accurate assessment comes from a large sample size and refers to the most recent, consistent level of performance. Grades based on this premise have higher correlations with outside-the-classroom testing. In order to reduce the skewing influence of single-sitting, snapshot-moment-in-time, sample of student work, effective grading systems require multiple pieces of evidence over time.

This means the effective grading systems require teachers to check students' performance regarding important standards repeatedly through the year, putting previous curriculum on subsequent tests. This is the testimony for both the grade and the teacher - what students carry forward, not what they demonstrated while immersed in the unit of study.

Finally, effective grading systems eject percentages from their wheelhouse. Percentages seem credible at first as they are mathematically determined, but this is misplaced. Percentages promote the 100-point scale which is ill-suited to report human progress against specific evidence of standards in meaningful ways. An aggregate percentage is murky, obfuscating a student's high's and low's, removing the revelatory nature of good assessment. It undermines instructional decision-making, as Tom Guskey made clear in, "The Case Against Percentage Grades," (Education Leadership, September 2013): "...[W]ith more levels [in a grading scale], more students are likely to be misclassified in terms of their performance on a particular assessment."

Smaller scales have higher inter-rater reliability: a B on one student's assessment for one teacher in one class represents the same level of evidence and learning as a B placed on another student's assessment with another teacher in another class. In smaller, rubric-size scales, we can connect reporting symbols to specific descriptors of evidence. We end the bickering over semantics and students grousing over one-tenth of decimal point in order to win an unhealthy grading derby.

7. Constructive Response to Failure - Most teachers who study how the mind best learns accept the need for a constructive responses to late work and re-doing assessments and assignments without hesitation. It's the schools in which teachers are highly trained in their subject areas but not in how the mind develops that push against these ideas the most.

Additionally, teachers vested in today's focus on resilience and grit and similar, career and college preparatory skills, also see the value in students re-doing work until they get it right. They realize that it's the recovery from failure that teaches responsibility and the lesson's content, not the label for failure. Grades are terrible teachers. Somehow we fell into the trap of thinking that holding k-12 students accountable for post-certification, singular deadline performance taught them self-discipline and respect for deadlines in their pre-certification learning with material they do not find uniformly meaningful and in which they have wildly varying degrees of skill development.

Officials designing effective grading systems recognize that holding students' reports of competency to a specific calendar date removes hope and only perpetuates the ineffective factory model of schooling. Successful gradebooks, therefore, are cumulative for the year. They have a flexible capacity for re-iterative learning, including changing scores/grades easily when new evidence is presented, even over the course of a summer. Previous digressions are not woven into the report of how you perform today. There are none of the spurious re-do grading methods such as giving half a point for each problem a student corrects, averaging the new score with the former one, or allowing only a score as high as a 70 on a re-do.

Zeroes are infinitely revisable within the school year, so most effective grading systems just call them what they really are: "No evidence reported yet," or, "Incomplete." There is every expectation here that the student will achieve and mature, even if his undeveloped self is immature or irresponsible. We will not let his immaturity dictate his learning.

When people say that they are against standards-based grading, I have to ask, "Then against what do you grade, current weather patterns? 'the flip of a coin?" When pushed to clarify, cynics agree that we should grade against the curriculum standards. They just don't like what that means for heretofore accepted (but unexamined) grading practices that must be expunged in order to minimize our hypocrisies. They discover that standards-based grading systems are far more challenging for students - they really have to learn the material, not just sort of learn it, and they are far more useful to everyone involved, resulting in far more competence and hope. No shifting sands here, we're on solid footing.



Responses From Readers

David B. Cohen:

David, a well-known educator and teacher/leader, left a lengthy comment and explanation of his grading system.  It offers a very helpful perspective, but it's a little too long to include it in its entirety here.  I don't want to shortchange it by doing edits, so I would very strongly encourage you to go directly to his full comment.

Brent Logan:

What are the best types of grading practices? It's an interesting question. I think it makes a lot of sense to grade the way adults grade each other:

Above expectations
Below expectations or needs work

Depending on the subject matter, the students, and the teacher, I would expect most students to get a successful grade. A below expectations or needs work grade would mean that the student hasn't yet mastered the subject matter.

Dang Ren Bo:

I'm for burning the whole thing down. We've got objectives. Why do we need class grades? Homework grades? That's all silly. Students should ideally be able to move along and master whatever objectives they can without needing to worry about seat time or (necessarily) completing certain assigned work. IF they can show proficiency in all the on-level objectives before we start a unit designed to help with those, that student needs to be working on the next level or even (preferably) areas of weakness. Ultimately, students should get exactly the prompts they need at the moment they need them.

We'll never get to this point with the 1:30 teacher to student ratio, but technology with teachers as "project managers" may provide a way for us to get as close to a "just in time" education model.

A number of readers contributed their comments via Twitter.  I've used Storify to collect them:

Thanks to Rick and to readers for their contributions!

Please feel free to leave a comment your reactions to the topic or directly to anything that has been said in this post.

