« Cool People You Should Know: Doug Downey | Main | Irreconcilable Differences: Why NYC’s Surveys Provide a Misleading Portrait of School Quality »

Let the Spin Begin


Suppose that your fourth-grader takes a state test that shows that she understands the associative property of multiplication, can multiply two-digit numbers by two-digit numbers, and can find the perimeter of a polygon by adding up the length of the sides. A year later, as a fifth-grader, she takes a test that shows that she can compare fractions and decimals using <, > or =; identify the factors of a given number; simplify fractions to their lowest terms; and knows that the sum of the interior angles of a quadrilateral is 360 degrees—but she cannot yet create algebraic or geometric patterns using concrete objects or visual drawings (e.g., rotate and shade geometric shapes). Would you say that your child had lost ground in proficiency, or actually gone backward?

Jim Liebman would. Liebman, the Columbia University law professor on leave as Chief Accountability Officer at the New York City Department of Education, is quoted and paraphrased in an article by Jim Dwyer in Saturday’s New York Times on the F grade that P.S. 8 in Brooklyn Heights will receive in this year’s School Progress Reports—a grade that many are finding hard to believe, given that 80% of the students tested in the school are judged proficient in math, and two-thirds are judged proficient in English Language Arts. Doubly embarrassing, in that Chancellor Joel Klein and Mayor Mike Bloomberg have publicly declared the school to be successful and worthy of emulation.

So the spinmeisters are out, and the spin here is justifying the grade of F by arguing that the children in P.S. 8 are going backward. “You drop them off at the beginning of the year, and on average, by the end of the year, your child lost ground in proficiency,” Dwyer quotes Liebman as saying. “Where was the child last year, and where is the child this year?” Liebman asked. “You’re comparing them to themselves.”

A gentle reminder to Mr. Liebman, who was hired in January, 2006: the state math and ELA tests which children take, and are the primary basis for assigning these lovely letter grades, are not vertically equated. (See skoolboy's testing primer here.) This means that there is no basis for comparing performance on the fourth-grade test with performance on the fifth-grade test. For each test, there is a subjective judgment about what level of performance constitutes proficiency, but the tests are independent. There is no basis for claiming that children are going backward; there’s no justification for claiming that a child “lost ground in proficiency,” since proficiency doesn’t exist in the abstract, but rather in grade-specific skills; and the children are not being compared to themselves, but rather their location in the distribution of children’s performance in one year is being compared to their location in the distribution of children’s performance the following year.

Perhaps Jim Liebman simply misspoke, as perhaps did Chancellor Joel Klein when he referred to statistical significance as “playing something of a game.” Such missteps might arise from the tremendous pressure to justify a particular high-stakes evaluation of a school when there are multiple sources of information about school performance that point in different directions—NCLB status, achievement levels, gains, school quality reviews, not to mention the public pronouncements of Liebman’s boss, and his boss’s boss.

There’s nothing wrong, in skoolboy’s view, in looking at students’ achievement growth as one of several criteria for judging how well a school is doing in relation to other schools. But I would never think of using year-to-year changes in proficiency levels on just two tests as the primary basis for evaluating a school’s performance. And neither would most people who study testing and assessment for a living.



Crazy world. When will NYC (and other schools districts) recognize that the measures that they are taking as an absolute are hardly reliable?

What will it take for Bloomberg/Klein to realize that these faulty tests are not the ultimate marketplace?

Who is "judging" the tests?

Most remarkable to me is the unwillingness of both the Press Secretary, David Cantor, and Jim Liebman to take a second look.

I thought the Progress Reports made use of the demographic composition of the tested grades, not the demographic composition of the whole school. In NYC, where there is often rapid demographic change, this could make a big difference, and apparently it did in the case of PS 8. From Elissa Gootman's NY Times article:

Also contributing to the F grade was P.S. 8’s rapid change in population. A quarter of the students now qualify for free lunch, compared with 98 percent in 2002, and more than half the students are white or Asian-American, up from 11 percent in 2002. Most of these changes are happening among the youngest children, before tests begin in the third grade.

Eighty-nine percent of last year’s prekindergarteners at P.S. 8 were white, for example, as were 60 percent of kindergarteners, 49 percent of first graders and 54 percent of second graders. The test-taking grades — 3, 4 and 5 — were 27 percent, 31 percent and 19 percent white, respectively.

It boggles my mind that schools are being graded using this faulty measurement system. Not everything can be shown through a test. And clearly as skoolboy explains these tests are not designed to be used solely to focus on growth.

Also here's the ironic part. Last year I visited PS 8 and PS 307, a school ten minutes away that serves mostly students who live in the housing district across the street. There is a new apt complex next to PS 307, but middle class parents have been assured that their children can attend PS 8. As eduwonkette's numbers show white parents flock to PS 8. These parents clearly make decisions with more information than simply the NYC grading system. But if the data misrepresents what is going on at PS 8, it probably misrepresents what is going on at other schools as well.

What really is this grading system doing besides diverting attention from what really matters?

Check out the webcast for Aspen Institute's Summit on Education. Some good discussions.

I guess there are two potential solutions:

1. Devise better ways of measuring schools (can anybody say national tests?)

2. Stop trying to measure and grade schools

From Jim Liebman:

Skoolboy doesn’t appear to understand how the DOE’s Progress Reports work.

First, the Progress Reports are the only accountability system in the country that actually give credit to teachers for the very condition he mentions: If a student is half way between “basic understanding” (Level 2) and “proficient” (Level 3) according to state learning standards in third grade, and again in fourth grade, that means that the student actually made a year’s worth of progress. She attained more than a basic understanding not only of the skills we expect third graders to master but also, a year later, of the skills we expect fourth graders to master. Even if that student has not yet become “proficient,” she still made a year’s worth of progress by mastering the same proportion of the fourth grade curriculum as she did of the third grade curriculum. The Progress Reports recognize this by giving schools and teachers credit for the percentage of their students who achieve a year of progress – in contrast to other accountability systems, which only give them credit for students who are “proficient.” (The Progress Reports, by the way, also give schools credit when students move beyond proficient to advanced – again unlike other accountability systems nationwide).

But suppose that the student who is halfway between basic and proficient in the third grade is just barely at basic in fourth grade. That doesn’t mean this student learned nothing in fourth grade. But it does mean that she mastered fewer of the skills we expect fourth graders to master than was true of her mastery of skills as a third grader. Faced with this problem skoolboy essentially says “not to worry. After all, the student learned something!” In fact, we should worry a lot if students master less and less of the curriculum in successive years, even if they are mastering some of it. Consider this: If an elementary or middle school enables its students on average to finish 8th grade at just “proficient” in literacy and math (a proficiency rating of 3.00), those students will have only a 55% probability of graduating high school with a Regents diploma in four years. By contrast, if another school down the street enables its students to finish 8th grade halfway between “proficient” and “advanced” (a proficiency rating of 3.50), those students have over an 80% probability of graduating college-ready in four years. Which school would you want your child to attend? It is exactly these kinds of gains that our Progress Reports hold schools accountable for achieving. And this is the problem that the Progress Reports discovered at PS 8 in Brooklyn. Compared to schools elsewhere in Brooklyn and New York City with students at the same entering learning levels, PS 8 enabled its students on average to master about one-third less of the state learning standards in a year. Three years of those kinds of results, and the students will have fallen a full proficiency level behind their peers who had the luck to attend other schools. Again, should the City’s families – especially those with children who start school less well-prepared – be content with schools that leave their students with those kinds of prospects? Skoolboy says we should be just as happy with a school from which students matriculate with a 3.00 – and a 55% probability of graduating ready to go to college – as we are with a school from which students matriculate with a 3.50 and an 80%-plus probability of succeeding in high school.

Skoolboy may believe that New York State’s learning standards aren’t valid: his post seems to say that we shouldn’t care if a child can’t create algebraic or geometric patterns using concrete objects or visual drawings at the point in a child’s schooling when teachers across the state (who created the state’s learning standards) have concluded that this skill is one children should master. But he hasn’t explained why he knows better than the educators who created those standards, or what standards he would prefer, or how long it is okay for schools to let students delay learning what educators statewide believe they should learn.

As for vertical scaling, the state’s tests are vertically connected (or “vertically moderated”), and our own analyses reveal that they indeed are highly vertically predictive. A student who leaves 8th grade with a 2.00 is less likely to graduate on time, college ready, than a student with a 2.10, who is less likely to do so than a student with a 2.20, who is less likely to do so than a student with a 2.30 – and so on all the way to the very top of the scale at 4.50. We know this based on analyses of millions of observations of the performance of New York City children over many years. Skoolboy would have us wait and do more studies, achieving pure statistical perfection before acting on what we already know. Again, if it was your child whose school was not enabling her to progress from year to year – whose school was allowing her to master less each year than the year before –you wouldn’t be willing to await statistical perfection.

Also, eduwonkette suggests in her comment that PS 8’s grade is a result of its having a demographically and economically mixed population in its older grades and a largely white and middle-class population in its lower grades. The post suggests that it is unfair to use the entire student body as a benchmark of the school’s performance and that instead the DOE should only have used the school’s upper-grade population to define the school’s peer group. In fact, PS 8 did so poorly in the amount of longitudinal progress achieved by its mixed upper-grade population of students (“progress” that was in many cases negative on average) that the school would receive an F no matter how its peer group is calculated. PS 8 helped its students make less longitudinal progress than virtually any other school in the City. No matter which schools it is compared to – and compared to all schools – it performed very poorly, especially with its upper-grade population and most especially with its struggling students. That is why it received an F.

There’s evidence this system is working. Check out the recent statistical study, “Short Run Impacts of Accountability on School Quality,” http://papers.ssrn.com/sol3/papers.cfm?-abstract_id=1261682, by Columbia University researchers Jonah Rockoff and Lesley Turner, which found that:

“Despite the fact that [NYC] schools had only a few months to respond to the release of accountability grades, we find that receipt of a low grade significantly increased student achievement in both [math and ELA], with larger effects in math. We find no evidence that accountability grades were related to the percentage of students tested, implying that accountability systems can cause real changes in school quality that increase student achievement over even a short time horizon.”

David, You are missing the bigger picture. Skoolboy is not nit-picking your assessment. He is saying that evaluating schools with this type of measure is wrong. There is substantial evidence to support Skoolboy's assertion.

If NYC had any inclination towards actually improving its students education, it would not be focused on evaluating schools but on improving instruction, curricula and those very same standards and assessments that NYC is relying too much upon.

Quality school systems around the world that are substantially more effective at educating their children are focused on the student not on setting up some convoluted, misleading way of "grading" schools.

By the way remind me again how these "grades" are supposed to improve instruction or anything else that actually affects student learning?


The grades improve instruction the same way a bad grade for a student can improve his or her homework completion. That is, if mom or dad is going to ground me because I got an F, I'll work harder next month, re-evaluate how I'm approaching the class, or go ask the teacher for more help.

Cantor's arguments show that these grades ARE, in fact, meaningful, and that evidence shows that the data they are using do, in fact, predict life outcomes. This is significant, and I hope that skoolboy will be able to avoid doing what most people do in these situations, which is dig in their heels and look for more evidence to support their own position, rather than acknowledging the wisdom in the other's viewpoint.

NYC's report cards are not the only system in the country to use a value-added system. Ohio has recently adopted this as well. The initial results surprised some and embarrassed other, confirming a few. I think the problem is that the results are most useful, but least reliable, the closer they get to the classroom/teacher/student level that one gets. Ohio reports an average for each school, along with a grade-level content area breakdown. Some schools (or districts) that show a year's worth of growth are actually all over the map when it comes to above and below average growth. So, a kid who comes in "behind" (as it is frequently charged, and as the growth measures were supposed to respond to--allowing schools to take credit for progress even though the kids might never "catch up") may in fact also achieve less than a year's growth for successive years, but still be balanced out by other kids (perhaps in other grades) who do excede a year's growth. Or, success in reading could cancel out deficits in math.

Certainly there is value in being able to say to schools that are not providing adequate growth and also achieving below proficiency that they have work to do, and there is confirmation for those schools achieving at proficiency and also providing adequate growth. But for the vast middle, it is merely helpful information in pinpointing where the areas of weakness lie.

skoolboy's original point wasn't that a value-added system is bad, it was that the NYS Tests do not provide a valid value-added measure because of the way in which they're constructed (re-read the part in bold above).

Margo/mom hit it on the head,writing:

"I think the problem is that the results are most useful, but least reliable, the closer they get to the classroom/teacher/student level that one gets."

Of course, we interpret that differently. I conclude that we should move away from data-driven accontability, but towards data-driven decision.

I my experience the two are mostly incompatible.

John, Exactly right. Good quality data on how to improve can be very helpful for improving learning. And coming up with that data is rather difficult (but doable).

Summative "grades" on the school is only useful as a case for closing the school. Perhaps that is what NYC is trying to do; Provide extensive documentation for closure.

But how in the world would some summative grade ever provide a path towards improvement?

Socrates, In your analogy, who is playing the role of "parent" for the schools that are receiving D's or F's? That is who is responsible for ensuring that the schools become better?

Certainly not the district administration. They are more obsessed with documenting artifacts than improving student learning.

Quality school systems (internationally) do not use these summative measures to "hold the schools accountable".

Quality school system administrators take the responsiblity for improving student learning upon themselves. Not passing the buck onto the backs of overworked teachers.

Where are the school improvement initatives?

Hi everyone,

I just want to comment on the Rockoff paper that Cantor/Liebman use as evidence that this system is working. There were exactly 34 school days between the day the grades were publicly released and the ELA test. Do we really believe that learning was improved in D and F schools in a meaningful way in such a short time frame? Did anyone even bother to audit for cheating?

Thank you, representatives of the DOE, for posting a comment. I hope you’ll be willing to continue to do so as eduwonkette and I begin to analyze the school progress report data you’ve just released. I want to focus on what I view as the fundamental difference between the DOE and me at the heart of my post: the distinction between continuous and threshold measures of academic performance. (eduwonkette is exploring this distinction in her dissertation research on the distributional consequences of accountability systems in education and medicine.)

The Grade 3-8 Mathematics and ELA Tests are designed to measure the extent to which students are achieving the New York State Learning Standards in these domains. Scores on these tests are reported in several ways. Raw scores are converted into continuous scale scores that are intended to be interval-level (which means that the 40-point difference between a score of 630 and 590 represents the same difference in the underlying level of subject-matter achievement as a difference between 720 and 680.) Then, based on a standard-setting process, cut scores are established to classify students into four mutually-exclusive proficiency categories: Level I (not meeting learning standards), Level II (partially meeting learning standards), Level III (meeting learning standards), and Level IV (meeting learning standards with distinction). These four proficiency levels are not interval-level scales—the distance in achievement between Level IV and Level III is not the same as the distance in achievement between Level II and Level I. Rather, the proficiency levels are threshold measures. If a student achieves the threshold score for Level III, the student is judged to have met the learning standards (which corresponds to the NCLB category of “proficient” that all students are expected to achieve by the year 2014.) There’s no such thing as “very nearly meeting learning standards”—the standard setting process considers just these four proficiency categories. The specific skills that are to be mastered to be classified in each proficiency category differ from grade to grade.

When I described fourth-grade math skills such as multiplying two-digit numbers by two-digit numbers, and fifth-grade math skills such as simplifying fractions to their lowest terms, I took these skills right from the New York State Learning Standards for these grades. In my example of a fifth-grader who could not yet create algebraic or geometric patterns using concrete objects or visual drawings (e.g., rotate and shade geometric shapes)—another fifth-grade mathematics learning standard in New York—I was seeking to create a hypothetical contrast between a Level III fourth-grader who had met the learning standards (i.e.,was proficient), but who in fifth-grade was at Level II, having partially met the learning standards (but not proficient). I thought then, and I continue to think now, that it was inappropriate to describe a student who had learned quite a bit during that year beyond where she started as “going backward” or “having lost ground in proficiency.” The skills that represent proficiency in fifth grade are different than those that represent those in fourth grade, and there is no way to use scores on the respective tests, which are not vertically equated, to look across the grades.

The DOE response gives me an opportunity to further explicate our difference of opinion regarding threshold and continuous measures. Readers will note that it makes reference to proficiency ratings of 3.50, 2.10, and 4.50 (“the very top of the scale”). What are these ratings? They are not based on the standards-setting process used to construct the proficiency levels reported by the state. The fine print of the DOE’s guide to the school progress reports distinguishes between Performance Levels ranging from 1 to 4—what the technical reports on the Grades 3-8 Math and ELA Tests released by New York state refer to as “Proficiency Levels”—and “Proficiency Ratings.” “For purposes of the Progress Report,” the DOE guide states, “the scale scores awarded by the State on State mathematics and ELA exams are assigned a Proficiency Rating on a continuum from 1.00 to 4.50. A Proficiency Rating of 1.00 corresponds to the lowest score a student in Performance Level 1 can attain. A Rating of 1.99 corresponds to the highest score a student can attain and still be at Performance Level 1. A Proficiency Rating of 2.50 corresponds to the midpoint between Performance Level 2 and Performance Level 3 and Ratings between 3.00 and 4.00 reflect scale scores between the State cut-off scores for Performance Levels 3 and 4. Students who exceed the cut-off score for Performance Level 4 are assigned Proficiency Ratings from 4.01 to 4.50; a Rating of 4.50 corresponds to the highest score that can be attained on the test.”

Why the DOE chose to do this is a mystery. The scale scores which are reported for individual students are interval-level, which means that calculating an average scale score is a meaningful thing to do. But these proficiency ratings are not interval-level, because the Performance Levels are not equally spaced. I know of no way to justify assigning a value of 2.50 for the score that is halfway between Performance Level II and Performance Level III (it’s probably no accident that the state uses Roman numerals rather than the Arabic numerals 1, 2, 3 and 4—it discourages the temptation to treat them as countable, rather than an ordered set of thresholds for proficiency.) Frankly, it’s meaningless. If the DOE wants to give partial credit for growth that doesn’t reach the next threshold for proficiency, they should use scale scores; that’s exactly what they’re designed to do.

Now, I’m going to try to take the high road here and not overreact to the DOE’s efforts to put words in my mouth—“skoolboy says we should be just as happy with a school from which students matriculate with a 3.00 – and a 55% probability of graduating ready to go to college – as we are with a school from which students matriculate with a 3.50 and an 80%-plus probability of succeeding in high school,” “skoolboy may believe that New York State’s learning standards aren’t valid,” “skoolboy would have us wait and do more studies, achieving pure statistical perfection” and so forth. I think it’s clear that I said nothing of the sort. I certainly said nothing about a 3.50 proficiency rating, which I think is something meaningless that the DOE made up. Moreover, I defy anyone to show how what I wrote suggests that I believe that New York State’s learning standards aren’t valid. (And what the heck is “pure statistical perfection”?)

Skoolboy, Greatly appreciate you keeping the NYC office honest regarding their statistics. Documenting statistical artifacts is in no one's best interest.

But really, why are they doing this anyway? What is the point? Without an improvement plan, what is the use of evaluating schools.

The NYC DOE is mistaking the tail(tests) for the dog(education).

Assessments are not an end unto themselves. Assessments are great to evaluate whether a priori defined plans met their goals.

Where are the plans for improving schools? What goals are the NYC DOE setting?

Just hoping that schools will get better with these "accountability" systems is rather foolhardy. This approach to improvement has never worked in any time or place for complex systems (such as education).

Even businesses develop plans first.

Professor Pallas relies heavily on the design of the New York State Learning standards and tests for measuring mastery of them. He cautions against deviating from or going beyond that system. Three comments:
First, in Professor Pallas' view, "it [i]s inappropriate to describe" a student who was "a Level III fourth-grader who had met the learning standards (i.e., was proficient), but who in fifth-grade was at Level II, having partially met the learning standards (but [was] not proficient) . . . as 'going backward' or [as] 'having lost ground in proficiency." In this belief, it is Professor Pallas who is deviating from the New York State accountability system. Under that system, a student who is Level III in one year and Level II the next year has indeed lost ground. In third grade that student factors favorably into the school's accountability evaluation for both State and Federal (NCLB) purposes. In fourth grade, the student factors unfavorably into the school's state and federal accountability evaluation. Why? Because the student's learning level was where it was expected to be as defined by the State Learning Standards in third grade, but is not where it is expected to be as defined by those same standards in the fourth grade. Put colloquially, it is because the student has fallen back from proficiency to something less favorable to the student and his future.
Second, in departing from the goals and approach of the New York State Learning Standards and accountability system, Professor Pallas acknowledges the point made in our earlier post -- he takes a position different from that of the broader community -- in this case, the broader community as reflected by the New York State Education Department in exercising its role under state and federal law to define the learning that is expected to take place in core subjects in each grade in schools in the state, and to hold schools accountable for whether or not they bring their students to the point where expected learning has taken place. For Professor Pallas, the fact that a student had mastered the material expected of a third grader but has failed to do so as a fourth grader is not a matter worth taking into consideration in holding accountable the student's school, because the student may have "learned quite a bit during that year."
Professor Pallas's opinion makes me wonder whether he awards an A to every one of his Teachers College students who demonstrates that she "learned quite a bit" during a year in his class, though she did NOT master important parts of the materials and key concepts that he presented. I wonder whether he would retain an office assistant he hires who knows "quite a bit" about how to type a letter for him and answer his phone but fails to keep his calendar up to date or prepare materials for his classes in the manner that he requests. We do think it is not only appropriate but necessary to draw distinctions between learning "quite a bit" and learning, in addition, what is necessary to succeed in high school and beyond. Consider a student who has gotten enough items correct on his or her ELA and math exams in 8th grade to earn the lowest scale score that qualifies as "proficient." This student surely has "learned quite a bit" in nine years of schooling. But based on the actual outcomes of many hundreds of thousands of New York City students, we know that the probability that this student will graduate high school four years later (i.e., on time) with a Regents (i.e., college ready) Diploma is only 55%. A student who instead emerges from eighth grade having scored what we call a 3.50 -- meaning that the student attained the scale score that represents half as many additional correct answers as is necessary to move from a Level III to a Level IV under the New York State system -- has over an 80% probability of graduating on time with a college-ready diploma. We believe that we owe it to our students and their parents to hold our schools accountable in a serious, consequential way for the difference between a 55% and an 80% chance of a future in which the door-opening possibilties of a college education are made available. We believe that it simply won't do to say that both schools -- the one that leaves its matriculants with only a 55% percent chance of a college-enabled future and the one that leaves its matriculants with an 80% chance of a college-enhanced future -- as the same because both have helped their students to learn "quite a bit."
Which brings me to the third point. The New York State system goes so far. It draws distinctions -- ones to which Professor Pallas, unlike New York State, would attribute no consequence or effect -- between Level I, II, III, and IV students. It also uses scale scores that enable one to identify whether a student has just barely passed from Level II to Level III or, instead, is just shy of reaching Level IV (though is still denominated a Level III). Although that system does not itself make finer distinctions, the question is whether the data it generates permit us to do so. After careful analysis we have determined that it does. Why? Because we know that a student who is one-tenth of the way from Level II to Level III as I have defined that concept above has a higher probability of graduating high school on time and college ready than does a student who is right at Level II; a student who is two-tenths of the way from Level II to Level III has a still higher probablilty of graduating high school on time and college-ready; and so on from the very bottom to the top of the scale. At some points the smooth and continuous curve that this analysis generates is less steep (at the two tails, below 2.5 and above 3.7 or so) and at places it is quite steep (between 2.5 and 3.7). But it is always true that mastering more information and skills as reflected by test outcomes, not only across but also within the States four proficiency levels, creates a higher probability of succeeding thereafter in high school. Because the New York City Department of Education, as a matter of policy, puts a high premium on giving students the best chance of an on-time, college-ready graduation, the data generated by New York state accountability system provides us with a powerful basis for evaluating our schools -- based not only on whether students in these schools reach Level II vs III but also on where those students results are located between those threshholds. And based not only on whether those students learn "quite a bit", but also on whether they learn enough to prepare them to succeed in high school and college. For us learning quite a bit isn't enough. Learning what it takes to master state learning standards and, in addition, to succeed in high school is what is required.


In answer to your question about my analogy, the DOE certainly is the parent, despite your concern that they're more concerned with enforcement than teaching schools how to teach (of course, if they told the schools how to teach, plenty of people would be screaming about oppressive dictatorial Klein imperatives). It wasn't my parents who taught me Calculus, but when I got a bad grade on my first quarter report card, it sure was them who threatened to take away the car. Thus, my achievement in Calc went up. Sort of like how the achievement of last year's F schools has gone up.

Socrates, The parent/grade analogy only works so far. You were apparently used to doing well and the "just working harder" seemed to help you.

The vast majority of kids do not use grades as motivators. Certainly if they did there would be a steady trend upwards from bad grades to good ones. Not so. Largely, grades become a self-fulfilling prophecy.

The "grades" for schools are only good for making the case to shut them down. There is nothing within the grades that provides a path towards improvement.

David Cantor, You apparently put more thought into this convoluted school assessment system than into any ideas regarding how to improve NYC students' education.

A quality school system starts with goals, develops plans to implement them and then (and only then) develops assessments to determine if you met those goals. You are confusing the tail(tests) for the dog(education).

Instead of attacking skoolboy, how about listening/incorporating his concerns and coming up with a real improvement plan for NYC schools?

No, the reason many kids don't get better when they get a bad grade is because of learned helplessness. But teachers are adults and should be able to use the feedback provided by grades (and data) to improve their instruction. In most cases, teachers can improve if they work harder, just as I could've improved my grades by working harder when I was young.


It is not working harder that our teachers need, but working smarter. And that smarter means teaching in a better way.

Our teachers work the longest classroom hours of any peer nation. They have tremendous work ethic.

What they don't have is system support for improving. To improve requires: Better curricula, better teaching techniques and better standards/assessments.

Given the lack of system support, our teachers do an amazing job. Given the way our system is set up there is no room for improvement. Our teachers are doing the best that they can under the circumstances.

This is not to suggest that schools can't be improved. They can be and quite dramatically. But they won't be improved by just insisting that teachers are lazy and need to work harder.

The problem with the "accountability" model, is that *nothing* improves without a plan. Just saying, "Well, schools will (magically) come up with a good plan" completely underestimates the co-ordination necessary to enable schools to educate well.

One aspect that is *never* discussed is how poorly the state standards are organized and developed. Schools are trying to match standards that are ill-thought out, try to do too much very poorly and are mostly at complete odds with a quality education.

And so our assessments are based largely upon horrible standards.

If you want an example of state school systems developing horrible standards, consider the state of Washington. They have just gone through a revamping of their math standards because of the lackluster performance of their students on the math section of the WASL. But in that revamping, they came up with "new standards" that were remarkably like the old ones. And in comparing their standards to available curricula, they found that the ones that aligned the best were (surprise, surprise) the same ones that were being currently used in schools. Singapore math, which internationally has produced the best educated students as measured by TIMSS, was considered completely unacceptable because it did not align with the "new standards" and so despite the demonstrated high quality of the curricula, it is not being recommended for use.

So how can this be? If our standards were so great, wouldn't they align with the best material used around the world? Why is it that standards developers are never held "accountable" for the mess they fob off onto the schools?

So who is responsible for fixing the standards, or even knowing that they are bad/poorly written?

The problem with our schools is that we have no system for improvement. Accountability alone will never suffice.

Curricula matter. Teachers matter. And having assessments that align with long term goals and classroom instruction is *essential* towards enabling students to learn well.

Education is not a marketplace. It is a rather artificial ideal that we aspire to attain. So someone has to define what success is. If that success is defined by the standards/assessments that we have in place today, then it is no wonder that we are failing in educating our children to high levels seen around the world.

In business, the marketplace serves as the "invisible hand". Why do you think that these horrible tests based upon ill-thought out standards can best serve as education's invisible hand?

Comments are now closed for this post.


Recent Comments

  • Erin Johnson: Socrates, It is not working harder that our teachers need, read more
  • Socrates: No, the reason many kids don't get better when they read more
  • Erin Johnson: Socrates, The parent/grade analogy only works so far. You were read more
  • Socrates: Erin, In answer to your question about my analogy, the read more
  • David Cantor: Professor Pallas relies heavily on the design of the New read more




Technorati search

» Blogs that link here


8th grade retention
Fordham Foundation
The New Teacher Project
Tim Daly
absent teacher reserve
absent teacher reserve

accountability in Texas
accountability systems in education
achievement gap
achievement gap in New York City
acting white
AERA annual meetings
AERA conference
Alexander Russo
Algebra II
American Association of University Women
American Education Research Associatio
American Education Research Association
American Educational Research Journal
American Federation of Teachers
Andrew Ho
Art Siebens
Baltimore City Public Schools
Barack Obama
Bill Ayers
black-white achievement gap
books on educational research
boy crisis
brain-based education
Brian Jacob
bubble kids
Building on the Basics
Cambridge Education
carnival of education
Caroline Hoxby
Caroline Hoxby charter schools
cell phone plan
charter schools
Checker Finn
Chicago shooting
Chicago violence
Chris Cerf
class size
Coby Loup
college access
cool people you should know
credit recovery
curriculum narrowing
Dan Willingham
data driven
data-driven decision making
data-driven decision-making
David Cantor
Dean Millot
demographics of schoolchildren
Department of Assessment and Accountability
Department of Education budget
Diplomas Count
disadvantages of elite education
do schools matter
Doug Ready
Doug Staiger
dropout factories
dropout rate
education books
education policy
education policy thinktanks
educational equity
educational research
educational triage
effects of neighborhoods on education
effects of No Child Left Behind
effects of schools
effects of Teach for America
elite education
Everyday Antiracism
excessed teachers
exit exams
experienced teachers
Fordham and Ogbu
Fordham Foundation
Frederick Douglass High School
Gates Foundation
gender and education
gender and math
gender and science and mathematics
gifted and talented
gifted and talented admissions
gifted and talented program
gifted and talented programs in New York City
girls and math
good schools
graduate student union
graduation rate
graduation rates
guns in Chicago
health benefits for teachers
High Achievers
high school
high school dropouts
high school exit exams
high school graduates
high school graduation rate
high-stakes testing
high-stakes tests and science
higher ed
higher education
highly effective teachers
Houston Independent School District
how to choose a school
incentives in education
Institute for Education Sciences
is teaching a profession?
is the No Child Left Behind Act working
Jay Greene
Jim Liebman
Joel Klein
John Merrow
Jonah Rockoff
Kevin Carey
KIPP and boys
KIPP and gender
Lake Woebegon
Lars Lefgren
leaving teaching
Leonard Sax
Liam Julian

Marcus Winters
math achievement for girls
meaning of high school diploma
Mica Pollock
Michael Bloomberg
Michelle Rhee
Michelle Rhee teacher contract
Mike Bloomberg
Mike Klonsky
Mike Petrilli
narrowing the curriculum
National Center for Education Statistics Condition of Education
new teachers
New York City
New York City bonuses for principals
New York City budget
New York City budget cuts
New York City Budget cuts
New York City Department of Education
New York City Department of Education Truth Squad
New York City ELA and Math Results 2008
New York City gifted and talented
New York City Progress Report
New York City Quality Review
New York City school budget cuts
New York City school closing
New York City schools
New York City small schools
New York City social promotion
New York City teacher experiment
New York City teacher salaries
New York City teacher tenure
New York City Test scores 2008
New York City value-added
New York State ELA and Math 2008
New York State ELA and Math Results 2008
New York State ELA and Math Scores 2008
New York State ELA Exam
New York state ELA test
New York State Test scores
No Child Left Behind
No Child Left Behind Act
passing rates
picking a school
press office
principal bonuses
proficiency scores
push outs
qualitative educational research
qualitative research in education
quitting teaching
race and education
racial segregation in schools
Randall Reback
Randi Weingarten
Randy Reback
recovering credits in high school
Rick Hess
Robert Balfanz
Robert Pondiscio
Roland Fryer
Russ Whitehurst
Sarah Reckhow
school budget cuts in New York City
school choice
school effects
school integration
single sex education
small schools
small schools in New York City
social justice teaching
Sol Stern
Stefanie DeLuca
stereotype threat
talented and gifted
talking about race
talking about race in schools
Teach for America
teacher effectiveness
teacher effects
teacher quailty
teacher quality
teacher tenure
teachers and obesity
Teachers College
teachers versus doctors
teaching as career
teaching for social justice
teaching profession
test score inflation
test scores
test scores in New York City
testing and accountability
Texas accountability
The No Child Left Behind Act
The Persistence of Teacher-Induced Learning Gains
thinktanks in educational research
Thomas B. Fordham Foundation
Tom Kane
University of Iowa
Urban Institute study of Teach for America
Urban Institute Teach for America
value-added assessment
Wendy Kopp
women and graduate school science and engineering
women and science
women in math and science
Woodrow Wilson High School