eduwonkette_header_515.jpg

Through the lens of social science, eduwonkette takes a serious, if sometimes irreverent, look at some of the most contentious education policy debates. (Find eduwonkette's complete archives prior to Jan. 6, 2008 here.)

« August 2008 | Main | October 2008 »

September 30, 2008

Vanity Fair

Rest assured that this blog will not run out of troubling things to write about anytime soon.

NYC-New-Clothes.jpg

No Child Left Behind: Looking Back, Looking Forward

soapy-maggie.gif
I'm knee deep in old NCLB documents, and ran across the Department of Education's NCLB song. NCLB represented not only a major shift in federal education policy, but an embrace of policy/PR boosterism that's enough to make all of us giggle (Remember Armstrong Williams?). Back from 2002, here are the NCLB lyrics:

We're here to thank our president,
For signing this great bill,
That's right! Yeah,
Research shows we know the way,
It's time we showed the will!
No matter how catchy the ditty, a song can't carry a fundamentally flawed law. That's where Tom Toch and Doug Harris come in. They've penned a thoughtful commentary in this week's Ed Week about the future of NCLB (Salvaging Accountability). It's an important one, because it recognizes that NCLB conflates the school's contribution to student learning with what students bring to the school to begin with. Essentially the argument is that:

1) "It’s critical in any accountability system that the metrics used to judge performance reflect accurately the contributions of those being judged."

2) "As a measure of school performance, however, [the NCLB] snapshot strategy is flawed. Because student populations vary greatly from school to school, and because family income, parental education, and a host of other non-school-related factors have a major influence on students’ learning, some schools have to improve student achievement a lot more than others to get their students up to state standards. The federal law is unforgiving of such schools. As a result, it gives an unfair advantage to schools with students from privileged backgrounds, and it fails to measure what matters most: how much students learn during the school year."

3) The Department of Education's Growth Model Pilot offers little improvement over the current rating system because it relies on a projection model - i.e. are students on target to be proficient in a 3 year window? - rather than a true growth model.

4) The new NCLB should dump the projection model, and focus its sanctions on schools that are both low in terms of their growth, and low in terms of their proficiency. And there's no reason to wait for reauthorization - this could all happen via regulations.

No commentary can do it all, so here are some issues to ponder for their next round. The goal of Toch and Harris' proposed system is to make measurement of school performance a more fair and effective enterprise. Why not take the leap and dump 100% proficiency altogether? That way, we could narrowly tailor our sanctions to schools that are low-performing compared to the schools we already have.

And if we're going to go full throttle on value-added models, we can't just punt the measurement problems. For example, Toch and Harris write, "value-added calculations have larger margins of error than NCLB’s proficiency ratings, but because they measure what’s most important in judging schools—student learning gains—their statistical shortcomings are more than worth tolerating."

A poorly designed growth model is no better than the poorly designed proficiency model that we have now, and no one knows this better than New Yorkers. Value-added systems that have literally no relationship between two years' value-added measures are still bad public policy. In short, beware the silver bullet.

September 29, 2008

Guest blogger Betsy Gotbaum on: The Future of Mayoral Control

Betsy Gotbaum is the Public Advocate for the City of New York. The Public Advocate is an independently elected citywide official who serves as a public ombudswoman.

Six years ago, Mayor Michael Bloomberg accomplished what those before him could not: he gained control of New York City public schools, a fragmented, famously troubled bureaucracy that now has about 1.1 million students, 80,000 teachers, 1,450 schools and a budget that, at more than $21 billion, is larger than that of several states. When the New York State Legislature authorized mayoral control in 2002, it added a sunset provision, which takes effect next June. At that time, the Legislature will decide whether mayoral control should continue.

I believe that it should.

I also believe that the law should be amended in certain important ways. Last year, Catherine Nolan, chair of the education committee in the New York State Assembly, asked me to appoint a School Governance Commission to assess mayoral control. Over the course of a year, this independent Commission heard testimony from more than 100 individuals representing broad and diverse constituencies, hosted parent forums, and held public hearings. It also commissioned eight academic papers from experts on mayoral control of schools, which in turn shed light on how the process has worked in other cities. (These papers are to be published as an edited volume, When Mayors Take Charge: School Governance in the City, by Brookings Institute Press.)

In its final report, the Commission recommended that mayoral control be maintained. It also recommended changes to ensure greater public accountability as well as meaningful input from parents and the community.

For some time, it's been clear that we need better oversight of city Department of Education (DOE) finances. We also need better oversight of certain DOE-produced data. I enthusiastically endorse the idea that the city’s Independent Budget Office serves as an outside evaluator to monitor and assess such DOE data as test scores and graduation rates. And, since the DOE spends billions of tax dollars, it must follow the same procurement procedures as other city agencies, including bidding protocols created and monitored by the city comptroller. Since 2003, we have seen the DOE give away more than $300 million by skirting the competitive bidding process. The era of no-bid contracts must end.

The DOE has ignored parents, community leaders and others who have a valid stake in the ways and means of educating New York kids. Virtually shut out of the decision-making process, these stakeholders have been unable to provide meaningful input about issues that directly affect their children’s education.

This regrettable DOE attitude must change. While overall mayoral control should continue, it should be flexible enough to include a certain amount of decentralized authority. This is needed to address such local problems as enrollment, school transfers and school bus routes. Also, given the immense size of the school system and the DOE bureaucracy, parents desperately need someone who's knowledgeable, effective, and locally based to consult when problems or questions arise. Toward this end, the local geographic school districts that were created decades ago should be re-established. They should include superintendents with adequate staff and explicit oversight over principals in a given district.

The Commission recommended that Community District Education Councils (CDECs) should continue as well, though a process should be developed to give them meaningful input into decisions about budgets, general education practices and the opening and closing of schools. I support this recommendation, which provides a valuable tool for local involvement. I also believe that eligibility criteria for CDEC membership should be expanded.

To maintain mayoral control, the mayor must continue to appoint the majority of members of the Panel for Education Policy, a 13-member group that, among other things, reviews education policies proposed by the Chancellor. However, members should be appointed for fixed terms, which would ensure their independence. As it is now, they can be fired at will, and they have been, for disagreeing with the Mayor and Chancellor. I also believe that Panel members should select their own chairperson; currently the Chancellor serves as chair of the PEP. Further, the Panel should be comprised of members with relevant backgrounds and a stake in the education system.

While the Commission sets down the groundwork for stronger community and parent participation in school governance, much more needs to be done. It's difficult to legislate greater opportunities for community input, but the state's Contracts for Excellence model for parental involvement, cited in the report, is a good start. It's also an approach with which the State Senate has experience.

I’m opposed to one Commission recommendation, that the Panel for Education Policy be involved in collective bargaining agreements. Third-party approval would undermine collective bargaining by empowering an entity that is not involved in the process.

Some may believe that the Commission's final report does not adequately assess the current governance arrangement under Mayor Bloomberg and Chancellor Joel Klein. I understand this criticism. The purpose of this effort, however, was to develop a map for the future, regardless of who is mayor or chancellor. I think the Commission has done that.

Some may disagree with the findings, but there's no question that, by and large, they reflect the views that stakeholders expressed throughout this process. Passions and tensions run high when debating this issue, but the debate must take place. The Commission has established a framework in which this debate can and should continue. I look forward to the discussion that lies ahead, and I'm confident that, through an open and deliberative process, school governance in New York City can be improved.

I encourage you to read the Commission report.

September 28, 2008

This Week's COWAbunga Award

cowabunga-award.jpg
This week's COWAbunga Award, aka the "comment of the week award," goes to Citizen X, who let loose on Roland Fryer's experiments that pay kids for their test scores:
The soul-crushing aspect of Fryer's theoretical framework....is that it lets the curriculum and the teacher and the school entirely off the hook. It's not a matter of creating learning experiences that connect with a child, her culture, her community; or creating curriculum that intrigues her or teaching that respects her and her family; or about creating schools where families and communities can find support and education and develop skills of active citizenship....No, it's a much more cynical view on students living in poverty. They don't care, they are only motivated by material objects that they don't have, they have to be bribed into "learning" (or at least learning to get a better score on a bubble sheet)....RRRRR! Get me out of here.
COWAbunga, indeed.

September 26, 2008

Cool People You Should Know: Sean Reardon

Reardon.jpg
We know that the average African-American student lags behind the average white student. But until recently, we did not have a clear portrait of the differences between black and white high-achievers in elementary school - a critical pipeline issue in shaping inequality in access to the most coveted colleges, graduate schools, and jobs. Thanks to Sean Reardon, a Stanford sociologist of education who studies school segregation and the sources of racial/ethnic achievement gaps, we've come a long way.

How does the progress of initially high-achieving black and white students compare as they progress from kindergarten through 5th grade? It turns out that high-achieving black students fall back significantly more than low-achieving black students. For students who start school at the 84th percentile, black-white gaps grow twice as fast as students who start school at the 16th percentile. (See Reardon's paper here.)

The question, then, is why. Reardon suggests a few possible mechanisms, each of which deserve more attention in future research. The first possibility is an outgrowth of racial segregation. The average black high achiever attends school with lower achieving students than the average white high achiever. If teachers teach to the middle, high-achieving black kids may lose out compared to their white peers. A second possibility is that teachers treat black and white low achieving students similarly, but differentiate treatment among high-achieving black and white students. (This seems less plausible to me, but perhaps you have thoughts here.) A third possibility is that the home environments of high-achieving black and white students diverge more than the home environments of low-achieving black and white students.

Kudos to Reardon for putting this issue on the map, and may a thousand dissertations bloom.

September 25, 2008

Thursday Link Love

1) My Kingdom for a Parking Space: On top of everything else, NYC teachers like Mimi are without parking. As always, she has some funny and insightful things to say about it:
Sometimes it feels as if the forces in the universe are alligning to make this job as difficult as possible, just to see if I have the balls to stick with it. Other times, it feels as if teachers (as people) are the absolute last priority on everyone's list...that we will just suck it up and deal with ridiculous situations "for the kids."

If one more person tells me to do it "for the kids", I might throw a kid at them. Seriously. Stop playing on our good intentions and altruistic dedication to the future and treat us like the professionals you so desperately claim you want us to be. It just seems at times as if this job teeters on the brink of being inhumane.
2) Quiz Show: Celia Oyler puts together her second New York City Progress Report Quiz.

3) Dream big, Harvard: It would be a shame if this $44 million R&D effort in education spent most of its brainpower studying incentives.

September 24, 2008

Could a Monkey Do a Better Job of Predicting Which Schools Show Student Progress in English Skills than the New York City Department of Education?

monkey4.JPG

eduwonkette and I have been blogging about the School Progress Reports released last week by the New York City Department of Education. We’ve shown that, although the performance and environment scores of schools were pretty consistent from last year to this year, the student progress scores were virtually unrelated—knowing a school’s progress score from last year didn’t predict which schools would demonstrate a lot of progress this year. This, we argued, demonstrated that the progress part of the School Progress Report—representing 60% of the letter grade each school received—wasn’t really telling us which schools consistently are promoting student progress, but rather was mostly random error.

The problem was particularly acute in the domain of English Language Arts (ELA). The stability in the student progress scores from 2007 to 2008 was so low that it led skoolboy to wonder if a monkey could actually do a better job predicting which schools show progress in students’ ELA performance in 2008 than relying on the DOE’s 2007 student progress score. The particular measure I examined was the percentage of students in the school making at least one year of progress on the ELA test from last year to this year. (As we've noted in earlier posts, the calculation of this measure changed slightly from 2007 to 2008.)

In the interest of full disclosure, skoolboy didn’t actually rent a monkey to pick the schools. Animals scare him, and he wouldn’t have been able to record the picks while hiding under his bed. What I did instead was use a random number generator to assign each school to the top or bottom half of the distribution of schools on last year’s peer and citywide measures of the percentage of students making a year of progress in English Language Arts.

The DOE got credit for a correct prediction if it correctly predicted that a school would be in the top half of this year’s schools, based on the school being in the top half on the DOE’s 2007 measure, or correctly predicted that a school would be in the bottom half of this year’s schools, based on the school being in the bottom half last year. The monkey got credit for a correct prediction if the randomly-selected location of a school as being in the top half of the 2007 distribution correctly predicted that a school would be in the top half of this year’s schools, or the random pick of being in the bottom half of last year’s distribution correctly predicted that a school would be in the bottom half of this year’s schools. These predictions were done separately for the 570 elementary schools, 128 K-8 schools, and 289 middle schools which received overall letter grades last year and this year.

Round 1. We begin with the peer horizon score for the 570 elementary schools. The DOE’s peer horizon progress score from last year correctly predicted the progress status of 46% of the elementary schools this year. The monkey correctly predicted the status of 51% of this year’s schools.

Score: Monkey 1, DOE 0.

Round 2. We next turn to the citywide horizon score for the 570 elementary schools. The DOE’s citywide horizon progress score from last year correctly predicted the progress status of 47% of the elementary schools this year. The monkey correctly predicted the status of 52% of this year’s schools.

Score: Monkey 2, DOE 0.

Round 3. In this round, we examine the peer horizon scores for the 128 K-8 schools. The DOE’s peer horizon progress score from last year correctly predicted the progress status of 45% of the K-8 schools this year. The monkey correctly predicted the status of 55% of this year’s schools.

Score: Monkey 3, DOE 0.

Round 4. Next, we look at the citywide horizon progress scores for the 128 K-8 schools. The DOE’s citywide horizon progress score from last year correctly predicted the progress status of 43% of the K-8 schools this year. The monkey correctly predicted the status of 47% of this year’s schools.

Score: Monkey 4, DOE 0.

Round 5. The final stage of the competition examines the 289 middle schools. The DOE’s peer horizon progress score from last year correctly predicted the progress status of 40% of the middle schools this year. The monkey correctly predicted the status of 50% of this year’s middle schools.

Score: Monkey 5, DOE 0.

Round 6. The last round looks at the citywide horizon progress scores for the middle schools. The DOE’s citywide horizon progress scores from last year correctly predicted the progress status of 45% of this year’s middle schools. The monkey correctly predicted the status of 49% of this year’s middle schools.

Score: Monkey 6, DOE 0.

skoolboy will forego the cheap jokes about how a monkey could do a better job of managing New York City’s accountability system than the people currently in charge. On the whole, they’re smart, hard-working people, and ridiculing them is not likely to persuade them to change their behavior (as satisfying as it may be at particular moments.) But the system that they have designed and implemented is profoundly flawed, as this comical example illustrates, and it needs to change. eduwonkette and I are going to keep hammering on this point, because it has such important consequences for students and for schools.

And besides: I bet the DOE would beat the monkey in predicting school progress scores in math. (But it wouldn’t be a rout.)

September 23, 2008

Happy Anniversary!

birthday-cake.jpg

Today marks the one-year anniversary of eduwonkette's bold entry into blogging about education. A lot has happened here over the past year, across 487 different posts, and thousands and thousands of comments. (Heck, back then, eduwonk and eduwonkette were BFF.)

eduwonkette has tackled a remarkably diverse set of education policy issues: teacher quality, No Child Left Behind, gender differences in academic performance, myths about small schools, New York City's School Progress Reports, the "it's being done/no excuses" argument, the achievement gap and "acting white", value-added assessment, choice, incentives, unions ... the list goes on and on. And she's done it all with great style and wit, first with and now without the mask.

Today is an opportunity to revisit the principles that brought her to the blogging world:

Are you tired of listening to the usual suspects on education policy? So am I. Education policy debates are dominated by a small number of very loud voices. In these debates, ideological claims, rather than research, data, the experience of educators, and common sense, are wielded as weapons. What are some of the problems I see with these debates?


A selective reading of educational research: The loudest outlets pick and choose which studies are relevant, often leading to a skewed view of what we know and don’t know about how to improve schools.

An inattention to the costs and benefits of policies: Policy solutions are endorsed as if they have no downside. But we know that all actions have positive and negative consequences. The education policy debate would benefit from such an acknowledgement.

A fundamental disrespect for the knowledge of teachers and principals who work in public schools: Too often, teachers and administrators are dismissed as “self-interested” or “protecting the status quo” when they question what policymakers wreak on their classrooms and schools. In no other profession are we willing to discount the opinions of those closest to the work at hand. Education should be no different.

Rather than stepping into this ideological boxing ring, this blog takes a different approach.

And so she has. Happy anniversary, eduwonkette!

What Does Educational Testing Really Tell Us? An Interview with Daniel Koretz

Koretz.jpg
Daniel Koretz, a professor who teaches educational measurement at the Harvard Graduate School of Education, generously agreed to field a few questions about educational testing. He is the author of Measuring Up: What Educational Testing Really Tells Us.

EW: What are the three most common misconceptions about educational testing that Measuring Up hopes to debunk?

DK: There are so many that it is hard to choose, but given the importance of NCLB and other test-based accountability systems, I'd choose these:
* That test scores alone are sufficient to evaluate a teacher, a school, or an educational program.

* That you can trust the often very large gains in scores we are seeing on tests used to hold students accountable.

* That alignment is a cure-all - that more alignment is always better, and that alignment is enough to take care of problems like inflated scores.
EW: I'm intrigued by your third point about alignment. For example, we often hear that because state testing systems are directed towards a particular set of standards, we should primarily be concerned with student outcomes on tests aligned with those standards. This is the common refrain about a "test worth teaching to." What's missing from this argument?

DK: Up to a point, alignment is a clearly good thing: we want clarity about goals, and we want both instruction and assessment to focus on the goals deemed most important.

However, there are two flies in the ointment. The first is that the achievement tests are concerned with, no matter how well aligned, are small samples from large domains of performance. That means that most of the domain, including much of the content and skills relevant to the standards, is necessarily omitted from the test. As I explain in Measuring Up, this is analogous to a political poll or any other survey, and it is not a big problem under low-stakes conditions. Under high-stakes conditions, however, there is a strong incentive to focus on the sampled content at the expense of the omitted material, which causes score inflation. Aligned tests are not exempt. Score inflation does not require that the test include poorly aligned content. Even if the test is right on target, inflation will occur if the accountability program leads people to deemphasize other material that is also important for the conclusions based on scores. And to make this concrete: some of the most serious examples of score inflation in the research literature were found in Kentucky's KIRIS system, which was a standards-based testing program.

The second problem is predictability. To prepare students in a way that inflates scores, you have to know something about the test that is coming this year, not just the ones you have seen in the past. The content, format, style, or scoring of the test has to be somewhat predictable. And, of course, it usually is, as anyone who has looked at tests and test preparation materials should know. Carried too far, alignment actually makes this problem worse, by focusing attention on the particular way that knowledge and skills are presented in a given set of standards. Think about 'power standards,' 'eligible standards,' and 'grade level expectations,' all of which can be labels for narrowing in on the specifics of how a set of skills appear on one state's particular assessment.

Why is this bad? Because many of those specifics are not relevant to the students' broader competence and long-term well-being. Scores on a test are a means to an end, not properly an end in themselves. Education should provide students knowledge and skills that they can use in later study and in the real world. Employers and university faculty will not do students the favor of recasting problems to align with the details of the state tests with which they are familiar. As Audrey Qualls said some years ago: real gains in achievement require that students can perform well when confronted with "unfamiliar particulars." Improving performance on the familiar but not the unfamiliar is score inflation.

EW: What are the implications of score inflation for both measuring and attenuating achievement gaps? Because schools serving disadvantaged students face more pressure to increase test scores via the mechanisms you describe, I worry that true achievement gaps may be unchanged - or even growing - while they appear to be closing based on high-stakes measures.

DK: I share your worry. I have long suspected that on average, inflation will be more severe in low-achieving schools, including those serving disadvantaged students. In most systems, including NCLB, these schools have to make the most rapid gains, but they also face unusually serious barriers to doing so. And in some cases, the size of the gains they are required to make exceed by quite a margin what we know how to produce by legitimate means. This will increase the incentive to take short cuts, including those that will inflate scores. This would be ironic, given that one of the primary rationales for NCLB is to improve equity. Unfortunately, while we have a lot of anecdotal evidence suggesting that this is the case, we have very few serious empirical studies of this. We do have some, such as the RAND study that showed convincingly that the "Texas miracle" in the early 1990s, supposedly including a rapid narrowing of the achievement gap, was largely an illusion. Two of my students are currently working with me on a study of this in one large district, but we are months away from releasing a reviewed paper, and it is only one district.

I have argued for years that one of the most glaring faults of our current educational accountability systems is that we do not sufficiently evaluate their effects, instead trusting - evidence to the contrary - that any increase in scores is enough to let us declare success. We should be doing more evaluation not only because it is needed for the improvement of policy, but also because we have an ethical obligation to the children upon whom we are experimenting. Nowhere is this failure more important than in the case of disadvantaged students, who most need the help of education reform.

Inflation is not the only reason why we are not getting a clear picture of changes in the achievement gap. The other is our insistence on standards-based reporting. As I explain in Measuring Up, relying so much on this form of reporting has been a serious mistake for a number of reasons. One reason is that if one wants to compare change in two groups that start out at different levels - poor and wealthy kids, African American and white kids, whatever - changes in the percents above a standard will always give you the wrong answer. This particular statistic confuses the amount of progress a group makes with the proportion of the group clustered around that particular standard, and the latter has to be different for high- and low-scoring groups. I and others have shown that this distortion is a mathematical certainty, but perhaps most telling is a paper by Bob Linn that shows that if you ask whether the achievement gap has been closing, NAEP will give you different answers - very different answers - depending on whether you use changes in scale scores, changes in percent above Basic, or changes in percent above Proficient. This is not because the relative progress has been different at different levels of performance; it is simply an artifact of using percents above standards. This is only one of many problems with standards-based reporting, but in my opinion, it is by itself sufficient reason to return to other forms of reporting.

September 22, 2008

Come on Feel the Noise!

Last week, New Yorkers scratched their heads and tried to make sense of the Progress Report results. What does it mean, for example, when 77% of schools that received an F last year jump to an A or a B? Michael Bloomberg has a resolute answer to this question, “Not a single school failed again....The fact of the matter is it’s working.”

Last week, skoolboy and I took to our computers with the newly released data. Of particular concern is the progress measure, which makes up 60% of a school’s grade. Both skoolboy and Dan Koretz have already identified serious flaws in DOE’s test progress model. Even in the absence of these problems, we know that all models of year-to-year growth must contend with measurement error present in two different tests.

What the heck is measurement error? Bear with us for two paragraphs, because this is critical to understanding the central problem with the Progress Reports. A test score is just a proxy for students' underlying skills and competencies. If you give a student a test, the test score represents the combination of her "true" level of skills plus measurement error. This error may be a function of idiosyncratic factors like not eating breakfast (which might hurt your score), having the good fortune of having studied the material that happens to be on the test (which would increase your score over your true level of skill), or a dog barking during the test (which might decrease the scores of all students in a classroom). A "gain score" represents the difference between two test scores, both of which are measured with error, so they provide noisy estimates.

If measurement error was constant, then it would just cancel out when we difference the two scores. But we know that measurement error is likely to be random – the two errors do not just cancel out. Another kind of error stems from sampling variation, which I have discussed here before. In short, the more measurement error (or “noise”) in the results, the harder it is to detect the “signal” that represents a school’s actual contribution to growth in student learning.

In what follows, we demonstrate that there is almost no relationship between NYC schools' progress scores in 2007 and 2008. The progress measure, it appears, is a fruitless exercise in measuring error rather than the value that schools themselves add to students. If we believe that the Progress Reports are in the business of cleanly identifying schools that consistently produce more or less progress, this finding is rather troublesome.

First, some sunnier results: Below, we provide scatterplots of the relationship between the overall environment and performance-level scores in 2007 and 2008 for the 566 elementary schools that received overall grades in both years. In both cases, last year’s score is a strong predictor of this year’s score. To quantify the extent to which two variables move together, we can make use of a measure called a correlation coefficient. A correlation of 0 implies that the variables have no relationship, while a correlation of 1 represents a perfect positive relationship. We find that the correlation is .82 for the performance score and .75 for the environment score. This is exactly what we would expect – schools’ performance or climates do not wildly change from year to year.

Environment%20and%20Performance%20Plots.jpg

But the relationship between the 2007 and 2008 progress scores is quite different – the correlation is -.02. In other words, there is almost no relationship! This is precisely what we would expect to see if the growth measures were primarily capturing measurement error. (These correlations are still low, but slightly larger, for K-8 and middle schools - the correlations were .11 and .15, respectively.)

Progress%20Plot.jpg

We are left with three possible explanations:
1) The poorly constructed progress measure is simply measuring noise.

2) The DOE somewhat tweaked the progress measure for this year, so the results are not comparable.

3) The receipt of and publicity around last year’s progress measures fundamentally changed how New York City’s elementary schools do business, so that schools that were more successful in raising student achievement in 2007 suddenly became less so, and schools that were less successful in raising student achievement in 2007 suddenly became more so.
New Yorkers are left with three courses of action:
* If explanation 1 is correct, we should ignore these report cards altogether because they are primarily (60%) measuring error.

* If explanation 2 is correct, we should not compare schools' grades in 2007 with their grades in 2008, because they are measuring fundamentally different dimensions of school performance. In this case, the collective hysteria that has ensued in NYC schools last week about why grades are up or down is all for naught.

* And if explanation 3 is correct, eduwonkette and skoolboy should shut up and get out of the way of the silent revolution that has transformed public schooling in New York City.
Thanks to skoolboy’s masterful analysis of the data, we present evidence below the fold to suggest that the likely culprit is measurement error. The evidence is not conclusive, because every single element of the progress measure—and there are 16 of them in this year’s student progress measure—changed slightly from last year to this year. The strategy that we pursue below is to compare those elements of the progress measure that were used in both years - for example, the percentage of students making at least one year of progress, or the average change in proficiency scores. Again, we stress that these measures were not identical across years, but one would expect them to be moderately related. Needless to say, that is not what we found. We think it extremely unlikely, given these analyses described in detail below, that this is simply due to a tweaking of the progress report measures.

And what of the third explanation—a fundamental overhaul in the effectiveness of New York City’s elementary and middle schools over the past year that reshuffled the effective and ineffective schools? Magical transformations that shift schools from low to high-progress, or vice versa, are the fabled stuff of Hollywood movies, not reality. Real school change, unfortunately, is not an overnight affair.

Where does this leave NYC parents, teachers, and principals, all of whom are trying to make sense of what these measures mean? Bottom line: It's impossible to know what your A or your F means, because these grades are dominated by random error. Let's hope that the DOE heads back to the drawing board rather than continuing to defend the indefensible.

A key measure in both last year’s and this year’s student progress measure is the percentage of students making at least one year of progress in ELA and in Math, where a year of progress is defined as attaining the same or higher proficiency rating in 2008 in the subject as the student received in 2007, with a minimum proficiency rating of 2.00 in 2008. Three changes to this are new this year: (a) if a student scored at Level IV in both 2007 and 2008, that student is counted as making one year of progress, even if the proficiency rating declined from 2007 to 2008 (b) all students who were designated Special Education in 2007 receive a +0.2 addition to their 2007 proficiency rating before calculating whether a year of progress was achieved; and (c) any middle school student earning an 85 or higher on the Math A or Integrated Algebra Regents exam is automatically classified as making one year of progress in Math.

For elementary schools, the correlation between the peer horizon score for the percentage of students making at least one year of progress in ELA in 2007 and in 2008 is -.10, and the correlation for the citywide horizon score over the two years is -.09. There is essentially no stability over time in which elementary schools were successful in advancing their students a year in ELA achievement. The story is even more surprising at the K-8 and middle school levels; the K-8 peer horizon correlation is -.15, and citywide horizon correlation is -.16, whereas the middle school peer horizon correlation is -.24, and citywide horizon correlation is .01.

The stability in a school’s ability to advance its students a year of progress in Math in 2007 and 2008 is a bit higher, especially at the middle school level. For elementary schools, the correlation of the peer horizon score in 2007 and 2008 is .09, and for the citywide horizon score it’s .16. Among K-8 schools, the peer horizon score correlates -.03, and the citywide horizon score correlates .11. The greatest stability is seen at the middle school, where the over-time correlation for the Math peer horizon score is .33, and for the citywide horizon score is .32.

We did the same kind of over-time calculation for the average change in proficiency scores from 2007 to 2008, which also involved the Special Education adjustment in 2008. Five of the six correlations for the average change in ELA proficiency, which range from -.16 to -.37, are negative and statistically significant. What this means is that the schools that were judged to be more effective in raising students’ ELA proficiency in the 2007 report card were significantly less successful in producing ELA gains in 2008 than the schools that were less effective in 2007.

At best, there is no correlation over time in the DOE’s reports of which schools are good at inducing growth in ELA achievement. At worst, the DOE’s system finds that the schools that were better than average in 2007 were actually worse than average in 2008.

September 19, 2008

COWAbunga Award!

cowabunga-award.jpg
This week's COWAbunga Award goes to DoubleDown, who provided his take on NYC's Progress Reports. Readers outside of New York, listen closely - this system could very well become a model for the nation. Here's an excerpt:
You have to feel sorry for the brain trust working on [the Progress Reports]. Up until this week, if you told the American public that you had hired a group of really smart economists to develop some complicated statistical models that explain the entire universe, many people would have been very impressed. But that was before those really smart economists crashed the entire global economy.

(...) To be blunt, the results make no...sense to anyone. Schools shouldn't be jumping all over the place in terms of progress scores or achievement. That is simply not reality. A school that moved up should be proudly proclaiming how they spun gold from wet straw, but the principal says nothing changed. What strange statistical model is spitting out this nonsense?....recoding statistical noise into a letter grade does not an accountability system make.

Now that the results are out and the validity in question for a second year, critics are being told: "Statistics don't lie." But ordinary people know that companies shouldn't be allowed to borrow 30 times more than what they own so they can gamble the money on other stocks. You don't need a Ph.D. in economics to know what that smells like. New York's "Progress Reports" have the same foul stench. Maybe everyone will feel differently about it in 10 or 20 years when we have forgotten the damage done, but right now something about value-added models seems just a bit too...risky.

September 18, 2008

GothamSchools Geeks Out on Sampling Error!

Philissa Cramer totally geeks out over at GothamSchools, and posts a great figure showing that smaller schools were more likely to experience wild swings in their school grades. Head over and check it out.

September 17, 2008

Between a Political Rock and a Statistical Hard Place

Some days, skoolboy feels bad for the hard-working folks in the New York City Department of Education. They’re caught between a political rock and a statistical hard place. The political rock is the New York State accountability system, which complies with No Child Left Behind’s requirements to test students annually in grades 3-8 in Mathematics and English Language Arts, and to classify students, based on their test scores, as either Not Meeting Learning Standards (Level I), Partially Meeting Learning Standards (Level II), Meeting Learning Standards (Level III), or Meeting Learning Standards with Distinction (Level IV), and then aggregate the performance of students, and subgroups of students, to assess the school’s progress toward the goal of 100% proficiency for all students by the year 2014. The mechanism for this is a series of grade-specific exams, with a broad (but arbitrary, as Dan Koretz explains in Measuring Up) standard-setting process that define the scores on the exam that correspond to the four proficiency levels. Whatever a student’s scale score on the exam, he or she is classified into a particular proficiency level.

The statistical hard place is that the proficiency levels are only part of the story. The NYC DOE has found that the scale scores matter, such that a student whose scale score is halfway between the cutoffs for Level II and Level III, and therefore whose proficiency level is Level II, has a higher probability of graduating from high school on time than a student whose scale score is right at the cutoff for Level II. The scale scores have predictive validity—that is, they predict educational outcomes that we think of as important—but they don’t have the political currency of the proficiency levels specified by the state and the federal government.

There’s no evidence, to skoolboy’s knowledge, that achieving a proficiency level on NCLB-style exams has any predictive validity over and above the scale scores on which they are based. (Another regression discontinuity design study waiting to happen.) But I’ll wager that they don’t.

Whether or not the state/NCLB proficiency levels matter, the NYC DOE is stuck. They have to pay homage to the state standards, even though their internal evidence shows that partial progress—“learning quite a bit,” in skoolboy’s terms—really does matter for students’ futures, and therefore is something that schools should be held accountable for.

And I don’t disagree. I would be comfortable (though not ecstatic) with school progress reports that used changes in scale scores to quantify how much students had learned from one year to the next, under two conditions: (a) if the exams were vertically linked, and (b) if the uncertainty in the estimates of school-level effects on the average change were taken into account. Neither of these conditions is met in the current New York City School Progress Reports.

Navigating the political rock and the statistical hard place is definitely a challenge, both rhetorically and in the construction of the School Progress Reports. Rhetorically, the DOE is obliged to argue that a student who is Level III in fourth grade and Level II in fifth grade has lost ground—that student has fallen off of the sharp Level III cliff—because the state and federal accountability metrics treat this as a sharp discontinuity. But as a practical matter, the student may not have fallen off a cliff; rather, she may be just a little bit lower on a gradual hill in fifth grade than we’d like, but still higher on the hill than she was in fourth grade--and the DOE’s internal analyses document that anyone who is higher on the hill is better off than someone lower.

What’s the DOE to do? Well, it could continue to escalate the rhetoric directed toward its critics. (I note with alarm that the DOE went from calling me by my blogging name “skoolboy” on Monday to calling me “Professor Pallas of Teachers College” on Wednesday—whose proclivity to giving A’s to all of his students will come as a surprise to many of them—what’s next? Examining my teeth?) Or it could speak honestly and openly about the challenge of incorporating political and technical realities into the School Progress Reports. I think readers know which path skoolboy recommends.

Guest Blogger Daniel Koretz on New York City's Progress Reports

Koretz.jpg
Daniel Koretz is a professor who teaches educational measurement at the Harvard Graduate School of Education. He is the author of Measuring Up: What Educational Testing Really Tells Us. Below, he weighs in on the NYC Progress Reports that were released yesterday.

eduwonkette: One of the key points of your book is that test scores alone are insufficient to evaluate a teacher, a school, or an educational program. Yesterday, the New York City Department of Education released its Progress Reports, which grade each school on an A-F scale. 60 percent of the grade is based on year-to-year growth and 25 percent is based on proficiency, so 85 percent of the grade is based on test scores. Do you have any advice to New Yorkers about how to use - or not to use - this information to make sense of how their schools are doing?

Koretz: This is a more complicated question in New York City than in many places because of the complexity of the Progress Reports. So let’s break this into two parts: first, what should people make of scores, including the scores New York released a few weeks ago, and second, what additional should New Yorkers keep in mind in interpreting the Progress Reports?

In the ideal world, where tests are used appropriately, I give parents and others the same warning that people in the testing field have been offering (to little avail) for more than half a century: test scores give you a valuable but limited picture of how kids in a school perform. There are many important aspects of schooling that we do not measure with achievement tests, and even for the domains we do measure—say, mathematics—we test only part of what matters. And test scores only describe performance; they don’t explain it. Decades of research has repeatedly confirmed that many factors other than school quality, such as parental education, affect achievement and test scores. Therefore, schools can be either considerably better or considerably worse than their scores, taken alone, would suggest.

However, there is another complication: when educators are under intense pressure to raise scores, high scores and big increases in scores become suspect. Scores can become seriously inflated—that is, they can increase substantially more than actual student learning. This remains controversial in the education policy world, but it should not be, because the evidence is clear, and similar corruption of accountability measures has been found in a wide variety of different economic and policy areas (so widely that it goes by the name of “Campbell’s Law”). High scores or big gains can indicate either good news or inflation, and in the absence of other data, it is often not possible to distinguish one from the other. As you know, this was a big issue in New York City this year, in part because some of the gains, such as the increase in the proportion at Levels 3-4 in 8th grade math, were remarkably large.

New York City is a special case. It is always necessary to reduce the array of data from a test to some sort of indicators, and NYC has developed its own, called the Progress Reports, which assign schools one of five grades, A through F. My advice to New Yorkers is to pay attention to the information that goes into creating the Progress Reports but to ignore the letter grades and to push for improvements to the evaluation system.

The method for creating Progress Reports is baroque, and it is hard to pick which issues to highlight in a short space. The biggest problems, in my opinion, lie in the estimation of student progress, which constitutes 60% of the grade. The basic idea is that a student’s performance on this year’s test is compared to her performance in the previous grade, and the school gets credit for the change. It sounds simple and logical, but the devil is in the details. (For a non-technical overview of the issues in using value-added models to evaluate teachers and schools, see “A Measured Approach”.)

To keep this reasonably brief, I’ll focus on three problems. First, the tests are not appropriate for this purpose. skoolboy made reference to part of this problem in a posting on your blog. To be used this way, tests in adjacent grades should be constructed in specific ways, and the results have to be placed on a single scale (a process called vertical linking). Otherwise, one has no way of knowing whether, for example, a student who gets the same score in grades 4 and 5 improved, lost ground, or treaded water. The tests used in New York were not constructed for this purpose, and the scale that NYC has layered on top of the system for this purpose is not up to the task.

And that points to the second problem, which again skoolboy noted: the entire system hinges on the assumption that one unit of progress by student A means the same amount of improvement in learning as one unit by student B. This is what is called technically an interval scale, meaning that a given interval or difference means the same thing at any level. Temperature is an interval scale: the change from 40 to 50 degrees signifies the same increase in energy as the change from 150 to 160. There is no reason to believe that the scale used in the Progress Reports is even a reasonable approximation to an interval scale. It starts with the performance standards, which are themselves arbitrary divisions and cannot be assumed to be equal distances apart. The NYC system assigns to these standards new scores that nonetheless assume that the standards are equidistant—so, for example, a school gets the same credit for moving a student from Level 1 to Level 2 as for moving a student from Level 2 to Level 3. Moreover, the NYC system assumes that a student who maintains the same level on this scale has made “a year’s worth of progress.” That assumption is also unwarranted, because standards are set separately by grade, and there is no reason to believe that a given standard, say, Level 3, means a comparable level of performance in adjacent grades. (There is in fact some evidence to the contrary.)

The result is that there is no reason at all to trust that two equally effective schools, one serving higher achieving students than another, will get similar Progress Report grades. Moreover, even within a school, two students who are in fact making identical progress may seem quite different by the city’s measure. There may be reasons for policymakers to give more credit for progress with some students than for progress with others, but if one does that, you no longer have a straightforward, comparable measure of student progress.

And finally, there is the problem of error. People working on value-added models have warned for years that the results from a single year are highly error-prone, particularly for small groups. That seems to be exactly what the NYC results show: far more instability from one year to the next than could credibly reflect true changes in performance. Mayor Bloomberg was quoted in the New York Times on September 17 as saying, “Not a single school failed again. That’s exactly the reason to have grades…It’s working.” This optimistic interpretation does not seem warranted to me. The graph below shows the 2008 letter grades of all schools that received a grade of F in 2007. It strains credulity to believe that if these schools were really “failing” last year, three-fourths of them improved so markedly in a mere 12 months that they deserve grades of A or B. (The proportion of 2007 A schools that remained As was much higher, about 57 percent, but that was partly because grades overall increased sharply.) This instability is sampling error and measurement error at work. It does not make sense for parents to choose schools, or for policymakers to praise or berate schools, for a rating that is so strongly influenced by error.

We should give NYC its due. The Progress Reports are commendable in two respects: considering non-test measures of school climate, and trying to focus on growth. Unfortunately, the former get very little weight, and the growth measures are not yet ready for prime time.

2008 Letter Grades of Schools that Received an F Grade in 2007

NYC%20F%20schools.png

NYC Progress Report Chutes and Ladders!

A week ago, skoolboy encouraged readers to predict schools' upward and downward grade mobility. Here's how that shook out. When 26% of elementary and middle schools that received Fs last year - 9 schools - climb from a F to an A, it does make you wonder what exactly it is that we are measuring. Likewise, 26 schools cascaded from As or Bs to Ds or Fs. Readers, stare into the table and tell me what you see...

grade%202007%20and%202007.jpg

September 16, 2008

In NYC, More F Schools than A Schools in Good Standing with NCLB

Some of you have asked what fraction of NYC schools receiving each Progress Report grade are in good standing with NCLB. As a refresher, NCLB labels schools in need of improvement based on overall proficiency. NYC's system is based 60% on year-to-year growth, 25% on proficiency, 5% on attendance, and 10% on surveys.

Given these differences, perhaps you won't be surprised to find that a higher fraction of F schools are in good NCLB standing than are A schools:

* 74% of A schools are in good standing with NCLB

* 67% of B schools are in good standing with NCLB

* 69% of C schools are in good standing with NCLB

* 48% of D schools are in good standing with NCLB

* 89% of F schools are in good standing with NCLB

What if we just look at the "performance grade", aka the proficiency grade, that each school received, and see how that maps on to NCLB good standing? Recall that this year, schools also were given separate grades for the performance, progress, and environment categories. I guess the peculiar results below are a function of the fact that schools are being compared to peer groups, but here's what I've got:

* 86% of A schools based on proficiency on the are in good standing with NCLB

* 60% of B schools are in good standing with NCLB

* 60% of C schools are in good standing with NCLB

* 51% of D schools are in good standing with NCLB

* 75% of F schools are in good standing with NCLB

All Progress Reports, All the Time

The new NYC Progress Reports are out, and I'm busy analyzing the data now. Have ideas about what I should look at? Leave a comment below.

Irreconcilable Differences: Why NYC’s Surveys Provide a Misleading Portrait of School Quality

eduwonkette-NYC.jpg
My heart went out to Charlie Gibson last week, as he stared into those doe eyes that will not blink and realized that he could not wrangle a single straight answer out of Miss Wasilla.

So I can only imagine how the NYC Department of Education analysts’ felt when they sat down to analyze the data from student, parent, and teacher surveys this year. It turns out that you get as much valid and reliable information out of these surveys as Gibson managed to pull out of Sarah Palin.

The problem is a very simple – and very predictable – one. Survey responses constitute 10% of the Progress Report Grades, and schools face very real consequences if their schools receive a poor grade. Faced with such pressures, we expect that the adults who fully understand these consequences – parents and teachers – will provide a rosier picture of the school than truly exists.

If all schools did this equally, the inflation of survey responses would not be a problem; we could still rank schools by their perceptions of safety, engagement, or what have you. We would not have a clean measure of how safe a school is overall, but we would know how safe it was relative to other schools – a central objective of the grading system.

Alas, schools face different incentives to inflate their survey responses. If you’re a teacher filling out a survey in an F school, you know that your school could very well be closed if its grade doesn’t improve. Compared to a teacher filling out a survey in an A school, you’re more likely to put on a happy face.

One way to get at this problem is to compare changes in the teacher responses to the survey with changes in the student responses. We know that students and teachers don’t see eye-to-eye about school conditions, so we don’t expect them to provide comparable assessments of the school in any given year. But if teachers report improvement at a rate that far outpaces the improvement reported by the students, and this happens more in D and F schools than A and B schools, we have pretty good evidence that teachers have inflated their responses.

To get a handle on survey inflation, I did a basic calculation for each of the 4 survey domains: safety, communication, academic expectations, and engagement. Using the example of safety, I calculated:

(2008 Teacher Survey Score for Safety – 2007 Teacher Survey Score for Safety) –
(2008 Student Survey Score for Safety – 2007 Student Survey Score for Safety)


At schools that have positive scores on this measure, teachers report a pace of improvement that outpaces the improvement that students report. Kids are often the best check on us wily adults, and it turns out that they function as a first-rate BS detector in this case. I should also note that students may be pressured to inflate their scores, so if anything, the difference between the teacher and student changes is a lower bound measure of survey inflation.

The first graph below reports the average of these differences for the safety measure for high schools receiving A to F grades. At A schools, students and teachers saw improvement happening equally – there is almost no difference between the change in teacher scores and the change in student scores. At F schools, there are tremendous differences between the rate of improvement reported by teachers and students.

hs%20safety.jpg

The teacher-student discrepancy exists for every measure on the survey. Next, let’s look at the engagement measure for high schools.

hs%20engagement.jpg

Bottom line: survey inflation exists across the board, but is worst at D and F schools. If you’d like figures for the other domains or school levels, feel free to email me. The irony, of course, is that instead of having better information about how things are going in NYC schools, incorporating the surveys in the grading scheme has fundamentally corrupted this measure.

September 14, 2008

Let the Spin Begin

top.gif

Suppose that your fourth-grader takes a state test that shows that she understands the associative property of multiplication, can multiply two-digit numbers by two-digit numbers, and can find the perimeter of a polygon by adding up the length of the sides. A year later, as a fifth-grader, she takes a test that shows that she can compare fractions and decimals using <, > or =; identify the factors of a given number; simplify fractions to their lowest terms; and knows that the sum of the interior angles of a quadrilateral is 360 degrees—but she cannot yet create algebraic or geometric patterns using concrete objects or visual drawings (e.g., rotate and shade geometric shapes). Would you say that your child had lost ground in proficiency, or actually gone backward?

Jim Liebman would. Liebman, the Columbia University law professor on leave as Chief Accountability Officer at the New York City Department of Education, is quoted and paraphrased in an article by Jim Dwyer in Saturday’s New York Times on the F grade that P.S. 8 in Brooklyn Heights will receive in this year’s School Progress Reports—a grade that many are finding hard to believe, given that 80% of the students tested in the school are judged proficient in math, and two-thirds are judged proficient in English Language Arts. Doubly embarrassing, in that Chancellor Joel Klein and Mayor Mike Bloomberg have publicly declared the school to be successful and worthy of emulation.

So the spinmeisters are out, and the spin here is justifying the grade of F by arguing that the children in P.S. 8 are going backward. “You drop them off at the beginning of the year, and on average, by the end of the year, your child lost ground in proficiency,” Dwyer quotes Liebman as saying. “Where was the child last year, and where is the child this year?” Liebman asked. “You’re comparing them to themselves.”

A gentle reminder to Mr. Liebman, who was hired in January, 2006: the state math and ELA tests which children take, and are the primary basis for assigning these lovely letter grades, are not vertically equated. (See skoolboy's testing primer here.) This means that there is no basis for comparing performance on the fourth-grade test with performance on the fifth-grade test. For each test, there is a subjective judgment about what level of performance constitutes proficiency, but the tests are independent. There is no basis for claiming that children are going backward; there’s no justification for claiming that a child “lost ground in proficiency,” since proficiency doesn’t exist in the abstract, but rather in grade-specific skills; and the children are not being compared to themselves, but rather their location in the distribution of children’s performance in one year is being compared to their location in the distribution of children’s performance the following year.

Perhaps Jim Liebman simply misspoke, as perhaps did Chancellor Joel Klein when he referred to statistical significance as “playing something of a game.” Such missteps might arise from the tremendous pressure to justify a particular high-stakes evaluation of a school when there are multiple sources of information about school performance that point in different directions—NCLB status, achievement levels, gains, school quality reviews, not to mention the public pronouncements of Liebman’s boss, and his boss’s boss.

There’s nothing wrong, in skoolboy’s view, in looking at students’ achievement growth as one of several criteria for judging how well a school is doing in relation to other schools. But I would never think of using year-to-year changes in proficiency levels on just two tests as the primary basis for evaluating a school’s performance. And neither would most people who study testing and assessment for a living.

September 12, 2008

Cool People You Should Know: Doug Downey

Doug-Downey.jpg
To many observers of public education, there is no doubt about which schools are failing - it's the schools with low rates of students passing state tests, stupid!

Of course, this assumes that students' achievement is a direct measure of school quality. "Yet we know that this assumption is wrong....It follows that a valid system of school evaluation must separate school effects from nonschool effects on children's achievement and learning" writes Doug Downey, a cool Ohio State sociologist of education you should know, in his recent paper (in collaboration with Paul von Hippel and Melanie Hughes), "Are 'Failing' Schools Really Failing?"

Analyzing data from the Early Childhood Longitudinal Study - Kindergarten Cohort, a national sample of 21,000 kindergarteners that were then followed through 5th grade, Downey and colleagues thus set out to isolate the effects of schools on student learning. The ECLS data are uniquely suited for this task because the study evaluated students in the fall and spring of kindergarten, and again in the fall and spring of first grade. It turns out that summers - a time when students are only affected by non-school influences - are the key to teasing apart school and nonschool factors.

Downey and colleagues look at schools' effectiveness in four different ways. First, they examine NCLB's method - overall test score levels. They then turn to 12-month learning rates; think growth models, which measure test score growth, for example, between a test given in April 2007 and a test given in April 2008. They contrast those rates with 9-month learning rates; imagine a test given in September, and then again in May. Finally, they introduce a measure called impact, which is the difference between the school year and summer learning rate.

"Impact" is attractive because it doesn't require us to measure and statistically control for all of the different aspects of children's nonschool environments that may affect school success, as do cardiac surgery report cards. It captures what we need to know about students' out-of-school environments without bogging us down in the methodological and political problems associated with introducing these controls. And it helps us adjust for "soft" factors like innate student motivation, for which it is difficult to measure and control. Moreover, it holds schools harmless for what happens to their students over the summer, which currently serves as a confounding factor in growth models.

What percent performing in the bottom 20% of overall achievement are actually in the bottom 20% for measures of impact and learning? Less than half! High-achieving schools are concentrated in more affluent communities, but "high impact" schools exist across the socioeconomic spectrum. And the opposite is true. There are plenty of school with good test scores that are skating by because simply because they had advantaged kids to begin with.

What does this all mean for NCLB? Downey and colleagues put it like this:
Our results raise serious concerns about the current methods that are used to hold schools accountable for their students' achievement levels. Because achievement-based evaluation is biased against schools that serve the disadvantaged, evaluating schools on the basis of achievement may actually undermine the NCLB goal of reducing racial/ethnic and socioeconomic gaps in performance. If schools that serve the disadvantaged are evaluated on a biased scale, their teachers and administrators may respond like workers in other industries when they are evaluated unfairly - with frustration, reduced, effort, and attrition. Under a fair system, a school's chances of receiving a high mark should not depend on the kinds of students the school happens to serve.
Crystal clear, creative thinking is the distinguishing feature of Downey's work - see, for example, his paper on school effects on child obesity, or his paper asking if schools are "the great equalizer."

Wonks can rest a little easier tonight with the knowledge that Downey's now turned his attention to NCLB.

Schools Restructuring under NCLB: Blow ‘em up Good?

95129c.jpg

This morning, the Center for Education Policy in Washington, DC is issuing the latest in a series of state-level reports on the fate of schools restructuring under NCLB policy. Today’s report, authored by Brenda Neuman-Sheldon (a one-time student of skoolboy’s, but I hear that she’s back on solid food), examines restructuring schools in Maryland. In 2007-08, Maryland had 38 schools in restructuring planning, a huge increase over the four schools the preceding year, and 64 schools in restructuring implementation, a 7% decline from the preceding school year. The restructuring schools are concentrated in a small number of Maryland’s 24 school districts, with 61% of the restructuring schools in Baltimore City, and an additional 30% in Prince George’s County, which adjoins Washington, DC. This concentration has stretched the capacity of the state and these districts to support restructuring planning and implementation. Prince George’s County, for example, soared from one school in restructuring planning in 2006-07 to 21 in 2007-08.

Neuman-Sheldon identifies a major shift in the form that restructuring schools in Maryland is taking. Whereas 58% of the schools in restructuring implementation in 2007-08 relied primarily on the appointment of a school “turnaround specialist” as the engine of restructuring (already a decline from the 73% using this option in 2005-06), all of the schools in restructuring planning that had submitted a plan at the time the report was written were proposing some form of “zero-based staffing”—i.e., replacing most or all of the staff in the school or asking all staff to reapply for their positions. It’s the neutron bomb theory of school reform!

But is it a good theory? That remains to be seen. What mechanism will bring highly-qualified teachers to these failing schools? Where will the tenured teachers who leave the schools go? In schools that replace only some of their staff, how will decisions about who stays and who leaves be made?

Beyond these logistical questions, though, lies another fundamental challenge: will changing the staffing—including the principals, who, Neuman-Sheldon reports, are often surprised to learn that when they select zero-based staffing as an option, they’re placing their own jobs on the line—fundamentally alter the context for teaching and learning in the school, when other powerful forces shaping teaching and learning aren’t changing at all?

September 11, 2008

COWAbunga Award!

cowabunga-award.jpg
This week's COWAbunga Award goes to two comments that explain why medicine and education have followed very different paths when it comes to accountability. The first comment is from eiela, a teacher librarian:
I think the reason we don't want to inject the idea that student achievement is based partly on what [students] come to school with (parent support, poverty rates, etc.) into the NCLB debate is because it comes too close to admitting that our public education system doesn't help everyone equally. And that education does give everyone the same advantages is one of our cherished public ideals....We don't want to admit that there are problems that are too big for education as it exists right now to fix.

I've often wished that if I am going to be held so accountable for student performance that we had a boarding school system, so I could make sure my students had a quiet place to do homework, a good dinner and breakfast, etc....I like the idea of value-added assessments; we get value-added scores for each classroom teacher in my state. I wish that NCLB took those scores into account....I know one year, our value-added scores were great, yet we still didn't make AYP because our students were so far behind to begin with. It's very demoralizing to be labeled in the news as a failing school when you've made so much progress based on where the students started.
The second winner is Erin Johnson, who, in a series of comments, made compelling arguments about the differences in the evidence bases for educational and medical practice. Read them all here, and here's a tasty morsel from one of them:
The development of medicine and education was not random. Both were a function of very specific decisions made by key opinion leaders and laws passed both on the state and federal levels....We take for granted the the scientific, evidentary basis of our medical system, but it was not pre-ordained to be so.

(Un) Heartbreaking Links of Staggering Genius

Einstein_tongue.jpg
1) Once Upon A School: Dave Eggers, author of A Heartbreaking Work of Staggering Genius, was awarded the TED Prize to help him fulfill this wish:
I wish that you -- you personally and every creative individual and organization you know -- will find a way to directly engage with a public school in your area, and that you'll then tell the story of how you got involved, so that within a year we have 1,000 examples of innovative public-private partnerships.
Now there's a website called "Once Upon A School" that's tracking project ideas for engaging in local schools. If you're looking for a little shot of edu-Red Bull this week, check them out.

2) The Inalienable Right to Catch and Transmit Potentially Fatal Infections?: Over at Dangerously Irrelevant, Scott McLeod asks about the delicate balance between parents' individual rights to deny vaccinations and social responsibility to protect other kids. Is it going to take a high-publicized outbreak of bizarre diseases for folks to wake up on this public health no brainer?

3) Heckman on Our Minds: Public School Insights, a blog dedicated to sharing what's working in public schools, posts about Jim Heckman, Nobel prize winning economist who has been advising the Obama campaign about early childhood ed issues. Also in early childhood ed news, Steve Barnett, via the National Institute for Early Education Research, releases a new report, Preschool Education and its Lasting Effects.

4) And Speaking of Geniuses: Fall semester means an explosion of skinny jeans on campus, as well as the return of Debbie Meier and Diane Ravitch back to the blogosphere - just in time to kick around ideas of the next president. Take a look at Meier's advice to the next president, as well as Ravitch's insights on the changing politics of education.

5) Charles Murray - The Best Proof that You Don't Get Wiser with Age?: Stay tuned for my review of the book, but in the meantime, Karin Chenoweth hands the good sir's tushie to him in this trenchant critique.

6) Harold and Maude Strike Back: Michele McNeil, who has been doing a bang-up job covering the election, reports on the McCain sex ed ad. And I thought the true meaning of nuts had been thoroughly fleshed out over the last eight years.

7) Core Carnival Knowledge: Big props to Robert Pondiscio, who put together a fabulous Carnival of Education this week.

September 10, 2008

Obama-Biden on the New Report Cards

parent%20report%20card.bmp

skoolboy doesn’t fancy himself a particularly political creature, although some readers would likely argue that I’m kidding myself, in that blogging is an inherently political activity. In any event, I haven’t chosen to do a close analysis of the positions or proposed policies of the finalists in our Presidential derby. I’ll make a brief exception today, not to make political hay, but rather to try to illuminate an enduring sociological challenge.

Yesterday, Barack Obama issued a new plan for school reform, emphasizing choice and innovation, investments in technology, enhanced college readiness, incentives for improved classroom teaching, and heightened responsibility from parents and from the federal government. The last piece of this agenda calls for the creation of quarterly parent report cards to support individual learning plans. Press reports of this component of the Obama agenda conveyed the impression that such report cards would simply be a fancy repackaging of the periodic report cards that parents already receive itemizing how their children are doing in school. But the Obama plan has something more ambitious in mind, including “the concrete information [that parents] need to help improve their child’s performance each year and plan for post-high school education”:

  • Where their child is expected to perform at their grade level to be ready for high school graduation and post-high school education

  • Information about local afterschool, summer learning, tutoring, and/or mentoring programs that might provide additional assistance to students who have fallen behind and provide additional hands-on learning opportunities for students who excel in certain subject areas

  • Information about alternative public schooling options in the area that the student may be able to attend, and how those schools’ students are performing

  • Expected amount of savings a family should have for future college tuition and information about eligibility for federal and state tax credits, grants, and other financial assistance

Is more information inherently better than less information? No, skoolboy thinks, not if more information is overwhelming. This is a remarkably diverse set of objectives, and each of them would require at least a term paper’s worth of material to convey what’ s important. Providing parents with the information necessary to enable them to choose between their child’s current school and alternatives? What’s the right metric here? Value-added models of school effects? I've seen highly-educated professionals struggle to understand them. Concrete information on how a child is expected to perform at the child’s grade level? You can find this on most state department of education websites, but it’s not something that can be summarized in a page or two.

The more serious problem, though, is the assumption that providing information in and of itself creates a logic for action. The available evidence calls into question both the inclination and the ability of parents to use information to make decisions regarding their children’s schooling. Moreover, these orientations and predispositions are linked to social class. skoolboy’s long-time colleague Annette Lareau, noted here as a cool person you should know, has written extensively about the differing childrearing and schooling practices of middle-class and working-class parents. Her analyses show that middle-class parents are predisposed to see family and school as connected, and to be proactive in seeking out and evaluating educational opportunities for their children. Working-class parents care just as much about their children’s education, but they see family and school as separate, and are less likely to intervene in what they view as the responsibility of the school.

Provision of this information, therefore, could have the unintended consequence of exacerbating social class differences in schooling. Middle-class parents may be better able to make sense of the information, and will be more prepared to act on it. Working-class parents may be overwhelmed by it, and will not necessarily know how to translate the information into concrete action steps. It wouldn’t be the first policy initiative to founder by assuming that everyone behaves like the middle class.

And finally: “quarterly”? Maybe that’s just rushed copyediting…

September 9, 2008

Lessons for No Child Left Behind from "No Cardiac Surgery Patient Left Behind"

heart_art.jpg
New AYP numbers are out, folks. In California, only 48% of schools made AYP, and only 34% of middle schools did so. In Missouri, only about 40% of schools made AYP. Pick almost any state, and you'll see that there are soaring numbers of schools designated as "in need of improvement." With numbers like these, it's worth considering whether NCLB's measurement apparatus is accurately identifying "failing schools."

One way to get leverage on this question is to consider how other fields approach the issue of accountability. Doctor and hospital accountability for cardiac surgery - also the topic of a NYT commentary today - is instructive in this regard. Borrowing heavily from previous work, let me outline how state governments have approached doctor and hospital accountability in medicine. In subsequent posts this week, I'll write about the outcomes of medical accountability systems, as well as some of their unintended consequences.

Medicine makes use of what is known as “risk adjustment” to evaluate hospitals’ performance. Since the early 1990s, states have rated hospitals performing cardiac surgery in annual report cards. The idea is essentially the same as using test scores to evaluate schools’ performance. But rather than reporting hospitals’ raw mortality rates, states “risk adjust” these numbers to take patient severity into account. The idea is that hospitals caring for sicker patients should not be penalized because their patients were sicker to begin with.

In practice, what risk adjustment means is that mortality is predicted as a function of dozens of patient characteristics. These include a laundry list of medical conditions out of the hospital’s control that could affect a patient’s outcomes: the patient’s other health conditions, demographic factors, lifestyle choices (such as smoking), and disease severity. This prediction equation yields an “expected mortality rate”: the mortality rate that would be expected given the mix of patients treated at the hospital.

While the statistical methods vary from state to state, the crux of risk adjustment is a comparison of expected and observed mortality rates. In hospitals where the observed mortality rate exceeds the expected rate, patients fared worse than they should have. These “adjusted mortality rates” are then used to make apples-to-apples comparisons of hospital performance.

Accountability systems in medicine go even further to reduce the chance that a good hospital is unfairly labeled. Hospitals vary widely in size, for example, and in small hospitals a few aberrant cases can significantly distort the mortality rate. So, in addition to the adjusted mortality rate, confidence intervals are reported to illustrate the uncertainty that stems from these differences in size. Only when these confidence intervals are taken into account are performance comparisons made between hospitals.

Contrast this approach with that used by the New York City Department of Education's progress reports, where "point estimates" are used to array schools on an A-F continuum with no regard for measurement error. Readers know well that your friendly neighborhood "statistical nut" has no beef with the use of sophisticated statistical methods to compare schools. But I would just ask that we have some humility about what these methods can and cannot do. (Sidenote: The only winners when we ignore these issues are educational researchers, who can then write regression discontinuity papers using these data. Thanks for the publications, Joel and Mike!)

And it's quite eye-opening to compare the language used by state and federal governments used to explain their accountability systems with the rhetoric we hear in education. Consider this statement from the Department of Health and Human Services to explain the rationale behind risk adjustment:
The characteristics that Medicare patients bring with them when they arrive at a hospital with a heart attack or heart failure are not under the control of the hospital. However, some patient characteristics may make death more likely (increase the ‘risk’ of death), no matter where the patient is treated or how good the care is. … Therefore, when mortality rates are calculated for each hospital for a 12-month period, they are adjusted based on the unique mix of patients that hospital treated.
If you replace the word "hospital" with "school" above, you can imagine the reception this statement would receive in the educational accountability debate. Soft bigotry of low expectations, and you probably kill baby seals for fun, too.

Readers, why is the educational debate so different? Full disclosure: I will shamelessly appropriate your thoughts in my dissertation, which attempts to answer this question, and also establish the effects of each of these systems on race, gender, and socioeconomic inequalities in educational and health outcomes.

Grading skoolboy

spiffboy2.jpg
What bloggers need, Michael Bloomberg prophesied last year, is a "wake-up call." Joel Klein agreed: "If you're not making progress, if your [posts] are not moving forward, then I don't think the [blog] is doing well." Jim Liebman couldn't have agreed more: "“When you say, we’re going to hold you to the best that other [blogs] like you can do, all of a sudden, [there are] no more excuses."

skoolboy, as you all already know, is that pesky curvebreaker in your calculus class. An A+ for you, skoolboy, and a hearty thanks for relieving me from blogging for my conference/vacation.

And I'm a huge fan of skoolboy's report card contest - don't forget to enter! If you are lazy, you can just hit the diagonals (i.e. what percent of schools that received As will still receive As, what percent of schools that received Bs will still receive Bs, etc). As for prizes, I am still working on it. A pony? The right to choose Joel Klein's costume in this year's Halloween Parade?

Got ideas? Let me know.

September 7, 2008

Predicting the Near Future*

question_marks.jpg

Sometime soon, with great fanfare, the New York City Department of Education will release this year’s School Progress Reports. (Word on the street is that schools already know their grades.) The School Progress Reports, for better or worse, are the centerpiece of the NYC accountability system. (skoolboy thinks for worse, but more on that later.)

The DOE has made a number of changes to the Progress Reports for this second iteration, and I think that eduwonkette had something to do with that (as did other critics and analysts outside of the Tweed inner circle.) We can expect to see separate letter grades for the three major dimensions on which the Progress Reports are based: school environment (including attendance, and parent, teacher and student surveys), student performance, and student progress. But the overall format appears to be unchanged: most of the grade is based on student progress on test scores, and such gains are not very reliable from one year to the next. There is, in skoolboy’s opinion, a false sense of precision conveyed by these letter grades, as they are based on components that are measured with error, but that measurement error is not reflected in how the grades are calculated. And I’m particularly annoyed at the misuse of social surveys for accountability purposes.

Nevertheless, the DOE is marching onward, and we’ll have this year’s grades to pore over in the near future. (And you can bet that eduwonkette will put on the green eyeshade for this, even though it clashes with her cape and mask.) How many schools will improve their grade from last year to this year? How many will fall? It’s time to make some predictions. What do you think, readers?

Here's a five-by-five table designed to show how this year’s grades are associated with last year’s grade. Each column represents last year’s grade, and each row represents a possible outcome for this year. The column percentages will add up to 100%. Try to fill in the blanks: What percentage of the schools that received A’s last year will receive an A this year? What percentage of A’s will decline to B’s? What fraction will fall further to C’s, D’s, and F’s? At the other end of the spectrum, what percentage of last year’s F’s will remain F’s? What percentage will climb out of the cellar to obtain a D? Will any make the leap from F to A?

crosstab.JPG

As a reminder, last year, about 23% of schools received an A; 38% received a B; 26% received a C; 8% received a D; and 4% (i.e., 53 schools) received an F.

A caveat: The DOE knows that the legitimacy of the School Progress Reports depends on the grades not being too volatile from year to year. If 75% of last year’s A’s became F’s this year, no one would take this scheme seriously. (And if schools that everyone views as exemplary or high-performing got middling grades, this too would call the scheme’s legitimacy into question. So don't expect Stuyvesant High School to get a C.) There may not be very much fluctuation from last year to this. You can be sure that the DOE has constructed this year’s scores so that there’s not too much instability from last year to this year.

But since we believe in incentives on this blog, the reader who comes closest to the actual association between last year and this year shall receive a prize to be selected by eduwonkette—and we know how creative she can be. Be sure to fill in all 25 blanks.

*Employees of Tweed Courthouse, KPMG Consulting, and the Parthenon Group are ineligible for this contest.

September 5, 2008

COWAbunga! Post-Convention Edition

cowabunga-award.jpg

No, there's no convention commentary here (or else skoolboy would have to shoot himself). This week’s “Comment of the Week Award,” also known as the COWAbunga Award, goes to NYC Educator, for a comment on yesterday’s Coffee Talk question about which big-city school district is the worst-managed. NYC Educator wrote:

I see the system in which I work on a daily basis, and I don't always see its reality reflected in the press--although they've made great strides over the last few years.
Really, when you're a teacher and you find blatantly preposterous statements in the NY Times, you have to wonder about the reporting from other cities. Who knows whether or not they're telling the truth, or whether they've sent anyone to find out what was really happening. Certainly it's easier to just ask City Hall what's going on and write whatever they tell you.

Big-city school districts are notorious for turning inward—transparency has never been their strong suit. A vigorous press is one of the ways that those in charge of these districts can be held to account for their responsibilities as public servants. This is one of the reasons why yesterday’s announcement that the New York Sun may be folding at the end of the month was so disappointing. skoolboy didn’t often agree with the editorial pages of the Sun, but I always felt better knowing that there was a venue for opinions different from mine to be aired and debated.

Even more importantly, though, the shutdown of the Sun would mean less daily beat reporting on New York City schools. eduwonkette has said repeatedly, and I agree wholeheartedly, that Sun reporter Elizabeth Green has been breaking important stories since she arrived on the scene last year, and it would be a shame if those of us with a stake in New York City schools were to be deprived of her investigative skills. (And yes, she wrote a feature on eduwonkette, and I’ve assisted her in a story or two, but the quality of her work speaks for itself.) Alexander Russo over at This Week in Education has also lamented the recent transitions of a number of well-regarded education writers to new positions that remove them from day-to-day beat reporting. Really, is it possible to have too much high-quality reporting on public education? Maybe … but we have a long way to go before that’s a serious question to consider.

In the meantime, the gap between the person-power devoted by school systems to transmitting messages about public schools to the public and the person-power available in an independent press to interpret these messages in a critical and thoughtful way for the public continues to widen. This, in skoolboy's view, does not serve the public interest.

skoolboy Throws Down the Class Size Gauntlet

moneymouth.JPG

Long-time followers of skoolboy (hi, Mom!) know that his first posts on eduwonkette’s blog were about class size. I argued for championing class size reduction as the right thing to do for children and for teachers—an argument grounded in the moral content of public schooling more so than in the technical consequences of class size reduction for standardized test scores.

Over the past year, I’ve observed a number of trends in the operation of big-city school districts. I’ll use New York City as my key example, because it’s my hometown, but the issues are sufficiently general to warrant posting here.

First, large districts are increasingly trying out innovative policies and practices for which there is little or no pre-existing research support. In New York City, the issuing of school report cards and conduct of school quality reviews are high-stakes evaluative practices for which there’s no prior evidence showing beneficial outcomes. In Washington, DC and New York City, school officials are offering incentives in the form of cash and cellphones to students in exchange for meeting academic performance targets. Some of these innovations have evaluations built into their design, whereas others do not.

Second, the arguments in support of these innovations often rely on claims that other innovations have not been successful. The best example is the juxtaposition of teacher quality and class size reduction. All kinds of policies regarding teachers—value-added assessment, merit pay, new recruitment strategies—are being justified on the grounds that teacher quality has much larger consequences for student achievement (read: test scores) than other policy choices, such as class size reduction.

Third, a lot of the claims about these effects take the form of “Research shows…”, which eduwonkette has derided as glib and poorly documented. There are, of course, important studies of both teacher quality effects and class size effects on student outcomes, but different studies yield different estimates of the magnitude of these effects. In part, this is because the impact of a particular innovative policy or practice is contingent on how the policy or practice is implemented and the features of the local organizational and institutional context for the new intervention. (We might expect, for example, that class size reduction would have different effects in classrooms with novice teachers than in classrooms with experienced teachers, or in classes that differ in the amount of prior student misbehavior.)

So when a policymaker confidently says that we should prefer innovations designed to influence teacher quality rather than class size reduction in a particular local setting—say, New York City—what’s the evidence for such a claim? Specifically, what does research tell us about the consequences of a well-designed class size reduction intervention in New York City?

The answer is, we don’t know—because there has never been a carefully-controlled study of class size reduction in New York City.

So at this point, skoolboy throws down the gauntlet: If we’re serious about data-driven decision-making, we should put our money where our mouth is, and demonstrate the relative effectiveness of class-size reduction and other policy initiatives. I call on the New York City Department of Education to carry out a well-designed study—ideally, a randomized experiment—of class size reduction in New York City public schools. View it as a small-scale pilot, as is true for some of the other initiatives, such as the student incentive plans, and look for some private funding (if it’s not feasible to draw on the operating budget). It will not be hard to pull together some of the leading researchers on class size to inform the design (and it wouldn’t kill anybody to have a couple of knowledgeable parents and teachers at the table too.) There's nearly a full year to get this off the ground for the start of the 2009-10 school year.

skoolboy is willing to live with the findings of a well-designed and well-implemented study of class size reduction in New York City, whether they support or refute claims about the efficacy of class size reduction. What I cannot support are claims that “research shows” that teacher quality is more important than class size reduction for student outcomes in New York City—or any other local education setting—in the absence of research that actually does show this.

September 4, 2008

Talk amongst Yourselves

Linda_Richman.jpg

skoolboy was having a spirited discussion with some of his students the other night, who have taught in school systems such as New York City, Detroit, LA, New Orleans, Washington, DC, Newark, Oakland, and elsewhere. The topic of the day: what's the worst-managed big-city school system--and why? Readers, what do you think? Discuss.

September 3, 2008

COWabungle

cowabunga-award.jpg

skoolboy has been worrying about how he was going to make this week's COWabunga award. There haven't been any comments to his posts! Hard to believe that such witty and incisive remarks would draw nary a "well done!" or "you're full of it, skoolboy!" Turns out that the website woes that Ed Week has endured the past few days include a disabling of the comment features here. The good people at Ed Week are now aware of this, and I look forward to hearing what readers have to say when the problem is resolved. It's not the first time that technology has kicked skoolboy in the butt, and I'm sure it won't be the last.

If you can't wait for the site to get fixed to get something off your chest, feel free to e-mail me at skoolboy2 (at) gmail.com.

The Chicago Boycott: Publicity Stunt or Principled Protest?

Yesterday, State Senator Rev. James Meeks engineered a boycott of the Chicago Public Schools, urging CPS students to travel with him to high-spending districts in Chicago’s suburban North Shore to try to register for school. The objective of the protest was to draw attention to inequalities in school funding in Illinois. Rev. Meeks sought to contrast the Chicago Public Schools, which annually spends a bit over $10,000 per student, with New Trier High School, which spends in the neighborhood of $18,000 per student. Publicity stunt, or principled protest?

Probably a bit of both, in skoolboy’s view. Illinois still relies heavily on local property taxes to fund its schools, and the variability in income and wealth across school districts means that different districts have differing capacities to raise money to support the schooling of the children who reside in them. State and federal funds are supposed to compensate for these inequalities, and they do, but not completely. The available evidence suggests that total per-pupil spending on students in the wealthiest 20% of school districts in Illinois is considerably higher than total per-pupil spending on students in the poorest 20% of school districts—a difference on the order of $2,500 per pupil per year.

The chart below shows these dynamics. skoolboy divided Illinois’ school districts into national deciles based on the median family income of the district in 2000. Districts with a median family income of $30,000 or lower were in the lowest decile, whereas those with a median family income of $66,000 or higher were in the highest decile. I looked at three different revenue streams: per-pupil local revenues; per-pupil state revenues; and per-pupil federal revenues. The sum of these three is reported as total per-pupil revenues. (I use revenues because they’re reported by source in the federal data, and expenditures are not. The data are also weighted by the number of students enrolled in each district, so smaller districts count less than larger ones. I also excluded districts in which the total per-pupil revenues exceeded $40,000 per year. The story is pretty much the same whether one looks at median family income or the percentage of children living in poverty within a district.)

Illinois.JPG


You can see just how strongly district median income and local per-pupil revenues are correlated in Illinois (r=.68). It’s also clear that state and local funds flow disproportionately to lower-income districts. But when the three funding streams are added together, there is a moderate positive correlation (r=.38) between a district’s median family income and its total per-pupil revenue. Although federal and state revenues do help to close the gap between wealthy and poor districts in Illinois, the remaining inequalities in spending are not trivial.

Having said that, a comparison between the Chicago Public Schools and New Trier is fundamentally misleading. By skoolboy’s calculations, the average total per-pupil revenue in New Trier in 2006 was nearly $22,000, which is way, way above the average total per-pupil revenues for the 113 Illinois districts in the top national income decile ($11,400). Moreover, CPS is in the 5th median income decile, not one of the lowest, and its total per-pupil revenues are a tad above the average for the 87 Illinois districts in that decile.

Not all states show this pattern; some have been more successful in reducing the association between a school district’s total per-pupil spending and the characteristics of the students in that district. (For example, like Ken DeRosa, I also find no correlation between per-pupil revenues and the percentage of children in poverty among Pennsylvania school districts. However, in Pennsylvania, as in Illinois, districts with higher median family incomes do spend more than those with lower median family incomes. ) How schools are funded has a lot to do with the inequalities across districts, but funding formulae don’t change easily. Don't expect high-spending districts to be happy with policies that ask them either to spend less or to subsidize the spending on children in other districts.

September 2, 2008

A Brief Word on Nomenclature

spiffboy2-thumb.jpg

Even though eduwonkette and skoolboy have been unmasked, skoolboy plans to continue to refer to himself in the third person. Why? If I did it at school, my students would laugh me out of the classroom. If I did it at home, my wife would kick my butt. So let me (er, skoolboy) have some fun, OK?

And for the record: both skoolboy and eduwonkette are lower case. Only proper nouns warrant capitalization, and it should be clear by now that skoolboy isn't very proper.

Back to (Home) School

hs4.jpg

It’s back to school! Today, more than one million schoolchildren will get up from the breakfast table, strap on a backpack, and trundle off to … the living room. Home schooling has been expanding rapidly over the course of this decade, according to data from the National Center for Education Statistics, representing approximately 2.2% of the student population in 2003. (The NCES definition of home schooling is children who are schooled at home instead of in a public or private school for at least part of their education, and whose part-time enrollment in public or private schools does not exceed 25 hours per week.) skoolboy hoped to be able to report some new evidence from the Parent and Family Involvement (PFI) module of the National Household Education Survey’s 2007 sample, but those data have not yet been released. Unfortunately, that means that the best available information is from 2003, the prior wave gathering information on the incidence of home schooling. Moreover, only 239 homeschooled children were included in the PFI module of the 2003 NHES, and thus our knowledge about their characteristics isn’t very precise.

There are a lot of misconceptions about home schooling, such as homeschooled children lack normal social graces due to isolation from peers, or they’re all very well-prepared for college. skoolboy has seen no persuasive evidence of any problems of social adjustment among homeschooled children. The reality is that most homeschooled children and youth are not isolated from others; they often participate in homeschooling networks, may participate in extracurricular activities sponsored by public and private schools, and, for a significant fraction, are part of religious communities that provide opportunities for interaction with peers and adults. Homeschooled children and youth probably have fewer opportunities to interact with other youth with differing social characteristics than do students who attend public school; but you don't need to be a homeschooler to select yourself into settings where you engage almost exclusively with other people who are like you.

It’s challenging to assess the impact of home schooling on children who are home schooled, because families self-select into home schooling, and the kinds of families that choose to home school differ, on average, from those who do not. (And don’t hold your breath waiting for the definitive randomized experiment!) Homeschooled children are more likely to be white than Black or Hispanic; to be in a household with three or more children than one with fewer children; to live in a two-parent household with one parent in the labor force than in another configuration; and to have college-educated parents.

One of the most interesting features of home schooling, from skoolboy’s view, is its implications for defining teaching as a profession. For the most part, parents who home school their children are subject to very little oversight by the state. Contrast this with the rules for licensing teachers who teach in the public schools. Although eduwonkette pooh-poohs my “1950’s” thinking about what defines a profession in the sociological sense, I think she would agree that the fact that the state will allow parents with no formal training, and who are not accountable to other teachers for what they do, to teach weakens the case for teaching as a profession.

In February, a California appeals court held that parents can be prosecuted for failing to ensure that either (a) their children attend a full-time public or private day school, or (b) their children are instructed by a tutor who holds a state credential for the child’s grade level. The case alarmed home schoolers and their supporters across the country. On appeal, that same court ruled last month that “(1) California statutes permit home schooling as a species of private school education; and (2) the statutory permission to home school may constitutionally be overridden in order to protect the safety of a child who has been declared dependent.” The court made clear that it was not taking a stand on whether or not home schooling should be allowed, and blamed the California legislature for a lack of clear legislation on the issue. What counts as a threat to the safety of a dependent child is not inscribed in the law, but physical and sexual abuse (which were alleged in the California case) surely count; skoolboy’s guess is that mediocre instruction would not. (If it did, there’d be an awful lot of usual suspects to round up!)

The opinions expressed in eduwonkette are strictly those of the author and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.

Get RSS

Get eduwonkette delivered by e-mail. Enter your e-mail here:

Delivered by FeedBurner

Advertisement
Powered by
Movable Type 3.34

EW Archive