Jack Hassard: NCTQ Assessment Study Flunks
Guest post by Jack Hassard, originally posted here.
In May, 2012, the National Council on Teacher Quality (NCTQ) issued a report entitled: What Teacher Education Programs Teach About K - 12 Assessment. Anthony Cody mentioned this study in a recent post entitled Payola Policy: NCTQ Prepares its Hit on Schools of Education.
The title intrigued me, so I went over to the NCTQ website, and read and studied the report which is about what education courses teach about assessment. This post is my review of the NCTQ study, and I hope after you finish reading the post you will realize how bogus reports like these are, given the quality of research that professors of education have been doing for decades. The study reviewed here would never have been published in a reputable journal of research in education, not only in the U.S., but in any other country in the world. I'll make it clear why I make this claim.
The National Council on Teacher Quality is a conservative think-tank that publishes reports on education that the council claims to be research studies in the field of education. The subhead for the group on their website is: A research and policy group working to ensure that every child has an effective teacher. The NCTQ has a staff of 18, an advisory group of 36 people, and a 13-member board of directors. The individuals on these various committees come from the corporate, educational, and consulting worlds. Some of the organizations represented include: Pearson Publishing, Teach Plus, KIPP Schools, the Hoover Foundation, American Enterprise Institute, Core Knowledge, Piton Foundation, Bill and Melinda Gates Foundation, Thomas Fordham Foundation, N.F.L Players Association, B & D Consulting, Students First, Abell Foundation, Teach for America, New Schools Venture Fund, and others including a few universities and two public schools.
Many of these groups have worked very hard to denigrate teachers, insist that the Common Core State Standards be adopted by all states, believe that teaching and learning should be data-driven, and that student achievement data on high-stakes tests should be used to make decisions about student, teacher and principal effectiveness, and school success.
According to Anthony Cody's post, the NCTQ was founded by the Thomas Fordham Institute, a conservative think-tank that publishes non-peer reviewed reports on education, and has an appalling opinion of teacher education institutions. And of course, the Thomas Fordham Foundation has membership on the NCTQ Board of Directors.
I've reviewed two reports previously published by the Thomas Fordham Institute. You read my reviews of these reports here:
Framework of K-12 Science Education
In each report I found the methodology weak, the results misleading, and both reports were published as non-peer reviewed research. The NCTQ study on assessment in teacher education uses the same methodology as the Fordham studies. Even with such a poorly designed and unreliable data, think tanks get away with publishing their works in this fashion, and because of the financial resources, and the identities of their funding agencies, they carry a good deal of clout. The Fordham Foundation and the NCTQ are two such foundations.
Is teacher education going to take hit? Probably so. The NCTQ organization has the resources and the connections to make trouble for university teacher education programs. There is a movement to hold teacher education institutions accountable for the achievement test scores and gains that their graduates produce in their students once they begin teaching. As absurd as this sounds, the U.S. Secretary of Education is supportive of such an idea. Organizations such as NCTQ are on the accountability bandwagon, and carry weight at the policy level in education.
What Teacher Preparation Programs Teach About K-12 Assessment
This report was released in May, 2012, and according to the preface of the report, it provides information "on the preparation provided to teacher candidates from teacher training programs so that they can fully use assessment data to improve classroom instruction." The results reported in the final document were based on reading and analyzing 450 syllabi received from 98 institutions of higher education representing 180 teacher preparation programs.
Why This Study?: The First Disagreement
The purpose of the study was to find out what professors in teacher education are teaching their students about assessment so that when they begin teaching in the classroom they will be able to use assessment data to improve classroom instruction.
To rationalize their study, the NCTQ authors, Julie Greenberg and Kate Walsh, impress upon the reader the importance of assessment in today's schools, and the need for prospective teachers to know how to use assessment in their future classrooms. The authors say,
Effective instruction requires that teachers have a well grounded sense of student proficiency in order to make a daunting number of instructional decisions, such as making snap judgments in the midst of interactions with students, and planning lessons, be they for the next day, the next unit or the entire school year.The purpose and rationale for this study was not based on previous research, or a review of the literature. The authors allotted less than one page on "previous research." Three references were cited. One of the references they cited is research done by Black and Wiliam, two of the leading assessment researchers in the field of education. The authors of the NCTQ study rejected the Black and Wiliam research,which is extensive, published in peer-reviewed journals, and highly cited, BECAUSE the NCTQ researchers said that the research was old (1998), and back then education research had weaker designs, and THEREFORE those studies are suspect. The researchers fail to tell the reader the Black and Wiliam are leading research proponents of formative assessment as a way to improve instruction and learning and have been publishing their research for decades. Even now. And if they were concerned that the studies were old (>1998), all they have to do is a Google search, or link to Dr. Black's or Dr. Wiliam's site for their research on assessment. Greenberg and Walsh claim that education studies prior to 1998 used weaker designs. I did my Ph.D. work in the late 1960′s in science education at Ohio State University, and let me tell you the research designs and methodologies that my colleagues in graduate school, and in the literature used in their research were quite robust, not weak. The research in education is enormous, and it's a testament to the incompetence or bias of Greenberg and Walsh that they couldn't cite more than three studies.
The rationale the NCTQ study is rooted in political and ideological beliefs in schooling rather than one that builds upon previous research. For example they make this claim:
The evidence for the connection between using data to drive instruction and student performance is emerging, just as the practice of using data is emerging.
There is no previous research cited in their report that would support this claim, or help us see how their work is connected to other scholars. Instead they were cherry picking any research that would support their view, or downplaying or dismissing research that might have questioned their intentions.
The researchers were bent on showing that teacher educators weren't doing the job of teaching their students about assessment. And they undertook this task with the clarion call that there is new focus on "data driven instruction," and they cite examples of schools that prove that using data to drive instruction will reduce the achievement gap among low-income and high-income students. And sure enough, they cite two Broad Prize Winners, Charlotte-Mecklenburg Schools, NC, and Adline Independent Schools, TX as examples. Teachers in these schools, according to Greenberg and Walsh, were trained in using data to drive instruction, and that is what led to such positive test results. And by the way, the Broad Foundation is a major funding source for NCTQ.
But here is the problem. Instead of trying to document or uncover what is being taught about assessment in teacher preparation programs, the researchers decided what they thought was important and then proceeded to go and compare what teacher preparation program are doing compared to their own ideas. The researchers started with three categories of assessment that they thought ought to be included in teacher prep programs. Their three categories, which turned into their research questions, were as follows:
- How adequately does coursework address Assessment Literacy?
- How adequately does teacher preparation program coursework address Analytic Skills?
- How adequately does teacher preparation program coursework address Instructional Decision Making?
You might think this is legitimate. But it is not really helping with the inquiry. If the researchers were really interested in making a contribution to the field they would have approached the problem inductively. That is, they would have worked their way up from the syllabi to generalizations that they could make based on their observations of the syllabi.
The inductive method is a scientific method that educators have used for decades to make generalizations about educational phenomena, such as the types of questions that teachers ask during class. In this case data analysis would have been determined by multiple readings (of the syllabi) and interpretations of the raw data. Because the researchers would be looking for evidence of assessment in the syllabi, they would identify specific parts of the syllabi and label these parts to create categories (e.g. diagnostic methods, formative techniques, using computers to analyze test data, etc.) The point is that instead of starting with the three categories that the researchers at NCTQ thought should be taught in teacher preparation programs, they could have uncovered what the syllabi reveal about the teaching of assessment, and report that data. There is more to say about this kind of research, such as teaching the researchers how to code, the use of computer programs to make the task easier, assessing the trustworthiness of the data, and reporting the findings.
According to the authors, and the opinions of their experts in the field, teacher education institutions have not figured out what knowledge a new teacher needs in order to enter a classroom with the ability to use data to improve classroom instruction. Their review of the literature certainly didn't lead them to this opinion. They have no basis for saying this, other than they can, and it supports the basis for their study.
The purpose of their study was to show that teacher preparation program coursework does not adequately prepare students to use assessment methods with K-12 students. Their study does not shed new light on the teaching of assessment in teacher prep, but it does shed light on how research can be biased from the start by asking questions based on your beliefs and ideologies, rather than research in the field.
The Study Sample: Found Wanting
According to the report, NCTQ obtained course syllabi from 180 teacher education programs in 98 institutions of higher education in 30 states. Using the open records requests, the reporters used the states' course syllabi from colleges that first responded to their request. The "researchers" don't tell us if they actually contacted any of these institutions, tried to talk with any of the professors, or perhaps visit a few institutions so that they could interview not only professors, but students, and cooperating teachers with whom these institutions worked. None of this was done. Or at least it wasn't stated in the their report. They got their data by requiring the institutions to hand over their course syllabi.
All of the data is embedded in the course syllabi they received. I don't know about you, but course syllabi vary from one course to another. Some professors create very detailed course syllabi, have well developed websites, use course software such as Blackboard, textbooks, and online data bases. All of these sources should have been examined if the NCTQ researchers wanted get a full picture of these courses. This was not done.
They only looked at the paper they received. On the basis of this alone, the data that the researchers used for this report is incomplete. Syllabi are no doubt inconsistent in design and scope from one institution to the next. And relying solely on a paper syllabus does the research study an injustice, and makes the analysis and conclusions invalid.
The syllabi they selected had to have the word "assessment" in the course title, or it had to be a methods course, e.g. science methods. Other syllabi were thrown out, and not analyzed. Somehow, the researchers perused the course syllabi looking for "evidence" (or lack thereof) for assessment by reading the objectives, lectures (if they were included in the syllabi), assignments, textbooks and readings. Whether the researchers actually looked at the texts is unknown. They said they looked at the publishers' descriptions of the content of the required texts. And then they looked for "capstone projects," such as work samples or portfolios.
The sample that the researchers report in their study does NOT represent teacher preparation institutions in the U.S. It only represents the 98 institutions that responded to the open records request of NCTQ. Their "finding" can not be generalized beyond the sample they studied. I don't trust the sample that they are basing their findings on. For one thing, there didn't seem to be an open two way exchange between the NCTQ and the universities cited in the report. How do we know if the syllabi the researchers received is a true record of the course syllabi for these teacher prep institutions?
It's possible that NCTQ is making decisions for some universities based on one syllabus, and for others using multiple syllabi. We have no idea, however, because the researchers did not report this in their report. The universities in the study have been short changed, and even worse have been lumped together in a report that paints a negative picture of teacher preparation programs.
If you take a look at examples of teacher education programs, you'll find that if they are graduate level teacher preparation programs leading to a masters degree and certification, there are at least 10 courses that should be examined to evaluate the coursework. At the undergraduate level, there are as many as 19 courses that should be evaluated. The researchers at NCTQ failed in giving a real picture of a university's teacher prep program.
The researchers over-laid three rubrics on the course syllabi to find out to what extent professors were teaching (1) assessment literacy (2) analytic skills and (3) instructional decision making. Assessment literacy meant searching the syllabi for key words including diagnostic, formative and summative. Analytic skills meant looking for key words such as dissect, describe or display data from assessment. Instructional decision-making meant looking for evidence that teacher educators helped their students use assessment data to drive instruction.
The rubrics were very simple using a Likert measuring scale from "0″ to "4." A "0″ meant there was no evidence, while a "4″ meant the criteria was met with a high degree. For example to evaluate the syllabi for assessment literacy, the scale used was as follows (you can view all of the rubrics here):
0-There is no or almost no instruction or practice on the various types of assessment (inadequate)
1-Instruction on the various types of assessment is very limited and there is no or almost no practice (slightly adequate)
2-Case 1: The scope of Instruction on the various types of assessment is not comprehensive and practice is very limited to adequate. OR Case 2: The scope of instruction on the various types of assessment is comprehensive, but practice is very limited or limited.
3-The scope of instruction on various types of assessment is comprehensive and there is adequate practice.
4-The scope of instruction on the various types of assessment is comprehensive, including concepts such as "validity" and "reliability," and there is adequate practice ( adequate)
The researchers rated each syllabus on three criteria and judged each criteria as inadequate (0)to adequate (4) using the 0 - 4 point scale. They were then able average scores on the syllabi from each teacher education program. Presumably either the two researchers did the actual rating, or they hired raters. Whether did or not, the researchers failed to provide data on inter rater reliability. We have to question the trustworthiness of the data.
As mentioned above, NCTQ started with a biased set of questions, and used these questions to analyze the syllabi of the teacher prep coursework. On face value, they findings only reflect their own biases and way of assuming how and what teacher prep courses should include about assessment.
In this study, 455 courses were evaluated, anywhere from one to six courses per institution. The only average mentioned was that 2.5 courses per program reference assessment. This statistic is difficult to believe given our knowledge of teacher education courses. If they looked at methods courses, the chances are very high that assessment was included in these courses. I don't know if these researchers examined course syllabi for internships or student teaching, but all of these experiences would have included assessment strategies as part of the experience. So you have wonder about the validity of their data.
Results: Did the Teacher Education Programs Reach the Bar
The results of this study have to be examined cautiously and with reluctance. In my own opinion, the data that was collected in this study is inadequate to answer the questions posed in the study. Firstly, the institutions did not directly participate in the study. There is no evidence that there was any attempt to contact the deans of these colleges, or department heads to ask them to provide additional documentation on their teacher education courses. Nor is there evidence that the researchers made any attempt to seek out course websites that would have included more details and content of the courses.
It seems to me that the researchers wanted to limit the data, yet make sweeping statements about teacher education programs, and make recommendations based on such an inadequate study.
According to the researchers, "the bar to earn a passing rating in this study was set low." They said they did this to give institutions the benefit of the doubt. Actually, it is a way out for the researcher because they were dealing with very limited data, a few course syllabi from major institutions of higher education, and they were going to use this meager data to make decisions about howassessment.
According to this study only 3% of teacher preparation programs adequately teach the content of assessment in their courses. But actually all they can say is that in their opinion only 3% of the syllabi they received reflected this value.
The sample they used in their study was biased from the start. Why did these universities respond to the open records request? Why did universities refuse to respond to the open records request? Did the researchers treat the universities with any respect, and try and open up a dialog on teacher preparation content?
One More Thing
There are quality teacher education programs in the United States. Linda Darling-Hammond, in her book Powerful Teacher Education: Lessons from Exemplary Programs, documents seven highly successful teacher education programs, and discusses the way in which teacher education has changed to create more clinically based teacher education programs.
The researchers of the NTCQ study are stuck in a 19th-century model of teaching, and simply want to hold accountable teacher education institutions to the principles and practices that teacher education rocketed through years ago.
But at the same time, the NTCQ study cleverly uses percentages and numbers in such a way to convince some that teacher education programs are inadequate, and need to be regulated in ways that satisfy their interests. If you look at their sources of funding, and the names of individuals who sit on their boards, you will see the conservative agenda in action in this organization.
My advice is to call them to task on this study. Tell them that their study in no way sheds any light on how assessment is taught in teacher education programs. The only light that is shed is on their own deficiencies as a research organization.
What do you think about the NCTQ study? Do think their study is to be taken as a valuable contribution to the literature of teacher education?
Jack Hassard is a former high school science teacher and Professor Emeritus of Science Education, Georgia State University. While at Georgia State he was coordinator of science education, and was involved in the development of several science teacher education programs, including the design & implementation of TEEMS, a clinically based masters program for mathematics, science, and engineering majors. He was director of the Global Thinking Project, an Internet-based environmental program linking schools between Russia and U.S.A at first, and then many countries around the world. He also conducted seminars around the country on science teaching, inquiry and technology for the Bureau of Education and Research and for school districts' staff development programs. He is author of more than 20 books including The Whole Cosmos Catalog of Science, Science Experiences, Adventures in Geology, and most recently The Art of Teaching Science, 2nd Edition, and Science as Inquiry, 2nd Edition. His blog is The Art of Teaching Science.