Open Education Science and Challenges for Evidence-Based Teaching
With my colleague Tim Van der Zee, I wrote an article called Open Education Science that outlines new pathways and best practices for education researchers--in particular about being more transparent with readers about how we plan our research, what research we actually conducted, and how that reality aligns or not with what we planned. In this post, I try to explain why we wrote it, and what it might mean for educators and policymakers trying to make good use of education research to improve teaching and learning.
In 2005, John Ioannidis published an article provocatively titled "Why Most Published Research Findings Are False." The article targeted medical research, but over the past thirteen years it' has been profoundly influential across a variety of social sciences and embraced by researchers who want to figure out how to continue improving the scientific process.
The core argument is just what the title suggests: for any given scientific article that depends upon quantitative or statistical arguments, there is a probability that the claims of the article are not true. That's not a problem in and of itself; science is a dialogue among different claims and positions, and incorrect claims are an inevitable part of that dialogue. But Ioannidis made a compelling argument that for most scientific articles, the probability that a new article's claim was false was higher than the probability than it was true. Our scientific dialogues were not weeding out false claims, but supporting their proliferation.
There are many causes of this phenomenon, but I'll explain two here. The first is that the editorial process of article selection in scientific journals is biased towards novelty. Editors and article reviews tend to be particularly supportive of publishing articles with novel or suprising findings. Imagine that 100 researchers write studies, and many of them are inconclusive, a small number are conclusive but have very expected results, and a tiny number have very unexpected results. That tiny number are potentially the most attractive to journal editors because they have the possibility of changing the direction of a field and attracting a great deal of attention. In reality, the authors of those inconclusive and less interesting studies often give up on trying to publish them, so they don't appear in print at all, the so-called "filedrawer problem." If all of those studies are of the same topic, you can see how the literature about a topic can quickly over-represent extreme findings and under-represent more banal ones. Novelty is an important part of science--we don't necessarily want thousands of studies of well-established phenomenon that all say the same things--but there is good evidence that the unexpected studies that rise to the top of science publishing are not representative of all studies being conducted.
A second problem is that with a proliferation of statistical methods and approaches, there are lots of ways to analyze data, and it's often the case that if researchers try enough of these methods, they will find one that has results that are more favorable to publication than other results. This is somewhat akin to the old saw that "if you torture the data enough, it will confess," but the processes are probably even more subtle than that. Even well-meaning researchers trying to provide careful and skeptical examination of their data can wander through what Andrew Gelman calls the "Garden of Forking Paths" until they find a statistical path that leads to a result that colleagues will deem publishable. Again, trying different methods of analyzing data is a good thing and essential to science, but often people will publish only the most favorable analysis, and not the many, many others that were tried.
For those who imagine a future of evidence-based teaching--where research provides reliable guidance that can inform how teachers are trained and how teachers improve their practice, this can be disheartening stuff. I don't mean to present it as a blanket condemnation of all research efforts, but it is a serious problem for social science and educational science. In 2015, a collaboration of researchers published an article reporting on the replication of 100 studies from pyschological science, and found only 1/3 to 1/2 had findings similar to the original results, and the average magnitude of the effects in the replications was about half that of the original studies. It's hard to build an empirical foundation for evidence-based teaching on a shaky foundation.
So what can education researchers do to improve the state of affairs. For Tim and I, the answer is more sunlight: science improves through systematic skepticism, so researchers should be more open about what studies they have done and how they have done then. Scholars over the centuries have primarily published summaries of their researchers in articles and journals because we've printed out these summaries on dead trees which are expensive to ship around the world; digital technologies make it much cheaper to share much more of our work.
An incredibly important starting point is being much more transparent with other researchers and with the public about the planning behind our research. We call for researchers to engage in a process called pre-registration: where before actually conducting a study, researchers write down what they plan to study and how. Imagine two experimental studies: In one, the researchers say "A year ago, we published online a document that says how we were going to do an experiment. We did the experiment as defined, and we found this particular intervention improved learning." In another, the researcher say "A year ago, we published online a document that says how we were going to do an experiment. When we analyzed the results the way that we planned, we found no effect of our intervention. Using other statistical methods, we found that the intervention worked." Our argument is that both studies can be valuable contributions to knowledge, but the evidence from the first study should be considered much more robust than evidence from the second.
In the 2000s, education policymakers tried to get practitioners to become much more attentive to whether or not a research study was able to establish a causal mechanism, or whether the results only showed correlations. Led by the Institute of Education Science, policymakers tried to explain to the public how to recognize when a study was a randomized controlled trial, and to be more skeptical of studies that were not experiments but made causal claims. Tim, myself, and other folks--like the good people at the Society for Improvement of Pyschological Science--are trying to now get people to recognize that if hypotheses and analytic plans aren't pre-registered somewhere, those studies should be held in lower regard than similar studies where the plans are pre-registered.
One way to encourage researchers to pre-register their study is to make it a requirement of publication. With Hunter Gehlbach and Casper Albers, I'm editing a special issue of AERA Open that accepts a new format of scholarly article called a Registered Report. In a Registered Report, authors submit a plan for a study for peer review, and then the reviewers and editors accept or reject a study on the basis of 1) whether the questions are important, and 2) whether the study is well-planned and well-designed. The editors agree to accept any article, with really boring findings or really provocative findings, on the basis of the quality of the design of the study, and not on the novelty or lack thereof of the findings. Our hope is that a science built on these kinds of studies will be more robust than the science we have now.
I don't have a tidy suggestion for how educators should respond to these kinds of changes. Social scientists have recognized some problems with how we do our work, and some of the problems are pretty technical and hard to explain to a lay audience. We've had these problems with us for a while, and they raise questions about the quality of research that we've published over the last century. But there are a bunch of us that think we can do better, and we're trying some new things to make our science more open, which will make it easier to be systematically skeptical, which we think will lead to better science.
For teachers who try to follow and use research, I'd encourage you two add two new questions to your bank of question that you use to evaluate new research. 1) Have the researchers made their data and methods openly available for other researchers to scruntize? Generally speaking, the more open people are with their data and methods, the more likely it is others will be able to scrutinize and improve that research. 2) Have the researchers explained how what they did aligns with what they planned to do? Close alignment is the best, changes can be OK, and I'd urge you to reserve the most skepticism for circumstances where you cannot determine to what extent researchers deviated from what they planned to do.
And hopefully in the years ahead, greater transparency will lead to better educational research that provides better guidance to teachers and administrators. If you want to read the whole article that Tim and I wrote on Open Education Science, it's available for free through open access at AERA Open.