I met Anant Agarwal at edX the other day and introduced myself, and he asked me, "What research questions are you most excited about answering?"
It was a good question, asked with a genuine and infectious enthusiasm, and I didn't offer a great response, so I thought I'd write it out.
The first things I am doing aren't related to research questions: the tasks that have consumed most of my time during my first month involve laying the groundwork for a wide variety of research conducted by a wide variety of researchers. My first interest isn't answering my own questions, it's making sure that we're setting things up so that other edX researchers can answer their questions.
Step 1 is getting the data that comes out of edX into a usable shape for researchers. Some of this is minor cleaning (edX data dumps alternatively use "id," "user_id," and "student_id " to label the same identifier); some of it is major rescrubbing (if you want to hear me talk for an hour about something you don't care about, ask me about the inconsistent usage of "event" and "event_type" in edX tracking logs).
So first we have to figure out the quirks of data coming off a ship built as it sails, and then we have to think about the ways in which we should help people make use of these files.
Then, we have to put the data into one or more usable forms. One colleague suggested that once we clean the data, we should define it with a schema in a series of related database files. Another colleague suggested we should transform the data to use Coursera's structures and fields, so that we can use the same analytic tools with both sources. Our colleagues at MIT are building a giant online pool of data, from which people can build custom buckets to draw up want they want. Trained as I was by secondary analysts, I'm interested in generating some large, standard, generalizable person-level and person-click datasets that could be widely disseminated among secondary analysts with very diverse interests. Maybe we'll do all of the above.
This is not the sort of work that will get written up in Science or Nature. But it's exciting to think that if we can remove some of the brambles here, we can clear the path for dozens of researchers who want to search along these paths.
Refining Survey Instruments and Procedures
Step 2 is gathering more information about our students, by developing some common survey instruments that can be used across HarvardX courses in different fields. I have a particular interest here in conducting survey research that looks carefully at both minimizing and characterizing response bias. There have been lots of surveys done on early xMOOCs, but response rates can be quite low, and it can be unclear to what extent survey respondents can be considered representative of any particular population. If we can develop a better understanding of who takes surveys, who gets turned off by surveys of different lengths, and whether or not we can increase the proportion of our students who complete surveys, then again, we can improve the ways in which everyone conducts their research. For all the data we have about what xMOOC students are clicking, we know precious little about what goes on in their heads.
So those aren't really juicy research questions to be answered ("But is our children really learning?"), but they are the kinds of yeoman tasks that consume my days, and being part of a community of folks working to build open source solutions to some of these problems makes the task all the more satisfying.
So when everything is all cleaned up, what questions do I want to ask?
Developing a Conceptual Framework of Outcomes
The place to start, in my mind, is to get much more clear about what kinds of measures we are using as outcomes, what those outcomes actually are, and what are the appropriate circumstances for measuring particular kinds of outcomes. Fundamentally, if something is working in an xMOOC, how will we know? What should we be looking at?
Right now, I'm thinking there are five types of outcome variables that we can use in evaluating xMOOCs: assessments of student learning and performance, persistence and completion, activity density and time on task, student self reported learning and satisfaction, and subsequent performance. Each of these variables will be more or less useful for different kinds of courses, in different settings, and for making different sorts of inferences.
We can do all the experiments and examine all the predictors we want, but if we don't have a clear sense of what outcomes we are hoping to evaluate or improve, then we might get stuck in simplistic conversations about retention proportions or other metrics that might not be telling us what we really want to know.
Evaluating Opportunity and Access
As anyone who knows my work might be able to guess, once we square away the variables on the left side of the equation, I'll be very interested in evaluating socio-economic variables on the right side of the equation. The early evidence leaking out seems to be pretty clear on this point—MOOC participants are disproportionately people with college and advanced degrees—but I'm interested in doing a comprehensive review of economic diversity in HarvardX courses, and then examining the findings in light of my own theories of how expanding opportunity can exacerbate inequalities. All indications suggest that if we want xMOOCs to reduce inequalities, then we'll need to develop a set of design principles that allow us to target courses or supports to learners that we care most about serving.
Design Research in Participatory Learning for xMOOCs
My third interest is in design research, thinking about how we can expand our repertoire of practices on edX. How can we take the most interesting, innovative practices in online or residential education and bring them to life on for HarvardX courses?
For instance, in professional education (law, business, education), case studies are a vital part of teaching in many courses in programs. What tools could let people collaboratively engage in cases online? Could some of these cases be the foundation of new social games or simulations? There are a wide range of teaching strategies practiced across Harvard, and the edX LMS will need to grow to accommodate them.
Another common refrain I'm hearing has to do with different aspects of social learning: wanting a community of learners to stay together after a course, or having people stay connected throughout a class in social spaces. Humans have built a space for social learning: it's called the Web. Especially among the humanists I talk with from HarvardX, there is a great deal of interest in doing the kinds of things that connectivist MOOCs have been doing well for a number of years. I'm interested in thinking about how we push the possibilities of the edX platform or how we might use the marketing and student information system components of edX to support learning environments that are not primarily built on the edX LMS. A lot of my career is spent looking longingly at those educators who play on the exciting edges of things and then thinking, "OK, how do we get everyone there?"
So. That's what I'm doing these days. If you have suggestions, let me know.