HarvardX Students from Around the World
Where in the world do HarvardX students come from?
One of my research colleagues at HarvardX, Sergiy Nesterko, has a background in data visualization, and he has been working for sometime on developing a framework to present data from HarvardX courses in an intuitive, scalable and sustainable way for the benefit of instructors, researchers, and the public. One of his first efforts is a presentation of HarvardX registration by course and by country. He's done a beautiful job on the front end, and there is some sophisticated engineering under the hood, which (we hope) will ultimately lead to the point where the map updates itself as the courses progress.
One of the things that I most appreciate about Sergiy's work is how careful he's been to document all of the particularities of the sources of the data. Appended to the visualization is a technical document that describes some of the limitations of the data, like the high number of missing responses and the threat that these responses may not be "missing at random," as a registrant from Azerbaijan may be less likely to share an address than a registrant from Canada.
He gives a great interview with Anna Hashmi from the Crimson, where again he helps carefully interpret the findings:
To better understand this interactive visualization [world map of enrollment], you should keep the population of the country in mind," said Nesterko. Brazil and Nigeria, the most populous countries in South America and Africa respectively, have the most HarvardX registrants on those two continents.
(Sadly, Anna's article on HarvardX isn't the among the most read articles in the Crimson, where "Fifteen Hottest Freshmen" holds the top spot.)
What Sergiy's careful work suggests is that even what seem like very simple concepts—like that nationality of edX users—is to some degree swaddled in assumptions and estimations. These assumptions and estimations are everywhere in analyses of xMOOC data, and researchers in this field should be extremely forthright about the processes of cleaning and organizing data.
The challenge that we have as researchers is giving people access to useful insights, while highlighting the most important of these limitations and making the full set of underlying assumptions available to the public. I think Sergiy got our work here at HarvardX off to a great start on those counts.