Scientific Data has published a description of an interesting dataset: "Pantheon 1.0, a manually verified dataset of globally famous biographies". This data collection effort contributes to quantitative data for studying historical information, especially, the information about famous people and events.
Workflow diagram (Image from the paper) |
Manual cleaning and verification includes a controlled vocabulary for occupations, popularity metrics (defined as a number of Wikipedia edits adjusted by age and pageviews).
The dataset is available for download at Harvard Dataverse http://dx.doi.org/10.7910/DVN/28201. Another entertaining part is a visualization interface at http://pantheon.media.mit.edu that allows to explore the data and answer questions like "Where were globally known individuals in Math born?" (21% in France) or "Who are the globally known people born within present day by country?". Turns out that Russia produced a lot of politicians and writers, while the US gave us many actors, singers and musicians.
Globally known people born in the US (from http://pantheon.media.mit.edu/treemap/country_exports/US/all/-4000/2010/H15/pantheon) |
No comments:
Post a Comment