Scientific Data has published a description of an interesting dataset: "Pantheon 1.0, a manually verified dataset of globally famous biographies". This data collection effort contributes to quantitative data for studying historical information, especially, the information about famous people and events.
| Workflow diagram (Image from the paper) |
Manual cleaning and verification includes a controlled vocabulary for occupations, popularity metrics (defined as a number of Wikipedia edits adjusted by age and pageviews).
The dataset is available for download at Harvard Dataverse http://dx.doi.org/10.7910/DVN/28201. Another entertaining part is a visualization interface at http://pantheon.media.mit.edu that allows to explore the data and answer questions like "Where were globally known individuals in Math born?" (21% in France) or "Who are the globally known people born within present day by country?". Turns out that Russia produced a lot of politicians and writers, while the US gave us many actors, singers and musicians.
![]() |
| Globally known people born in the US (from http://pantheon.media.mit.edu/treemap/country_exports/US/all/-4000/2010/H15/pantheon) |

No comments:
Post a Comment