Jul 7, 2015

The International Polar Year 2007-2008 (IPY-4) and the importance of data management

The International Polar Year is an international collaboration that focuses on the Arctic and the Antarctic, or polar regions. The polar regions have many unique phenomena, but the cold harsh environment makes them expensive to visit and study. It takes a large multi-country collaborative effort to put together expeditions, install equipment, and collect data. The first three IPYs occurred in 1882–1883,1932–1933, and 1957–58 respectively. The fourth IPY took place between March 2007 and March 2009.

The fourth IPY was dramatically different from the previous efforts (Mokraine and Parsons, 2013). A $1.2 bln effort with participants from more than 60 countries, it had an ambitious vision to enable international sharing and reuse of multidisciplinary datasets and keep the data discoverable, open, linked, useful, and safe (Parsons, Godoy et al., 2011). The enormous efforts to initiate, coordinate, improve, and sustain IPY data stewardship have seen both successes and failures, with some components of the IPY infrastructure struggling to exist and be useful (Lessons and legacies..., 2012).

A fair amount of IPY data is available via such online portals as the IPY data page at the National Snow and Ice Data Center (NSIDC) in the US, the NASA Global Change Master Directory (GCMD) IPY portal, or the Global Cryosphere Watch portal. Some of it, such as a global IPY Data and Information System (IPYDIS)  or the Discovery, Access, and Delivery of Data for IPY (DADDI) are broken. Most importantly though, missing is a way to track and access all the IPY data via a federated or centralized catalog. There is no good consistent way of international polar data to “function locally and reach globally”, to use Mokraine and Parsons’ words.

The challenges of making heterogeneous data and metadata work together were exacerbated by the lack of focused international funding for planning data archiving post-IPY, differences in data policies and researchers “hoarding” data (Lessons and legacies..., 2012; Carlson, 2011). Despite many IPY projects adopting a free and open data-sharing policy, compliance with it and, ultimately, sharing was rather low. Additionally, the researchers in IPY-4 didn’t have access to data from the first IPY projects, some of the data were not available in the digital form, while others were scattered or lost. The data centers (WDCs) that were supposed to support the increasing IPY data streams, lacked mechanisms of working with heterogeneous data, e.g., they couldn’t support social and ecological data.

Despite the difficulties, the IPY data management experience is crucial to the advancement of global data services and the norms of data sharing and re-use. As Mark Parsons, Secretary General of the  Research Data Alliance and former Senior Associate Scientist and the Lead Project Manager at NSIDC put it,
“We were perhaps rather naive going in to IPY. Many of the organizers came from the geoscience background of earlier the IPYs and assumed data systems would exist that could handle IPY data. We weren’t prepared for the incredible diversity of IPY4 with data ranging from Indigenous knowledge to satellite remote sensing to genomic sequencing to cosmology. Although it is unclear what percentage of IPY data are available and much is surely lost, new data services were created and sustained, international coordination continues in sustained organisations, and we learned a lot about different disciplinary cultures and their attitudes to data sharing. The IPY Data Policy was aggressive and not fully honored, but it did drive changes in national policies towards more timely and open release of data. Most critically we saw a change in the conversation within polar science from whether to share to when to share and now how to share. We have a long way to go, but polar data are significantly more accessible than they were prior to IPY.”

Mark’s and others’ publications, some of which are listed below, are a good source of all the lessons learned from IPY data stewardship efforts, one important lesson being that “[e]xperts in data management are critical members of any team attempting internationally coordinated science ...” (Lessons and legacies..., 2012).

Resources

Carlson D. 2011. A lesson in sharing. Nature 469 (293).

Lessons and legacies of the International Polar Year 2007-2008. 2012.

Mokrane M and MA Parsons. 2014. Learning from the international polar year to build the future of polar data management. Data Science Journal 13.

Parsons MA, Ø Godøy, E LeDrew, TF de Bruin, B Danis, S Tomlinson, and D Carlson. 2011. A conceptual framework for managing very diverse data for complex interdisciplinary science. Journal of Information Science 37 (6): 555-569.

Parsons MA, T de Bruin, S Tomlinson, H Campbell, Ø Godøy, J LeClert, and IPY Data Policy and Management SubCommittee. 2011. The state of polar data—the IPY experience. In Understanding Earth’s Polar Challenges: International Polar Year 2007-2008. Ed. Krupnik I, I Allison, R Bell, P Cutler, D Hik, J López-Martínez, V Rachold, E Sarukhanian, and C Summerhayes. Edmonton, Canada: CCI Press.