Showing posts with label data practices. Show all posts
Showing posts with label data practices. Show all posts

Oct 19, 2015

Making progress in data sharing

A few useful tips on making progress in data sharing in a blog post "Data Sharing: Access, Language, and Context All Matter":

  • To make the global data system less fragmented and disorganized, create data portals with good human-centered designs and support users with varying levels of expertise

  • JSON and XML are great, but humans read data too. These formats are critical to fueling innovation, but make sure CSVs are available as well

  • Responsible data use demands proper attention to metadata. Document datasets and don't ignore ReadMe files  while re-using them

Mar 5, 2015

DataONE collection of data stories

DataONE provides a collection of researcher's experiences of data sharing and management, containing both successful and cautionary tales.

This collection of data stories has been compiled by the DataOne project by interviewing scientists about their experiences of data sharing and management. Stories include experiences of data loss and recovery, and several of data sharing agreements or barriers (including cultural barriers).

Several of the stories are themed around ecology e.g. oil spills, climate change, forestation.  Examples of stories include

  • overcoming difficulties of data sharing when different formats, tools and storage methods are used

  • problems with missing metadata

  • recovering from mistaken data deletion

  • challenges of requesting data from other researchers

Feb 11, 2015

Institutional analysis of data practices

A short summary of a paper published in JASIST recently: Mayernik, M. S. (2015), Research data and metadata curation as institutional issues. J Assn Inf Sci Tec. doi:10.1002/asi.23425.

The paper begins by noticing a mismatch between the findings of two studies on the data practices in climate science. One of them (a report commissioned by the UK Research Information Network RIN) described the level of data sharing in climate science as low and the other (the book by Edwards "A vast machine...") argued that data sharing was a strong and common norm in climate science. Which one is true? Or, could it be that both studies are correct and climate science includes both the high and the low data sharing levels?

Data practices are institutionalized within a number of social systems, including formal organizations (such as universities and research centers), rules and sanctions (such as funding agency requirements and professional guidelines), and the norms of modern Western science, so the case study analysis in this paper is grounded in the institutional framework that has five characteristics: (a) norms and symbols, (b) intermediaries, (c) routines, (d) standards, and (e) material objects. Norms are largely associated with the norms of science (Merton and later work), symbols are logos and other visible signs of collective identity, but also terminological choices and metaphors. Intermediaries are individuals or collectives who connect resources and facilitate relationships. Routines are frequently repeated patterns of action and interaction, for example, meal or socializing routines. Standards are rules and specifications that define how information can be presented, organized, and transferred. Material objects are ... material objects.

The case studies are comparisons between data practices at the Center for Embedded Networked Sensing (CENS) and the Long Term Ecological Research (LTER) network and between the University Corporation for Atmospheric Research (UCAR)and the National Center for Atmospheric Research (NCAR).

Although there are some interesting observations in these case studies, it seemed that the first, conceptual part of the paper was stronger than the second. The five characteristics of the institutional framework were applied rather narrowly, without revealing many interconnections and directionality. For example, the standards section focuses on metadata standards and their choice. Are there any other standards relevant to data practices? How does the choice of standards affect norms and what is the role of intermediaries in establishing routines and other aspects of data practices? Another much more important question is: Once we describe the variability of data practices within and across disciplines, what's next? What exactly is the role of each institutional carrier in data practices?