Jun 5, 2015

Data Sharing in Human Paleogenetics

A study published in PLOS ONE “When Data Sharing Gets Close to 100%: What Human Paleogenetics Can Teach the Open Science Movement”  examined patterns of sharing data related to polymorphisms in ancient mitochondrial DNA and Y and X chromosomes. The authors focused on data that can be considered derivative, i.e., they’re derived from the processing of raw data obtained via such methods as DNA purification, Polymerase Chain Reaction, and so on. 162 PubMed papers on ancient human DNA containing a total of 207 datasets were retrieved.

The authors classified types of sharing (in the text, files for download, supplementary materials, online database) and sent out a survey to the papers’ authors about their choices in sharing human DNA data.

202 datasets out of 207 (97.6%) were fully available and reusable. Five datasets that were initially withheld, have been published later. At the same time more than half of the datasets (57.7%) were shared in the body of the published article, rather than via a database or in separate files.

Among the 33 researchers who responded to the survey, most of them acknowledged the importance of making their studies open to scientific inquiry. Many (97%) also agreed that data sharing should be a common practice in science. The authors of the PLOS paper make a conclusion that a) awareness of the importance of openness in science may help achieve a high data sharing rate; b) modality of sharing (i.e., how data is shared) plays an important role in sharing behavior, and c) openness to the scientific scrutiny of data in human paleogenetics coupled with the adoption of rigorous standards and cross-laboratory validation has been crucial in establishing the field and its scientific rigor and data reliability.