Below is a brief summary from a recent report to ESIP Federation's Data Stewardship Committee that evaluated identifier schemes for Earth system science data and information(see also executive summary and links). The report seems to be a hands-on continuation of the paper published in 2011 "On the utility of identification schemes for digital earth science data: an assessment and recommendations" by Ruth Duerr and others(link).
The paper introduced four uses cases and three assessment criteria:
- unique identification (identify a piece of data, no matter which copy)
- unique location (locate an authoritative copy)
- citable location (identify cited data)
- scientifically unique identification (to tell whether two data instances have the same info even if the formats are different)
- Technical value (e.g., scalability, interoperability, security, compatibility, technological viability)
- User value (e.g., publishers' commitment, transparency)
- Archive value (e.g., maintenance, cost, versatility)
The report took those use cases, expanded assessment criteria and used all of it to test the implementation of 8 identification schemes, DOI, ARK, UUID, XRI, OID, Handles, PURL, LSID, and URI/URN/URL, using two datasets: the Glacier Photo Collection from the National Snow and Ice Data Center (JPEG and TIFF images) and a numerical data set from the NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) sensor.
- UUID are most appropriate as unique identifiers, any other use requires effort.
- DOI, ARK and Handles are the most suitable as unique locators, DOI and ARK also support citable locators. Handles need a local dedicated server. ARKs are cheaper than others, but DOIs are accepted by publishers.
- PURL has no means for creating opaque identifiers and the API support for batch operations is poor.
- The rest of the ID schemes are less suitable.
It seems that the overall conclusion is that DOI and ARK are generally better, but there is a need for support of multiple ID schemes in a system. From the report I didn't quite get whether any of the ID schemes can support the fourth use case - scientifically unique identification. The paper argued that "none of the identifier schemes assessed here even minimally address this use case".