Instrument Identifiers = Better Science
/Ted Habermann, Metadata Game Changers
Cite this blog as: Habermann, T. (2024). Instrument Identifiers = Better Science. Front Matter. https://doi.org/10.59350/ehsbf-gwe11
Measuring ocean temperatures accurately is a critical step towards understanding heat in the oceans. Measurements made using EXpendable BathyThermographs (XBTs) compose ~18% of available ocean temperature data and these instruments introduce a positive bias into the global temperature anomaly estimates, resulting in a larger apparent warming following their introduction during the late 1960’s and early 1970’s. Estimating and correcting these biases depends on knowing the XBTs used to make measurements and knowing which instruments were used depends on persistent identifiers (PIDs) for those instruments.
The Research Data Alliance PIDINST Working Group brought together experts from many domains to develop the concept of instrument identifiers into collection of use cases and a conceptual metadata schema with several implementations, including one based on the DataCite Metadata Schema. The DataCite community took the next step by adding the resource type “Instrument” to Version 4.5 of the metadata schema. This Figure shows how DataCite metadata for the instrument identifier can be combined with connected resources to help researchers understand the instrument, the measurements it makes, and the accuracy of those measurements.
A talk from the last NISO Plus meeting discusses how instrument identifiers and this metadata model can help improve understanding of ocean temperatures and other observations. My instrument identifier talk starts at 13:50 in the recording of the session, titled “PID Innovations And Developments In Scholarly Infrastructure”. The session also includes talks by Matt Buys from DataCite, Shawn Ross from ARDC, and Amanda French from ROR.
Spoiler: most XBT metadata do not include instrument identifiers. See Cowley et al., 2013, https://doi.org/10.1175/JTECH-D-12-00127.1 for the story of unraveling the data without them.