Measuring Metadata

Evaluating documentation is critical for identifying good examples within a collection, for laying out a path forward and for recording progress as the documentation is improved. In the end, whether or not users can use and trust your data is the final evaluation. There are quantitative and qualitative steps that can be used as signposts on the way to trustworthy data.

Metadata repositories are typically behind the scenes driving data discovery portals, business processes and decisions, and publishing houses. Measuring the metadata helps make improvements visible. This radar plots compares CrossRef metadata completeness during two time periods. It shows the results of a concerted effort to increase the number of ORCIDs in this publication metadata. ORCIDs are unique and persistent identifiers for people. Adding them to metadata helps those people get credit for all kinds of contributions to the research community.

Continuous Improvement and Bright Spots

Universities with the most complete FAIR DataCite metadata are identified using a community convention for FAIR DataCite metadata that support findability with text and identifiers, connections and contacts.

makeImprovementVisible.png

Make Improvement Visible

Adding ORCIDS connects people to the PID Graph!

The tools we use for creating metadata influence the content we can create and the completeness of that content. Measuring metadata makes it possible to compare metadata before and after changes to those tools. In this case, changing a publishing platform led to a increases in metadata content across the board!

RockefellerUniversityPress.jpg

Find systemic changes

Big improvements across the board

Large metadata infrastructure providers like CrossRef and DataCite provide access to metadata from many providers some of which manage many repositories. It is important to be able to find Bright Spots in those groups with great metadata that can be used as good examples for others that are starting improvement efforts. The physics journal Stichting SciPost had the most complete metadata of almost 1700 CrossRef members measured using the CrossRef Participation Reports. Great job!

StichtingSciPost+2.jpg

Identify Bright Spots

Good work deserves recognition!

 

Change is hard

Metadata improvement is hard when it involves organizational change. The Heath brothers provided some great insight into organizational change in their great book Switch. This talk applies some of their insights to metadata improvement.

Metadata Improvement

Helping scientists and data providers understand how to improve their documentation involves understanding their requirements and identifying specific steps towards satisfying them. Explaining those steps with straightforward guidance, relevant community examples and rewards for moving forward are also important.

Resources

The presentation given at the AGU fall meeting by Ted Habermann focused on how repository re-curation could help repositories of all kinds respond to the guidance in the OSTP Public Access Memo published during August 2022.

Documentation Dialects

All scientific disciplines and communities understand that documenting their data makes it trustworthy and helps others use it with confidence. Many communities develop conventions for their documentation that enable sharing within the community, but can make it more difficult to share outside the community. Sometimes these dialects are called "standards" but, in the end they are all dialects of the same documentation language.

This schematic shows two metadata dialects each of which has related recommendations (R1 - R6) and overlap for elements supporting the discovery use case. This is not unusual as discovery systems are very similar across domains while metadata that supports use, understanding, and trust can be more domain specific. Many dialects include the concept of core metadata for cross-domain content and extensions for community specific elements.

RecommendationsAndDialects.png