DataCite Metadata Continues to Add Capabilities

Ted Habermann, Metadata Game Changers

During 2021, I chronicled how the DataCite metadata schema had evolved to improve the capability to address many of the FAIR principles from 2011 to 2019 (Habermann, 2021). That discussion covered schema versions up to V4.3 which was released during late 2019. I noted several trends:

1.    Increase in the number of resource types that could be described.

2.    Increase in the number of relationships between resources.

3.    Increasing structure in the metadata – i.e. metadata objects with properties instead of plain strings. 

The metadata schema has continued to evolve since 2021 and continued these trends with 13 new resource types added in V4.4 to improve specificity for describing text resources and the addition of resource types for Instruments, StudyRegistrations, Samples, Awards and Projects in V4.4 and 4.5 with appropriate relation types (IsCollectedBy and Collects). In addition, a trend towards increased structure to include identifiers when possible continued with the publisher element which became an object with an identifier. This change is significant because publisher is a mandatory field included in all records so, in many repositories all records can now include RORs connecting them to the organization running the repository. 

Metadata schema evolution reflects the progression of needs, ideas, and practices of the community that creates and uses the metadata. Figure 1 summarizes the evolution of DataCite metadata between 2011 and 2024 and highlights changes related to increasing “FAIRness”. Three types of changes are shown:

•       bold text shows changes related to mandatory fields,

•       Italic text shows additions to several shared vocabularies,

•       plain text shows properties introduced in various schema versions.

Overall, these changes reflect a marked increase in the capability to describe data in a FAIR way and several other long-term trends.

Figure 1. Schematic of DataCite metadata evolution through time.

All the schema updates described here are included in Appendices of the DataCite schema documentation and that is the authoritative source. The changes I summarize here are focused on improving capabilities related to FAIR use cases, mostly identifying various kinds of resources and making connections between them. I hope that presenting these changes together increases awareness of the many steps that DataCite has taken to help data and metadata be more FAIR and that this awareness encourages the community of metadata creators and managers to take advantage of these improvements. 

Of course, these improvements in DataCite metadata are only effective if they are adopted by metadata creators and it is clear that adoption is currently lagging far behind metadata evolution. Stathis et al. (2023) outlined the schema changes with tips on using DataCite metadata effectively.  Measuring metadata creates an important baseline for identifying good examples and fruitful improvement opportunities. If you are interested in improving your DataCite metadata, please contact us at Metadata Game Changers.

References

Habermann, T. (2021). DataCite Metadata: Evolving to FAIRness. Front Matter. https://doi.org/10.59350/ftzmw-k9q02

Stathis, K., Chen, X., Cousijn, H., & Puebla, I. (2023, November 8). The DataCite Metadata Schema Through Time. A Decade of Data: Celebrating 10 Years of the Research Data Alliance, Online. Zenodo. https://doi.org/10.5281/zenodo.10081013