Metadata Game Changers
  • Home
  • Offerings Capabilities Our Team Contact
  • Software
  • Metadata Game
  • Blog
Metadata Game Changers
  • Home/
  • About Us/
    • Offerings
    • Capabilities
    • Our Team
    • Contact
  • Software/
  • Metadata Game/
  • Blog/
dataCiteModel_V6.jpg
Metadata Game Changers

Blog

Exploring metadata, communities, and new ideas.

Metadata Game Changers
  • Home/
  • About Us/
    • Offerings
    • Capabilities
    • Our Team
    • Contact
  • Software/
  • Metadata Game/
  • Blog/
May 02, 2024

Making the Invisible Visible: Celebrating the Year of Open Science

May 02, 2024/ Ted Habermann

Ted Habermann, Metadata Game Changers

Jamaica Jones, University of Pittsburgh

Cite this blog as Habermann, T. (2024). Making the Invisible Visible: Celebrating the Year of Open Science. Front Matter. https://doi.org/10.59350/77zs1-hz764

Metadata Game Changers and the INFORMATE Project had the opportunity to present some of our recent work during the recent culminating conference to showcase the outcomes, coalition-building efforts, and ongoing work stemming from the 2023 Year of Open Science (YOS). Our talk focused on FAIRness of DataCite metadata in university repositories, consistency of funder metadata, and comparisons between CHORUS data from the global research infrastructure and several NSF repositories. Some highlights are described here, the slides are available here, and a recording of the talk is available.

University Metadata FAIRness

DataCite metadata from 387 university repositories was evaluated using the MetaDIG FAIRness recommendations (Habermann, 2019). The results, shown in Figure 1, indicate that, even though the DataCite metadata schema includes many elements that can support all of the FAIR use cases (findability, accessibility, interoperability, and reuse), most universities do not take advantage of these capabilities in their metadata. Instead, they focus on providing minimal metadata required to get DOIs quickly.

Figure 1. Metadata completeness for FAIR use cases in university repositories at DataCite. The Y-axis shows the number of repositories, and the X-Axis shows the completeness level. The completeness is highest for the Text use case because of mandatory DataCite metadata fields and decreases for all other use cases the correspond to the A, I, and R in FAIR.

Consistency of Funder Metadata

Consistent metadata is critical within and across repositories to correctly identify connections between research objects of many kinds. With the goal of characterizing contributions from specific awards, we searched text award metadata in CHORUS retrievals from Crossref to find NSF award numbers. We were successful in identifying award numbers for 93% of the awards. However, this leaves almost 32,000 free text award descriptions for which award numbers could not be identified (Figure 2). Almost half of these are errors in the length of the provided award numbers, i.e. they are shorter or longer than the required seven digits. In other cases, award titles are provided instead of identifiers, or apparently random text.

Figure 2. CHORUS data includes over 430,000 text descriptions of award numbers. Actionable award numbers could be recognized in 93% of these cases, leaving nearly 32,000 without recognizable award numbers.

The Denominator

Understanding how well agency repositories are doing capturing information on research contributions requires that we can estimate the total number of contributions that are out there, i.e. the denominator. CHORUS data provides one estimate of this denominator, and comparing CHORUS and PAR indicates that 36% of the articles included in CHORUS are also recorded in PAR (Figure 3).  A second estimate this denominator comes from the NSF Award database. In this case, 6% of the awards in the Award Database are included in PAR.

Figure 3. Comparisons of the NAF Public Access Repository (PAR) with several estimates of the denominator.

Conclusion

This talk describes several challenges identified while exploring the global research infrastructure using CHORUS data for NSF and other agencies. Descpite these challenges, the talk finishes by describing several successes: using existing metadata from institutional repositories to improve DataCite completeness and re-curating affiliations and organizational identifiers into Dryad.

The recording of the talk is available here, and the slides are available here, please let us know if you have questions or comment below.

References

Habermann, T. (2019). MetaDIG recommendations for FAIR DataCite metadata. Front Matter. https://doi.org/10.59350/n31gm-kg364

May 02, 2024/ Ted Habermann/ Comment
informate, presentation, FAIR, metadata evaluation, funder metadata

Ted Habermann

  • Funder Acronyms Are Still Not Enough
  • CHORUS Data Journeys
  • Home/
  • About Us/
    • Offerings
    • Capabilities
    • Our Team
    • Contact
  • Software/
  • Metadata Game/
  • Blog/

Metadata Game Changers

I have worked in scientific data management for many years and enjoy working with organizations and communities that share data and knowledge. I am fluent in metadata standards and dialects used in scientific data management and publishing.

Tell us what you think!

We are constantly working to help you change your metadata game. If you have any questions, suggestions, or crazy ideas, please send contact us or connect with us through the details below.

Ted Habermann
ted@metadatagamechangers.com
ORCID | LinkedIn | Twitter

Erin Robinson
erin@metadatagamechangers.com
ORCID | LinkedIn | Twitter

or use this form.

Search the site:

Subscribe

Sign up with your email address to receive news and updates.

We respect your privacy.

Thank you!

Powered by Squarespace.