Discovery Through Links
/Metadata records are high-value information hubs that connect users to related resources in journals, books, and on the web.
Read MoreMetadata records are high-value information hubs that connect users to related resources in journals, books, and on the web.
Read MoreThe Doe family consists of mother Jane and her son John. Identifying people and relationships is an important function of metadata. The ISO TC211 metadata standards provide a comprehensive framework for many important metadata use cases. Could ISO metadata be used to describe the Doe family?
ISO metadata describes objects with types and properties. ISO types generally start with uppercase letters followed by an underscore (MD_, CI_, ...) and have identifiers (id and uuid). Properties are in lower camel case (e.g. contactInfo) and have references (xlink:href and uuidref) that link to objects with content for those properties.
In ISO metadata, Jane and John are objects that have types (CI_Individual) and identifiers (JaneDoe and JohnDoe) which are given as id properties. Jane has a son (the property) whose identifier is JohnDoe, indicated with the reference (xlink:href) to JohnDoe associated with the property of son. John has a mother (the property) named JaneDoe, indicated by the reference to Jane associated with the mother property.
Using identifiers and references simplifies metadata creation and maintenance, improves consistency, and facilitates reuse of people, organizations, or citations referenced using links (like web pages) rather than repetitive content. More about that later.
This blog is included in a series related to ISO TC211 metadata standards. The goal of the series is to improve understanding and to examine and compare how the standards are being used by various groups around the world. Some of the background content was originally available on the NOAA GEO-IDE wiki. It will be updated in this series.
The idea that the language we use to talk about things shapes the way we think or can think about those things has been around since the 1800’s and even has a name, the Sapir–Whorf hypothesis, proposed during 1954. It was Whorf who said, “Language is not simply a reporting device for experience but a defining framework for it.” Last year Lera Boroditsky discussed a similar idea from the stage at TEDWomen with some nice examples and data from multiple languages and cultures. I have been thinking and writing about a universal documentation language for some time and bring together a couple of those ideas here.
Some metadata terms emerged from my metadata evaluation and guidance work with many partners. I described the concept of “metadata dialects” and suggested that many metadata standards are more like dialects of a universal documentation language then they are like separate languages. Some have questioned whether a universal “documentation language” really exists. I admit that it is really a concept that I hope exists rather than a real language described in an unabridged dictionary somewhere.
More recently, I introduced this dialect nomenclature to the Metadata 2020 community of metadata experts that advocate richer, connected, reusable, and open metadata for all research outputs. The terms are slowly creeping into some Metadata 2020 discussions, hopefully helping to build and cross bridges between different communities that are committed to better metadata in all contexts.
Many datasets and products are documented using approaches and tools developed by data collectors to support their analysis and understanding. This documentation exists in notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, databases, custom binary and ASCII formats, and almost any other conceivable form, each with associated storage and preservation strategies. This custom, often unstructured, approach may work well for independent investigators or in the confines of a particular laboratory or community, but it makes it difficult for users outside of these small groups to discover, use, and understand the data without consulting with its creators.
Metadata are standard and structured documentation.
Metadata, in contrast to documentation, helps address discovery, use, and understanding by providing well-defined, structured content. This makes it possible for users to access and quickly understand many aspects of datasets that they have not collected. It also makes it possible to integrate information into discovery and analysis tools, and to provide consistent references from the metadata to external documentation.
Metadata standards provide standard element names and associated structures that can describe a wide variety of digital resources. The definitions and domain values are intended to be sufficiently generic to satisfy the metadata needs of various disciplines. These standards also include references to external documentation and well-defined mechanisms for adding structured information to address specific community needs.
Another important difference between documentation and metadata is the target audience. Documentation is targeted at humans and it relies heavily on our capability to make sense out of a variety of unstructured information. Metadata, on the other hand, is typically targeted at applications. Many of these applications facilitate searching metadata and displaying it in a way that facilitates data discovery by humans. As tools mature and, more importantly, the breadth of existing metadata increases, we will see more and more applications creating and using metadata to facilitate more sophisticated metadata and data driven discovery, comparisons between multiple datasets, and other analyses.
Of course, the audience is also very important when we create metadata. Humans like descriptions that help them understand the resources being described and citations to more, likely unstructured, information. Applications are generally much more demanding when it comes to consistency and completeness. It is important to consider both audiences when creating and improving metadata.
Note added: It is interesting to see that the word “documentation” has a much longer history than the word “metadata”. Metadata is really the new kid on the block.
I have worked in scientific data management for many years and enjoy working with organizations and communities that share data and knowledge. I am fluent in metadata standards and dialects used in scientific data management and publishing.
We are constantly working to help you change your metadata game. If you have any questions, suggestions, or crazy ideas, please send contact us or connect with us through the details below.
Ted Habermann
ted@metadatagamechangers.com
ORCID | LinkedIn | Twitter
Erin Robinson
erin@metadatagamechangers.com
ORCID | LinkedIn | Twitter
or use this form.
Search the site:
Powered by Squarespace.