Documentation Cluster Blog: ESIP Summer Meeting Metadata Improvement Lab

Submitted by scgordon on Tue, 2016-06-14 15:07

Complete documentation of scientific data is the surest way to facilitate discovery and reuse, particularly if you use a metadata dialect that has been standardized. What is complete metadata though? How can you be sure that what you include in the metadata is not only relevant to your organization’s work, but understandable to your scientific community and beyond?

At the 2016 ESIP Summer Meeting, I will be demonstrating a process to analyze the content of metadata in many standard dialects with respect to over 10 different recommendations of particular interest to the earth, space, and ecological scientific communities. You only need a connected web browser or Microsoft Excel to participate in this hands-on activity.

I’d like to invite anyone to share metadata in one of the dialects below for use in this lab, even if you are unable to attend.

Known Metadata Dialects

ISO 19115-1 / ISO 19115-3

Directory Interchange Format (DIF9)

Directory Interchange Format(DIF10)

DataCite 3.1

Content Standard for Digital Geospatial Metadata (CSDGM)

Content Standard for Digital Geospatial Metadata (CSDGM) Biological Data Profile

ECHO

ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2

Metadata Object Description Schema (MODS)

Dryad

DataOne Dublin Core Extended v1.0

Mercury Metadata Standard

Attribute Convention for Data Discovery (ACDD)

Ecological Metadata Language (EML)

There are many metadata recommendations from organizations like the OGC, FGDC, NASA, and LTER, that can provide documentation guidance. Often, the recommendations that organizations develop are for a particular metadata dialect. However, the concepts being described are similar, and are often exactly the same between dialects. For example, many different dialects, use an element to refer to the resource’s title. Since there are many such concept similarities, we can quantitatively report on a collection in many dialects for the concepts contained in many recommendations.

Metadata Recommendations	Originating Organization
CSW_Discovery	Open Geospatial Consortium
ISO-1_Discovery	International Standards Organization
DIF_Discovery	National Aeronautics and Space Administration
FGDC_Discovery	Federal Geographic Data Committee
DataCite_Discovery	DataCite
DCAT_Discovery	World Wide Web Consortium
ECHO_Discovery	National Aeronautics and Space Administration
ACDD Discovery	University Corporation for Atmospheric Research / ESIP Documentation Cluster
UMM-Collection	National Aeronautics and Space Administration
UMM-Common	National Aeronautics and Space Administration
UMM-Granule	National Aeronautics and Space Administration
LTER_Completeness	Long Term Ecological Research Network
Dryad-Package	Dryad Digital Repository
Dryad-File	Dryad Digital Repository

Participants can use metadata collections from their own organizations as long as it is submitted in advance, but we will also provide some sample data if you just want to drop in. Please send 50 or more XML records from your collection. If you are also interested in dialect translation proofing, you may send samples containing up to four dialects.

I will use XSL prior to the lab to mine the collections for concept content. In the lab you will learn how to utilize the resultant data and a recommendation’s concepts to assess the completeness of records in the collections, and how to use that information to inform an iterative design process to improve the completeness of current records and prepare future metadata creation for success.

I hope to see you there!

Tags:

Comments

Table Formatting.

Permalink Submitted by scgordon on Tue, 2016-06-14 15:14

Table Formatting.

Log in to post comments