The science community has long recognized the importance of citing data in published literature to encourage replication of experiments and verification of results. Authors that try to cite their data often find that publishers will not accept Internet addresses because they are viewed as transient references, frequently changed by the data provider after the paper is published. Digital Object Identifiers (DOIs) and the DOI® System were created to avoid this problem by providing a unique and persistent identifier scheme and an online resolution service. DOIs and the Internet service provided by the DOI System have emerged as the most acceptable scheme for publishers. NASA’s Earth Science Data and Information System (ESDIS) Project, in cooperation with several Earth Observing System (EOS) instrument teams and data providers, has developed methods for assigning DOIs to EOS products. By assigning DOIs we are enabling authors and publishers to find it easier and more compelling to cite EOS data products.
DOIs are unique alphanumeric strings that consist of a prefix and suffix. The prefix is assigned by a registration agency for the DOI System. The suffix must be unique, but is otherwise free to be constructed by the publisher, in this case NASA ESDIS Project. A strategy was needed for constructing DOI suffix names that corresponds to each EOS product. Since the onset of the DOI System, publishers have developed conventions to suit their own purposes. These range from random generation to complex, formally controlled vocabularies. An overarching ESDIS goal has been for the DOI names to be attractive for researchers to use in publication applications. Keeping them short and simple is paramount. When adding meaning to the string, it is also important that the name only refer to the data and not to the publisher, so that the DOI can be accepted as persistent even if the data is moved to a new publisher.
Most users download EOS product files to their local facilities when they want to use the data for analysis or applications. By imbedding DOIs in the file metadata, users have access to the DOI value long after the product has left the source data center. This enables users to find documentation about the product in the future – long after it has left the contextual environment of the data provider. Existing HDF and netCDF metadata structures have been adapted to accommodate the addition of DOIs. In addition, associated EOSDIS core metadata will also contain a product specific attribute for DOIs.
Advances in computer science and Internet brought about a host of data identification schemes designed to solve problems inherent in developing advanced provenance models. Lessons from trying to use early satellite observations in climate studies today point to the importance of providing links in data archives to documentation and publications about the data. Data system engineers link data records to standard product documentation prepared at the time of the mission and archive with the data, but will need to add links to the whole range of information needed to support future research and long-term climate studies. DOIs can serve this need if referenced by developers when preparing technical data and reports, as well as when publishing research results.