In 2014 the National Center for Atmospheric Research (NCAR) Directorate created the Data Stewardship Engineering Team (DSET) to plan and implement the strategic vision of an integrated front door for data discovery and access across the organization, including all laboratories, the library, and UCAR Community Programs. The DSET is focused on improving the quality of users’ experiences in finding and using NCAR’s digital assets. This effort also supports new policies included in federal mandates, NSF requirements, and journal publication rules.
An initial survey with 97 respondents identified 68 persons responsible for more than 3 petabytes of data. An inventory, using the Data Asset Framework produced by the UK Digital Curation Centre as a starting point, identified asset types that included files and metadata, publications, images, and software (visualization, analysis, model codes). User story sessions with representatives from each lab identified and ranked desired features for a unified Scientific Data Management System (SDMS). A process beginning with an organization-wide assessment of metadata by the HDF Group and followed by meetings with labs to identify key documentation concepts, culminated in the development of an NCAR metadata dialect that leverages the DataCite and ISO 19115 standards.
The tasks ahead are to build out an SDMS and populate it with rich standardized metadata. Software packages have been prototyped and currently are being tested and reviewed by DSET members. Key challenges for the DSET include technical and non-technical issues. First, the status quo with regard to how assets are managed varies widely across the organization. There are differences in file format standards, technologies, and discipline-specific vocabularies. Metadata diversity is another real challenge. The types of metadata, the standards used, and the capacity to create new metadata varies across the organization. Significant effort is required to develop tools to create new standard metadata across the organization, adapt and integrate current digital assets, and establish consistent data management practices going forward. To be successful, best practices must be infused into daily activities.
This poster will highlight the processes, lessons learned, and current status of the DSET effort at NCAR.