Progress in Open-World, Integrative, Collaborative Science Data Platforms.

Abstract: 

As collaborative, or network science spreads into more Earth and space science fields, both the participants and their funders have expressed a very strong desire for highly functional data and information capabilities that are a) easy to use, b) integrated in a variety of ways, c) leverage prior investments and keep pace with rapid technical change, and d) are not expensive or time-consuming to build or maintain. In response, and based on our accummulated experience over the last decade and a maturing of several key technical approaches, we have adapted, extended, and integrated several open source applications and frameworks that handle major portions of functionality for these platforms. At minimum, these functions include: an object-type repository, collaboration tools, an ability to identify and manage all key entities in the platform, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options.

At a conceptual level, science networks (even small ones) deal with people, and many intellectual artifacts produced or consumed in research, organizational and/our outreach activities, as well as the relations among them.  Increasingly these networks are modeled as knowledge networks, i.e. graphs with named and typed relations among the 'nodes'. Nodes can be people, organizations, datasets, events, presentations, publications, videos, meetings, reports, groups, and more. In this heterogeneous ecosystem, it is also important to use a set of common informatics approaches to co-design and co-evolve the needed science data platforms based on what real people want to use them for.

In this contribution, we present our methods and results for information modeling, adapting, integrating and evolving a networked data science and information architecture based on several open source technologies (Drupal, VIVO, the Comprehensive Knowledge Archive Network; CKAN, and the Global Handle System; GHS). In particular we present both the instantiation of this data platform for the Deep Carbon Observatory, including key functional and non-functional attributes, how the smart mediation among the components is modeled and managed, and discuss its general applicability.

Author(s): 

Name: Peter Fox
Organization(s): TWC
Email: [email protected]