Development of cyberinfrastructure to facilitate collaboration and knowledge sharing for marine Integrated Ecosystem Assessments


Here we present an approach to help scientists collaborate in multi-disciplinary research, providing a wide spectrum of software tools for data science and enabling the reproducibility of their research outputs. The main tool is based on the extensive use of a web application, the IPython Notebook, that gives the scientists  the ability to work on very diverse and heterogeneous data and  information sources, providing an effective way to share the source code  used to generate data products and associated metadata as well as save  and track the workflow provenance. A key feature in IPython (Interactive  Python) is that metadata, embedded in the Notebook, can be generated  during the access and processing of data. We are presently developing functionalities to collect the provenance generated at each run of the workflow and store this metadata in the JSON-LD (JSON for Linking Data) standard format. In this way it is possible to record the provenance for derived data products, to trace back to their original  sources and the processing conducted to generate them.

Creative Common License: 
Creative Commons Attribution 3.0 License