Provenance of Earth Science Datasets – How Deep Should One Go?


For credibility of scientific research, transparency and reproducibility are essential. This fundamental tenet has been emphasized for centuries, and has been receiving increased attention in recent years. The Office of Management and Budget (2002) addressed reproducibility and other aspects of quality and utility of information from federal agencies. Specific guidelines from NASA (2002) are derived from the above.  According to these guidelines, “NASA requires a higher standard of quality for information that is considered influential. Influential scientific, financial, or statistical information is defined as NASA information that, when disseminated, will have or does have clear and substantial impact on important public policies or important private sector decisions.” For information to be compliant, “the information must be transparent and reproducible to the greatest possible extent.” 

We present how the principles of transparency and reproducibility have been applied to NASA data supporting the Third National Climate Assessment (NCA3). The depth of trace needed of provenance of data used to derive conclusions in NCA3 depends on how the data were used (e.g., qualitatively or quantitatively). Given that the information is diligently maintained in the agency archives, it is possible to trace from a figure in the publication through the datasets, specific files, algorithm versions, instruments used for data collection, and satellites, as well as the individuals and organizations involved in each step. Such trace back permits transparency and reproducibility.

(This poster was presented at Dec. 2015 AGU meeting, and would be useful to present at the ESIP meeting as well). The full list of authors is: Gerald Manipon1 [email protected], Hampapuram Ramapriyan2 [email protected], Steve Aulenbach3 [email protected], Brian Duggan4 [email protected], Justin Goldstein5 [email protected], Hook Hua1 [email protected], Dexter Tan6 [email protected], Curt Tilmes7 [email protected], Brian Wilson1 [email protected], Robert Wolfe5 [email protected], Stephan Zednik8 [email protected] 1Jet Propulsion Laboratory, California Institute of Technology, 2 Science Systems and Applications, Inc., 3US Geological Survey, 4PromptWorks, 5US Global Change Research Program, 6Raytheon Company, Pasadena, 7NASA Goddard Space Flight Center, 8Rensselaer Polytechnic Institute