NASA EOSDIS - Architecting for the Future
In 2006, NASA undertook a concerted effort to modernize and improve EOSDIS, the system of systems built to support the varied and, in many cases, voluminous data and information sets that began to flow into the system beginning in the late 1990s. With an external advisory panel’s guidance and a resultant set of Vision for 2015 Tenets, the systems were updated to lower the cost of operations and maintenance, improve IT currency, and improve the responsiveness to the research communities that EOSDIS primarily serves.
At present, each of the twelve Distributed Active Archive Centers (DAACs) continues to serve their individual user communities very well. However, the types of services available vary greatly from DAAC to DAAC. Some user communities still feel that Earth science data are difficult to find, understand, and use. This is particularly true for non-experts in remote sensing.
The purpose of this session is to explore ideas for enhancing the capabilities of EOSDIS as a whole. The focus will be on identifying technologies, standards, and best practices to not only improve data accessibility and usability for our traditional user communities, but also to lower the barriers for interdisciplinary users, modelers, applied science users, application users, and decision makers.
Note on session notes: ;-)
I am only attempting to capture info that veers from the content covered in meeting slides. In other words, this section will attempt to capture questions and comments during the session.
Intro by Jeff Walter
Q: Extending the tenets, discussed at DAACs last year, implemented?
A: Has been updated and incorporated.
Handoff to Andy...
Q: What fraction of EOS data is inventoried in ECHO
A: All EOSDIS data should be in GCMD and ECHO (2900 collections) GCMD id is available in ECHO, vise-versa, and they match up. Tyler: everything in the IDN is also in the GCMD.
Q: Has anybody built any restful interfaces into GCMD?
A: Ongoing process, some groups have expressed interest.
Q: How current is ECHO?
A: Depends on ingest - some near real-time data. Depends on dataset, some is a monthly push. Ok, not monthly, some arbitrary interval.
Q: For a given record, what's diff between Langly record and ECHO.
A: Not exactly same schema, pretty close.
Q: Definition of ordering?
A: Staged with notification.
Q: Can I do same search on partner site as ECHO site?
A: Yes, this is bread and butter of ECHO
Q: Federated vs. Centralized commentary, what are benefits and drawbacks? Mandate was simplicity for data discovery and open search.
Comment: At time ECHO was conceived, a replication strategy was not feasible.
Comment: Data index volume: 1.5 TB, metadata archive is 4.5 TB. More about transactions than volume though.
Handoff to Kevin Murphy...
Q: Why do people look at browse images?
A: Many reasons. Is my area of interest available. Initial context - is scene cloud free?
Q: Name of usability company hired? (Usability eval done across NASA web sites)
A: Blink. Had some interesting observations, e.g. how many times can you say data on one page?
BREAK occurred here... 3:00 PM
Post break, Kevin Murphy resumes...
Q: Demo of WorldView, Peter: Can I go and download that scene as HDF file?
A: That's what we're working on.
Clarification: "Browse" is part of discovery process. Tommy thinks... even more it's the initial analysis
Q: Why not utilize WCPS?
A: Need to evaluate protocols - OpenDAP, etc. for performance
Q: Context seems to be MODIS-centric. Extensible to JPSS etc?
A: Reason for using WTMS - can envision this model used across agencies - ESA, JAXA, etc.
Q: Has there been discussion with users as to how their workflow will be affected by new system?
A: Metadata study 2.0 - yes, study being done on how to add value through metadata chain
Confusion over Centralized Metadata Architecture slide. (there are errors in the slide arrows)
Much commentary over metadata search, integrity, etc. will be handled.
Q: Metadata DB schema considerations?
A: TBD - Need to choose what is the authoritative information to store and provide.
Q: Will it be easy for various datacenters to agree on format and info?
A: (Frank Linsay), NO ;-)
Q: What architectures are being considered?
A: TBD (Tommy - problem is solvable - maybe Hadoop given the metadata volume)
Frew comments on flaws in original EOSDIS architecture based on centralization, and accessibility needs to be emphasized here.
Comments from others on concerns of scalability. Not a new problem - ref Netflix, Amazon, etc.
Various commentary on data access, value/use of FTP, security.