NASA EOSDIS Evolving Technologies Discussion

Abstract/Agenda: 

NASA's Earth Observing System Data and Information System (EOSDIS) continues its work on a number of different projects, systems, and initiatives. These efforts are aimed at increasing discovery and use of EOSDIS data, improving the quality of the holdings, and lowering the bar to data use and participation in Earth Science Research.  This session will introduce new capabilities and offer a forum for discussion on:

  • EOSDIS' Next Generation Application Platform (NGAP) and its support for Cloud Deployment
  • The Common Metadata Repository (CMR) and Unified Metadata Model (UMM) updates
  • New support for Virtual Products, DIF interoperability, and APIs to simplify web application development
  • Approaches to improving metadata quality and authoring through CMR's new Metadata Management Tool
  • Updates and roadmaps for Earthdata, the Earthdata Code Collaborative, and Earthdata Search
  • Updates on the EOSDIS Open Source Governance plan

 

Notes: 

NASA EOSDIS Evolving Technologies Discussion

1. CMR+UMM Overview

  • EOSDIS Metadata challenges

    • lots of legacy systems

    • lots of archived metadata

    • ISO19115 is verbose

    • varying levels of metadata, consistency & completeness

  • Unified metadata model(UMM): science metadata -> UMM-Collection(GCMD DIF, ECHO10, ESDIS, ISO19115)

  • Interactive data discovery: e.g. spatial search efficiently

    • Common Metadata Repository (CMR): unified, high-quality, and reliable, time < 1s; support huge geospatial items and scale; fault tolerant;

    • ---> meet the performance and scalability needs

    • 95% for granule searches are less than 700ms

    • web friendly (lightweight JSON)

2. The (Long Awaited) ECHO and GCMD Merge

  • Inconsistent results:

    • different records for the same collection

    • merge data

  • UMR more benefits:

    • available to see IDN data

    • tagging capability

    • hierarchical facet responses

    • sub-second search

    • translation services and revision history services

    • validation with KMS keywords, additional curation

    • more flexible JSON query language

  • How to achieve them:

    • System integration: stop requirements paralysis,; prioritize; show progress weekly

    • Data integration: one by one; custom analysis; monthly ta-ups; flexible; prioritize -> make sure nothing is missing

 

3. CMR ECHO Transition and client collaboration (presented by the software developer)

  • ECHO & CMR: moving some of the core capabilities from ECHO technologies to the more modern approaches used in CMR

  • why/how that happen

  • ECHO Prehistory: monolithic application, java old API (not good) -> more separate services, REST API for ingesting, searching, and ordering data -> CMR and ECHO together, subset searching, smaller, additive

  • Limitations: Cost (old systems) + Complexity

  • Holding us back

    • reliability degraded service events(DSE) since CMR went live

    • ECHO code (largest cause)

  • Transition goals:

    • Incremental Rollout: capability by capability

    • Minimize Impact (API issues): no change for users

    • Safety Net

  • Transition plan

    • Each Capability: design, implement, integrate the CMR components; live switch over; Old api will retire in September 2016.

    • Benefits: reduced cost; improved reliability and performance; easily add new capabilities (users can improve the apis ); consistent APIs and features.

  • Data stored as immutable revisions: history track; make a copy of the database, then age off the old one; compare the differences

  • Need your help:

    • APIs Transition:

    • Feedback

  • API versioning strategies

    • API stability VS API improvements

    • strategies

      • client collaboration

      • URI Path: popular, but should be stable, granularity problem

      • Query Parameter or Custom Header: easy to use, but not RESTful, granularity problem

      • Content Negotiation: RESTful, URLs stay the same, request and response separately, but more difficult to define

      • Content negotiation via URL ext.: easy to specify for clients, but don’t handle request content.

      • May use: combination

  • For client collaboration: forum, CMR designs in Wiki, bug reports, email list,

  • Questions:

    • How many people use these API? -> do not have the number, but will track it

    • How to get the users’ feedback?

    • How to reduce cost? -> hardware, reliable performance and easier maintenance, virtual machines

 

3. Earthdata Search and Global OpenSearch

  • how to discover the local data

    • go to source, but presents problems -> tools providing fast access, but some data are not available -> CWICSmart + IDN OpenSearch + CWIC

    • Earthdata search (https://search.earthdata.nasa.gov)

      • smart

    • pros: agent global coverage

    • cons: no granule level search; few search query terms; probably not as fast; direct download not always available

4. NGAP: Compliance as a Service in the Public Cloud

  • Traditional Application hosting  VS Cloud

    • faster, resources procured via API calls

    • Pay for what you use

    • scaling is easy and automatic

    • unlimited vms

    • hosts change frequently: IP and storage clears

    • shared trust model

  • NGAP in cloud

  • NGAP applications: 12 -factors (http://12factor.net)

  • Components: API nodes, router, RedHat AMI

  • Deployment: Build Slug, start instances, fetch and start slug from S3, update routers

  • NGAP services: system maintenance, login

  • Questions:

    • Environment -> Model is published on Git

    • Automatic deployment model, such database? ->

    • Docker Containers? -> scheduler algorithms for container, programs

    • How is it different from NASA Earth change? -> model and hpc to study

 

Citation:
Pilone, D.; NASA EOSDIS Evolving Technologies Discussion; Winter Meeting 2016. ESIP Commons , October 2015