Granule Discovery and Services

Abstract/Agenda: 

Discovery and commensurate web services for dataset discovery are well established with many data centers offering these capabilities.  Now many groups are moving beyond directory or collection searches to support discovery of granules. Many of these searches are using CSW, Opensearch or other services. Well known examples include those established within ECHO, NODC, and PO.DAAC. These efforts are mature enough to share lessons learned etc. and that is the goal of this session. Also solicited are conceptual ideas and architectures that show how  metadata can be leveraged as part of a granule search, for example, using quality information to apply quality flags to the search result.

Notes: 

Ed Armstrong

Goal – build on lessons of granule discovery.  Forum to share experiences for building tools and elucidate what needs to be done.

4 Talks

1.       NOAA NODC

2.       NASA Goddard DISC

3.       NASA JPL

4.       NASA Goddard

From Collection to Granule in Data Discovery - Ken Casey & Yuanjie Li– NOAA/NODC

Ken Cassey

·         OAIS-RM (Open Archival Information System Reference Model)

o   Archival Information Package (AIP) – accessions

o   Archival information collection (AIC) – haven’t really talked much about this – it is an aggregation of AIP (one or more AIC)

o   Sometimes the searchable granule is an accession, a file, or part of a file

·         Functionally,

o   Collections are the minimum citable unit (ex. Journal article)

o   Granules are the minimum searchable unit (ex. Word in article)

§  Trying to manage 2 step discovery process and manage data…

o   Not all collections are enabled for granule-level discovery because they have few granules

·         GHRSST

o   Each collection can have 100s of accessions

o   Users want data over time and space… they don’t care about the archival process

Yuanjie Li

·         NODC has template for ISO 19115-2

o   Ex. Multiple data access links

·         Automatic flow uses collection and accession level to make sure information is perserved

·         Geoportal – includes various granular discovery options – CSW, Open Search, JSON, RSS

o   CWIC Start portal

·         Collection is linked to the granule level discovery through the REST URL

·         Recently set up the html view of metadata

·         Ken – not rocket science, implement ISO-19112-2 at accession and collection level – this is a “best” practice

·         SOLAR has been implemented in 1.2.4

·         Q – Walter – spatial resolution of metadata – inside of boxes would be 100s or 1000s of data.  When you select a small area, can you get false positives – How do you deal with this?

 

Use Cases for Granule-level Open Search – Chris Lynnes – NASSA - Goddard

·         Why is it so difficult to illuminate the “dark data” (those data outside of large data centers)

·         2 step data discovery

o   Start with data discovery – looking for data sets results

§  For every data set generate an OpenSearch data description document

·         NOAA-NODC – is counting on the Identifier

o   OpenSearch data document allows search at different data centers

o   Space-Time Query

·         Granule-level part was really useful

·         Simple Subset Wizard

o   DAAC had different services, but all had machine level data

o   Most search and access interface

·         Geovanni

o   Creates visual response

o   (sorry PC crashed and lost 5 min of notes)

·         OpenSearch Requires

o   Critically – time, bbox, and unique ID

o   Geovanni – need OPeNDAP link

§  Use a client to pull it in in NetCDF

·         Keyword tagged granules (future?)

o   Then – can search for things like “Hurrican Sandy”, track a moving object, …

·         Q (Ken) – OpenSearch describes document, do you need one per dataset or collection, how generate

o   Programmatically

o   ?transfer or static – either

o   Only use url string

·         Q (Ken) – Is there a standard to how to generate subset

o   ESIP discovery cluster has documented standards

·         Q (Jeff D.) tagging – is it in the metadata or where?

o   Really want to do everything in SOLAR database – support space, time, and keywords

o   Need to modify XML

o   Ted – ISO 19115-2 has mechanism for shared event between granule…. Standard way of describing events

·         Q (Walt) – when initially look – if you don’t have the highest resolution of data of granules, you get false positive

o   Need to have better spatial control

 

Web Services for Earth Science Data – Ed Armstrong

·         PO.DAAC and Webification

·         PO.DAAC Portal

o   Available web services

§  Have to access via url

o   Have api for users to run for web services

o   Require datasetID and output format – provide url to click and execute

o   Can image and extract the data on the fly

·         Webification

o   Goal – make data easy to use in the “web” way

o   W10n

o   Able to address and access at attribute level

o   Can subset, not just by index, by value

o   Can apply a quality filter (example uses CF conventions)

o   Chain these ex. Qualify filter, wind screen and in bounding box

·         Q (Alek) – more subset index can be introduced, how?

o   This is what the w10n protocol supports

o   With OPeNDAP need to know what the subset means before… don’t need to here

·         Q(Ted) – qualify is a variable? – yes

·         Q (Katie) – what are the valid variables and how to invoke?  - go to http://w10n.org

·         http://podaac-w10n.jpl.nasa.gov/

·         Q (Ken) – are there other groups implement w10n servers

o   At least one other place at JPL and the amazon hack-a-ton

 

Parameter Visualizations as Metadata to Facilitate Discovery - Matt Cechini

·         EOSDIS – has systems, services, and data

o   GIBS is imagery as a service

o   Eventually get to data as a service

·         GIBS

o   Operating Prototype

o   Have ~100 products- will have all MODIS soon

·         Problems

o   Amalgamation of projections and representations, coverage, low res.

·         Do users need full resolution granule imagery?

o   Dealt with low res for a long time – is this fine?

·         Do users need visualizations of every parameter? – do you need it for a quick look

·         Should images include annotations (graticules, legend)?

·         Should quality layers be visualized (quality mask)?

·         How do you request granule imagery?  There is no standard to request granule (right?)

·         How do you represent metadata for a granule image and its service?

o   Can also have an ISO metadata record that applied to an image – target could be the image (Ted)

·         Focus is on how to get users to a granule… 

Citation:
Armstrong, E.; Granule Discovery and Services ; Winter Meeting 2014. ESIP Commons , November 2013

Comments

edward.m.armstrong's picture

E. Armstrong will present on "PO.DAAC web services for granule discovery and applications" and/or "Thoughts on advanced use cases for granule utility"
edward.m.armstrong's picture

E. Armstrong will present on "PO.DAAC web services for granule discovery and applications" and/or "Thoughts on advanced use cases for granule utility"
Clynnes's picture

Chris Lynnes will present: "Eating Our Own Dog Food: Applications Use of Granule Discovery".
edward.m.armstrong's picture

Yuanjie Li will present "From Collection to Granule in Data Discovery"
edward.m.armstrong's picture

Matt Cechini will present on "Parameter Visualizations as Metadata to Facilitate Discovery"