NOAA Data Management Dashboard

Abstract/Agenda:

The NOAA Data Management Dashboard is an effort to improve NOAA data interoperability through leveraging best practices and standards. Reports that give an overview of data, data services and metadata that are available through NOAA data catalogs will be created. Requests to spatial services such as OGC, OpenDAP, ISO metadata are returned into the NOAA Google Cloud Unified Messaging Service

Notes:

NOAA – Data Management Dashboard (Matt Austin)

This is still a work in progress
Visual tool for NOAA data management standards
We all need to work together
Need to review mark-up before do any design
Can show number of records that satisfy Documentation directive
- Sum of all data management plans
- Need to look at how much it is being used
- These metrics can be shown on home page
- By having observing system – would save time in office & have better results
- Have to define what a dataset is
- Can have drill down options – look at how metrics change with training
Need a consistent way to implement standards – like what Documentation cluster is looking at
- Find ways to aggregate metadata to answer questions
What high level design makers need – prioritize questions
Sources – can use web accessible
This can improve interoperability of NOAA data
Can organize dashboard at multiple levels – end user usually wants things topically
- Have not fully defined collection vs. grandul
http://sites.google.com/a/noaa.gov/noaa-hppc-shared-hosting-project/ (need permissions)
- NOAA has switched to google
- Number of metadata records in NOAA Geopotal
- See changes in number of records – number of records does not mean always better
- Google spread sheet can update in real time – or can use a database
Q – do you have to use geoportal – no but NOAA is
- Need to focus on standards so that it is the same for everyone
Q – are these available anywhere else
- Can share them (will get screen shots)
Q this is just a concept
- Google doc – shows what development that is finished
Ted has more

Dashboard Displays for Classifying Metadata Collections (Ted Habermann)

A dashboard in your car – metrics about the system
- Without an engine – dashboard is not interesting
- Interested in how to build in NOAA, communities/engines worth reporting on
- One target audiences – CIO or management – if nothing changes on dashboard – it is a problem
- How can we get the community to participate
Work on set of data sets – related to project or other – rows in table are collection of records from NOAA
- Uses a rubric to measure completeness (0-41) – provide: average score, std. dev. Total, counts above 20 and 25
- When started with rubric – thought it was for single records – but found interesting information
- One set has the same min, max, and average and some have small std dev
  - Includes link to histogram of score – see different characteristics between data groups
- IOOS – metadata generated automatically
  - Have more at 45
  - Q – count seem low – these are specifically related to sensor operation records or THREDD records
- Mean and std dev – recognize records with different characteristics
- If look at table of contents for records – can look at individual record rubric – which is navigable and can see the fields and links to the wiki to improve the metadata
  - At bottom – other reports – ex. Broken uls, broken xlinks, validation errors, unique contacts
  - Created contact reports – to see who to contact if have problems
  - Looking at common content/aggregation is a powerful too
- Consistency checker
  - Takes all of the records (usually around 100) – aggregate into single xml file – then look for repeated content
  - Connections, people, citations, services (can sort by fields in each tab) – provides a count for each
  - This provides a means to look for inconsistencies that occur in record
  - Q – (Matt) – seems like the kind of thing that can be automated to not burden the people that are busy need to change
    - Not sure of tool that will be able to figure out a typo/formatting
    - Ted would make change in oxygen (grouped metadata) – find and reply then it will save it back to the original
    - (Matt) – if you provide to the researcher – may not care about problem
  - Dashboard is a tool to help people creating the metadata
- Q – as a search engine – if doing textual then some of these may not come up because of typos
  - exactly
- sometimes need to look at code for fix to the problem – others are human errors that need to be looked at – machine generated vs hand hewn records
- if change from rank to time – see same variability
  - to the manager – the older records are finished – so translation from FGDC to ISO
    - for older records – want to translate without major problems (Pandora’s box) – if it ant broke-don’t fix it
    - with newer records – want the data manger to work with the ISO standards
    - helps develop strategies for moving forward – strategic tool for guiding the direction and indentifying the approaches to help evolve the organization
Q Eric (Kodak) – are you indenting to define dashboard for end user – or widgets that any user can customize
- Customize is good – want to start with “My” – for a specific collect defined by string or set of options
- Not meant to change settings – it sit and changes results
Q – (Eric) like that you can relate to dashboard – use can depend on user and context
- Makes it better if users can customize
- Identify what users are interested
- Then you can query to see who is using what
- In junked car – still fuel and odometers (fuel = explosion)… some are universal – might be really hard for all audiences
ACTION - If there are tools that might be useful – please let Ted or Matt know
- EXTJS
Really have to think about data model – the limitations of the metadata – if can build data model to think about potential uses – right now for data stewards – but can think of how other use it
Q (Don Collins – NODC) – wants to see what NOAA is asking about – Matt described what might get measured, quantified, reported – Ted focused was to quantify/target to improve data for NOAA – wants to know where metadata is good or not/need improvements OR is this some measurement to see if system is working – have not solidified internal concept
- Remember Bruce Carin’s talk -… this is a coffee house
Ted testing – Matt is see what documentation needs to be done
- Less about – what question for who MORE how can you answer any question for any reason
- Reason for doing this in ESIP – because want to share outside of NOAA – reach mass market audience
Heather Hinkle (USGS) – survey wide data management program (starting) – metrics will happen in ~ phase 2 – show growth/standards – may want sooner – metric are tricky – need to be answering the right question – show both quality and quantity
Want to start metrics early – usually will create metadata and then “later” improve it
Ted – also try bring to data.gov – looking for data stewards
(Erik) – difference between metric and the interpretation – give number, but don’t interpret – give guidelines to interpret
- Challenge #1 – exposing metadata
- Challenge #2 – finding the problems
- Challenge #3 – fix the problems in the metadata (people don’t want to fix)
- Ted – other thing is building relationship with people that are doing these things