Earth Science Data Analytics 201

Abstract/Agenda: 

The continuum of ever-evolving data management systems presents continuous challenges to the enhancement of knowledge and facilitation of science. To overcome these challenges, it is essential to understand and develop methods that enable data relationships to be examined and information to be manipulated.

 

Come join the Earth Science Data Analytics (ESDA) cluster in our quest to decipher the different types of data analytics, generate definitions (in terms of Earth science), collect use cases, and eventually analyze tools unique to the different types of data analytics. The ultimate goal is to perform gap analysis and provide recommendations to the community. Are you interested in helping to guide the future of information analysis?

 

Notes: 

 

Getting back to the ESDA Home Page: http://wiki.esipfed.org/index.php/Earth_Science_Data_Analytics

 

Thanks to all in attendance.  After a well attended Earth Scence Data Analysis (ESDA) 'introduction' session (http://commons.esipfed.org/node/2724), ESDA 201 session identified actions in support of addressing ESDA cluster goals,  In identifying tasks, several cluster participants signed up to contribute their time on behalf of accomplishing these tasks.  After reading the following notes, we hope you are moved to be willing and able to donate some time to this interesting topic.  If so, please contact Steve Kempler ([email protected],gov).  No inputs are too big or too small.

 

Session Agenda:

 

- To Plan out the next 6(+) months of ESIP ESDA work

-- Generate a library of ESDA Use Cases

-- Complete the ESDA Use Case/Tools/Techniques, etc. matrix

- Publish Findings (?)
- Data Analytics Theme…
- Open Discussion
 
The session began with a short presentation summarizing work/ideas developed, thus far.  This includes:
- Data Analytics Definition:  The process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations and other useful information.
- ESDA Cluster goal:  To facilitate gleaning knowledge about Earth from all available data and information
- What's the Big Deal About Big Data:  What's new is the need to develop and implement technology anfd techniques to efficiently analyze data and information in order to extract knowledge.
- It is our job (Information Technologists) to facilitate Data Analytics through our understanding and implementation of supportive information technologies, in close coordination with the specific data analysis needs of the science community
Data Preparation – Making heterogeneous data so that they can ‘play’ together
Data Reduction – Smartly removing data that do not fit research criteria
Data Analysis – Applying techniques/methods to derive results

 

Tools/Services for: Preparation are fairly generic; Reduction, and especially Analysis, are very specific research dependent (and, thus difficult for us to address without science domain expertise)

 

Generate a library of ESDA Use Cases
Now that we are gaining a decent understanding of the scope of ESDA, we now need to better understand the use off data analytics by Earth science data and information users.  Thus, our next step is to assemble a large comprehensive library of use cases to: Analyze data analytics usage patterns, commonalities, methods, technologies, etc.; Who uses data analytics (Do we see specific patterns per user?); Do specific techniques/tools/frameworks, keep showing up?  The larger the better, to develop an authoritative set of findings.
Given a strawman set of information to be acquired per use case (see presentation provided below), the group tried identifying the desired information for an exemplary use case, provided by Laura Carriere.  Not surprising, this was not a straight forward task:  Discussion included definition of terms, if requested information was actually applicable to the use case, and the uniqueness of use cases that may lead to being better served with other information.  Most revealing was the difficulty of clearly identifying use cases with one or more of the data analytics types that we have been using to define data analytics.  
After meeting, and follow-up discussions, we concluded that with the analysis of ESDA use cases, we may find that the 5 data analytics types (Descriptive, Diagnostic, Discoveritive, Predictive, Prescriptive), admittedly defined in the business world, do not specifically apply to Earth science data analytics.
This finding is supported by the difficulty in assigning data analytics types to currently known use cases.  It was suggested that our types could be result oriented.  More analysis is needed.
Additional discussion included types of use cases that we should go after.  Besides the obvious: User of data airborne data, and data provided by citizen science and social media.

 

Publish Findings

Introduced for future consideration:  If we do a good job on the use case library, can draw conclusions on how we can best serve data analytics to the user community, and feel we can provide gidance for technologists... well maybe we have a paper to publish.

 

Data Analytics Theme

It appears that the theme for the Summer ESIP Meeting may very well be focused on Earth Science Data Analytics.  So... guess which cluster will be stepping up to make this happen.  Discussion included:

- What should be the focus:  Taking from the 'What's New About Big Data', suggestion was:  'What enables us to further the use of data analytics techiques and methods'.  Or:  'How do we enable developers'.  Or: 'Societal Benefits'.  Or: 'Adding Value of Data Analytics to Tools'

- It was pointed out that, being on the west coast may make it easier to get data analytics experts, many located on the left coast, to discuss their expereince and expertise in utilizing data analytics.

 

 

Actions: 

Steve - Set up a Google Doc:  Provide a set of definitions for needed use case information, for Cluster discussion.  Study NIST use case library for ideas.  See: http://bigdatawg.nist.gov/usecases.php)

 

See:  https://docs.google.com/spreadsheets/d/108glVB8Rni8M47e5G1_oZ6g1q5AOxzdT...

 

Active Participants - Provide feedback in Google Doc.  Further discuss at next telecon.

 

All Other Participants - E-mail Steve ([email protected]) if you wish to be more active and provide feedback to the Googls Doc

 

Steve - Look at how themes were reflected in past ESIP Meeting agendas.  Report back to group.

 

All - For next telecon, prepare to discuss ESDA theme for next ESIP Meeting:  Focus of theme, who can we invite as speakers, which other clusters have some relation to the usage of data analytics

 

Attachments/Presentations: 
AttachmentSize
File ESDA 201.pptx1.89 MB
Citation:
Kempler, S.; Mathews, T.; Earth Science Data Analytics 201; Winter Meeting 2015. ESIP Commons , October 2014