Earth Science Data Analytics Tools, Techniques and More
The Earth Science Data Analytics (ESDA) Cluster has made great strides in understanding the utilization of data analytics in Earth science, an area virtually untouched in the literature. In achieving its goal to support advancing science research that increasingly includes very large volumes of heterogeneous data, the ESDA Cluster is in the process of categorizing existing tools and techniques utilized in Earth science data analytics data preparation, reduction, and analysis. This session will provide a student’s ‘student of Data Science’ point of view showcasing the usage and usability of Data Analytics. This will set the stage to address a more detailed ESDA categorization, and begin the discussion on how best to perform the gap analysis between data analytics research needs and tools/techniques available.
Approximately 50 people attended the ESIP ESDA Cluster session, comprised of ~20 Data Scientists (by hand show) and 10 scientists (also, by hand show). Tis is a welcome increase form past sessions. Also, ~25 people were first time learners. Two presentations were given, that included discussion:
1. Overview of ESDA to date (Steve Kempler):
Highlights and History
- Earth Science Data Analytics Definition “ratified” by ESIP
- List of Tools, Techniques in Earth Science Data Analytics presented
- Lots of discussion about what is a tool, technique, and how best to showcase this sort of information
- From the Case Study perspective, which was attempted first)
- List of Tools and Techniques and mapping them to the three different stages of data analytics.
2. Graduate Student Presentation (Lindsay Barbieri):
Lessons from Earth Science / Data Science Grad Student
- Interdisciplinary Earth Science Research happening
- Showcase of the types of learning that happened in a 1 year “Data Science” Course
- Lessons Learned in attempting to apply those to Earth Science disciplines.
- Important to understand how data analytics can be scaled to different types of research problems
- Important to be able to clearly understand the data... and how to use the data
Discussion Points:
- With the large amount of material reseaerched and analyzed by the cluster, ina abroad sense, there is a consensus that we whoulfd move forward in specific areas. That is dig deeper into an implementation of a tool or technique to prototype performing data anlytics on particular research problems.
- This might be faciitated by engaging spercific scientists and their research
- Next Steps:
- Validate our work
- Prototype
- From Ethan:
- Have a sandbox for using the tools
- List DA projects that people are working on
- List potential problems that DA could be applied to
- List challenges to doing more DA
- List potential solutions to the challenges
- From Shea:
- Move from tabular data to data visualization
- Potential new co-chairs?
- Next steps… to be determined!