Where Open Source Meets Geospatial
We're soliciting talks from the geospatial community with a particular focus on interactive, highly data-intensive systems and on open source solutions. The session will include participation and leadership from the chairs of the NASA Earth Science Data System Working Groups (ESDSWG) Open Source and Geospatial (GIS) as we work to highlight important geospatial and open source solutions that can apply to your data system -- we'll recommend technologies that make sense, and identify where they plug into your architectures. Entire stacks, virtual machines, libraries, toolkits, data formats, we'll try to cover it all. Come for the talks and for the discussion!
8:30 – NOAA Open Source Geospatial (Micah Wengren & Matthew Austin)
This presentation will review some of the ways NOAA is leveraging open source software to support geospatial data publishing, access, and discovery. It will discuss a few projects that the presenters have been involved with in the last several years to address the increasing expectation of NOAA customers to obtain geospatial data products in reliable, standards-based services in a timely manner. It will describe some ways NOAA has contributed to the open source geospatial community both through procurements to further development of open source tools or through direct contributions.
9:00 - Clear Skies: Turning Massive NASA/USGS Data into a Pixel Perfect World Atlas (Chris Herwig & Charlie Loyd)
In spring 2013, the satellite team at Mapbox collected and processed 6 terapixels of NASA/USGS open MODIS imagery into Cloudless Atlas  -- a global, cloud-free mosaic. The next iteration of Cloudless Atlas , currently under way, leverages the USGS/NASA Landsat archive to add even more cloud-free zoom levels to our global imagery mosaic.
We'll speak from our perspective as a bulk consumer of U.S. Government open data endpoints, and discuss how we use commercial cloud computing services to manage the task at every stage – from downloading and storage, through processing, and into production. In a matter of weeks, the first iteration of our cloudless base map project produced a higher quality product than was possible with conventional methods or existing commercial imagery offerings, all enabled by open NASA data. We’ll talk about our experience with Landsat data availability, processing, and the strengths and weaknesses of Landsat data from our perspective as commercial open data consumers.
9:30 - Comparing Open Source Cloud Computing Solutions for Geosciences (Chaowei <Phil> Yang)
Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.