Earth Science Community Testbeds: Landscape & Collaboration
An Earth science Testbed can be a platform for conducting evaluation, verification, validation and integration of data and science technologies, capabilities and services. Many organizations have developed or are developing testbeds. This session will feature a combination of presentations and discussions led by various Earth science testbed developers. The session is designed to inform the ESIP community the current testbed landscape and progress, as well as promote potential collaborations among these testbed activities.
- EarthCube Integration and Test Envrionment (ECITE) - Law/Keiser
- Earth Science data access and analytics in OGC Testbeds - Percivall
- NSF XSEDE/Jetstream - Pierce/Fischer
First Presentation: EarthCube
Key to note not an overview of EarthCube:
There are a lot of possible projects and synergies with ESIP.
Overall, seems as though EarthCube is still early on in its process, and is still determining what it really wants to “get” out of it.
- Is it capturing information from projects?
- Is it demonstrating the integration of information between projects?
- Perhaps both?
There are working groups, and they are building use cases. Use cases = what’s important for EarthCube
Still early on -- but there is flexibility and a lot of potential overlap, and ability to sign on to participate.
Especially lots of discussion to have with ESIP - synergies of resources and ideas of valuation approaches.
EarthCube Update Summary:
ECITE prototyping on schedule
Initial use cases in the pipeline
But the bottom line is: Let’s not reinvent the wheel here - let's collaborate / share
QUESTIONS / DISCUSSION:
EarthCube could take advantage of ESIP Testbed
ESIP could take advantage of EarthCube platform
Real Needs: The glue that connects projects together… everything is distributed and not everyone will use the same computational resources. But EarthCube could help with integration and interoperability
- TECHNICALLY? ie: how things plug into each other… OR knowledge?
- Will EarthCube provide either the technical ability or simply the knowledge and sharing… or both? -- Unclear! What does EC really want to get out of this?
What about cyber security?
- Has not been expressed as a big need. They do address it, but not perhaps to the level? Particularly dealing with commercial cloud. Data is not sensitive but...
Second Presentation: Earth Science Data Access and Analytics: OGC (Open Geospatial Consortium) Testbeds
OGC is a partner of ESIP
- global forum to advance geospatial information
- innovation through prototyping
- interoperability program
- advancement of technology “maturation levels”
Testbed 11 - had four groups (One of which was CDI… which is the focus for this talk, though there are other threads)
- Climate Data Initiative -- bring and increase the interoperability of climate data
Urban Climate Resilience with Coastal Inundation: Climate and Human Security
- Sea level rise is hard to deny - people take this seriously, so it’s relatively easy to focus on
- Climate models are complicated. How to give information to first responders in real time. This was to simplifiy and put some knobs on complicated models
Goal is to connect science with first responders.
RETROSPECTIVE; 11 going on 12 testbeds. 80+ initiatives.
Retrospective paper: published recently.
“Join testbed 12” --
WCS and OpenDAP -- you’re already doing things, so get involved and participate to get involved in the broader activities that are going on! collaboration?
Questions AND COLLABORATIONS
How do we leverage ESIP testbeds projects into OGC testbeds
- GEOSS pilot about air quality… from ESIP (Erin led) -- if we could do that again?
ESIP testbed: Framework for evaluating technologies - Can we solicit ESIP evaluators?
- Or the other way: incubation aspect - possible small seed funding funding for ESIP members?
- Keeping eyes and ears peeled -- opening doors between ESIP and OGC?
OGC and ESIP Collaborations:
- No ESIP Testbed session this week (ESIP session) -- AIST is about evaluation. not matchmaking between projects.
- However, could make sure the incoming products and services person from ESIP is on the mailing list for OGC -- make sure there is communication.
- When ESIP members are proposing projects, and see a way to connect with OGC to build into the proposal… make more communication known.
"Let’s not wait until it comes out - says OGC. A lot of processes could happen now:"
- World Bank in March
- Communication could happen now and at that point at the World Bank - coordination can happen early on.
What Might Be Possible for Collaborations:
- Right now (ESIP says) 2 RFPs per year for ESIP TestBeds… but there is fluidity - perhaps we could have an exciting opportunity? TEX project (technology Earth exchange) -- but actually we could do a testbed call, we could link the two if there are interest. People could propose to participate, as long as there are connections and collaborations, the How To are flexible and making it work.
- Seems like there is a LOT of reasons why there should be great coordination between ESIP federation and OGC -- we’re leveraging the ESIP federation (separate brain trust) - people are addressing some of the same things, but different… good way to come together? Pilot? Climate data? -- in support of presidents CDI… would make joint sponsors excited. They could look at the federation. Makes it exciting for people to participate… through these communications and projects. Reenergize...agenda item. Sounds like quick successes are possible.
- Annie’s idea; RFP special project call that could be “anytime” - if it collaborates with other OGC/Earth Cube… maybe what that looks like is that someone from OGC does a presentation… here’s the “nutshell” on Testbed 13 and then ESIP coordinates folks that would be interested in coordinating on behalf of ESIP… maybe we have a “hey you’re representing ESIP and your member organization” -- getting people on the hook… work needs to happen, put a title on it. Testbed...start and end date.
- Action: Two committees (ESIP? -- who is Ethan? IT and?? Interop?)
Presentation Three and Four: Jetsteam and XSEDE
First production cloud for science and engineering research across all areas of activity supported by the NSF.
Want to expand the science into other areas (not just biology and medical)
Who Might Use It / What It Is:
- People that need “a few cores today not a thousand next week”
- Specific toolkit/workflow/data to manipulate
- Doing Hadoop at a modest scale.
- Want to give scientisits (Earth Science!) the VM with the tools to do it… and finding people that want to do similar things:
- When you go out in the field.. want to make it so that you can do work effectively in the field.
- Using the VMs remotely out in the hinterlands on an ipad and phone network -- it is possible!!
- twin systems IU and TACC and U of Arizona.
- simplicity via web browser
- really easy to do.
- search via images or tags. images = toolsets?
- make it easy for people to find your areas of interest and also make it easy to create your own.
- Portal account is free.
Goal: March 2016 leave early operations and go into production
As a user or someone who creates VM: do you have any control?
- Yes… and if you want to push the limits, do it! integration of wrangler and jetstream -- good opportunity to get in early :)
- Five Years from NSF now and hope for another Five.
Do they support container architectures? Docker?
- ...not yet… but they are going to start looking at it.
Licensing: Example: ArcGIS?
- Worse you can do is ask.
APIs: breakdown of troposphere, API is amazing broad and complex but all open source on github… but between the iPlant and OpenStack API… quite a bit you can do with it…
- infrastructure at your service
- Allocatable brains
- Machines that are not traditional “supercompters” that may be of interest especially to ESIP
- umbrella to things like: SDSC Comet: long tail computing
Basic idea of XSEDE: go there, write an allocation, get access and burn through it… and if you like it, write an allocation for “Millions of XUs” (currency of XSEDE is XU) - basically you can get a years worth of supercomputer time by writing an allocation proposal.
Wrangler: a very different machine - data centric computing rather than traditional super computing