NASA Mini-Summit for Open Source Software and Science
This session will host a series of Open Source for NASA Science Data Systems talks, focusing on:
* Deployments
* Licensing, Attribution and Redistribution issues in Open Source Software
* Sharing Software Between Centers: Success Stories
* Short Lightning Talks
* How NASA open source software can help the ESDIS project
* Legal/IP issues in Open Source Software
We will solicit between 4-6 30 min talks from leaders in the NASA Earth Science Data Systems Community and from leaders in the broader open source community from which NASA leverages (e.g., from NetCDF/UCAR, HDF5-group, GDAL, PostGIS, etc.). In addition, we will have 2 focused discussion sections with those in attendance to digest and learn from the talks given. Community feedback will be an important aspect of the workshop, as will sharing "tribal knowledge" from those in the open source community.
Detailed Agenda:
Presenters
8:30 - Introduction (Chris Mattmann)
8:40 - NASA ESDIS DRAG Open Source Effort - Andrew MitchellNASA - 25 mins + 5 mins questions
9:10 - ESRI Geo Portal Open Source Platform - Christine White/ESRI - 25 mins + 5 mins questions
9:40 - Discussion Topic: Process versus Project - should agencies develop an open source policy or use successful projects as a model?
10:00 - Break #1
10:30 - HDF Group - open source data management tools - Mike Folk/HDF Group - 25 mins + 5mins questions
11:00 - NOAA Open Source Success stories (ESRI Geo Portal) - Yuanjie Li/NOAA - 25 mins + 5mins questions
11:30 - TBD Speaker - 25 mins + 5 mins questions
12:00 - Adjourn and Further discussions
We expect strong synergy between the Geospatial Cluster and this workshop, including participation and attendance.
NASA Mini-Summit for Open Source
What does Open Source Mean for HDF?
Mike Folk
What is HDF:
- a data model
- open file format
- open source software
Two version:
HDF4 & HDF5
- Both are relevant as far as the talk goes
History:
- U of Illinois, NCSA 1988-2006
- Non profit since 2006
- Best examples of open source project have the same roots as we do
Mission
- managing large complex data set
- provide services for users
-
sustainability over long term
- long term access and usability of data
- left university to ensure this
HDF Communities
- Academia, government, commercial
- Large and complex data
What we do:
- EOS and JPSS, support in an enterprise way, whatever they need
- General maintenance, QA and Support
Open technology development
- they are a single maintainer open source company
Intellectual Property
- transferred to HDF Group for royalty on commercial profits
-
they have a BSD license (Berkeley System Distribution)
- people can use software however they want to
Benefits on OSS, as it related to BDF
- trying before adopting
-
if it almost works, you can modify it to make it work
- usually better to go to them for help
- development activities are public
- freedom to develop tools that make HDF more usable
-
Long term access
- Code open, if the company stops, the code is still available
Aspects of OSS we’re less sure about, as they relate to HDF
-
unpaid contributors can do much core work
- has not been their experience
- not software that a lot of people are interested in working on
- we have people that they pay, which for them has worked better so far
-
given enough eyeballs, all bugs are shallow
- has not really worked for them
-
frequent development cycles are good
-
when you have people writing applications with a particular API
- no documentation
- critical API
- change API,
-
this can be detrimental to communities
- need to be slow and work with user community
-
when you have people writing applications with a particular API
-
OSS is easy to use
- Need to qualify this
-
OSS is low cost
- There are a lot of expenses
-
It is easy to run a community-based OSS project
- Managing that, you need to really invest
- Not simple
-
OSS business models
- All completely different
Comments/Questions:
-
If they became a community model maybe they’d have more community involvement
- Such as documentation or other areas that are not in regards to coding
-
Contradiction: Only HDF group understands the complexity, so if the HDF group went away then no one would understand the complexity
- Maintaining a knowledge core of developers and supporters will help
- If the core is not there at least you have the source code
Ken Casey: NOAA Open Source
- NOAA National Oceanographic Data Center
- Real world demonstration of OSS
-
NODC
- Data services
- Have done usability tests (will present on Thursday)
-
NODC
- Space data to numerical models, heterogeneous, a lot in HDF, and such
-
Scientific Stewardship
- Acquire
- Archive: understandability over long term
- Access
- Add Value
NODC Archive Paradigm
-
human and machine interfaces
- Google, data.gov, GOS, OAS, Geoportial Server Web App
-
Machine to Machine
- CSW, Geoportal, REST API, Open Search
- Metadata: FGDC, ISO, Ad hoc, minimal standardized metadata
-
Geoportal Highlights (additional capabilities)
- Multiple service links
- Temporal search
- Browse by keywords
- Data visualization
-
Cart functions
- Working with ESRI developers
-
Other Improvements
- Ocean Locator
- Spatial search
- Discover by Browse
- Thumbnail graphics
- Users not just scientists, --schoolchildren to scientists
Machine to Machine
- Open Search
- ARCMap
-
Geoportal server –REST URL
- Allows users/community to do something with it
Geoportal Server Knowledge Sharing
- working to make sure partners are aware
- knowledge and community
Open Source
- Force multipliers exist
- Choose open source and standards first
- Pay external exerts to develop features
- ESRI
- OPeNDAP, Inc
Coments/Questions:
-
What doing connects with Fed-wide platform?
- Still in progress
- Will working out connections
- Connections should be easier than in past
-
What got from talk, start with what’s out there from OSS and if it doesn’t work develop what you need. There’s some good to starting internally, for example some of the long-term archive parts of NASA. Present a lot of value, doesn’t believe that open source should be the starting part.
- What Ken thinks is that some of the most general items, such as searching a catalog, open source makes sense
- One of the most custom items they have broken down from a monolithic item to a more loosely coupled item so that they can apply some open source items to the older systems
Discussion
Concerns:
- Licensing
- Redistribution
-
Open source “help desk” syndrome versus community
- Need to nurture the environment actually want to create this environment or learn
- Meritocracy: have various structures, established wait-times, and established social interactions
- Cross fertilization
-
Fear of messing up code:
- community models
- start small, people you know and trust
-
trust is the heart of role, unless there’s trust people won’t come to them for data or give data
- hard to quantify, but in favor of Open source