ToolMatch Implementation Party

Abstract/Agenda: 

ToolMatch (http://wiki.esipfed.org/index.php/ToolMatch) is a community-built set of semantic web applications to match datasets up with the tools that work on them. Simply by contributing information about datasets and the tools that work with them, we will be able to present data users with a comprehensive list of useful, appropriate tools. ToolMatch is currently underway within the Semantic Web cluster. At the ESIP ToolMatch Implemenation Party, we will be populating the system with datasets, tools and the relationships between them.

All Potential Contributors Are Welcome!

There is no minimum skill level:  we will teach you everything you need to know to contribute in the first 15 minutes.

Notes: 

Toolmatch party

Overview

Finding tools for data

  • finding out which tools work for a given data set
  • customer service survey says that we should provide more tools for data
  • simply not a central place/distributed place you can go

Supposed to be a very simple tool with a simple purpose: “show me which tools work with this type of data”

  • we know there’s a need out there
  • low learning curve anyone should be able to participate at anytime
  • will never be done, there are always new datasets/new data/new

Turtle: a language for describing relationships

  • basic form is the triple
  • subject predicate object
  • analogous to English

Turtle syntax:

Panoply            isCompatibleWith                         AIRS_Standard_Retrievals .

                        drawsMapsOf

Subject            predicate                                    object

Add additional statements to show what’s what

- is-an-entity-of-type

Identify a data collection:

  • want to be really specific, official identifiers, we use a URI or IRI
  • or DOI
  • or other identifiers, as long as their official

Identify tools

  • tools not as mature as the dataset stuff

Carving out your own namespace

  • we have URL on wiki which will be base for namespace

Be more expressive:

isCompatibleWith

  • visualizes-maps
  • reformats

Dydra:

  • this is where the triples are stored

Questions:

Versioning:

  • we can do it on iterative project

Advanced Issues:

  • bunch of more advanced issues such as homonyms, etc, that will be filled in with technology experts

Will people solve differently placed instances?

  • yes, technology experts will help

Camel Case

  • two types of Camel Case,
  • questions of the

Additional notes and observations:

This session focused on the work being done to create a tool that helps you figure out what type of system to use with different data sets.  It uses Turtle, a language system for describing relationships.  Turtle is a standard format for ingesting into off the shelf databases called triple stores. (Triple – Subject predicate object).  There was also a discussion about Camel case.  Camel case – significant words are capitalized instead of using underscores.  It is a convention, syntaxes convention.

 

During this workshop, we used Panoply to find data sets, then created triples in Turtle based on the relationships shown.  Panoply is a great tool for working with netcdf or other grid data and a good place to start with.

 

More information about the tool and the agenda for the session:

http://wiki.esipfed.org/index.php/ToolMatch_Talkoot

 

Examples of triples created by the groups:

http://twc.titanpad.com/392

http://twc.titanpad.com/393

http://twc.titanpad.com/394

Identifier: 
doi:10.7269/P3F769GS
Citation:
Lynnes, C.; ToolMatch Implementation Party; Summer Meeting 2012. ESIP Commons , June 2012

Comments

There was a lot of good feedback from this session. After some prodding, we were successfully able to get people to enumerate triples regarding the compatibility of tools and data collections. There was some good feedback about the ToolMatch concept, and people caught on quickly that enumerating specific tools that work with specific data collections would not scale (or was not the most logical way of going about collecting this data). Rather, most realized that it makes more sense to map tools to data access methods, or data formats, and then have individual data collections mapped to those access methods or data formats.