ToolMatch Implementation Party

Abstract/Agenda:

ToolMatch (http://wiki.esipfed.org/index.php/ToolMatch) is a community-built set of semantic web applications to match datasets up with the tools that work on them. Simply by contributing information about datasets and the tools that work with them, we will be able to present data users with a comprehensive list of useful, appropriate tools. ToolMatch is currently underway within the Semantic Web cluster. At the ESIP ToolMatch Implemenation Party, we will be populating the system with datasets, tools and the relationships between them.

All Potential Contributors Are Welcome!

There is no minimum skill level: we will teach you everything you need to know to contribute in the first 15 minutes.

Notes:

Toolmatch party

Overview

Finding tools for data

finding out which tools work for a given data set
customer service survey says that we should provide more tools for data
simply not a central place/distributed place you can go

Supposed to be a very simple tool with a simple purpose: “show me which tools work with this type of data”

we know there’s a need out there
low learning curve anyone should be able to participate at anytime
will never be done, there are always new datasets/new data/new

Turtle: a language for describing relationships

basic form is the triple
subject predicate object
analogous to English

Turtle syntax:

Panoply isCompatibleWith AIRS_Standard_Retrievals .

drawsMapsOf

Subject predicate object

Add additional statements to show what’s what

- is-an-entity-of-type

Identify a data collection:

want to be really specific, official identifiers, we use a URI or IRI
or DOI
or other identifiers, as long as their official

Identify tools

tools not as mature as the dataset stuff

Carving out your own namespace

we have URL on wiki which will be base for namespace

Be more expressive:

isCompatibleWith

visualizes-maps
reformats

Dydra:

this is where the triples are stored

Questions:

Versioning:

we can do it on iterative project

Advanced Issues:

bunch of more advanced issues such as homonyms, etc, that will be filled in with technology experts

Will people solve differently placed instances?

yes, technology experts will help

Camel Case

two types of Camel Case,
questions of the

Additional notes and observations:

This session focused on the work being done to create a tool that helps you figure out what type of system to use with different data sets. It uses Turtle, a language system for describing relationships. Turtle is a standard format for ingesting into off the shelf databases called triple stores. (Triple – Subject predicate object). There was also a discussion about Camel case. Camel case – significant words are capitalized instead of using underscores. It is a convention, syntaxes convention.

During this workshop, we used Panoply to find data sets, then created triples in Turtle based on the relationships shown. Panoply is a great tool for working with netcdf or other grid data and a good place to start with.

More information about the tool and the agenda for the session:

http://wiki.esipfed.org/index.php/ToolMatch_Talkoot

Examples of triples created by the groups:

http://twc.titanpad.com/392

http://twc.titanpad.com/393

http://twc.titanpad.com/394

Identifier:

doi:10.7269/P3F769GS

Citation:

Lynnes, C.; ToolMatch Implementation Party; Summer Meeting 2012. ESIP Commons , June 2012

Submitted by superadmin on 2012-06-29 15:55.

Comments

Session Notes

Permalink Submitted by Rozele on Thu, 2012-07-19 08:49

There was a lot of good feedback from this session. After some prodding, we were successfully able to get people to enumerate triples regarding the compatibility of tools and data collections. There was some good feedback about the ToolMatch concept, and people caught on quickly that enumerating specific tools that work with specific data collections would not scale (or was not the most logical way of going about collecting this data). Rather, most realized that it makes more sense to map tools to data access methods, or data formats, and then have individual data collections mapped to those access methods or data formats.