Do you want your datasets to show up on the first screen of search engine results? Of course you do! This hands-on session is for people who would like to learn more about the extension to the vocabulary for datasets and data catalogs.  We will begin with the basics, introducing and then highlighting the Dataset and DataCatalog classes. Many data providers generate dataset landing pages on-the-fly by pulling information from a relational database perhaps enhanced with content from a semantic knowledge base.  The original structured data is often obfuscated if the dataset landing pages are formatted in HTML, but recent vocabularies provided by can help restore the structure through shared markup vocabulary. Proper use of the vocabulary encoded in microdata, RDFa, or JSON-LD formats exposed structured and parsable information to content published out as HTML, and the marked up pages are recognized by search engines.

Several ESIP partners have successfully implemented Dataset for their dataset landing pages and will guide participants through the process, including a review of tools available from Google Webmaster Tools for validation and testing of search results. Participants in this session are encouraged to bring a sample of their dataset landing pages (e.g. the HTML source code for a representative dataset landing page) and receive  guidance on how to add Dataset markup.

Related information and examples:

Co-conveners: Doug Fils and Adam Shepherd


Doug Fils

Adam Shepard



Code example (jsbin)


People want a box, discovery occurs when people know where to look.  Current practices don't do a good job of allowing data discovery

Motivation: expression by community to being able to access data

Connections are lost in the code, portability/sustainability is decreased, becomes dark knowledge

What can we do about it?

APIs: difficult to use and not a good way to drive discovery

Linked open data.  Doesn't necessarily drive discovery.

Current search is not enough

Current tools, why the hate? placing machine readable information into xml

The resolution,: make the web representation of your data smarter

What is involved?: Dataset is the only "science data" focused vocabulary at this time.

-extend your own vocabulary

-no guarantee that others will use it

-there is a formal extension pattern (see FAQ) is not a majic bullet, there are other ways.  It's not focused on geospatial

It is in active development, new developments in the works

There are tools( ), any23 for apache

Live demo:

Crafting a scientific knowledge graph


virtuoso, kaylee's (from google, sp?)

Bring forth the affordances:


need to be used to be sustainable.....

Live examples:

Needs to be a data tab on, but to do that data sets need to be marked up with

If we mark up our dataset with information about tools, uses for the data, then new tools will be built that leverage that information

use data catalog is you have multiple dataset in a group

few examples, see notes for addresses....

tips for implementation....

Need to put notes in ESIP system...

How Adam and Doug implemented....ask for help if you need it.

Q: Is code viewable in pages

A: yes



Fils, D.; Shepherd, A.; Hack-A-Thon; Summer Meeting 2014. ESIP Commons , June 2014