Ontology Design Pattern-driven Linked Data Publishing


In recent years, Linked (Open) Data has emerged as a prominent framework for publishing structured data on the Web adopted by various domains including geosciences. Linked Data allows data from different sources to be interlinked using HTTP Uniform Resource Identifiers (URIs) and be machine-processable in a standard way via the Resource Description Framework (RDF). Interoperability and integration across different datasets are achieved by the use of vocabulary that is agreed upon by the community or standardized by some governance body. Such a vocabulary is often specified in an ontology, which formalizes the semantics of the vocabulary terms being used. The challenge is that many ontologies, including domain ontologies, are too complicated, restrictive, and difficult to use and understand. This makes many linked data publishers avoid ontologies and prefer to simply use less formal vocabulary. Although this allows linked data publishing staying relatively simple, the resulting datasets would only have a low quality metadata, making the datasets harder to understand, interoperate, and integrate. In this tutorial lecture, we shall introduce a modular ontology architecture based on the so-called ontology design patterns, which are sufficiently flexible, easier to understand, and less restrictive, while allowing the linked datasets to be equipped with a sufficiently high quality metadata, enabling interoperability and easier integration across semantically heterogeneous datasets. We will demonstrate how such an ontology architecture works in a data integration setting, catering multiple perspectives from different data providers, as well as accommodating existing vocabulary that are already employed by the community.


Adila Krisnadhi, Data Semantics Lab @ Wright State University
Ontology Design Pattern-Driven Linked Data Publishing

-Semantic Technology
    -linked data: set of triples of URIs/IRIs or literals that form a graph -- triple expresses linking
    -simplicity leads to popularity
    -state of linked data: close to 100 billion triples
-What are the principles behind the publishing of linked data?
    -use web identifiers; ensure that URIs are web-resolvable so human AND machine can obtain further information about the things they represent
    -link to data from other parties as much as possible
    -you must decide how to prep vocabulary to describe/link your data
        -create URIs for your data and vocabulary
    -only mint a URI for X if X comes from your own local database/source (if it’s instance data) or if there’s no known URI for X (if it’s a vocab term)
        -don’t if X originates from an external source that you don’t maintain
        -don’t if you like the existing definition and it fits your current AND future needs
        -you must maintain any URIs that you mint
-Hash vs slash URI: www.w3.org/wiki/HashVsSlash
-naming convention for URIs
    -use of dash and/or underscore, etc.
-every lookup of a URI should return something
-easing the URI persistence: use permanent redirection through PURL service (see www.purlz.org)
-"I have been told to reuse other ontologies”
-yes, but don’t start here--start by defining your own and then align with existing ontologies later
-ontologies principles:
    -small >>> large
-modular >>> monolithic (easier to use as building blocks, highly extendible, easily understandable)
-be aware of multiple perspectives: strike a balance between fostering interoperability vs. allowing semantic heterogeneity
-add human-readable annotations-ontology design pattern: ODP

Ontology Design Pattern-driven Linked Data Publishing; 2016 ESIP Summer Meeting. ESIP Commons , April 2016