A bit of discussion on URIs in the T-Box has been popping up lately on the W3C’s Semantic Web mailing list.  At issue in a thread on “Best Practice for Renaming OWL Vocabulary Elements” is whether T-Box URIs should be somewhat meaningful names, like foo:fatherOf, or meaningless names, like foo:R9814.

There are pros and cons for both sides.  Meaningful T-Box URIs are easier to to use when making ontologies and writing queries.  However, it’s very difficult to change a URI after you’ve created it (you may later become unsatisfied with an unforeseen connotation of your URI).  Moreover, you have to accept that the language you’re using (like English or Tagalog) may not be understood by a user.

Meaningless T-Box URIs have the pros and cons reversed – harder for creating ontologies and writing queries, easier for lifecycle management and (in theory) buy-in from non-native speakers.  To sweeten the deal for these meaningless URIs, advocates point out that tools can be written to correct for the difficulties in ontology and query authoring.

This all brings to mind a division in labor in the ontology/semantic web community, which you might call A-ontology and T-ontology (tracking the distinction between the A-Box and T-Box in description logics).

A-ontology is focused on analyzing data, leveraging the implicit semantics found in datasets.  Ontologies are a tool for declaring semantics so that

  • It’s clear what assumptions and interpretations have been made in the analysis
  • A reasoner can read and process the additional semantics

There’s no community effort to re-use the ontology going on, so these ontologies are narrowly purpose-driven.  Not to say the semantics comes cheap – a data analysis is ill-served by a rush to charts and graphs.

T-ontology is a bit different, primarily focussed on sharing re-usable terminology.  The ontology is not a tool, but rather the product of T-ontology work.  Communities of interest and stakeholders factor into the development of the ontology, since they and the people they represent are going to be using the fruits of this labor.

These two kinds of ontology work intersect in practice.  A-ontology will often use data that’s been published with semantics governed by the product of T-ontology.  If significant results are found when performing A-ontology work, the analysts may consider publication of their results, meaning a re-consideration of terminology and alignment with community standards.

A realization of the Linked Data dream of a globally-distributed data space is an undoubtedly good thing.  If meaningless T-Box URIs help this dream along, then we just need to be sure we’re not crimping the style of A-ontology.  If tools have to be written, then they need to fit the workflow of A-ontology before changes to RDF or SPARQL are made (and most modern data analysis takes place at the command line with scripts on the side – GUIs and faceted browsers won’t find a large audience here).

As things things currently stand (with RDF 2004 and Sparql 1.1), meaningless URIs would overburden A-ontology efforts.  It’s hard to imagine how I’d productively use the rrdf library (or rdflib+scipy or seabass+incanter or any ontology and data analysis library combination) if I had to deal with meaningless URIs in the T-Box.