2
votes

I am new to RDF/ontologies and the way to work in this domain is a bit unclear to me. Currently I am troubled about triple validation. I still think in the relational way where I first create a schema and then in order to insert data I need to follow that structure.

In order to insert triples I use Jena and then use the generated String to execute an Insert command. So here are my questions:

  • Is there a way to validate that the RDF triples that I generate (currently using Jena) are actually following the structure of my ontology.

  • Or, is there a way to be informed by the rdf store (currenlty Virtuoso) when I execute an Insert? I did some searching and found: How to Import Ontology into Virtuoso?. Does this mean that my triples will be checked/validated against my ontology?

  • Have also found RDF and OWL workflow question. There it says that

    1. after having created my ontology I should
    2. export the ontology as RDF in order to
    3. import it to RDF store. Does importing my ontology to Virtuoso (as described in:How to Import Ontology into Virtuoso?) mean that steps 1 and 2 are complete and I am now executing step 3?
2

2 Answers

4
votes

A) this kind of validation is not part of the RDF model, and so most triplestores/APIs do not support it. I would recommend that you let go of this mindset, but to answer your question: yes, there are tools that allow you to do this kind of thing. One example is the Pellet OWL reasoner, which has a constraint validation mode, but I'm sure there are others. Or you can of course implement your own validation, either by implementing some sort of parser listener that checks incoming triples, or by doing some after-the-fact checking on your triplestore, with queries, or by using an RDF-OO mapping solution like for example AliBaba or Empire.

B) I am not sufficiently familiar with Virtuoso to be 100% sure, but I suspect that it does not validate inserts against the schema. As said, this is an unusual thing to do in the RDF world.

C) (Updated): yes, if you are importing your ontology into Virtuoso, then you are indeed loading it into a triplestore, so that's all 3 steps taken care of.

2
votes

RDF graphs follow the "Open World" model, which is radically different from the "Closed World" model of SQL tables. In SQL, you can only fill in the cells of the tables as pre-defined by your schema. In RDF, "anyone can say anything about anything, at any time." This gives much freedom, and much power, but does need some learning to be taken advantage of.

You might think of each triple as corresponding to a single cell in a SQL table -- entity, attribute, value or subject, predicate, object roughly match up to primary key, column, value. In the SQL world, it's best if every cell in a table is populated -- and empty cells may be meant to be interpreted as having a meaning. In the RDF world, sparse data (with many empty cells) tends to be the rule, and those empty cells don't have meaning beyond "we don't have a value for that."

New tools for checking whether a given dataset conforms to a "shape" (which might be an ontology) have come from the W3C in SHACL and related projects. These tools do not restrict what data might be input, but rather check whether the data you're working with fits the shape you intend.

Virtuoso does not yet have built-in support for SHACL and related, but these are on the To-Do list. That said, SHACL validation tools can be brought to bear on data in/from Virtuoso.

(ObDisclaimer: OpenLink Software produces Virtuoso, and employs me.)