1
votes

This is a followup on my question Compare models for identity, but with variables? Construct with minus?. Either I forgot what I learned then, or I didn't learn as much as I has thought.

I have triples like this:

prefix : <http://example.com/>

:rose :color :red .
:violet :color :blue .
:rose a :flower .
:flower rdfs:subClassOf :plant .
:dogs :love :trucks .

I want to discover any triples in my triplestore that don't satisfy at least one of these rules:

  1. take :rose as the subject
  2. take :rose's parent class as the subject
  3. use any of the predicates that are used in any triple with :rose as the subject
  4. take, as their object, the object of any triple with :rose as the subject.

Thus, in this case, the exception-discovery query (either select or construct) should only return

:dogs :love :trucks .

This query shows what should be in the triplestore:

PREFIX : <http://example.com/>
construct where {
    :rose ?p ?o  .
    :rose a ?c .
    ?c rdfs:subClassOf ?super .
    ?s ?p ?x1 .
    ?x2 ?x3 ?o
}

.

+---+---------+-----------------+---------+
|   | subject |    predicate    | object  |
+---+---------+-----------------+---------+
| A | :flower | rdfs:subClassOf | :plant  |
| B | :rose   | :color          | :red    |
| C | :rose   | rdf:type        | :flower |
| D | :violet | :color          | :blue   |
+---+---------+-----------------+---------+

Is there a way to subtract that pattern out of everything in the triplestore, { ?s ?p ?o }, even though I'm using variable names other than ?s, ?p and ?o in the construct statement?

I have seen this post with strategies for comparing RDF, but I'd like to do it with standard SPARQL.

To tie it together with my earlier post, this final query erroneously suggests that several desired triples violate the rules set out at the top of the message.

   +---------+-----------------+---------+
 E | :dogs   | :love           | :trucks |
 A | :flower | rdfs:subClassOf | :plant  |
 D | :violet | :color          | :blue   |
   +---------+-----------------+---------+

Triple E is indeed undesired. But A is desired because it has :rose's class as its subject (rule 2), and triple D is desired because it's predicate is also used in some triples with :rose as the subject (rule 3).

PREFIX  :     <http://example.com/>
CONSTRUCT 
  { 
    ?s ?p ?o .
  }
WHERE
  { SELECT  ?s ?p ?o
    WHERE
      { { ?s  ?p  ?o }
        MINUS
          { :rose  ?p               ?o ;
                   rdf:type         ?c .
            ?c     rdfs:subClassOf  ?super .
            ?s     ?p               ?x1 .
            ?x2    ?x3              ?o
          }
      }
  }
1
Can you clarify a bit? "if my triplestore contains any triples that violate these rules: 1. aren't about :rose 2 ..." Is the rule that triples should be about rose (and by this, do you mean "rose is the subject" or "rose is the subject or object (or predicate)"), or that they shouldn't? It's usually not too hard to write SPARQL queries that find violations of some rules, but it needs to be clear what the rules are. - Joshua Taylor
Thanks. I was working on that vague section when you posted! Is it better now? I have more examples of undesired behavior at the bottom no, too. - Mark Miller
The problem is that the MINUS forms (as usual) a set of solution sequences and not a set of triples. And indeed it won't work to subtract it especially since you don't have shared variables, but here it's more critical that you're not subtracting triples inside a SELECT query. - UninformedUser
"I want to any triples in my triplestore that don't satisfy…"? Maybe you want to delete those? - TallTed
@TallTed thanks. no, I just want to know what they are for now. - Mark Miller

1 Answers

4
votes

If I understand this correctly, you want to allow four types of triples in your data. If a triple (s,p,o) is in your data, it should satisfy at least one of the following criteria:

  1. s = rose (About rose)
  2. p = rdfs:subClassOf, and data contains (rose,a,s) (About a type)
  3. The data also contains (rose,p,x) (Shared predicate)
  4. The data also contains (rose,q,o) (Shared object)

It's easy enough to write a pattern for each one of those. You just need to find each triple (s,p,o) and filter out the ones that match none of those criteria. I think you can do it like this:

select ?s ?p ?o {
  ?s ?p ?o 
  filter not exists {
          { values ?s { :rose } }                       #-- (1)
    union { values ?p { rdfs:subClassOf } :rose a ?s }  #-- (2)
    union { :rose ?p ?x }                               #-- (3)
    union { :rose ?x ?o }                               #-- (4)
  }
}