1
votes

After playing with MarkLogic I realized results from triples can be obtained in several ways for example by fully using either Xquery or SPARQL. So the question is that, are there any advantages using SPARQL over XQuery? Is there some indexing going on which makes SPARQL much faster then searching for a certain semantic query?

For instance if we are retrieving all semantic documents with the predicate "/like".

SPARQL

SELECT *
WHERE {
  ?s </like> ?o
}

XQuery

cts:search(fn:doc(), cts:element-query(xs:QName("sem:predicate"), "/like"))

Therefore, is there any difference in efficiency between these two?

2

2 Answers

3
votes

Yes, there are definitely differences. Whether XQuery or SPARQL is most efficient however fully depends on the problem you are trying to solve. XQuery is best at querying and processing document data, while SPARQL really allows you to reason easily over RDF data.

It is true that RDF data is serialized as XML in MarkLogic, and you can full-text search it, and even put range indexes on it if you like, but RDF data is already indexed in the triple index, which would give you more accurate results than the full-text search of above.

Also note that SPARQL allows you to follow predicate paths, which involves a lot of joining. That will be much more efficient if done via SPARQL than via XQuery, because it is mostly resolved via the triple index. Image a SPARQL query like this one:

PREFIX pers: <http://my.persons/>;
PREFIX topic: <http://my.topics/>;
PREFIX pred: <http://my.predicates/>;
SELECT DISTINCT *
WHERE {
  ?person pred:likes topic:Chocolate;
          pred:friendOf+ ?friend.
  FILTER( ?friend = (pres:WhiteSolstice) )
  FILTER( ?friend != ?person )
}

It tries to find all direct and indirect friends that like chocolate. I wouldn't write something like that in XQuery.

Then again, there are other things that are easy in XQuery, and practically impossible in SPARQL. And sometimes most efficient is to combine the two, doing a sem:sparql from inside XQuery, and using the results to direct further processing in XQuery. It also sometimes comes down to what shape your data is in..

HTH!

3
votes

A little nuance here: search is about searching for documents. Unless you have one triple per document, fetching just the triples that match out of a bunch in a document will involving pulling the whole document from disk (although it may be in cache). SPARQL is about selecting triple data from the triple indexes, which may involve less disk IO. Certainly if you are doing anything other than a simple fetch of a simple triple pattern, you're going to need the understanding of relationships that SPARQL gives you.