0
votes

I have an XML file, to which I am trying to extract the titles of the elements books and journals where the matching product by ID and IDREF has no value in the comments.

So the XML file I have looks something like this:

<bookshop>
    <product ID="185.3.16">
        <price currency="AU$">56.85</price>
        <comments>Best sell</comments>
    </product>
    <product ID="163.24.12">
        <price currency="NZ$">28.6</price>
        <comments />
    </product>
    <product ID="332.17.25">
        <price currency="US$">19.95</price>
        <comments></comments>
    </product>
    <book IDREF="163.24.12">
        <title>Core Java</title>
    </book>
    <book IDREF="185.3.16">
        <title>C++ Development</title>
    </book>
    <journal IDREF="332.17.25">
        <title>Mathematics and Computing</title>
    </journal>
</bookshop>

And the output I'm trying to achieve is:

<title>Core Java</title>
<title>Mathematics and Computing</title>

because the book C++ development's IDREF matches the ID of a product whose child comments has no value.

I've tried to create a script using xquery to do so by trying to find all elements product with a distinct ID attribute where it's child comment is also empty, but I'm having trouble both trying to find the products without comments, and then translating those results into something I can return. (I can't really give returning a proper go until I can resolve the products so I haven't been able to make much leeway with that.) But here is what I've tried so far:

for $id in distinct-values(//product/@ID)
where count(//product/comments/text()) > 0
return ...?

Any help would be much appreciated.

Thanks in advance!

1

1 Answers

0
votes

"I am trying to extract the titles of the elements books and journals where the matching product by ID and IDREF has no value in the comments"

This is one possible XPath to return what you described in question :

/bookshop/*[@IDREF = /bookshop/product[normalize-space(comments) = '']/@ID]/title

If you want specifically to select only either book or journal, simply add a predicate : [self::book | self::journal] after /*.

xpathtester demo

output :

<title>Core Java</title>

<title>Mathematics and Computing</title>

UPDATE :

It isn't clear about the limitation (what kind of XPath expression you can't used). Assuming you are not allowed to use XPath predicate ([]), it can be replaced with where clause as follow :

declare variable emptyProductIDs := /bookshop/product[not(comments/text())]/@ID

for $element in /bookshop/*
where $element/@IDREF = $emptyProductIDs
return $element/title