122
votes

Suppose I have the following XML:

<book category="CLASSICS">
  <title lang="it">Purgatorio</title>
  <author>Dante Alighieri</author>
  <year>1308</year>
  <price>30.00</price>
</book>

<book category="CLASSICS">
  <title lang="it">Inferno</title>
  <author>Dante Alighieri</author>
  <year>1308</year>
  <price>30.00</price>
</book>

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

<book category="WEB">
  <title lang="en">XQuery Kick Start</title>
  <author>James McGovern</author>
  <author>Per Bothner</author>
  <author>Kurt Cagle</author>
  <author>James Linn</author>
  <author>Vaidyanathan Nagarajan</author>
  <year>2003</year>
  <price>49.99</price>
</book>

<book category="WEB">
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>2003</year>
  <price>39.95</price>
</book>

I would like to do an xpath that gets back all book nodes that have a title node with a language attribute of "it".

My attempt looked something like this:

//book[title[@lang='it']]

But that didn't work. I expect to get back the nodes:

<book category="CLASSICS">
  <title lang="it">Purgatorio</title>
  <author>Dante Alighieri</author>
  <year>1308</year>
  <price>30.00</price>
</book>

<book category="CLASSICS">
  <title lang="it">Inferno</title>
  <author>Dante Alighieri</author>
  <year>1308</year>
  <price>30.00</price>
</book>

Any hints?

5
What XPath implementation is this?Pavel Minaev

5 Answers

185
votes

Try

//book[title/@lang = 'it']

This reads:

  • get all book elements
    • that have at least one title
      • which has an attribute lang
        • with a value of "it"

You may find this helpful — it's an article entitled "XPath in Five Paragraphs" by Ronald Bourret.

But in all honesty, //book[title[@lang='it']] and the above should be equivalent, unless your XPath engine has "issues." So it could be something in the code or sample XML that you're not showing us -- for example, your sample is an XML fragment. Could it be that the root element has a namespace, and you aren't counting for that in your query? And you only told us that it didn't work, but you didn't tell us what results you did get.

58
votes

Years later, but a useful option would be to utilize XPath Axes (https://www.w3schools.com/xml/xpath_axes.asp). More specifically, you are looking to use the descendants axes.

I believe this example would do the trick:

//book[descendant::title[@lang='it']]

This allows you to select all book elements that contain a child title element (regardless of how deep it is nested) containing language attribute value equal to 'it'.

I cannot say for sure whether or not this answer is relevant to the year 2009 as I am not 100% certain that the XPath Axes existed at that time. What I can confirm is that they do exist today and I have found them to be extremely useful in XPath navigation and I am sure you will as well.

14
votes
//book[title[@lang='it']]

is actually equivalent to

 //book[title/@lang = 'it']

I tried it using vtd-xml, both expressions spit out the same result... what xpath processing engine did you use? I guess it has conformance issue Below is the code

import com.ximpleware.*;
public class test1 {
  public static void main(String[] s) throws Exception{
      VTDGen vg = new VTDGen();
      if (vg.parseFile("c:/books.xml", true)){
          VTDNav vn = vg.getNav();
          AutoPilot ap = new AutoPilot(vn);
          ap.selectXPath("//book[title[@lang='it']]");
                  //ap.selectXPath("//book[title/@lang='it']");

          int i;
          while((i=ap.evalXPath())!=-1){
              System.out.println("index ==>"+i);
          }
          /*if (vn.endsWith(i, "< test")){
             System.out.println(" good ");  
          }else
              System.out.println(" bad ");*/

      }
  }
}
4
votes

I would think your own suggestion is correct, however the xml is not quite valid. If you are running the //book[title[@lang='it']] on <root>[Your"XML"Here]</root> then the free online xPath testers such as one here will find the expected result.

2
votes

Try to use this xPath expression:

//book/title[@lang='it']/..

That should give you all book nodes in "it" lang