xpath: getting text node value without using text()

Question

I am wondering what is the purpose of using text() in xpath. If I have an xml document

 <book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <author>Giada De Laurentiis</author>
  <year>2005</year>
  <price>30.00</price>
 </book>

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

And I need to find the price of the books. I can use: /bookstore/book/price[text()] or /bookstore/book/price

It will give me same results. So why use text()?

JLRishe JLRishe · Accepted Answer · 2015-03-10T18:19:45

There is no reason to use text() in this particular case, and text() is often overused among XPath newbies.

There are valid use cases for the text() node test and they involve times when one wants to specifically target a text node.

For example, suppose that some of the books had blank prices and you wanted to get only the non-blank ones:

<book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <price>30.00</price>
</book>

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <price></price>
</book>

<book category="CHILDREN">
  <title lang="en">Narnia</title>
  <price>29.99</price>
</book>

/bookstore/book/price would return three elements, while /bookstore/book/price[text()] would return two.

Or there may be times when you want to get just an element's text nodes, and not its entire content:

<book category="CHILDREN">
  Harry Potter
  <author>J. K. Rowling</author>
  <price>29.99</price>
</book>

In this case, /bookstore/book would produce an element whose string value is Harry Potter J. K. Rowling29.99, whereas /bookstore/book/text() would produce a set of text nodes, the first of which has a string value of Harry Potter, and the other two just whitespace.

As Michael Kay points out in the comments, using text() can be useful when dealing with mixed content (where text nodes are side-by-side with elements as in the second example above). There should be very few cases where you need to use text() with non-mixed content.

xpath: getting text node value without using text()

2 Answers