58
votes

I'm trying to extract an element with a particular innertext from a parsed XML document. I know that I can select an element that has a child with a particular innertext using //myparent[mychild='foo'], but I actually just want to select the "mychild" element in this example.

<myparent>
  <mychild>
    foo
  </mychild>
</myparent>

What would be the XPath query for "foo" that would return the "mychild" node?

4

4 Answers

91
votes

Have you tried this?

//myparent/mychild[text() = 'foo']

Alternatively, you can use the shortcut for the self axis:

//myparent/mychild[. = 'foo']
5
votes

Matt said it, but the full solution:

//myparent[mychild='foo']/mychild
3
votes

You might consider using the contains function to return true/false if the test was found like so:

//mychild[contains(text(),'foo')]

See XSLT, XPath, and XQuery Functions for functions reference

0
votes

As per the HTML:

<myparent>
  <mychild>
    foo
  </mychild>
</myparent>

The <mychild> element with text as foo is within it's parent <myparent> tag and the text contains leading and trailing white space characters.

So to select the <mychild> element you can use either of the following solutions:

  • Using normalize-space():

    //myparent/mychild[normalize-space()='foo']
    
  • Using contains():

    //myparent/mychild[contains(., 'foo')]
    

Ignoring the parent <myparent> tag you can also use:

  • Using normalize-space():

    //mychild[normalize-space()='foo']
    
  • Using contains():

    //mychild[contains(., 'foo')]