39
votes

I have an div elemet:

<div>
   This is some text
   <h1>This is a title</h1>
   <div>Some other content</div>
</div>

What xpath expression should I use to only get the div content without his child elements h1 and div

//div[not(h1)&not(div)]

Something like that? I cannot figure it out

4
Good question, +1. See my answer for three XPath expressions that probably provide you with the not defined by you "content" of the div element. - Dimitre Novatchev

4 Answers

61
votes

To get the string value of div use:

string(/div)

This is the concatenation of all text nodes that are descendents of the (top) div element.

To select all text node descendents of div use:

/div//text()

To get only the text nodes that are direct children of div use:

/div/text()

Finally, get the first (and hopefully only) non-whitespace-only text node child of div:

/div/text()[normalize-space()][1]
5
votes

What xpath expression should I use to only get the div content without his child elements h1 and div

This XPath expression:

/div/node()[not(self::h1|self::div)]

It selects every div root element's children except those h1 or div elements.

5
votes

expression like ./text() will retrieve only the content of root element only.

Regards, Nitin

1
votes

You can use this XPath expression:

./div[1]/text()[1]

to test, I use this online tester : http://xpather.com/