0
votes

Given test html:

<html>
<body>
  abc
  <b id="b_1">def</b>
  ghi
  <b id="b_2">jkl</b>
  <b id="b_3">mno</b><b id="b_4">qrs</b>
</body>
</html>

Question: How can i select all b elements whose first preceding-sibling node are non-empty text nodes?

So, in the example above, I wish to select elements b_1 and b_2.

Element b_3 has first preceding sibling node of type text, but it's empty.

Element b_4 has first preceding sibling node that is element node.


I've tried the following, but they both fail in at least one respect:

  • preceding-sibling::*[1] will select the first element node, ignoring the desired text nodes.
  • preceding-sibling::text()[1] will select the first text node skipping any element nodes.
2

2 Answers

2
votes

This XPath,

//b[preceding-sibling::node()[1][self::text()][.!='']]

will select all b elements whose immediately preceding sibling is a non-empty text node:

<b id="b_1">def</b>
<b id="b_2">jkl</b>

as requested.

-1
votes

This one should work:

//b[normalize-space(./preceding-sibling::text()[1])]

The normalize-space function is the one checking that something exists there, because the preceding-sibling could be empty or even just a \n.