148
votes

There is an HTML file (whose contents I do not control) that has several input elements all with the same fixed id attribute of "search_query". The contents of the file can change, but I know that I always want to get the second input element with the id attribute "search_query".

I need an XPath expression to do this. I tried //input[@id="search_query"][2] but that does not work. Here is an example XML string where this query failed:

<div>
  <form>
    <input id="search_query" />
   </form>
</div>

<div>
  <form>
    <input id="search_query" />
  </form>
</div>

<div>
  <form>
    <input id="search_query" />
  </form>
</div>

Keep in mind that that the above is merely an example and the other HTML code can be quite different and the input elements can appear anywhere with no consistent document structure (except that I am guaranteed there will always be at least two input elements with an id attribute of "search_query").

What is the correct XPath expression?

2
Good question, +1. See my answer for a complete explanation of the problem and for the wanted solution.Dimitre Novatchev
Minor point: you should never have more than one element with a given ID (and so the HTML in the question is actually invalid). In practice, browsers will let you do it anyway, but if you do you're missing out on the only benefit of using IDs, which is that they signal "I'm unique" (whereas classes are designed to be used for non-unique signifiers).machineghost

2 Answers

277
votes

This is a FAQ:

//somexpression[$N]

means "Find every node selected by //somexpression that is the $Nth child of its parent".

What you want is:

(//input[@id="search_query"])[2]

Remember: The [] operator has higher precedence (priority) than the // abbreviation.

24
votes

This seems to work:

/descendant::input[@id="search_query"][2]

I go this from "XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition" by Michael Kay.

There is also a note in the "Abbreviated Syntax" section of the XML Path Language specification http://www.w3.org/TR/xpath/#path-abbrev that provided a clue.