2
votes

Xpath //a[contains(@class, 'storylink')]/@* will extract all attributes of anchor tags. Anchor tag in my xml doesn't have title attribute which usually have content of the link. Is there a way to select both href and text content in anchor link with XPATH 1.0 ?

2
Add a full example of the source content (a full XML fragment) and the exact expected output.David Ennis
@DavidEnnis eg: http://news.ycombinator.com I'm trying to extract title along with href.Shanthakumar
Sorry... I meant in your sample - not for someone to have to fish around on the source of a web page.. Expand your question with example input and expected output..David Ennis

2 Answers

2
votes

If you want to select both @href and text() in a single XPath selection, you can use the union operator |.

With XPath 1.0, this is probably the best you can do:

//a[contains(@class, 'storylink')]/@href | //a[contains(@class, 'storylink')]/text()

With XPath 2.0 (or higher) you could avoid repeating the anchor selection criteria:

//a[contains(@class, 'storylink')]/(@href,text())
0
votes

Just select the a element itself using //a[contains(@class, 'storylink')] and then get the required attributes and/or text content using Javascript methods on the returned element node.

You could indeed select both the attributes and the text using XPath, but if they're all jumbled up in a single query result then it's a hassle separating them again.