5
votes

The issue is that I'm looking to construct an XPath expression to get nodes having attributes XXX having values like TP* where the star is a number. Suppose I have this XML file

<tagA attA="VAL1">text</tagA>
<tagB attB="VAL333">text</tagB>
<tagA attA="VAL2">text</tagA>
<tagA attA="V2">text</tagA>

So the xpath expression should get me all tagA having attribute attrA with values with the pattern VAL*
//tagA[@attrA[matches('VAL\d')]]: is not working

3

3 Answers

7
votes

If you need XPath 1.0 solution, try below:

//tagA[boolean(number(substring-after(@attA, "VAL"))) or number(substring-after(@attA, "VAL")) = 0]

If @attA cannot be "VAL0", then just

//tagA[boolean(number(substring-after(@attA, "VAL")))]
2
votes

matches() requires XPath 2.0, but javax.xml.xpath in Java 8 supports only XPath 1.0.

Furthermore, the first argument of matches() is the string to match. So, you'd want:

//tagA[@attrA[matches(., 'VAL\d')]]

This is looking for "VAL" plus a single digit anywhere in the attribute value of @attrA. See the regex in @jschnasse's answer if you wish to match the entire string with multiple/optional digit suffixes (XPath 2.0) or Andersson's answer for an XPath 1.0 solution.

1
votes

Add a quantifier (*,+,...) to your \d. Try

'^VAL\d*$'

As @kjhughes has pointed out. This will not work with standard Java, because even current version of Java 11 does not support XPath 2.0. You can however use Saxon if you need XPath 2.0 support.

Saxon Example (It is a variant of this answer using javax.xml)

Processor processor = new Processor(false);

@Test
public void xpathWithSaxon() {
    String xml = "<root><tagA attA=\"VAL1\">text</tagA>\n" + "<tagB attB=\"VAL333\">text</tagB>\n"
                    + "<tagA attA=\"VAL2\">text</tagA>\n" + "<tagA attA=\"V2\">text</tagA>\n" + "</root>";
    try (InputStream in = new ByteArrayInputStream(xml.getBytes("utf-8"));) {
        processFilteredXmlWith(in, "//root/tagA[matches(@attA,'^VAL\\d*$')]", (node) -> {
            printItem(node, System.out);
        });
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

private void printItem(XdmItem node, PrintStream out) {
    out.println(node);
}

public void processFilteredXmlWith(InputStream in, String xpath, Consumer<XdmItem> process) {
    XdmNode doc = readXmlWith(in);
    XdmValue list = filterNodesByXPathWith(doc, xpath);
    list.forEach((node) -> {
        process.accept(node);
    });

}

private XdmNode readXmlWith(InputStream xmlin) {
    try {
        return processor.newDocumentBuilder().build(new StreamSource(xmlin));
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

private XdmValue filterNodesByXPathWith(XdmNode doc, String xpathExpr) {
    try {
        return processor.newXPathCompiler().evaluate(xpathExpr, doc);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Prints

<tagA attA="VAL1">text</tagA>

<tagA attA="VAL2">text</tagA>