This simplified code works correctly:
static stringXML = '''<?xml version='1.0' encoding='UTF-8'?>
<ftc:FATCA_OECD xsi:schemaLocation='urn:oecd:ties:fatca:v1 FatcaXML_v1.1.xsd' version='1.1' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:sfa='urn:oecd:ties:stffatcatypes:v1' xmlns:ftc='urn:oecd:ties:fatca:v1'>
<ftc:MessageSpec>Spec</ftc:MessageSpec>
</ftc:FATCA_OECD>'''
static void main(args) {
//NAMESPACE UNAWARE PARSING
def rep = new XmlParser(false,false).parseText(stringXML)
def attrMap = rep.attributes()
attrMap.each {k,v ->
println "$k, $v"
}
//NAMESPACE AWARE PARSING
rep = new XmlParser().parseText(stringXML)
def ftc = new groovy.xml.Namespace(attrMap['xmlns:ftc'])
println rep[ftc.MessageSpec].text()
}
}
And produces following correct output:
xsi:schemaLocation, urn:oecd:ties:fatca:v1 FatcaXML_v1.1.xsd
version, 1.1
xmlns:xsi, http://www.w3.org/2001/XMLSchema-instance
xmlns:sfa, urn:oecd:ties:stffatcatypes:v1
xmlns:ftc, urn:oecd:ties:fatca:v1
Spec
The problem is, that I am already using in quite extensive code Namespace aware parsing and I would like to keep it....
Therefore I would have to use both namespace unaware and namespace aware parsing as in code above
Do you know, how to produce the same result without double parsing the whole .xml (the .xml is quite large) or by extracting just root element of the .xml and than using namespace aware parsing....