0
votes

How can I parse following XML file using Groovy

<Person>
    <name>a</name>
    <age>1</age>
</person>
<account>
    <number>4242</number>
    <bank>S</bank>
</account>
<account>
    <number>4242</number>
    <bank>T</bank>
</account>
<Person>
    <name>b</name>
    <age>1</age>
</person>
<account>
    <number>4242</number>
    <bank>S</bank>
</account>
<account>
    <number>4242</number>
    <bank>T</bank>
</account>

In this case person can have multiple accounts.How can I parse this xml?

Person A - finding all of bank accounts using groovy

2
Is your question "please write the code to parse this XML" or "how do I determine which account belongs to which user, because only the element order tells me this" - Mark O'Connor
Did you tried something? Also i think this xml is missing a root element - Will

2 Answers

2
votes

Mark is right, first you need a well formated xml document. Then, you can use XmlSlurper (or XmlParser), both return an implementation of GPath.

Here I've added <root> and <accounts> tags to make the document xml valid:

 def doc = """
    <root>
        <person>
            <name>a</name>
            <age>1</age>
            <accounts>
                <account>
                    <number>4242</number>
                    <bank>S</bank>
                </account>
                <account>
                    <number>5252</number>
                    <bank>T</bank>
                </account>
            </accounts>
        </person>
        <person>    
            <name>b</name>
            <age>1</age>
            <accounts>
                <account>
                    <number>4242</number>
                    <bank>S</bank>
                </account>
                <account>
                    <number>4242</number>
                    <bank>T</bank>        
                </account>
            </accounts>
        </person>
    </root>
    """

then it's pretty easy to parse it with GPath:

def parser = new XmlSlurper().parseText(doc)
parser.person.findAll { p -> p.name == 'a' }
    .accounts.account.number.each {  v -> println "Account number[$v]" }

which renders:

Account number[4242]

Account number[5252]
0
votes

This is pretty ugly, but if there is no way to change the XML being received, you can iterate it using old school index manipulation.

The XML and some fixing:

def xml = '''
<Person>
    <name>a</name>
    <age>1</age>
</person>
<account>
    <number>4242</number>
    <bank>S</bank>
</account>
<account>
    <number>4242</number>
    <bank>T</bank>
</account>
<Person>
    <name>b</name>
    <age>1</age>
</person>
<account>
    <number>4242</number>
    <bank>S</bank>
</account>
<account>
    <number>4242</number>
    <bank>T</bank>
</account>'''

fixedXml = "<root>${xml.replaceAll('<Person>', '<person>')}</root>"

And the processing:

def root = new XmlParser().parseText fixedXml

def people = [:]

for (int i = 0; i < root.children().size(); i++) {
  def child = root.children()[i]

  if (child.name() == 'person') {

    people[child] = []

    while(true) {
      def nextChild = root.children()[i + 1]
      if (nextChild?.name() == 'account') {
        people[child] << nextChild
        i++
      } else {
        break
      }
    }

  }
}

assert people.size() == 2
assert people*.value*.size == [2, 2]
assert people['a'][0].bank.text() == "S"
assert people['b'][1].number.text() == "4242"

If the XML is worst than that, consider using neko to clean it.