0
votes

I started a project to process XML input, I ran into a problem with the unmarshal method. I'm trying to umarshal the XML below into a struct. The are multiple 'values' tag, which is fine, but within the 'values' tag there can be either one 'value' tag with a value. Or there are multiple 'value' tags with a 'key'&'value' tag inside. I'm trying to capture this in a Go struct, but no luck so far..

<lines>
<values>
    <key>Key_1</key>
    <value>Value_1</value>
</values>
<values>
    <key>Key_2</key>
    <value>
        <key>NestedKey_1</key>
        <value>NestedValue_1</value>
    </value>
    <value>
        <key>NestedKey_2</key>
        <value>NestedValue_2</value>
    </value>
</values>

Go struct:

type Value struct {
    Key string `xml:"key"`
    Value string `xml:"value"`
}

type Values struct {
    Key string `xml:"key"`
    Value []Value `xml:"value"`
}

type Lines struct {
    Values []Values `xml:"values"`
}

When I print the unmarshal output, I do see Key_1 and Key_2 and the nested key-value pairs, but I don't see Value_1 in the output. Which is obvious since it's a string and not an array, anyone an idea how to work around this?

Playground: http://play.golang.org/p/I4U0lhPt5U

1
"anyone an idea how to work around this?" - Have a sensible XML schema. In your snippet there are two 'values' nodes and they don't share a common schema... How else should it process it? That is a values node, it does not have a key and an array of type Value. It just has a value. Is that even valid?evanmcdonnal

1 Answers

0
votes

Two things to look at:

  • A struct type representing the contents of an XML element might have a special field "XMLName" of type xml.Name.

    This can be used to have types named differently from XML elements.

  • The name of an XML element specified for a field of your struct type via the "xml" tag might specify nesting of elements using the > notation, like in

    Foo string `xml:"a>b>c"`
    

    This can be used to skip unneeded intermediate (enclosing) elements and directly extract their child elements of interest.

All this stuff is explained in the package's doc for the Unmarshal() func.

Supposedly, these capabilities should allow you to deal with your problem.

If you will find that your case is truly pathological and is hard to deal with plain unmarshaling, you might try to approach it using XPath.