1
votes

I'm using xmlstarlet to make some modifications to an xml file (lets call it test.xml), but I'm running into issues with my update statement (Note: I'm very new to xmlstarlet as well!).

Here is an example of the xml I'm working with:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<LIST>
    <STUFF>
        <xSTUFF>
            <ITEM>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>X-123</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Purple</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <VULN_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>5</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <INSTOCK>No</INSTOCK>
                <LOCATION></LOCATION>
                <PRICE></PRICE>
                <ONSALE></ONSALE>
                <DISCOUNT></DISCOUNT>
            </ITEM>
            <ITEM>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>X-124</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Red</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <VULN_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>3</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <INSTOCK>Yes</INSTOCK>
                <LOCATION>IsleA</LOCATION>
                <PRICE>2.99</PRICE>
                <ONSALE>No</ONSALE>
                <DISCOUNT>No</DISCOUNT>
            </ITEM>
        </xSTUFF>
    </STUFF>
</LIST>

There are multiple items, each with unique item IDs. I'm trying to update the INSTOCK, LOCATION, PRICE, and sometimes the ONSALE and DISCOUNT fields for a given item ID. Using one of those as an example, I'm trying the following:

xmlstarlet ed --inplace -u "//LIST/STUFF/xSTUFF/ITEM/ITEM_DATA[ATTRIBUTE_DATA='X-123']/../INSTOCK" -v Yes test.xml

This appears to work, but for some reason strips the leading element tags on everything below within the matching item element, so my output file ends up looking like this (note the missing LOCATION, PRICE, ONSALE, and DISCOUNT leading tags):

*** EDIT: Tags were actually reformatted to self-closing tags, results below updated. Thanks Daniel Haley.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<LIST>
    <STUFF>
        <xSTUFF>
            <ITEM>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>X-123</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Purple</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <VULN_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>5</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <INSTOCK>Yes</INSTOCK>
                <LOCATION/>
                <PRICE/>
                <ONSALE/>
                <DISCOUNT/>
            </ITEM>
            <ITEM>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>X-124</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Red</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <ITEM_DATA>
                    <VULN_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
                    <ATTRIBUTE_DATA>3</ATTRIBUTE_DATA>
                </ITEM_DATA>
                <INSTOCK>Yes</INSTOCK>
                <LOCATION>IsleA</LOCATION>
                <PRICE>2.99</PRICE>
                <ONSALE>No</ONSALE>
                <DISCOUNT>No</DISCOUNT>
            </ITEM>
        </xSTUFF>
    </STUFF>
</LIST>

I'm guessing I'm missing something simple since I'm completely green to xmlstarlet, so any help is greatly appreciated!

1
When I try to reproduce your issue, I get something different. Instead of missing start tags, I get self closing tags. Like <LOCATION/>. This is the equivalent of <LOCATION></LOCATION>. (See stackoverflow.com/q/21022367/317052 .) I don't think you can control that part of the serialization in xmlstarlet.Daniel Haley
You could use XSLT with the tr command and set the output method to HTML to keep both the start/end tags, but then you lose your XML declaration (<?xml ...?>) and indent (pretty print) doesn't seem to work. (Tested with xmlstartlet 1.6.1 on Windows.) Let me know if you'd like an example and I'll add an answer.Daniel Haley
Oh wow... thanks for pointing that out Daniel. No, you're absolutely correct - I'm getting self closing tags and just completely missed it. If you wouldn't mind I'd certainly like to see your XSLT example as well. I'm not constrained to xmlstarlet and the resulting xml file after these changes is going to get used by another homegrown app that I'm unfamiliar with/unsure how it will handle those tags. It'll likely be OK, but having a plan B never hurt. Thanks again.junkyman

1 Answers

0
votes

To prevent the empty elements from being self closed, you could use XSLT with the tr xmlstarlet command with the output method set to HTML.

XSLT 1.0

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" method="html"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="id"/>
  <xsl:param name="newValue"/>

  <xsl:template match="@*|node()" name="ident">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="INSTOCK">
    <xsl:choose>
      <xsl:when test="../ITEM_DATA/ATTRIBUTE_DATA=$id">
        <xsl:copy>
          <xsl:value-of select="$newValue"/>
        </xsl:copy>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="ident"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

Command Line

xml tr test.xsl -s id="X-123" -s newValue="Yes" input.xml

Output

<LIST><STUFF><xSTUFF><ITEM><ITEM_DATA><ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>X-123</ATTRIBUTE_DATA></ITEM_DATA><ITEM_DATA><ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>Purple</ATTRIBUTE_DATA></ITEM_DATA><ITEM_DATA><ITEM_ATTRIBUTE>Weight</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>5</ATTRIBUTE_DATA></ITEM_DATA><INSTOCK>Yes</INSTOCK><LOCATION></LOCATION><PRICE></PRICE><ONSALE></ONSALE><DISCOUNT></DISCOUNT></ITEM><ITEM><ITEM_DATA><ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>X-124</ATTRIBUTE_DATA></ITEM_DATA><ITEM_DATA><ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>Red</ATTRIBUTE_DATA></ITEM_DATA><ITEM_DATA><ITEM_ATTRIBUTE>Weight</ITEM_ATTRIBUTE><ATTRIBUTE_DATA>3</ATTRIBUTE_DATA></ITEM_DATA><INSTOCK>Yes</INSTOCK><LOCATION>IsleA</LOCATION><PRICE>2.99</PRICE><ONSALE>No</ONSALE><DISCOUNT>No</DISCOUNT></ITEM></xSTUFF></STUFF></LIST>

Notice that you don't get the XML declaration (<?xml ...?>) and even though I have indent="yes" set on xsl:output, the XML ends up all on one line.

The XML is still well-formed though because the XML declaration is not required on XML 1.0 instances.

Another option is to use XSLT 2.0/3.0 with a different processor. That way you can use the output method xhtml.

Here's an example of using XSLT 3.0 with the Java version of Saxon-HE 9.8 (free/open source) from the command line...

XSLT 3.0

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xhtml" standalone="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="id" required="yes"/>
  <xsl:param name="newValue" required="yes"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="ITEM[ITEM_DATA/ATTRIBUTE_DATA=$id]/INSTOCK">
    <xsl:copy>
      <xsl:value-of select="$newValue"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Command Line

java -cp "C:/apps/saxon/saxon9he.jar" net.sf.saxon.Transform -s:"input.xml" -xsl:"test.xsl" id="X-123" newValue="Yes"

Output

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><LIST>
   <STUFF>
      <xSTUFF>
         <ITEM>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>X-123</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>Purple</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>5</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <INSTOCK>Yes</INSTOCK>
            <LOCATION></LOCATION>
            <PRICE></PRICE>
            <ONSALE></ONSALE>
            <DISCOUNT></DISCOUNT>
         </ITEM>
         <ITEM>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Item_ID</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>X-124</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Color</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>Red</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <ITEM_DATA>
               <ITEM_ATTRIBUTE>Weight</ITEM_ATTRIBUTE>
               <ATTRIBUTE_DATA>3</ATTRIBUTE_DATA>
            </ITEM_DATA>
            <INSTOCK>Yes</INSTOCK>
            <LOCATION>IsleA</LOCATION>
            <PRICE>2.99</PRICE>
            <ONSALE>No</ONSALE>
            <DISCOUNT>No</DISCOUNT>
         </ITEM>
      </xSTUFF>
   </STUFF>
</LIST>

Notice there's no line break between the XML declaration and the <LIST> start tag. If this is an issue (it shouldn't be), you can add the following template to the XSLT.

<xsl:template match="/">
  <xsl:text>&#xA;</xsl:text>
  <xsl:apply-templates/>
</xsl:template>

Also, if you end up being able to use your current output, you can simplify your XPath in your xmlstarlet command a little:

/LIST/STUFF/xSTUFF/ITEM[ITEM_DATA/ATTRIBUTE_DATA='X-123']/INSTOCK