0
votes

I am transforming XML files that do not contain every element in the schema. What technique should I use in my XSLT file so that the resulting XML contains all elements specified in the schema?

I know XSLT 2.0 could probably solve this problem easily, but I am stuck with XSLT 1.0.

Paul


Here is an example of what I am trying to do:

Schema.xml

<SAN>
    <STACKMEMBERS>
        <STACKMEMBER>
            <A/>
            <B/>
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <C/>
                    <D/>
                    <E/>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
    </STACKMEMBERS>
</SAN>

DataFile.xml

<SAN>
    <STACKMEMBERS>
        <STACKMEMBER>
            <A>1111</A>
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <D>3333</D>
                    <E>4444</E>
                </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <D>5555</D>
                    <E>6666</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
        <STACKMEMBER>
            <A>2222</A>
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <D>7777</D>
                    <E>8888</E>
                </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <D>9999</D>
                    <E>1010</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
    </STACKMEMBERS>
</SAN>

Desired Output:

<SAN>
    <STACKMEMBERS>
        <STACKMEMBER>
            <A>1111</A>
            <B />
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <C />
                    <D>3333</D>
                    <E>4444</E>
                </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <C />
                    <D>5555</D>
                    <E>6666</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
        <STACKMEMBER>
            <A>2222</A>
            <B />
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <C />               
                    <D>7777</D>
                    <E>8888</E>
                </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <C />
                    <D>9999</D>
                    <E>1010</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
    </STACKMEMBERS>
</SAN>

My current XSLT file is...

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:variable name="schemaFile" select="document('Schema.xml')"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*">
        <SAN>
            <STACKMEMBERS>
                <xsl:for-each select="/SAN/STACKMEMBERS/STACKMEMBER">
                    <xsl:copy-of select="."/>
                    <xsl:copy-of select="$schemaFile/SAN/STACKMEMBERS/STACKMEMBER"/>
                </xsl:for-each>
            </STACKMEMBERS>
        </SAN>
    </xsl:template>

</xsl:stylesheet>

...and it is giving me the wrong results:

<SAN>
    <STACKMEMBERS>
        <STACKMEMBER>
            <A>1111</A>
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <D>3333</D>
                    <E>4444</E>
                    </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <D>5555</D>
                    <E>6666</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
        <STACKMEMBER>
            <A />
            <B />
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <C />
                    <D />
                    <E />
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
        <STACKMEMBER>
            <A>2222</A>
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <D>7777</D>
                    <E>8888</E>
                </ETHERNETSWITCH>
                <ETHERNETSWITCH>
                    <D>9999</D>
                    <E>1010</E>
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
        <STACKMEMBER>
            <A />
            <B />
            <ETHERNETSWITCHES>
                <ETHERNETSWITCH>
                    <C />
                    <D />
                    <E />
                </ETHERNETSWITCH>
            </ETHERNETSWITCHES>
        </STACKMEMBER>
    </STACKMEMBERS>
</SAN>

Note that elements B and C are defined in the schema but not present in the data file. The desired output would add elements only contained in the schema to the data file.

It would be nice to have multiple criteria for a selection (like select=(one, two)). I feel I am close here but need one little push to get the desired output.

Paul


Here is the complete XSLT file, revised with Michael Kay's template, that is not quite working:

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:variable name="Instance" select="/"/>
    <xsl:variable name="Schema" select="document('schema.xml')"/>

    <xsl:template match="*">
        <xsl:copy>
            <xsl:variable name="E" select="."/>
            <xsl:variable name="S" select="$Schema//*[name(.)=name($E)]"/>
            <xsl:for-each select="$S/*">
                <xsl:variable name="SC" select="."/>
                <xsl:variable name="EC" select="$E/*[name(.)=name($SC)]"/>
                <xsl:choose>
                    <xsl:when test="$EC">
                        <xsl:apply-templates select="$EC"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="$SC"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

I wrote a template that should return the element's XPath. Then I modified the above template to look up the elements by XPath instead of element name. The result is a stack overflow and I don't know where I've gone wrong.

Here is the XSLT code that is not working. Any help would be much appreciated.

Paul

<?xml version="1.0"?>

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2011-03-16T10:53:27">

    <xsl:output indent="yes"/>
        <xsl:strip-space elements="*"/>

        <xsl:variable name="Data" select="/"/>
        <xsl:variable name="Schema" select="document('MainDataSource.xml')"/>

    <xsl:template match="*[not(*)]">
        <xsl:copy-of select="."/>
    </xsl:template>

        <xsl:template name="GetXPath">
                <xsl:param name="element"/>
                <xsl:if test="not(*)">
                        <xsl:apply-templates select="ancestor-or-self::*" mode="path"/>
                </xsl:if>
        </xsl:template>

        <xsl:template match="*" mode="path">
                <xsl:value-of select="concat('/',name())"/>
        </xsl:template>

        <xsl:template match="*">
                <xsl:copy>
                        <xsl:variable name="DataElement" select="."/>

                        <xsl:variable name="DataElementXPath">
                                <xsl:call-template name="GetXPath">
                                        <xsl:with-param name="element" select="$DataElement"/>
                                </xsl:call-template>
                        </xsl:variable>

                        <xsl:variable name="SchemaElement" select="$Schema/*[$DataElementXPath]"/>

                        <xsl:for-each select="$SchemaElement/*">
                                <xsl:variable name="SchemaChild" select="."/>

                                <xsl:variable name="SchemaChildXPath">
                                        <xsl:call-template name="GetXPath">
                                                <xsl:with-param name="element" select="$SchemaChild"/>
                                        </xsl:call-template>
                                </xsl:variable>

                                <xsl:variable name="DataChild" select="$Data/*[$SchemaChildXPath]"/>

                                <xsl:choose>
                                        <xsl:when test="$DataChild">
                                                <xsl:apply-templates select="$DataChild"/>
                                        </xsl:when>
                                        <xsl:otherwise>
                                                <xsl:copy-of select="$SchemaChild"/>
                                        </xsl:otherwise>
                                </xsl:choose>
                        </xsl:for-each>
                </xsl:copy>
        </xsl:template>

</xsl:stylesheet>
2
(Off topic, I did a double-take: I'm just reading John le Carre "A delicate truth" in which the main character masquerades as Paul Anderson).Michael Kay

2 Answers

0
votes

No easy answer. Just write a transformation that outputs valid XML.

There are mapping tools around, e.g. from Altova, that attempt to automate the task of creating a stylesheet that transforms from schema A to schema B. I've never found them very useful, but they might work for you.

0
votes

Right. Well, it's nice to know that your schema is written in a proprietary schema language of your own invention. I suppose we have to guess the semantics, but they seem straightforward enough. As it seems this schema language can only express sequence, and not choice or iteration, the problem might be rather simpler than it would be with a more conventional schema language.

It seems to me that the rule you want to apply is:

To process an element E named N in the instance, find the element S named N in the schema. For each child SC of that schema element, if E has a child EC with the same name, process EC, otherwise copy SC.

That translates to this:

<xsl:variable name="Instance" select="/"/>
<xsl:variable name="Schema" select="doc('schema.xml')"/>
<xsl:template match="*">
  <xsl:copy>
    <xsl:variable name="E" select="."/>
    <xsl:variable name="S" select="$Schema//*[name(.)=name($E)]"/>
    <xsl:for-each select="$S/*">
      <xsl:variable name="SC" select="."/>
      <xsl:variable name="EC" select="$E/*[name(.)=name($SC)]"/>
      <xsl:choose>
        <xsl:when test="$EC">
          <xsl:apply-templates select="$EC"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy-of select="$SC"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:copy>
</xsl:template>

Not tested.