5
votes

I have a large collection of XML files which I need to transform using XSLT. The problem is that many of these files were hand-written by different people and they do not use consistent names to refer to the schemas. For example, one file might use:

xmlns:itemType="http://example.com/ItemType/XSD"

where another might use the prefix "it" instead of "itemType":

xmlns:it="http://example.com/ItemType/XSD"

If that's not bad enough, there are several files which use two or three synonyms for the same thing!

<?xml version="1.0"?>
<Document
    xmlns:it="http://example.com/ItemType/XSD"
    xmlns:itemType="http://example.com/ItemType/XSD"
    xmlns:ItemType="http://example.com/ItemType/XSD"
    ...

(there's clearly been a lot of cutting and pasting going on)

Now, because the pattern matching in the XSLT file appears to work on the namespace prefix (as opposed to the schema it relates to) the pattern only matches one of the variants. So if I write something like:

    <xsl:template match="SomeNode[@xsi:type='itemType:SomeType']">
        ...
    </xsl:template>

Then it only matches a subset of the cases that I want it to.

Question 1: Is there any way to get the XSLT to match all the variants?

Question 2: Is there any way to remove the duplicates so all the output files use consistent naming?

I naïvely tried using "namespace-alias" but I guess I've misunderstood what that does because I can't get it to do anything at all - either match all the variants or affect the output XML.

<?xsl:stylesheet
    version="1.0"
    ...
    xmlns:it="http://example.com/ItemType/XSD"
    xmlns:itemType="http://example.com/ItemType/XSD"
    xmlns:ItemType="http://example.com/ItemType/XSD"
    ...

    <xsl:output method="xml" indent="yes"/>
    <xsl:namespace-alias stylesheet-prefix="it" result-prefix="ItemType"/>
    <xsl:namespace-alias stylesheet-prefix="itemType" result-prefix="ItemType"/>
2
Which XSLT processor are you using? XSLT is supposed to use the URI for matching, not the textual prefix.Jim Garrison
xsltproc --version Using libxml 20626, libxslt 10117 and libexslt 813 xsltproc was compiled against libxml 20626, libxslt 10117 and libexslt 813 libxslt 10117 was compiled against libxml 20626 libexslt 813 was compiled against libxml 20626 That's the linux build, obviously, but we also have a windows target which behaves the same way, I believe.Andrew
Hmm, this is very confusing indeed. It looks it sometimes does match variants but sometimes doesn't. I'll try to isolate what's causing the different behaviour. If it does use the URI instead of the prefix, then that would obivate the first question. In which case, do you know of any answer to the follow-up question?Andrew
Okay, I updated the question to reflect an instance which definitely doesn't match. It may be because the usage of "itemType" is a bit more complicated in that example.Andrew
Check my answer for an explanation and XSLT 1.0 solution.user357812

2 Answers

1
votes

Attribute values or text nodes won't be cast to QName unless you explicitly say so. Although this is only posible in XSLT/XPath 2.0

In XSLT/XPath 1.0 you must do this "manually":

<xsl:template match="SomeNode">
    <xsl:variable name="vPrefix" select="substring-before(@xsi:type,':')"/>
    <xsl:variable name="vNCName" 
           select="translate(substring-after(@xsi:type,$vPrefix),':','')"/>
    <xsl:if test="namespace::*[
                     name()=$vPrefix
                  ] = 'http://example.com/ItemType/XSD'
                     and
                  $vNCName = 'SomeType'">
        <!-- Content Template -->
    <xsl:if>
</xsl:template>

Edit: All in one pattern (less readable, maybe):

<xsl:template match="SomeNode[
                        namespace::*[
                           name()=substring-before(../@xsi:type,':')
                        ] = 'http://example.com/ItemType/XSD'
                           and
                        substring(
                          concat(':',@xsi:type),
                          string-length(@xsi:type) - 7
                        ) = ':SomeType'
                     ]">
    <!-- Content Template -->
</xsl:template>
1
votes

In XSLT 2.0 (whether or not you use schema-awareness) you can write the predicate as [@xsi:type=xs:QName('it:SomeType')] where "it" is the prefix declared in the stylesheet for this namespace. It doesn't have to be the same as the prefix used in the source document.

Of course matching of element and attribute names (as distinct from QName-valued content) uses namespace URIs rather than prefixes in both XSLT 1.0 and XSLT 2.0.