1
votes

I had requirement of transforming below input XML to desired output XML format. With help of this forum, I got the solution as below:

Input XML

<?xml version="1.0"?>
<dataset  xmlns="http://developer.cognos.com/schemas/xmldata/1/"  xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
    <item name="Employee Id" />
    <item name="Employee Name" />
    <item name="Department Name" />
</metadata>
<data>
    <row>
      <value>1</value>
      <value Salutation="Dr." >John</value>
      <value>Finance</value>
    </row>
    <row>
      <value>2</value>
      <value Salutation="Mr." >Peter</value>
      <value>Admin</value>
    </row>
</data>
</dataset>

XSLT Transformation

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:c="http://developer.cognos.com/schemas/xmldata/1/" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vNames" select="/*/c:metadata/*/@name" />
<xsl:template match="/*/c:data">
    <dataset>
        <xsl:apply-templates/>
    </dataset>
</xsl:template>
<xsl:template match="c:row">
    <row>
        <xsl:apply-templates/>
    </row>
</xsl:template>
<xsl:template match="c:row/*">
    <xsl:variable name="vPos" select="position()"/>
    <xsl:element name="{translate($vNames[$vPos], ' ', '_')}">
        <xsl:apply-templates select="@*"/>
        <xsl:apply-templates/>
    </xsl:element>
</xsl:template>
<xsl:template match="@*">
    <xsl:attribute name="{name()}">
        <xsl:value-of select="." />
    </xsl:attribute>
</xsl:template>
</xsl:stylesheet>

Desired Output XML

<?xml version="1.0" encoding="UTF-16"?>
<dataset xmlns:c="http://developer.cognos.com/schemas/xmldata/1/">
<row>
    <Employee_Id>1</Employee_Id>
    <Employee_Name Salutation="Dr.">John</Employee_Name>
    <Department_Name>Finance</Department_Name>
</row>
<row>
    <Employee_Id>2</Employee_Id>
    <Employee_Name Salutation="Mr.">Peter</Employee_Name>
    <Department_Name>Admin</Department_Name>
</row>

However, I came across a special scenario which breaks this solution. The attribute values in Input XML can start with Number, Special Character or Space.

New Input XML

<?xml version="1.0"?>
<dataset  xmlns="http://developer.cognos.com/schemas/xmldata/1/"  xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
    <item name="1Employee Id" />
    <item name=" Employee Name" />
    <item name="$Department Name" />
</metadata>
<data>
<row>
    <value>1</value>
    <value Salutation="Dr." >John</value>
    <value>Finance</value>
</row>
<row>
    <value>2</value>
    <value Salutation="Mr." >Peter</value>
    <value>Admin</value>
</row>
</data>
</dataset>

Since name attribute values are translated to Element names, above transformation fails as Element names cannot start with Number or Space. In this case I would like to replace those characters with some valid characters for Element name say _ or C_ to get the same desired Output XML.

Kindly let me know how to handle this scenario.

Any help would be highly appreciated.

Thanks in advance.

Regards

2
Will you have Number or Special Characters within the attribute values, or just at the start? In the case of numbers, if an attribute value ended with a number, would you still want it appearing in the outputed element name?Tim C
If you can't be sure your column names will be valid XML names then you'd be safer opting for an output format like <c n="X">value</c> rather than <X>value</X>. That would probably be less verbose for long column names as you don't have the name repeated in both the opening and closing tag.Ian Roberts
@IanRoberts, Thanks for the suggestion. The output XML has to be consumed by another application, which expects it to be in this format.Fun2Learn

2 Answers

1
votes

Here is a complete solution that doesn't use any hard-coded special characters:

<xsl:stylesheet version="1.0"
 xmlns:c="http://developer.cognos.com/schemas/xmldata/1/"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:strip-space elements="*"/>

    <xsl:variable name="vNames" select="/*/c:metadata/*/@name" />

    <xsl:variable name="vAlpha" select=
    "concat('ABCDEFGHIJKLMNOPQRSTUVWXYZ',
            'abcdefghijklmnopqrstuvwxyz')"/>

    <xsl:template match="/*/c:data">
        <dataset>
            <xsl:apply-templates/>
        </dataset>
    </xsl:template>
    <xsl:template match="c:row">
        <row>
            <xsl:apply-templates/>
        </row>
    </xsl:template>

    <xsl:template match="c:row/*">
        <xsl:variable name="vPos" select="position()"/>
        <xsl:variable name="vChar1" select=
         "translate(substring($vNames[$vPos],1,1),
                    translate(substring($vNames[$vPos],1,1), $vAlpha, ''),
                    '_')"/>
        <xsl:element name=
               "{$vChar1}{translate(substring($vNames[$vPos],2), ' ', '_')}">
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="@*">
        <xsl:attribute name="{name()}">
            <xsl:value-of select="." />
        </xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<dataset  xmlns="http://developer.cognos.com/schemas/xmldata/1/"  xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
    <metadata>
        <item name="1Employee Id" />
        <item name=" Employee Name" />
        <item name="$Department Name" />
    </metadata>
    <data>
        <row>
            <value>1</value>
            <value Salutation="Dr." >John</value>
            <value>Finance</value>
        </row>
        <row>
            <value>2</value>
            <value Salutation="Mr." >Peter</value>
            <value>Admin</value>
        </row>
    </data>
</dataset>

the wanted, correct result is produced:

<dataset xmlns:c="http://developer.cognos.com/schemas/xmldata/1/">
   <row>
      <_Employee_Id>1</_Employee_Id>
      <_Employee_Name Salutation="Dr.">John</_Employee_Name>
      <_Department_Name>Finance</_Department_Name>
   </row>
   <row>
      <_Employee_Id>2</_Employee_Id>
      <_Employee_Name Salutation="Mr.">Peter</_Employee_Name>
      <_Department_Name>Admin</_Department_Name>
   </row>
</dataset>

Explanation:

  1. We can identify all characters that don't belong to a given set of characters by using the double-translate method, first shown by Michael Kay.

  2. If inside the names thare are other illegal characters in addition to spaces, the same double translate technique can be used to replace any such (unknown in advance) character with a desired legal character.

1
votes

It's very easy. Just extend the range of your translate() call to cover the problem characters (characters that you use in the attribute, but that XML forbids for element names).

For example, change...

<xsl:element name="{translate($vNames[$vPos], ' ', '_')}">

...to...

<xsl:element name="{translate($vNames[$vPos], ' 1$', '___')}">

Update

In response to the OP's clarification, here is how to translate just the first character. Suppose our tentative element name with unsafe first character is $vName. For illustration, let us say that we are just looking at $ as the only possible unsafe first character. What we do is strip off the first character, translate it, and then add it back on, like this...

<xsl:element name="{concat( translate( substring( $vName, 1, 1), '$', '_'),
                            substring( $vName, 2))}" />

...or...

<xsl:element name="{translate( substring( $vName, 1, 1), '$', '_')}{substring( $vName, 2)}" />

Note

If you are able to upgrade to XSLT 2.0, then it becomes easier with regex...

<xsl:element name="{replace( $vName, '^[$]', '_')}" />