1
votes

I'm trying to split out the comma separated tags list below into individual elements. The element and attribute names in the XML source will always be the same. I'm using 1.0, so I was hoping for a 1.0 solution. Based on this similar example I thought the following XSL could work:

<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>
      <xsl:template match="pbcoreCollection">
        <pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
             <pbcoreDescriptionDocument>
                 <xsl:call-template name="tokenize">
                 <xsl:with-param name="text" select="instantiationannotation"/>
                 <xsl:with-param name="elemName" select="'instantitionAnnotation'"/>
                </xsl:call-template>
            </pbcoreDescriptionDocument>
        </pbcoreCollection>
    </xsl:template> 
    <xsl:template name="tokenize">
        <xsl:param name="text"/>
        <xsl:param name="elemName"/>
        <xsl:param name="sep" select="', '"/>
        <xsl:choose>
            <xsl:when test="contains($text, $sep)">
                <xsl:element name="{$elemName}">
                    <xsl:value-of select="substring-before($text, $sep)"/>
                </xsl:element>
                <!-- recursive call -->
                <xsl:call-template name="tokenize">
                    <xsl:with-param name="text" select="substring-after($text, $sep)" />
                    <xsl:with-param name="elemName" select="$elemName" />
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:element name="{$elemName}">
                    <xsl:value-of select="$text"/>
                </xsl:element>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

But it yields the result

<?xml version="1.0"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <pbcoreDescriptionDocument>
    <instantitionAnnotation/>
  </pbcoreDescriptionDocument>
</pbcoreCollection>

My original XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
   <pbcoreDescriptionDocument>
        <pbcoreInstantiation>
            <instantiationAnnotation annotationType="CMS tag">congress, guns, gun_policy, catholic_schools, social_media, </instantiationAnnotation>
        </pbcoreInstantiation>
   </pbcoreDescriptionDocument>
</pbcoreCollection>

That I would like to look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
   <pbcoreDescriptionDocument>
        <pbcoreInstantiation>
            <instantiationAnnotation annotationType="CMS tag">congress</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">gun</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">gun_policy</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">catholic_schools</instantiationAnnotation>
            <instantiationAnnotation annotationType="CMS tag">social_media</instantiationAnnotation>
        </pbcoreInstantiation>
     </pbcoreDescriptionDocument>
</pbcoreCollection>
1
A similar question was answered well here.Buck Doyle
What version of XSLT can you use? 1.0, 2.0, 3.0? It's a bit complicated but not impossible with XSLT 1.0, but quite trivial in 2.0 or 3.0.helderdarocha
1. "I've seen some similar questions/answers, but nothing exactly like what I'm trying to do." Perhaps not exactly, but awfully close 2. An example is not a good definition of a problem: will the element and attribute names in the XML source always be the same? If yes, the answer can be simplified.michael.hor257k
I'll take a look at that example, i hadn't seen it before, thanks. In this case, yes, the element and attribute names in the XML source will always be the same.jackpass

1 Answers

1
votes

In this case, yes, the element and attribute names in the XML source will always be the same.

In such case, you could simplify to:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
    <pbcoreCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
       <pbcoreDescriptionDocument>
            <pbcoreInstantiation>
                <xsl:call-template name="tokenize">
                    <xsl:with-param name="text" select="pbcoreCollection/pbcoreDescriptionDocument/pbcoreInstantiation/instantiationAnnotation"/>
                </xsl:call-template>
            </pbcoreInstantiation>
       </pbcoreDescriptionDocument>
    </pbcoreCollection>
</xsl:template>

<xsl:template name="tokenize">
    <xsl:param name="text"/>
    <xsl:param name="sep" select="', '"/>
    <xsl:choose>
        <xsl:when test="contains($text, $sep)">
            <instantiationAnnotation annotationType="CMS tag">
                <xsl:value-of select="substring-before($text, $sep)"/>
            </instantiationAnnotation>
            <!-- recursive call -->
            <xsl:call-template name="tokenize">
                <xsl:with-param name="text" select="substring-after($text, $sep)" />
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <instantiationAnnotation annotationType="CMS tag">
                <xsl:value-of select="$text"/>
            </instantiationAnnotation>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Note: this may need a bit more work if your input really carries a trailing ", " separator as shown in your example.