1
votes

I'm trying to get the unique values of a custom generated sequence of strings in XSLT. The problem at hand is rather unusual since I have to first split a string on '.' and chop of the last bit, that's working fine but getting the unique values using the XSLT 2.0 distinct-values is not.

Given some input

<TreeNumberList>
  <TreeNumber>A01.001.001</TreeNumber>  
  <TreeNumber>A01.001.002</TreeNumber>  
  <TreeNumber>A01.001.003</TreeNumber>
  <TreeNumber>A01.002.111</TreeNumber>
</TreeNumberList>

The desired output would be an iterable sequence of

A01.001, A01.002

So far I have the following function

<xsl:function name="func:strip-last">
    <xsl:param name="str"></xsl:param>
    <xsl:value-of select="substring($str, 1, string-length($str) - 1)"></xsl:value-of>
</xsl:function>

<xsl:function name="func:parent-of">
    <xsl:param name="nodes"></xsl:param>
    <xsl:variable name="output">
    <xsl:for-each select="$nodes">
        <xsl:variable name="parent">
            <xsl:for-each select="tokenize(., '\.')">
                <xsl:if test="position() != last()">
                    <xsl:value-of select="."></xsl:value-of>
                    <xsl:text>.</xsl:text>
                </xsl:if>
            </xsl:for-each>
        </xsl:variable>
        <tmp><xsl:value-of select="func:strip-last($parent))"></xsl:value-of></tmp>
    </xsl:for-each>    
    </xsl:variable>
    <xsl:sequence select="distinct-values($output/*)"></xsl:sequence>
</xsl:function>

But this does not return a set of distinct elements, rather a sequence of all the elements involved. The eventual code will be a bit more contrived since the TreeNumbers themselves will not be unique but a some name retrieved through a key lookup will be. (For those recognizing the mark-up, it's part of the MeSH XML)

I've also tried to use a key-ed index or group-by for uniqueness but those did not play well with the document fragments.

2

2 Answers

2
votes

The following expression gives the set of distinct values as a sequence of strings:

distinct-values(TreeNumber/replace(., '\..*$', ''))
1
votes

Do you have to have functions? Could you just create a variable that has the sequence?

Example...

XML Input

<TreeNumberList>
    <TreeNumber>A01.001.001</TreeNumber>  
    <TreeNumber>A01.001.002</TreeNumber>  
    <TreeNumber>A01.001.003</TreeNumber>
    <TreeNumber>A01.002.111</TreeNumber>
</TreeNumberList>

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:variable name="vAllTreeNumbers">
        <xsl:for-each select="/*/TreeNumber">
            <xsl:analyze-string select="." regex="(.*)\.[^.]*$">
                <xsl:matching-substring>
                    <xsl:value-of select="concat(regex-group(1),' ')"/>
                </xsl:matching-substring>
            </xsl:analyze-string>
        </xsl:for-each>
    </xsl:variable>
    <xsl:variable name="vUniqueTreeNumbers" select="distinct-values(tokenize(normalize-space($vAllTreeNumbers),' '))"/>

    <xsl:template match="/">
        <results>
            <!--This demonstrates that each value is part of a sequence.-->
            <xsl:for-each select="$vUniqueTreeNumbers">
                <item>
                    <xsl:value-of select="."/>
                </item>
            </xsl:for-each>
        </results>
    </xsl:template>

</xsl:stylesheet>

XML Output (just to demonstrate the sequence)

<results>
   <item>A01.001</item>
   <item>A01.002</item>
</results>

Here's another option that returns the same results...

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:local="local"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:function name="local:getUniqueTreeNumbers">
        <xsl:param name="pNodes"/>
        <xsl:variable name="vAllTreeNumbers">
            <xsl:for-each select="$pNodes">
                <xsl:analyze-string select="." regex="(.*)\.[^.]*$">
                    <xsl:matching-substring>
                        <xsl:value-of select="concat(regex-group(1),' ')"/>
                    </xsl:matching-substring>
                </xsl:analyze-string>
            </xsl:for-each>
        </xsl:variable>
        <xsl:sequence select="distinct-values(tokenize(normalize-space($vAllTreeNumbers),' '))"></xsl:sequence>
    </xsl:function>

    <xsl:template match="/">
        <results>
            <!--This demonstrates that each value is part of an iterable sequence.-->
            <xsl:for-each select="local:getUniqueTreeNumbers(*/TreeNumber)">
                <item>
                    <xsl:value-of select="."/>
                </item>
            </xsl:for-each>         
        </results>
    </xsl:template>

</xsl:stylesheet>