0
votes

I have a csv file that has two main cases

Case #1:

"surname, givenName, id#"

Case #2

"organizationName,, id#"

I'm doing a tokenize function to break up the file into document nodes on every carriage return.

<xsl:template match="/">
  <!-- tokenize on line endings -->
    <xsl:for-each select="str:tokenize(.,'&#13;&#10;')">
      <document>
        <xsl:apply-templates select="." mode="new-document" />
      </document>
    </xsl:for-each>
</xsl:template>

so I have this:

<document>"Don Jackson,,19001"</document>
<document>"Frederick Guitars,,ed55555,,,O"</document>
<document>"Frederick Guitars,,ed11111,,,O"</document>
<document>"A WILLIAMS,JONES THOMPSON,141212"</document>
<document>"A RANJI,ALENA,741152"</document>

Now, I need to create content nodes within the document nodes, but the name of the content nodes will depend on the structure of the document node. Basically if the text after the first comma is null (meaning you get ',,'), then the name of the first content node will be "Organization". Otherwise, the first content node will be called "surname" and the name of the second content node will be "givenName". Third node will be ID_num, regardless.

It seems an xsl:choose should work here but I'm not sure how to implement it. Can someone provide some advice?

Thanks

1
by the way, the str:tokenize function is actually something application specific that uses the same functionality. I'm not sure if the 'str:tokenize' is formatted correctly, I just changed the name of the function that the application uses. - rally_point

1 Answers

0
votes

I mimicked however it is you get your data and then I show you below how to do the test you are asking about regarding distinguishing an organization from a person. I do note the test data does not appear to be correctly presenting surnames and given names.

t:\ftemp>type rally.xml 
<all>
<document>"Don Jackson,,19001"</document>
<document>"Frederick Guitars,,ed55555,,,O"</document>
<document>"Frederick Guitars,,ed11111,,,O"</document>
<document>"A WILLIAMS,JONES THOMPSON,141212"</document>
<document>"A RANJI,ALENA,741152"</document>
</all>
t:\ftemp>call xslt2 rally.xml rally.xsl 
<?xml version="1.0" encoding="UTF-8"?>
<document>
   <Organization>Don Jackson</Organization>
   <ID_num>19001</ID_num>
</document>
<document>
   <Organization>Frederick Guitars</Organization>
   <ID_num>ed55555</ID_num>
</document>
<document>
   <Organization>Frederick Guitars</Organization>
   <ID_num>ed11111</ID_num>
</document>
<document>
   <surname>A WILLIAMS</surname>
   <givenName>JONES THOMPSON</givenName>
   <ID_num>141212</ID_num>
</document>
<document>
   <surname>A RANJI</surname>
   <givenName>ALENA</givenName>
   <ID_num>741152</ID_num>
</document>

t:\ftemp>type rally.xsl 
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                exclude-result-prefixes="xsd"
                version="2.0">

<xsl:output indent="yes"/>

<xsl:template match="/">
  <xsl:for-each select="all/document/string(.)">
      <document>
        <!--old: <xsl:apply-templates select="." mode="new-document" /> -->
        <!--new:-->
        <xsl:variable name="parts" as="xsd:string*"
              select="tokenize(replace(.,'^&#x22;(.*)&#x22;$','$1'),',')"/>
        <xsl:choose>
          <xsl:when test="$parts[2]=''">
            <Organization><xsl:value-of select="$parts[1]"/></Organization>
            <ID_num><xsl:value-of select="$parts[3]"/></ID_num>
          </xsl:when>
          <xsl:otherwise>
            <surname><xsl:value-of select="$parts[1]"/></surname>
            <givenName><xsl:value-of select="$parts[2]"/></givenName>
            <ID_num><xsl:value-of select="$parts[3]"/></ID_num>
          </xsl:otherwise>
        </xsl:choose>
      </document>
  </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

t:\ftemp>rem Done! 

Edited to include the id number element.