1
votes

I'm working on an XSLT to transform XML (tei) docs to HTML. The goal is to create divs that can be styled to display as fixed columns.

In the docs, column beginnings are indicated by 2 empty elements (milestone and cb). "milestone" indicates that the number of columns, in the text flow, is now equal to the n attribute. "cb" marks the beginning of a column, and its n attribute indicates its order in the left-to-right sequence. The "cb" tags are not always siblings.

sample xml:

<p>
  <milestone unit="column" n="2"/>
  <cb n="1"/>
  M. Dudley
  <lb/>
  H. E. Ernshimer
  <lb/>
  M. M. Cash
  <lb/>
  John Wheatly
  <lb/>
  Jno W. Cash
  <lb/>
  <cb n="2"/>
  R. L. Wilson
  <lb/>
  R. B. Ratliff L.C.C.
  <lb/>
  G. D Watkins Clk
  <lb/>
  A. C. Mayes
  <lb/>
  <pb/>
</p>
<p>
   <note place="left margin">Jury 1863 Nov.</note>
   <lb/>
   <cb n="1"/>
   D C Mitchenssson
   <lb/>
   A. W. Forde, Tm P
   <lb/>
   L S Thomson
   <lb/>
   Louis Martin
   <hi rend="sup">c</hi>
   Casslin
   <lb/>
   E. M. Stevens
   <lb />
   <cb n="2"/>
   O Ross Baker Clk Caldwell County Court
   <lb/>
   N. Jones
   <lb/>
   S. W. M
   <milestone unit="column" n="1"/>
   <pb/>
   <lb/>
   John Garrett
</p>

Desired result below. divs with class equal to their preceding milestone's n attribute:

<div class="column 2">
    M. Dudley<br />
    H. E. Ernshimer<br />
    M. M. Cash<br />
    John Wheatly<br />
    Jno W. Cash<br />
    ...
</div>
<div class="column 2">
    R. L. Wilson<br />
    R. B. Ratliff L.C.C.<br />
    G. D Watkins Clk<br />
    A. C. Mayes<br />
    Jas Crenshaw<br />
</div>

How can I grab everything between each pair of cb tags, and wrap the content in a containing div? Everything I've tried results in a series of nested divs.

2
What do ypu mean by The "cb" tags are not always siblings. ? They are not always direct siblings?Stefan Hegny

2 Answers

0
votes

How can I grab everything between each pair of cb tags

I don't see that you have a pair of cb tags bracketing the contents of a column - only a leading cb element at the top.

IIUC, you want to do something like this:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="txt-by-col" match="text()" use="generate-id(preceding-sibling::cb[1])" />

<xsl:template match="/">
    <root>
        <xsl:for-each select="//cb">
            <div class="column {preceding::milestone[1]/@n}">
                <xsl:for-each select="key('txt-by-col', generate-id())">
                    <xsl:value-of select="." />
                    <br/>
                 </xsl:for-each>    
            </div>
        </xsl:for-each>
    </root>
</xsl:template>

</xsl:stylesheet>

Not that this assumes all the text nodes of a column are siblings of the leading cb element.

0
votes

I came up with a workable solution. May not be elegant, but it is working for my purposes. I will post here, in case it is useful to someone else in the future.

<!-- add a white space in empty milestone so it doesn't wrap around other elements -->
<xsl:template match="tei:milestone">
  <xsl:variable name="milenum" select="@n" />
  <milestone>
    <xsl:attribute name="n">
       <xsl:value-of select="$milenum" />
    </xsl:attribute>
    <xsl:text> </xsl:text>
  </milestone>
</xsl:template>

<!-- add a white space in empty cb so it doesn't wrap around other elements -->
<xsl:template match="tei:cb">
   <xsl:variable name="num" select="@n" />
   <cb>
      <xsl:attribute name="n">
         <xsl:value-of select="$num" />
      </xsl:attribute>
      <xsl:text> </xsl:text>
   </cb>
</xsl:template>

<!-- wrap content following cb elements in a div, with a class indicating the number of columns in the preceding milestone n attribute (if milestone n=2, then div class=column1of2 or div class=column2of2) -->
<xsl:template match="tei:p[tei:cb]">
    <!-- to print text before the first milestone -->
    <xsl:apply-templates select="node()[not(preceding::tei:milestone)]" />
    <xsl:for-each select="tei:cb">
      <xsl:variable name="count" select="position()" />
      <div>
         <xsl:variable name="numberofcolumns" select="preceding::tei:milestone[1]/@n" />
         <xsl:variable name="n" select="@n" />
         <xsl:attribute name="class">
           <xsl:text>column</xsl:text>
           <xsl:value-of select="$n" />
           <xsl:text>of</xsl:text>
           <xsl:value-of select="$numberofcolumns" />
         </xsl:attribute>
         <xsl:apply-templates select="following-sibling::node()[preceding-sibling::tei:cb[1][@n=$n] and count(preceding-sibling::tei:cb)=$count and preceding::tei:milestone[1][@n>1] and not(self::tei:milestone)]" />
       </div>
     </xsl:for-each>
 </xsl:template>

This outputs:

<milestone n="2"> </milestone>
<div class="column1of2">
</div>
<div class="column2of2">
</div>
<div class="column1of2">
</div>
<div class="column2of2">
</div>

Now that I see the answer from @michael.hor257k I will simplify this code with his approach.