1
votes

Here is my requirement. My sample input document is like below. (I have added white lines to make it clear)

<body>
       <p name="h-title" other="main">Introduction</p>
       <p name="h-titledesc " other="other-desc">XSLT and XQuery</p>


       <p name=""> XSLT is used to write stylesheets.</p>
    <p name="section-title" other=" other-section">XSLT</p>
    <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
    <p name=""> Some text.</p>
    <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
    <p name="h1-title" other=" other-h1">XSLT</p>
    <p name=""> Some text.</p>
       <p name="h2-title " name="other-h2">XQuery</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
       <p name="h3-title" name="other-h3">XQuery and stylesheets</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>


    <p name="section-title" other=" other-section">XSLT</p>
    <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
    <p name=""> Some text.</p>
    <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
    <p name="h1-title" other=" other-h1">XSLT</p>
    <p name=""> Some text.</p>
       <p name="h2-title " name="other-h2">XQuery</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
       <p name="h3-title" name="other-h3">XQuery and stylesheets</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>


       <p name ="summary-title">this is summary</p>
       <p name="summary-desc " other="other-summarydesc">the summary</p>
    </body>

Now my wanted output is this.

<body>
       <p name="h-title" other="main">Introduction</p>
       <p name="h-titledesc " other="other-desc">XSLT and XQuery</p>


       <p name=""> XSLT is used to write stylesheets.</p>

    <body-contents>
        <p name="section-title" other=" other-section">XSLT</p>
        <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
        <p name=""> Some text.</p>
        <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        <h1>
        <p name="h1-title" other=" other-h1">XSLT</p>
        <p name=""> Some text.</p>
        <h2>
           <p name="h2-title " name="other-h2">XQuery</p>
           <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        <h3>
           <p name="h3-title" name="other-h3">XQuery and stylesheets</p>
           <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        </h3>
    </h2>
    </h1>
    </body-contents>
    <body-contents>
        <p name="section-title" other=" other-section">XSLT</p>
        <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
        <p name=""> Some text.</p>
        <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        <h1>
        <p name="h1-title" other=" other-h1">XSLT</p>
        <p name=""> Some text.</p>
        <h2>
           <p name="h2-title " name="other-h2">XQuery</p>
           <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        <h3>
           <p name="h3-title" name="other-h3">XQuery and stylesheets</p>
           <p name="">
              <p1 name="bold"> XQuery is used to query XML databases.</p1>
           </p>
        </h3>
    </h2>
    </h1>

    </body-contents>
    <body-contents>
           <p name ="summary-title">this is summary</p>
           <p name="summary-desc " other="other-summarydesc">the summary</p>
    </body-contents>
    </body>

Please help me solve this problem.

{OPTIONAL There are restrictions like:

  • h1, h2, h3 come sequentially(that means, h3 does not come between h1 and h2)
  • lines with name="section-title" should come before name="section-desc"
  • h1, h2, h3, etc. should come after section-desc.

I solved the problem of h1,h2,h3,etc here. I know this is very had. Any help is great.

The transformation should not happen if these rules violate. }

1
I think it would be easier if you lay out the rules and "restrictions" in a more complete and organized way. What should happen if the input violates the restrictions? More importantly, try to make the transformation rules explicit, instead of asking us to infer them, which would result in varying interpretations.LarsH
Thank you @LarsH for telling that. I organized it. There are 3 restriction as I have put in the question.Setinger
Thank you for the clearer organization of the restrictions. I fixed the bullet formatting. What I was trying to say was, can you state the transformation rules? E.g. there seems to be a rule that every time we have a <p name="section-title">, we should wrap it and the following <p> elements in a <body-contents> element (up to the next section-title <p>). What other rules are there? Also, you mention that you have solved the problem of h1,h2,h3... so what part have you not solved? More in following comment...LarsH
Setinger, this long thread reflects the fact that the question is difficult to understand. With all deep respect I would recommend that you, please, pay extra attention to writing simple, well defined and understandable questions or people may stop reading them altogether.Dimitre Novatchev
I am sorry Dimitre. I will be clearer when I do in next questions. Luckily Martin gave answer to this.Setinger

1 Answers

3
votes

Here is my adaption of the previously posted stylesheet, it simply does an additional group-starting-with before calling the recursive function grouping the levels. I realize that is kind of the same suggestion as in an earlier post you made but so far it is not clear to me why that suggestion does not work for you.

So here is the stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.com/mf"
  exclude-result-prefixes="xs mf">

<xsl:param name="prefix" as="xs:string" select="'h'"/>
<xsl:param name="suffix" as="xs:string" select="'-title'"/>

<xsl:output method="xml" version="1.0" indent="yes"/>

<xsl:function name="mf:group" as="node()*">
  <xsl:param name="items" as="node()*"/>
  <xsl:param name="level" as="xs:integer"/>
  <xsl:for-each-group select="$items" group-starting-with="p[@name = concat($prefix, $level, $suffix)]">
    <xsl:choose>
      <xsl:when test="not(self::p[@name = concat($prefix, $level, $suffix)])">
        <xsl:apply-templates select="current-group()"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:element name="h{$level}">
          <xsl:apply-templates select="."/>
          <xsl:sequence select="mf:group(current-group() except ., $level + 1)"/>
        </xsl:element>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each-group>
</xsl:function>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="body">
  <xsl:copy>
    <xsl:for-each-group select="*" group-starting-with="p[@name = 'section-title' or @name = 'summary-title']">
      <xsl:choose>
        <xsl:when test="not(self::p[@name = 'section-title' or @name = 'summary-title'])">
          <xsl:apply-templates select="current-group()"/>
        </xsl:when>
        <xsl:otherwise>
          <body-contents>
             <xsl:sequence select="mf:group(current-group(), 1)"/>
          </body-contents>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

When I apply that stylesheet with Saxon 9.4 to the corrected input

<body>
       <p name="h-title" other="main">Introduction</p>
       <p name="h-titledesc " other="other-desc">XSLT and XQuery</p>


       <p name=""> XSLT is used to write stylesheets.</p>
    <p name="section-title" other=" other-section">XSLT</p>
    <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
    <p name=""> Some text.</p>
    <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
    <p name="h1-title" other=" other-h1">XSLT</p>
    <p name=""> Some text.</p>
       <p name="h2-title" other="other-h2">XQuery</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
       <p name="h3-title" other="other-h3">XQuery and stylesheets</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>


    <p name="section-title" other=" other-section">XSLT</p>
    <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
    <p name=""> Some text.</p>
    <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
    <p name="h1-title" other=" other-h1">XSLT</p>
    <p name=""> Some text.</p>
       <p name="h2-title" other="other-h2">XQuery</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
       <p name="h3-title" other="other-h3">XQuery and stylesheets</p>
       <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>


       <p name ="summary-title">this is summary</p>
       <p name="summary-desc " other="other-summarydesc">the summary</p>
    </body>

I get the output

<body>
   <p name="h-title" other="main">Introduction</p>
   <p name="h-titledesc " other="other-desc">XSLT and XQuery</p>
   <p name=""> XSLT is used to write stylesheets.</p>
   <body-contents>
      <p name="section-title" other=" other-section">XSLT</p>
      <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
      <p name=""> Some text.</p>
      <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
      <h1>
         <p name="h1-title" other=" other-h1">XSLT</p>
         <p name=""> Some text.</p>
         <h2>
            <p name="h2-title" other="other-h2">XQuery</p>
            <p name="">
               <p1 name="bold"> XQuery is used to query XML databases.</p1>
            </p>
            <h3>
               <p name="h3-title" other="other-h3">XQuery and stylesheets</p>
               <p name="">
                  <p1 name="bold"> XQuery is used to query XML databases.</p1>
               </p>
            </h3>
         </h2>
      </h1>
   </body-contents>
   <body-contents>
      <p name="section-title" other=" other-section">XSLT</p>
      <p name="section-desc" other=" other-sectionsdesc">XSLT</p>
      <p name=""> Some text.</p>
      <p name="">
          <p1 name="bold"> XQuery is used to query XML databases.</p1>
       </p>
      <h1>
         <p name="h1-title" other=" other-h1">XSLT</p>
         <p name=""> Some text.</p>
         <h2>
            <p name="h2-title" other="other-h2">XQuery</p>
            <p name="">
               <p1 name="bold"> XQuery is used to query XML databases.</p1>
            </p>
            <h3>
               <p name="h3-title" other="other-h3">XQuery and stylesheets</p>
               <p name="">
                  <p1 name="bold"> XQuery is used to query XML databases.</p1>
               </p>
            </h3>
         </h2>
      </h1>
   </body-contents>
   <body-contents>
      <p name="summary-title">this is summary</p>
      <p name="summary-desc " other="other-summarydesc">the summary</p>
   </body-contents>
</body>

Please next time when you provide an input sample make sure it is well-formed, so far you have always posted stuff like <p name="h2-title " name="other-h2">XQuery</p>, and two attributes of the same name are not possible.