1
votes

My scenario is that we have to check for a div tag in the cdata of a body element. If div is present we have to insert the the text from node2 into the div tag.

This is my input xml:

<?xml version="1.0" encoding="utf-8"?>
<root>
<node1>abc</node1>
<node2> needs to replace inside cdata div</node2>
<body> <![CDATA[
            <p>some text some textabcabcabcabc</p>

               <div class="marginBottom_4px">
               </div>
            <p>some text some textabcabc</P>

            ]]>
</body>
</root>

The out put xml would be:

<?xml version="1.0" encoding="utf-8"?>
<div class="marginBottom_10px">
abc
</div>
<div class="marginBottom_5px">

  <p>some text some textabcabcabcabc</p>

  <div class="marginBottom_4px">

   needs to replace inside cdata div
  </div>
  <p>some text some textabcabc</P>
</div>

My transform is:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
    <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <xsl:value-of disable-output-escaping="yes" select ="$firstnode"/>
    <xsl:text disable-output-escaping="yes"><![CDATA[   <div class="marginBottom_10px">
   ]]>
  </xsl:text>
    <xsl:value-of disable-output-escaping ="yes" select="root/body"/>
    <xsl:text disable-output-escaping="yes"><![CDATA[
      </div>
     ]]>
    </xsl:text>
  </xsl:template>

  <xsl:variable name="firstnode">
    <xsl:text disable-output-escaping="yes"><![CDATA[
       <div class="marginBottom_10px">
     ]]>
     </xsl:text>
    <xsl:value-of disable-output-escaping ="yes" select="root/node1"/>
    <xsl:text disable-output-escaping="yes"><![CDATA[
      </div>
     ]]>
   </xsl:text>
  </xsl:variable>
</xsl:stylesheet>

I am able to produce the out put. but my xml is very complex like below:

<?xml version="1.0" encoding="utf-8" ?>
<ComplexXML>
  <environment>
   couple of nodes..
 </environment>
  <document>
    nodes
  </document>

<element cd="dsjdhfjk"  input="abc.xml" mode="" >
   <cd position="1">
     <attributes>
        <type>dummy text</type>
        <title>dummy text</title>
     </attributes>
  <content>
   <node2>
        <![CDATA[
          needs to replace inside cdata div
          ]]>
    </node2>
     <body>
        <![CDATA[
          <p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. 
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s,
 when an unknown printer took a galley of type and scrambled it to make a type 
specimen book </p>

             <div class="marginBottom_4px">
               </div>
               <p>Lorem Ipsum is simply dummy text of her including versions of Lorem Ipsum. </p>
           ]]>
      </body>
      <abt >
        <![CDATA[ 

               text from  abt node
               ]]>
      </abt>
    </content>
   </cd>
  </element>
</ComplexXML>

In the above xml I have to check for the abt node.If data is there in abt node the out should be like below:

 <?xml version="1.0" encoding="UTF-8"?>
      <div>
          text from  abt node

        <div class="marginBottom_5px"> 
                <p>Lorem Ipsum is simply dummy text of the printing and      typesetting              industry. 
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s,
 when an unknown printer took a galley of type and scrambled it to make a type 
specimen book </p>

                   **<div class="marginBottom_4px">
                    </div>** I need to remove this div tag and place the node2 content here.

     <p>Lorem Ipsum is simply dummy text of her including versions of Lorem Ipsum. </p>

        </div>
</div>

Sorry to bother you..I am very new to xslt..I am in learning stage only..Can you please guide me..

1
Why are you trying so hard to produce CDATA?Mads Hansen
It is not clear how the rest of the content of <body> should be processed. Do you want to preserve the content of both <p> elements, transform the text, etc? Please clarify or update the desired output.Mads Hansen
Rest of the body node should be processed..I forgot to include in output.the content of both <p> elements should be same.Blossom
You probably do realize that the contents of a CDATA section is just text and not XML??? What you need is an XML parser -- AFAIK there is no pXML parser written in pure XSLT. Either you use an extension function, or you wait till XSLT 3.0, where there will be a parse-xml() function.Dimitre Novatchev

1 Answers

0
votes

The following XSLT 1.0 stylesheet produces what I believe the desired output is, based upon the example output and comments. It relies upon the text in the CDATA of the input document being well-formed, and leverages disable-output-escaping:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
    >
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="node1">
        <div class="marginBottom_10px">
            <xsl:apply-templates/>
        </div>
    </xsl:template>    

   <xsl:template match="node2" />     

   <xsl:template match="body">
       <div class="marginBottom_5px">
           <xsl:apply-templates/>
       </div>
   </xsl:template>

    <xsl:template match="body/text()[contains(.,'&lt;div') and contains(.,'&lt;/div>')]">
      <xsl:value-of 
          disable-output-escaping="yes" 
          select="substring-before(.,'&lt;/div')" />
      <xsl:value-of select="../../node2"/>  
      <xsl:value-of 
          disable-output-escaping="yes"
          select="substring-after(.,substring-before(.,'&lt;/div'))" /> 
   </xsl:template>
</xsl:stylesheet>

When applied against the example input, produces:

<?xml version="1.0" encoding="UTF-8"?>
    <div class="marginBottom_10px">abc</div>

    <div class="marginBottom_5px"> 
            <p>some text some textabcabcabcabc</p>

               <div class="marginBottom_4px">
                needs to replace inside cdata div</div>
            <p>some text some textabcabc</P>


    </div>

Note: the output is not well-formed. There is no document element. You would likely want to create a template for the <root> to create a <div> or other containing element.


This version handles the other input format and generates what I believe the desired output is:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
    >
    <xsl:output method="xml" indent="yes"/>

    <!--rely on built-in templates, 
        which apply-templates to child nodes of elements
        and get value-of for text() -->

    <xsl:template match="content">
        <xsl:choose>
            <!-- if <abt> has a value, do the following -->
            <xsl:when test="normalize-space(abt)">
                <div>
                    <!-- apply templates to <abt>, 
                        built-in template will copy text to output-->
                    <xsl:apply-templates select="abt"/> 
                    <!-- apply templates to <body>, template defined below will handle it -->
                    <xsl:apply-templates select="body"/>
                </div>
            </xsl:when>
            <xsl:otherwise>
                <!--process child nodes -->
                <xsl:apply-templates />
            </xsl:otherwise>
        </xsl:choose>

    </xsl:template>

    <xsl:template match="node1">
        <div class="marginBottom_10px">
            <xsl:apply-templates/>
        </div>
    </xsl:template>    

    <!--empty template ensures that no content produced when templates applied to <node2>-->
    <xsl:template match="node2" />     

    <xsl:template match="body">
        <div class="marginBottom_5px">
            <xsl:apply-templates/>
        </div>
    </xsl:template>

    <!--Template for handling body/text() when <abt> does not have a value-->
    <xsl:template match="*[not(normalize-space(abt))]/body/text()[contains(.,'&lt;div') and contains(.,'&lt;/div>')]">
        <!--get the value of content preceding "</div"-->
        <xsl:value-of 
            disable-output-escaping="yes" 
            select="substring-before(.,'&lt;/div')" />
        <!--get the value of <node2> -->
        <xsl:value-of select="../../node2"/>  
        <!--get the value of content starting at "</div" -->
        <xsl:value-of 
            disable-output-escaping="yes"
            select="substring-after(.,substring-before(.,'&lt;/div'))" /> 
    </xsl:template>

    <!--Template for handling body/text() when <abt> does have a value -->
    <xsl:template match="*[normalize-space(abt)]/body/text()[contains(.,'&lt;div') and contains(.,'&lt;/div>')]">
        <!--get the value preceding "<div" -->
        <xsl:value-of 
            disable-output-escaping="yes" 
            select="substring-before(.,'&gt;div')" />
        <xsl:value-of select="../../node2"/>
        <!--get the value following "</div>" -->
        <xsl:value-of 
            disable-output-escaping="yes"
            select="substring-after(.,'&lt;/div>')" /> 
    </xsl:template>
</xsl:stylesheet>