2
votes

Given an input XML document like this:

<?xml version="1.0" encoding="utf-8"?>
<title> This contains an 'embedded' HTML document </title>
<document>
<html>
<head><title>HTML DOC</title></head>
<body>
Hello World
</body>
</html>
</document>
</root>

How I can extract that 'inner' HTML document; render it as CDATA and include in my output document ?

So the output document will be an HTML document; which contains a text-box showing the elements as text (so it will be displaying the 'source-view' of the inner document).

I have tried this:

<xsl:template match="document">
<xsl:value-of select="*"/>
</xsl:template>

But this only renders the Text Nodes.

I have tried this:

<xsl:template match="document">
<![CDATA[
<xsl:value-of select="*"/>
]]>
</xsl:template>

But this escapes the actual XSLT and I get:

&lt;xsl:value-of select="*"/&gt;

I have tried this:

<xsl:output method="xml" indent="yes" cdata-section-elements="document"/>
[...]
<xsl:template match="document">
<document>
<xsl:value-of select="*"/>
</document>
</xsl:template>

This does insert a CDATA section, but the output still contains just text (stripped elements):

<?xml version="1.0" encoding="UTF-8"?>
<html>
   <head>
      <title>My doc</title>
   </head>
   <body>
      <h1>Title: This contains an 'embedded' HTML document </h1>
      <document><![CDATA[
                                                HTML DOC

                                                                Hello World

                                ]]></document>
   </body>
</html>
1
Can you show your expected output please?Sean B. Durkin

1 Answers

11
votes

There are two confusions you need to clear up here.

First, you probably want xsl:copy-of rather than xsl:value-of. The latter returns the string value of an element, the former returns a copy of the element.

Second, the cdata-section-elements attribute on xsl:output affects the serialization of text nodes, but not of elements and attributes. One way to get what you want would be to serialize the HTML yourself, along the lines of the following (not tested):

<xsl:template match="document/descendant::*">
  <xsl:value-of select="concat('&lt;', name())"/>
  <!--* attributes are left as an exercise for the reader ... *-->
  <xsl:text>&gt;</xsl:text>
  <xsl:apply-templates/>
  <xsl:value-of select="concat('&lt;/', name(), '>')"/>
</xsl:template>

But the quicker way would be something like the following solution (squeamish readers, stop reading now), pointed out to me by my friend Tommie Usdin. Drop the cdata-section-elements attribute from xsl:output and replace your template for the document element with:

<xsl:template match="document">
  <document>
    <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
    <xsl:copy-of select="./html"/>
    <xsl:text disable-output-escaping="yes">]]&gt;</xsl:text>
  </document>
</xsl:template>