108
votes

I noticed that when using SimpleXMLElement on a document that contains those CDATA tags, the content is always NULL. How do I fix this?

Also, sorry for spamming about XML here. I have been trying to get an XML based script to work for several hours now...

<content><![CDATA[Hello, world!]]></content>

I tried the first hit on Google if you search for "SimpleXMLElement cdata", but that didn't work.

5
How are you trying to access the node value? And, is SimpleXML a requirement?allnightgrocery
I tried every other function (xml2array and all) that I could find on the web and SimpleXML seems to be the only one that gives GOOD results, except for the CDATA not working.Angelo
We do a lot of XML parsing at work using DOMDocument (php.net/manual/en/class.domdocument.php). It works just fine in handling CDATA. Give that a short or post a little more code for us to see how you're working with SimpleXML.allnightgrocery

5 Answers

193
votes

You're probably not accessing it correctly. You can output it directly or cast it as a string. (in this example, the casting is superfluous, as echo automatically does it anyway)

$content = simplexml_load_string(
    '<content><![CDATA[Hello, world!]]></content>'
);
echo (string) $content;

// or with parent element:

$foo = simplexml_load_string(
    '<foo><content><![CDATA[Hello, world!]]></content></foo>'
);
echo (string) $foo->content;

You might have better luck with LIBXML_NOCDATA:

$content = simplexml_load_string(
    '<content><![CDATA[Hello, world!]]></content>'
    , null
    , LIBXML_NOCDATA
);
57
votes

The LIBXML_NOCDATA is optional third parameter of simplexml_load_file() function. This returns the XML object with all the CDATA data converted into strings.

$xml = simplexml_load_file($this->filename, 'SimpleXMLElement', LIBXML_NOCDATA);
echo "<pre>";
print_r($xml);
echo "</pre>";


Fix CDATA in SimpleXML

14
votes

This did the trick for me:

echo trim($entry->title);
11
votes

This is working perfect for me.

$content = simplexml_load_string(
    $raw_xml
    , null
    , LIBXML_NOCDATA
);
1
votes

When to use LIBXML_NOCDATA ?

I add the issue when transforming XML to JSON.

$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo json_encode($xml, true); 
/* prints
   {
     "content": {}
   }
 */

When accessing the SimpleXMLElement object, It gets the CDATA :

$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo $xml->content; 
/* prints
   Hello, world!
*/

I makes sense to use LIBXML_NOCDATA because json_encode don't access the SimpleXMLElement to trigger the string casting feature, I'm guessing a __toString() equivalent.

$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>", null, LIBXML_NOCDATA);
echo json_encode($xml);
/*
 {
   "content": "Hello, world!"
 }
*/