1
votes

This is the same issue as was asked here in 2012, which was not answered:

Using regex to remove empty paragraph tags <p> </p> (standard str_replace on "space" not working)

When I press enter in TINYMCE, it appears to enter empty paragraph tags like this:

<p> </p>

I wanted to remove them before saving the data to a MySQL table.

So I tried a simple fix:

$post_content = str_replace('<p> </p>', '', $content_from_mce);

And also:

$post_content = str_replace('<p>&nbsp;</p>', '', $content_from_mce);

However, they do not work (e.g. the don't replace the apparently empty paragraph tags).

If I do this:

$foo = utf8_encode($post_content);

And then check $foo: it shows as:

<p>Â </p>

So really it's not an empty paragraph tag, but I can't work out how to delete these blocks of text.

I've also tried these versions (not all at the same time - I mean in different runs...)

$post_content = str_replace('<p>Â </p>','',$post_content);
$post_content = preg_replace('~<p>\s*<\/p>~i','',$post_content);
$post_content = preg_replace('#<p>&nbsp;</p>#i','<p></p>', $post_content);
$post_content=str_replace("/<p> <\/p>/","",$post_content);

But none of them work.

1

1 Answers

1
votes

I figured it out - I'm using HTMLPurifier to make sure the posted content from TinyMCE is okay.

After the $post_content has been through HTMLPurifier it contains that funny character between the paragraph tags.

Therefore if I do the replace before putting $post_content through HTMLPurifier it works okay:

$config =   HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);

// get contents of "post_content" field
$post_content = $_POST['post_content'];

// remove blank paragraph lines
$post_content = str_replace('<p>&nbsp;</p>','',$post_content);

// now put $post_content through HTMLPurifier 
$post_content = $purifier->purify($post_content);