3
votes

This is a problem I've had for a long time - I currently accept a full html page from the user as input and want to filter / clean it out. the problem with HTMLpurifier is that it removes the head , html , and body tags - as well as the styles in the head. I've google , looked at the forums , tried implementing what was written , and to no luck. Can someone help ?

What I want : To keep the HTML , HEAD , STYLE , BODY TAGS

What I have done :

$config->set('HTML.DefinitionID', 'test');
    $config->set('HTML.DefinitionRev', 1);
    $config->set('HTML.AllowedElements', array('html','head', 'body', 'style', 'div', 'p'));    

    if ($def = $config->maybeGetRawHTMLDefinition()) {
        $def->addElement('html', 'Block', 'Inline', 'Common', array());
        $def->addElement('head', 'Block', 'Inline', 'Common', array());
        $def->addElement('style', 'Block', 'Inline', 'Common', array());
        $def->addElement('body', 'Block', 'Inline', 'Common', array());

    }
3
You basically need to change the whitelist to allow more stuff. Have you read htmlpurifier.org/docs#toclink1?deceze♦
The purifier strips out several things but you don't say what you want to strip with it and what you expect the outcome to be. Please clarify your question and show us what you have tried.Jay Blanchard
Added , above are the current approach I usedNadi Hassan Hassan

3 Answers

0
votes

Why not use strip_tags? It supports list of allowed tags.

http://www.php.net/manual/en/function.strip-tags.php

0
votes

You need to

$config->set('Core.ConvertDocumentToFragment', false);

For whatever reason, Core.ConvertDocumentToFragment defaults to true, even though the documentation states that "for most inputs, this processing is not necessary".

I was bitten by this too. All I got from the error collector was the cryptic message "Removed document metadata tags", which in turn is a translation from the internal message "Lexer: Extracted body".

0
votes

End Result - HTMLPurfier does not natively allow full HTML Parsing - Either extend it or find a pass thru