3
votes

I've been updating a members page system and one of the requirements is to allow bold, underline, italic, font colour and links on certain fields but not font size or style - all this with a WYSIWYG editor. This was originally done with a textarea and some minimal HTML filtering ie removing <script> with a preg_replace(). Crazy and definitely unsafe I know.

My first revision was to use TinyMCE and disallow certain tags within TinyMCE, the only problem is that obviously I cannot rely on TinyMCE as any sort of validator and securing HTML input against XSS I have discovered is an absolute minefield. I've spent the last hour or so reading up on certain practices, and it seems its going to be near-impossible to allow certain HTML tags/attributes without messing up the current profiles and further more allowing other customizations such as font-size and stuff with inline styles. For example I need to allow font colours with span tags, but allowing the style attribute will also allow for any piece of CSS.

I have now dappled with the idea of using BBCode with a WYSIWYG editor, as this would allow us to safely apply htmlspecialchars() on the output and then be fully in control of any HTML being generated with a BBCode parser for [b], [u], [i], and [color] tags with nl2br() for line breaks.

The only problem is I'll have to code something to convert the current HTML setup to BBCode.

My main query is are the aforementioned steps with BBCode going to be enough to protect from XSS attacks? Or is there a more graceful/obvious method of HTML security I can use?

2

2 Answers

4
votes

You could use the HTML Purifier library. It's heavyweight, but allows rules like "only allow color settings in style tags". It's thoroughly tested and actively developed.

1
votes

Save into html, and use proper HTML parser (like DOMDocument) to get rid of dangerous tags (such as htmlspecialchars the dangerous tags instead blindly apply to entire content nodes)