7
votes

As far as I've noticed TinyMCE does it's own escaping of meta characters, and using htmlspecialchars() afterwards will only clutter the output and show < p > tags and the like instead of rendering them in the browser. It's an easy thing to turn off Javascript and input malicious code which will be rendered when another user with Javascript turned on visits the content.

So I need to use proper server-side validation, but exactly -how- can I do this properly considering the thousands of XSS techniques out there? Is there any efficient way which works for TinyMCE, such as "using htmlspecialchars() together with TinyMCE?"

So far I've made a white-list for allowed HTML tags, replaced any javascript: and similar :void within the content to try and protect against inline Javascript such as onClick="javascript:void(alert("XSS"));", but I feel this is not enough.

Any advice on the subject would be very appreciated, but remember that certain content needs to be shown properly on the output, this is why I use TinyMCE in the first place. I only need to be protected against the XSS.

Also, while on the subject; how can I protect myself against CSS XSS such as style="background-image: url(XSS here);"?

2

2 Answers

2
votes

HTMLPurifier is one solution for php: http://hp.jpsband.org/

0
votes

For .Net: http://msdn.microsoft.com/en-us/security/aa973814.aspx

I also fight fire with fire by using:

$(".userpost").children().off();

This prevents users exploiting your existing JavaScript. One of the biggest annoyances to Microsoft's library is it adds "x_" in front of all classes. Which is fine until someone edit's an existing entry and it adds another x_ in front. I just get rid of the x_ all together with regex since the above code makes the "x_" prefix pointless.

This removes the "x_" for 3 classes in VB.Net:

Regex.Replace(myHtml, "(<\w+\b[^>]*?\b)(class="")x[_]([a-zA-Z]*)( )?(?:x[_])?([a-zA-Z]*)?( )?(?:x[_])?([^""]*"")", "$1$2$3$4$5$6$7")

I'm sure there's a cleaner way to do it.