I'm currently working on an email template building website in PHP (LAMP to be specific) that allows users to paste in their HTML email code and then send it off to their customers.
Obviously with handling this kind of data I need to implement some kind of XSS security. I've scowled the net for weeks trying to find solutions to this and found very few good methods but they don't really work for full HTML documents (which is what I'd be dealing with).
These are the solutions I found and why they don't work for me:
HTMLPurifier:
I think this is the obvious choice for most because it's got the best security and is up to date with industry standards. Although it's main use is supposed to be for HTML fragements/small snippets, I thought I'd give it a go.
The first issue I ran into was that the head tags (and anything inside them) was getting stripped and removed. The head is quite essential in HTML emails so I had to find a way around this...unfortunately, the only fix I could find was to seperate the head from the rest of the email and run each part seperately though HTMLPurifier.
I've yet to try this because it seems very hacky but it seems to be the only way to achieve what I'm after. I'm also not sure on how well HTMLPurifier is at finding XSS in CSS. On top of all that, it doesn't do well in terms of performance with it being such a large library.
HTMLawed:
HTMLawed seemed to be another great option but a few things swayed me from using it.
A) Compared to HTMLPurifier, this seems to be less secure. HTMLawed has several documented security issues at the moment. It's also not widely used yet which is more worrying (only used by about 10 registered companies).
B) It's released under the GPL/GPU License, which effectively means I can't use it on my website unless I'm willing to let people use my service for free.
C) From what I've seen of people talking about it, it seems to strip a lot of tags unless it's heavily configured. I can't have much say here because I've not tried it but that also raises security concerns for me - what if I miss something? what if I can't configure it to keep the elements I want? etc.
These are my questions to you:
- Are there any better alternatives to the ones listed above?
- Is it possible to code this myself or is that too ambitious and too insecure?
- How do the larger email companies tackle this issue (mailchimp, activecampaign, sendinblue, etc.)?