0
votes

I'm adding some xss protection to the website I'm working on, the platform is zendFrameWork 2 and therefor I'm using Zend\escaper. from zend documentation i knew that:

Zend\Escaper is meant to be used only for escaping data that is to be output, and as such should not be misused for filtering input data. For such tasks, the Zend\Filter component, HTMLPurifier.

but what are the riskes if i escaped the data before inserting it into the database, am i so wrong to do that? please explane to me as im somehow new to this topic. thanks

2

2 Answers

1
votes

When encoding data before storing it you will have to decode it before you can do anything sensible with it before outputting it. That's why I'd not do it.

Let's say you have an international application and you want to store the escaped value of a form field which might contain any NON-ASCII characters those might become escaped into HTML-Entities. So what if you have to quantify the content of that field? Like counting the characters? You will always have to de-escape the content before counting it. and then you have to re-escape it again. Much work done but nothing gained.

The same applies to search-operations in your database. You will have to escape the search-phrase the same way then your input for the database to understand what you are looking for.

I'd use one character-set throughout the application and database (I prefer UTF-8, beware of the MySQL-Connection....) and only escape content on output. Thant way I can then do whatever I like with the data and are on the safe side on output. And escaping is done in my view-layer automaticaly so I don't even have to think about it every time I handle data as it works automaticaly. That way you can't forget it.

That does not prevent me from filtering and sanitizing the input. And it doesn't prevent me from escaping the database-content using the appropriate database-escaping mechanisms like mysqli_real_escape_string or similar or using prepared statements!

But that's just my opinion, others might think otherwise!

1
votes

"Output" here refers to the web page. A form field ( HTML tag) is an INPUT (from the webpage), any text is an OUTPUT (to the webpage). You need to ensure any output (to the webpage) does not contain dangerous characters that could be used to forge XSS attack vectors.

This said, if you have DANGEROUS_INPUT_X given by the user and then

$NOT_DANGEROUS_ANYMORE = ZED.HtmlPurifier(DANGEROUS_INPUT_X) DBSave($NOT_DANGEROUS_ANYMORE)

and somewhere else

$OUTPUT = DBLoad($NOT_DANGEROUS_ANYMORE) echo $OUTPUT

you should be fine, as long as you do not apply any additional encoding/decoding to this output. It will be displayed in the way it is saved, that was safe.

I would suggest to look at output encoding more than validation: HtmlPurifier cleans the HTML, while you could accept any kind of bad characters if you ensure your output is encoded in the page.

Here https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet some general rules, here the PHP example

echo htmlspecialchars($DANGEROUS_INPUT_X_NOW_OUTPUT, ENT_QUOTES, "UTF-8");

Remember to set the Character Set and be consistent with the same one throughout your pages/scripts/binaries and in the database as well.