sanitizing data before mysql injection and xss - am i doing it right with pdo and htmlpurifier

Question

I am still working with securing my web app. I decided to use PDO library to prevent mysql injection and html purifier to prevent xss attacks. Because all the data that comes from input goes to database I perform such steps in order to work with data:

get data from input field
start pdo, prepare query
bind each variable (POST variable) to query, with sanitizing it using html purifier
execute query (save to database).

In code it looks like this:

// start htmlpurifier
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);

// start pdo
$pdo = new PDO('mysql:host=host;dbname=dbname', 'login', 'pass');
$pdo -> setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// prepare and bind
$stmt = $pdo -> prepare('INSERT INTO `table` (`field1`) VALUES ( :field1 )');
// purify data and bind it.
$stmt -> bindValue(':field1',   $purifier->purify($_POST['field1']), PDO::PARAM_INT); 
// execute (save to database)
$stmt -> execute();

Here are the questions:

Is that all I have to do to prevent XSS and mysql injection? I am aware that i cant be 100% sure but in most cases should it work fine and is it enough?
Should I sanitize the data once again when grabing it from db and putting to browser or filtering before saving is just enough?
I was reading on wiki that it's smart to turn of magic_quotes. Ofocurse if magic quotes puts unnecessery slahes it can be annoying but if I don't care about those slashes isn't turning it of just losing another line of defense?

Answer:

Please note that code I have written in this example is just an example. There is a lot of inputs and query to DB is much more complicated. Unfortunately I can't agree with you that if PDO type of variable should be int I do not have to filter it with XSS attacks. Correct me if I am wrong:

If the input should be an integer, and it is then it's ok - I can put it to DB. But remember that any input can be changed and we have to expect the worse. So if everything is alright than it is alright, but if a malicious user would input XSS code than I have multiple lines of defense:

client side defense - check if it is numeric value. Easy to compromise, but can stop total newbies.
server side - xss injection test (with html purify or ie htmlspecialchars)
db side - if somehow somebody puts malicious code that will avoid xss protection than database is going to return error because there should be integer, not any other kind of variable.

I guess it is not doing anything wrong, and it can do a lot of good. Ofcourse we are losing some time to calculate everything, but i guess we have to put on the weight performance and security and determine what is more important for you. My app is going to be used by 2-3 users at a time. Not many. And a security is much more important for me than performance.

Fortunately my whole site is with UTF8 so I do not expect any problems with encoding.

While searching the net i met a lot of opinions about addslashes(), stripslashes(), htmlspecialchars(), htmlentities().. and i've chosen htmlpurity and pdo. Everyone is saying that they are best solutions before xss and mysql injections threats. If you have any other opinion please share.

kijin kijin · Accepted Answer · 2012-04-01T03:05:49

As for SQL injection, yes, you can be 100% sure if you always use prepared statements. As for XSS, you must also make sure that all your pages are UTF-8. HTML Purifier sanitizes data with the assumption that it's encoded in UTF-8, so there may be unexpected problems if you put that data in a page with a different encoding. Every page should have a <meta> tag that specifies the encoding as UTF-8.
Nope, you don't need to sanitize the data after you grab it from the DB, provided that you already sanitized it and you're not adding any user-submitted stuff to it.
If you always use prepared statements, magic quotes is nothing but a nuisance. It does not provide any additional lines of defense because prepared statements are bulletproof.

Now, here's a question for you. PDO::PARAM_INT will turn $field1 into an integer. An integer cannot be used in an SQL injection attack. Why are you passing it through HTML Purifier if it's just an integer?

HTML Purifier slows down everything, so you should only use it on fields where you want to allow HTML. If it's an integer, just do intval($var) to destroy anything that isn't a number. If it's a string that shouldn't contain HTML anyway, just do htmlspecialchars($var, ENT_COMPAT, 'UTF-8') to destroy all HTML. Both of these are much more efficient and equally secure if you don't need to allow HTML. Every field should be sanitized, but each field should be sanitized according to what it's supposed to contain.

Response to your additions:

I didn't mean to imply that if a variable should contain an integer, then it need not be sanitized. Sorry if my comment came across as suggesting that. What I was trying to say is that if a variable should contain an integer, it should not be sanitized with HTML Purifier. Instead, it should be validated/sanitized with a different function, such as intval() or ctype_digit(). HTML Purifier will not only use unnecessary resources in this case, but it also can't guarantee that the variable will contain an integer afterwards. intval() guarantees that the result will be an integer, and the result is equally secure because nobody can use an integer to carry out an XSS or SQL injection attack.

Similarly, if the variable should not contain any HTML in the first place, like the title of a question, you should use htmlspecialchars() or htmlentities(). HTML Purifier should only be used if you want your users to enter HTML (using a WYSIWYG editor, for example). So I didn't mean to suggest that some kinds of inputs don't need sanitization. My view is that inputs should be sanitized using different functions depending on what you want them to contain. There is no single solution that works on all types of inputs. It's perfectly possible to write a secure website without using HTML Purifier if you only ever accept plain-text comments.

"Client-side defense" is not a line of defense, it's just a convenience.

I'm also getting the nagging feeling that you're lumping XSS and SQL injection together when they are completely separate attack vectors. "XSS injection"? What's that?

You'll probably also want to add some validation to your code in addition to sanitization. Sanitization ensures that the data is safe. Validation ensures that the data is not only safe but also correct.

sanitizing data before mysql injection and xss - am i doing it right with pdo and htmlpurifier

1 Answers