3
votes

I'm taking a look at how to properly escape data that comes from the outside world before it gets used either for application control, storage, logic.. that kind of thing.

Obviously, with the magic quotes directive being deprecated shortly in php 5.3.0+, and removed in php6, this becomes more pressing, for anyone looking to upgrade and get into the new language features, while maintaining legacy code (don't we love it..).

However, one thing that I haven't seen is much discussion about theory/best practice with what to do once you have protected your data - for example, to store with or without slashes? I personally think keeping escaped data in the DB is a bad move, but want to hear discussion and read some case studies preferably..

Some links from the PHP manual just for reference:

PHP Manual - mysql_real_escape_string

PHP Manual - htmlspecialchars

etc etc.

Any tips?

5
the fact you said "for example, to store with or without slashes" leads me to believe you may have a flawed concept of proper escaping. if you're doing escaping correctly, then strings being sent to the database that need slashes will have them, but will not actually be stored in the database. if you see slashes in the database, then the data was improperly escaped.longneck
Explain more please longneck - will mysql remove escapes before inserting? Is there a page about this? But you're dead right - I think I overlooked this at some point in the past, and now trying to catch up.dmp
mysql_real_escape_string will not escape slashes, etc. It makes a string safe for an SQL query.William
If you're looking to check to see if the server has magic quotes on, and if so remove slashes you need to look at the get_magic_quotes_gpc() and stripslashes() functions.William

5 Answers

6
votes

Take a look at prepared statements. I know in mysql this works very well and is a secure form of getting data in to your databse. It has a few performance benefits too.

http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html

I have some more resources if you are interested.

Hope this is what you are looking for, tc.

Edit:

One thing i can add is using filters in combination with prepared statements. For example to check if the value is a sting you use FILTER_SANITIZE_STRING, or for the email you use FILTER_SANITIZE_EMAIL.

This saves some amount of code and works very well. You can always check the data using your own methods afterwards, but there are a lot of filters you can use.

2
votes
  • Use correct method of escaping data when running queries: mysql_real_escape_string, prepared queries, etc...

  • Store data in database unaltered

  • Use correct method of escaping data on output: htmlspecialchars, etc..

2
votes

For database work, check parameterized queries and prepared statements. PDO and mysqli are good for that.

Htmlspecialchars is the right tool to display some text in html documents.

And, as you mentionned php 5.3, you have access to the filter functions which are a must-use when handling user data.

1
votes

For database inserts the solution is to use bind variables.

In general, any time you find yourself escaping anything (argument to a shell command, db command piece, user-supplied html, etc.), it indicates that you're not using the right function call (e.g., using system when you could use a multi-arg form of exec), or that your framework is deficient. The standard approach to working in a deficient framework is to enhance it so that you can return to not thinking about quoting.

Thinking about levels of escaping and levels of quoting can be fun, but if you really enjoy that go play with Tcl in your spare time. For real work, you shouldn't be thinking about quoting unless you're designing a library for other people to use, in which case you should quote properly and let your users avoid thinking about quoting. (And you should document very carefully exactly what kind of quoting you do and don't do)

0
votes

It's simple. ALL incoming data should be ran through mysql_real_escape_string() before inserting it into the database. If you know something needs to be an integer for example, set it to an integer before inserting it, etc. Remember this is just to stop SQL injection. XSS and data validation are different.

If you want something to be an email, you obviously need to validate that before you insert it into the database.

htmlentities() sanitizes data, meaning it modify the data. I think you should always store raw data in the database and when you grab that data, choose how you want to sanitize it then.

I like to use the following function as a "wrapper" for the mysql_real_escape_string() function.

function someFunction( $value )
{
    if ( is_int( $value ) || is_float( $value ) ) {
        return $value;
    }
    return "'" . mysql_real_escape_string( (string) $value ) . "'";
}

If the value is a float or an integer, then there is no point in running mysql_real_escape_string(). The reason I cast the value to a string before passing it to mysql_real_escape_string(), is because sometimes the value might not be a string.

An example of the value not being a string:

http://localhost/test.php?hello[]=test

Inside test.php, you run mysql_real_escape_string() on $_GET['hello'] expecting hello to be a string. Well since the person set the value to an array, it will actually cause a notice since hello is not a string.