28
votes

I'm creating a web application (in Django), which needs to allow users to upload files (specifically images, which are later displayed for other users). I'm trying to understand the best way to store these uploaded files.

From related questions, I saw some people suggested giving the file a server-generated unqiue id, then creating a DB table which maps ids to original filenames.

Is this the best approach to storing user-uploaded files, from a security, efficiency or any other standpoint? What kind of information should I be storing about each file?

Are there any other best-practices involved with accepting user-uploaded files? (Other than making sure they're really images and checking their size, obviously)?

Edit: A little more info about what I need. I'm talking specifically about image files that users need to upload and embed in content they create. Imagine it like a StackOverflow answer (or a blog post): someone uploads a picture, which has to be stored and displayed whenever anyone else sees the answer.

Thanks,
Edan
Note: There are several related questions, but I haven't found one which asks for a comparison of ways to store user-uploaded files.

2
hi edan, i am came to the same problem. I am going to create application where users can upload images/pdfs. Then they can view/download files. I am thinking of file system storage. Just want to know which way did you prefer from your experience?Pragnesh Patel
The direction I'm planning on going is: 1. A database to hold "meta" info (original file name, etc.). 2. Physical files on the file system, with a name which is the index into the db of the meta info. There can be more than one physical file for each db entry, by the way, corresponding to different resolutions, for example. E.g., I'll have files: 254_high_res.jpg, 254_los_res.jpg, where 254 is the index in the database of the meta info for file 254.Edan Maor
thanks edan. i am going to store images with the radon guid generated names as file name. and for different resolution i am going to use handler. dotnetslackers.com/articles/aspnet/…Pragnesh Patel

2 Answers

3
votes

Your question is too broad to really be helpful; best approach will depend on your specific requirements. Nonetheless...

Programmers are constantly tempted to plonk files into the database. Resist. It just adds a layer of complexity to everything you try to do with them thereafter.

For my experience, whilst using a hashkey as the local filename was my preference, it didn't really work out because our files weren't restricted to images: the non-images need a filename to serve back to users, and the uploaders don't particularly like having their files radically renamed since it makes it impossible for them to know what file is what.

As for images there is some non-trivial work to do in rescaling to various sizes/thumbnails.

2
votes

That is a big question.

Related to your picture use-case this is the domain of a picture-server which usually is a complete separate part of application. They handle lifecycle and resizing of images (one picture is stored/resized to different sizes). As far as I have seen such was never implemented by a BLOB column from SQL, but by straightforward normal file on disk.

Also have a look at a (infrastructure example from facebook)

Generally the product has to align to your concrete requirements (file-size, number-of files, load). In most cases you really don't want to build all that stuff from scratch... Though if your requirements are lower end (just a few users uploading files rarely) you could just save these files on disk and save the path as reference in your other data (e.g. column in RDBMS).