FACEPALM UPDATE: Turns out I had forgotten/overlooked the fact that I was using an older fork of S3BotoStorage from https://github.com/gtaylor/django-athumb as my default storage (even though I had django-storages installed). The current version of django-storages doesn't suffer from this problem. The problem was that the content-type headers were unicode when they hit boto, and boto escapes unicode using urllib.quoteplus
before sending it on to AWS. This isn't really Boto's fault since headers have to be converted to non-unicode strings somehow per HTTP. For a more indepth analysis see https://github.com/boto/boto/issues/1669 .
Original Question
I am using django_storage's S3BotoStorage in conjunction with a FileField to upload files to Amazon S3. Here's my field:
downloadable_file = FileField(max_length=255, upload_to="widgets/filedownloads", verbose_name="file")
In settings:
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
Everything works as far as the uploading/downloading goes.
However, the files are getting stored in my bucket with an incorrect content-type. WhenI look at the metadata for the files in my AWS S3 console, the Content-Type of the file is showing up as "application%2Fpdf" instead of "application/pdf" which it should be.
In case you say it shouldn't matter, it does matter. Google Chrome's built-in pdf reader will hang on pdf's with an invalid content-type, and a client brought this to my attention.
Here's an example of a file uploaded through django-storages/boto. If you're using chrome's built-in pdf reader I assume it hangs, like it does for me and the customer who reported this. If you're using a non-chrome browser, or the adobe plugin, or downloading the file to disk you'll probably be fine.
If I manually change the content-type metadata via the AWS console to 'application/pdf' (one of the standard choices it provides) then its fine.
I assume this is a bug with something internal with the way boto constructs the AWS policy document to upload the file, since I'm not doing anything outside of the standard usage here. However, I've stepped through boto code and can't find where it actually does the escaping.
Can someone either suggest a work around, or guide me to the offending code in boto so I can patch it and submit a pull request?
boto==2.9.5 django-storages==1.1.8