6
votes

Am using the boto (2.2.1) backend for django-storages (1.1.4) to upload files to an S3 bucket. It works fine for images, but when I try to upload movie files (small mov, small avi) or an mp3, i get a Broken pipe error.

This is Weird.

Digging into the Django traceback, I get see the following exception:

boto.https_connection.InvalidCertificateException

Which kind of fits the experience I've been having using Cyberduck to inspect the bucket directly: sometimes it complains that I'm getting a mismatch between the cert for *.s3.amazonaws.com and the domain *.s3-external-3.amazonaws.com

Indeed, the bucket logging shows that I'm being served a HTTP 307 temporary redirect. Is it perhaps AWS sending some content types one way and others another, but boto/something can't quite keep up with that? Uploads of movies do seem to hit S3 twice, whereas images hit it once, so it may well be that boto is coping with the 307 fine (and the closed tickets for 307 support in boto are a couple of years old), so it could well be ok and something else is up.

But what? I've gone from a pleasantly productive day to a head end, and it's extremely frustrating.

Any suggestions for what may be up and/or what to try to work around this?

(Note that this fails with the boto S3 backend or the simple S3 backend - it's just that the boto one gives me what looks like a more specific error)

1
Interestingly, if I do it all manually via the shell, using boto.S3Connection and boto.Key etc, the movie file goes up without a hitch (so at least I've got a long-winded workaround)Steve Jalim
What size are the files? Perhaps it's using a different method to send them (streaming vs all-in-one) and that's hitting an API mismatch?Joe
The png is about 100kb, the mp3 around 400 and the mov 360kb. Not big at all...Steve Jalim
Try it with a 400k png, if only to rule out file size.Joe
And try renaming the .mp3 and .mov extentions to .png and see if they upload. You may hit some AWS security filtering.RickyA

1 Answers

3
votes

I'm writing this as an answer because it is too long to fit in a comment. It doesn't really answer your question but perhaps it will help you get to an answer.

The 307 redirect you are getting is happening because the bucket is in eu-west-1 but you are hitting the standard s3.amazonaws.com endpoint. S3 uses some DNS magic and HTTP redirects to route traffic from the generic S3 endpoint to the correct regional endpoint.

To accomplish this, most S3 clients use a "subdomain" referencing scheme that prepends the bucket name to the hostname in the request. So, if you are trying to access your bucket the Host header in the request would, by default in boto, be foofoofoo-bar.s3.amazonaws.com and then, using the DNS magic and HTTP redirects, S3 would end up getting your request to the right place. That should all happen automatically in boto.

This approach can cause a problem if your bucket name includes a "." because then the Host header might be foofoofoo.bar.s3.amazonaws.com and, since the wildcard SSL cert on the S3 endpoint is only good for one level of subdomain, the period in the bucket name then causes the SSL cert verification to fail.

That's why I asked about "." in your bucket name but apparently that is not the problem. Is there anyway you could provide more context from the logs? I would like to see what's happening prior to the cert validation error.