0
votes

We have an aws s3 bucket that hosts our dynamic images, which will be fetched by web and mobile apps through https and with different sizes (url/width x height/image_name) i.e. http://test.s3.com/200x300/image.png).

For this we did two things:

1- Realtime resizing: I have a redirection rule in my s3 bucket to redirect 404 errors requesting non-existing image sizes to an API gateway that calls a Lambda function. The lambda function fetches the original image and resizes it and places it in a folder in the bucket matching the requested size.

We followed the steps in this articles: https://aws.amazon.com/blogs/compute/resize-images-on-the-fly-with-amazon-s3-aws-lambda-and-amazon-api-gateway/

2- HTTPS: I created a cloudfront distribution with an SSL certificate and its origin is the s3 static website endpoint

Problem: Requesting an image from s3 using the cloudfront https domain always causes an 404 error which gets redriected by my redirection rule the API gateway, even if this specific image size already exists.

I tried to debug this issue with no luck. I examined the requests and from what I see things should work normally.

I'd appreciate a hint on what to do to better debug this issue (and what kind of logs I need to provide here).

Thanks

Sary

1
The example seems to be a little bit incomplete (as implied in the note, in the first paragraph). Something I don't see mentioned is that your CloudFront distribution's Cache Behavior's Minimum TTL and Default TTL would both need to be set to a custom value of 0. The standard values are 0 and 86400 but this would result in CloudFront continuing to redirect to API Gateway for up to 24 hours after an image is resized.Michael - sqlbot
@Michael-sqlbot I had the setting to use the origin's cache control. I'll try setting the minimum, default and maximum TTLs to 0. But why would not setting them to 0 cause s3 to return an 404 for a resource that exists, if it is requested using cloudfront's domain?SaryAssad
Setting Default TTL to non-zero (which is what happens when you "use origin cache headers") would cause CloudFront to cache the redirect for 24 hours (86400 seconds). Maximum TTL doesn't need to be 0, just the other two. This may not be the entire issue -- there may be more to it -- but that does seem significant.Michael - sqlbot
@Michael-sqlbot WOW it worked. You're a hero. The Internet is a miracle. Please post it as an answer so that I accept it.SaryAssad

1 Answers

1
votes

This solution relies on S3 generating HTTP redirects for missing objects, to redirect the browser to API Gateway to resize the object... and save it at the original URL.

The problem is two-fold:

  • S3 generated redirects don't include any Cache-Control headers, and
  • CloudFront's default behavior when Cache-Control is absent in a response is to cache the response internally for the value of a timer called Default TTL, which by default is set to 86400 seconds (24 hours).

The problem this causes is that CloudFront will remember the original redirect and send the browser to it, again and again, even though the object is now present.

Selecting Customize instead of Use Origin Cache Headers for "Object caching" and then setting Default TTL to 0 (all in the CloudFront Cache Behavior settings) will resolve the issue, because it configures CloudFront not to cache responses where the origin didn't include any relevant Cache-Control headers.

For more background:

What is Cloudfront Minimum TTL for? explains the Minimum/Default/Maximum TTL timers and how/when they apply.

Setting "Object Caching" on CloudFront explains the confusing UI labeling of these options, which is likely a holdover from a time before all three timers were configurable.