I'm serving my JavaScript app (SPA) by uploading it to S3 (default root object index.html with Cache-Control:max-age=0, no-cache, pointing to fingerprinted js/css assets) and configuring it as an origin to a CloudFront distribution. My domain name, let's say SomeMusicPlatform.com has a CNAME entry in Route53 containing the distribution URL. This is working great and all is well cached.
Now I want to serve a prerendered HTML version for purposes of bots and social network crawlers. I have set up a server that responds with a pre-rendered version of the JavaScript app (SPA) at the domain prerendered.SomeMusicPlatform.com.
What I'm trying to do in the lambda function is to detect the user agent, identify bots and serve them the prerendered version from my custom server (and not the JavaScript contents from S3 as I would normally serve to regular browsers).
I thought I could achieve this by using a Lambda@Edge: Using an Origin-Request Trigger to Change From an Amazon S3 Origin to a Custom Origin function that switches the origin to my custom prerender server in case it identifies a crawler bot in response headers (or, in the testing phase, with a prerendered=true query parameter).
The problem is that the Origin-Request trigger with the Lambda@Edge function is not triggering because CloudFront still has Default Root Object index.html cached and tends to return the content from the cached edge. I get X-Cache:RefreshHit from cloudfront by using both SomeMusicPlatform.com/?prerendered=true and SomeMusicPlatform.com, even though there is a Cache-Control:max-age=0, no-cache on the Default Root Object - index.html.
How can I keep the well-cached serving and low latency of my JavaScript SPA with CloudFront and add serving content from my custom prerender server just for crawler bots?
User-Agent: BrowserorUser-Agent: Bot. Then change the origin behavior based on detecting one of those two values in the Origin Request trigger. Two triggers, but optimal caching. Thoughts? - Michael - sqlbotprerendered=trueparameter that dictates which origin should be used, either S3 or custom (my prerender server). The issue is that I always get a cached website no matter if I use theprerenderedparam or not. So, I go to mywebsite.com and get the content from S3. Then i visitmywebsite.com/?prerendered=trueand get a cached hit from S3 when it should get it from the custom origin. At this point if I make an invalidation,mywebsite.com/?prerendered=truewill get the content from my custom origin. - Matic Jurgličmywebsite.com(without the parameter), cached content from the custom origin will be returned, when it should use the cached content from S3 origin. How to make this work so that it switches to the correct origin depending on the parameter (and later, on the UA)? - Matic Jurglič