9
votes

I need some advanced people to give me an advice is this is an Google CDN bug or i miss something. I discover this bug like 4 months ago, tried to contact their support, but they were so rude, that i don`t want to even speak here about that. They accepted, at least they told me that they will gonna send the problem to back-end team but after that they deleted the issue tracker and they dont response to my emails anymore. That is the main reason why i ask here.

Problem

Google CDN randomly not serving gzip content to end user. So instead of ~70KB they download 500KB files. I can not produce this problem directly to my origin, but i can produce this problem very easy on Google CDN.

Here is example request to CDN:

Request:

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch, br
Accept-Language:en-US,en;q=0.8,bg;q=0.6,hr;q=0.4,mk;q=0.2,sr;q=0.2
Cache-Control:no-cache
Connection:keep-alive
Cookie: example
Host: example.com
Pragma:no-cache
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36

Response:

Accept-Ranges:bytes
Age:58422
Alt-Svc:clear
Cache-Control:public, max-age=604800
Content-Length:550158
Content-Type:text/css
Date:Tue, 04 Apr 2017 03:45:53 GMT
Expires:Tue, 11 Apr 2017 03:45:53 GMT
Last-Modified:Sun, 19 Mar 2017 01:50:22 GMT
Server:LiteSpeed
Via:1.1 google

As you can see, my request have accept-encoding:gzip header but i receive not gzip content. Instead of 70KB i receive 500KB. Also note Age header, that item is cached/exist on CDN for 58422 seconds!

Here is the same request from another machine (US)

Request:

:authority: xxx
:method:GET
:path:/wp-content/themes/365/style.css
:scheme:https
accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
accept-encoding:gzip, deflate, sdch, br
accept-language:en-US,en;q=0.8
cache-control:no-cache
cookie: xxx
pragma:no-cache
upgrade-insecure-requests:1
user-agent:Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36

Response:

accept-ranges:bytes
age:58106
alt-svc:clear
cache-control:public, max-age=604800
content-encoding:gzip
content-length:72146
content-type:text/css
date:Tue, 04 Apr 2017 03:49:28 GMT
expires:Tue, 11 Apr 2017 03:49:28 GMT
last-modified:Sun, 19 Mar 2017 01:50:22 GMT
server:LiteSpeed
status:200
vary:Accept-Encoding
via:1.1 google

As you can see, i got an gzip content from my other server.

I have a tons of HAR files and also videos that i prove this bug, but lets keep it simple. Google CDN logs are available in GCP dashboard, check how they looks like.

enter image description here

If all my visitors does not support gzip, what about GoogleBot?

enter image description here

I also analyzed my server logs and i found stats like 99% response size for that file is as gzip, only few requests are not gzip. Very logical, as some visitors or i prefer to say robots requested that file without gzip header.

Temporary solve the problem

If i purge the CDN cache, this problem does not exist in next minutes/hours. After some time, it still happens. Also this problem does not always happens, but randomly. I got system that parse CDN logs and show me graphs, that is actually how i discover this bug.

enter image description here

Whenever i see chart bandwidth increase (double as normal), when i log into google dashboard and check logs, i find those 500KB logs like 50% of that file requests, and it`s easy to produce the bug in browser, i just login to my servers, request the file and get random results.

Ill be so happy if the problem is in my origin since ill solve in 1 minute, but i think it is Google CDN bug. I`ll be happy if any person more into CDN technology to assist me or some person from Google Cloud.

EDIT:

As i said, this bug happens in random time frame, here is an video i recorded now that show us an 'NO BUG TIME FRAME'. As you can see, every response is compressed.

NO BUG TIME FRAME CDN VIDEO

EDIT2:

Here is an graph that shows number of gzip and not gzip responses for an single .css url test.

stacking lines

EDIT3:

On first graph image, lines are stack-able, here is same graph without stacking. As you can see, some hours have near 100% not gzip responses.

not stacking lines

EDIT4:

Here are my origin parsed logs for the same css file.

1060 requests were served with response size below 100KB. 200,304,206 response codes. 32 requests were served with response size above 100KB. 200 and 206 response codes.

origin server

EDIT5:

Analyzing 1-7 April logs here are some additional stats for single .css url:

19803 CDN requests were served with > 100KB (not gzip)

41004 CDN requests were served with < 100KB (gzip)

29 Cache Fill from origin with > 100KB (not gzip)

924 Cache Fill from origin with < 100KB (gzip)

423 Cache-To-Cache fill with > 100KB (not gzip)

2295 Cache-To-Cache fill with < 100KB (gzip)

I'm surprised how Cache-To-Cache fill is very effective, amazing.

SOLUTION

There is no bug in origin not even in Google CDN. Problem is when Google CDN receive an cache-able entity without 'Vary:Accept-Encoding' when request did not send 'Accept-Encoding:gzip', so Google CDN will store that uncompressed response and will overwrite all stored compressed cache entities. So next time when user try to get some file for example .css, Google CDN will answer like:

  1. I received this file from origin and its not vary by anything.
  2. Send uncompressed response.

Be aware that web servers are not configured to send 'Vary:Accept-Encoding' headers on requests that does not have 'Accept-Encoding:gzip' headers. I tested this on Litespeed, Apache, Nginx and Cloudflare Nginx.

I highly recommend Google team to update the documentation about this. There is an statement about 'Vary headers' but no one will get the point regarding this issue since not me, not Google first level support (i also had 20 days communication on Google issue tracker with two Google support persons), stack-overflow or other person answer the problem.

Additional the documentation says:

In addition to the request URI, Cloud CDN respects any Vary headers that instances include in responses.

But nothing when request does not have 'Vary' header.

This is how i fix it:

<FilesMatch '.(js|css|xml|gz|html|txt|xml|xsd|xsl|svg|svgz)$'>
    Header merge Vary Accept-Encoding
  </FilesMatch>
1

1 Answers

4
votes

Google Cloud CDN neither compresses nor decompresses responses from your origin. Instead, it respects the origin server's Vary: Accept-Encoding response header and caches separate variants based on the client's Accept-Encoding request header. Clients that support gzip compression should get one variant while clients that don't should get another.

The problem is that the example uncompressed response you provided is missing the Vary: Accept-Encoding header:

Accept-Ranges:bytes
Age:58422
Alt-Svc:clear
Cache-Control:public, max-age=604800
Content-Length:550158
Content-Type:text/css
Date:Tue, 04 Apr 2017 03:45:53 GMT
Expires:Tue, 11 Apr 2017 03:45:53 GMT
Last-Modified:Sun, 19 Mar 2017 01:50:22 GMT
Server:LiteSpeed
Via:1.1 google

The above response instructs Cloud CDN to use the uncompressed variant for all clients, regardless of whether they support gzip compression. Once a response without a Vary: Accept-Encoding header winds up in the cache, Cloud CDN will use that cached response for all clients. The fix is for the origin server to include a Vary: Accept-Encoding header in its responses.

Can you share the details of how you enabled gzip compression? It appears that sometimes your origin server fails to include the Vary: Accept-Encoding header in its responses. Perhaps it doesn't include that header when it thinks the client doesn't support gzip compression?