1
votes

Today index pages of two my domains (9 domains altogether) were redirected to amazon page. All other pages worked fine. Websites are custom coded.

First thought was websites were hacked, but I didn't find a single file modified within last 24 hours. I went through other possible options and nothing.

The last unknown was varnish installed a couple weeks ago. In result after restarting varnish/clearing the cache redirection stopped...

So the question is can varnish cache be modified from outside ?

I'm not a varnish expert since it sits on my server for a very short time, as well I'm aware that my config file is probably a mess but any suggestions are appreciated.

Thank you, derek

UPDATE: Thank you for answer.

Once cache is refreshed and redirection is removed, the next day other domains are affected in the same way. Purging single url '/' removes redirection until next time. I set a script checking page status to get exact time when it occurs. Got the time but cannot find much in logs. No varnish commands in syslog.

Now it happens on two physical vps servers, with exact the same source code.

Below are a few lines from varnishncsa, where HEAD request is my script, first header returns status 200 and the last is redirected - 302 to amazon.

1.2.3.4 - - [11/Jun/2016:22:40:23 -0400] "HEAD http://www.domain.com/ HTTP/1.1" 200 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.
7 NSS/3.15.3 zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
107.170.81.129 - - [11/Jun/2016:22:40:29 -0400] "GET http://www.domain.ca/search/?catid=1&sub_catid=22&sub_sub_catid=34 HTTP/1.1" 200 5908 "http:
//www.domain.com/categories/sitemap/" "Mozilla/5.0 (compatible; spbot/5.0.2; +http://OpenLinkProfiler.org/bot )"
100.43.81.151 - - [11/Jun/2016:22:40:39 -0400] "GET http://www.domain.com/ HTTP/1.1" 302 205 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex
.com/bots)"
100.43.91.12 - - [11/Jun/2016:22:40:39 -0400] "GET http://www.domain.com/robots.txt HTTP/1.1" 302 205 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +
http://yandex.com/bots)"
100.43.81.151 - - [11/Jun/2016:22:40:39 -0400] "GET http://domain.com/robots.txt HTTP/1.1" 301 0 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http:
//yandex.com/bots)"
100.43.81.151 - - [11/Jun/2016:22:40:39 -0400] "GET http://domain.com/robots.txt HTTP/1.1" 301 0 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://ya
ndex.com/bots)"
100.43.81.151 - - [11/Jun/2016:22:40:41 -0400] "GET http://www.domain.com/ HTTP/1.1" 200 4046 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://y
andex.com/bots)"
100.43.91.12 - - [11/Jun/2016:22:40:41 -0400] "GET http://domain.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.co
m/bots)"
100.43.81.151 - - [11/Jun/2016:22:40:41 -0400] "GET http://domain.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/b
ots)"
68.180.228.126 - - [11/Jun/2016:22:40:48 -0400] "GET http://www.domain.ca/profile/Faro HTTP/1.1" 200 7060 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp;
 http://help.yahoo.com/help/us/ysearch/slurp)"
104.193.88.243 - - [11/Jun/2016:22:40:55 -0400] "GET http://www.domain.uk/search/?catid=377&sub_catid=448&sub_sub_catid=461 HTTP/1.1" 200 33613 "-
" "Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"
117.78.13.18 - - [11/Jun/2016:22:41:13 -0400] "GET http://www.domain.com/robots.txt HTTP/1.0" 200 405 "-" "nutch-1.4/Nutch-1.4"
1.2.3.4 - - [11/Jun/2016:22:41:23 -0400] "HEAD http://www.domain.com/ HTTP/1.1" 302 0 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.
7 NSS/3.15.3 zlib/1.2.3 libidn/1.18 libssh2/1.4.2"

And here are headers when redirection occurs:

Request URL: http://www.example.com/
Request method: GET
Remote address: 1.2.3.4:80
Status code: 302 Found
Version: HTTP/1.1


Request headers:
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive


Response headers:

Age: 37681
Cache-Control: public
Connection: keep-alive
Content-Length: 205
Content-Type: text/html; charset=iso-8859-1
Date: Sun, 12 Jun 2016 02:40:41 GMT
Location: http://www.amazon.com
Server: Apache
Via: 1.1 varnish-v4
X-Varnish: 1249239 1443890



Request URL: http://www.amazon.com/
Request method: GET
Remote address: 54.239.25.200:80
Status code: 301 MovedPermanently
Version: HTTP/1.1


Request headers:

Host: www.amazon.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

Response headers:
Content-Encoding: gzip
Content-Type: text/html; charset=ISO-8859-1
Date: Sun, 12 Jun 2016 13:08:43 GMT
Location: https://www.amazon.com/179-0743706-1316952
P3P: policyref="https://www.amazon.com/w3c/p3p.xml",CP="CAO DSP LAW CUR ADM IVAo IVDo CONo OTPo OUR DELi PUBi OTRi BUS PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA 

All domains have been online for about 2 years without any problems, and varnish was installed 2 weeks ago.

For now I was forced to pass 'index' and if no solution is found will try to downgrade varnish to see if that helps.

Besides I have no clue where to start, what and where to look for ???

Below is my default.vcl file

vcl 4.0;

# Default backend definition. Set this to point to your content server.
backend default {
    .host = "2.3.4.5";
    .port = "8080";
    .first_byte_timeout = 300s;
    .connect_timeout = 5s;
    .between_bytes_timeout = 60s;
}

acl allowed_ip {
    # Access Control List used to to warm up cahe
    "1.2.3.0/22";   
    "2.3.4.5";
}


sub vcl_recv {

    # Do not cache 
    if ( req.url ~ "^/sitemap-(index|ads|profiles|static)\.xml")
    { return( pass ); }


    # Do not allow external access
    if (req.url ~ "^/(crone_job|sitemap_generator)\.php" && !client.ip ~ allowed_ip) 
    {  
         set req.url = "/";

 }


    # Detect device and redirect to proper site
    if ( (req.http.host ~ "www\.domain\.(ca|com|uk)" ||
        req.http.host ~ "^domain\.(ca|com|uk)" ) &&
        !(req.url ~ "\.(jpg|jpeg|png|gif|bmp|mp4|ogv|webm|m4a|ogg|doc|docx|xls|xlsx|pps|ppt|pptx|txt|rtf|csv|xml|pdf|zip|odf|ods)$" )) {

        call device_detection;
    }

    # Redirect non-www domain to www
    if (req.http.host ~ "^domain\.(ca|com|uk)$") {
       return (synth (750, ""));
    }

    # Only deal with "normal" types
      if (req.method != "GET" &&
          req.method != "HEAD" &&
          req.method != "PUT" &&
          req.method != "POST" &&
          req.method != "TRACE" &&
          req.method != "OPTIONS" &&
          req.method != "PATCH" &&
          req.method != "DELETE") {
       # /* Non-RFC2616 or CONNECT which is weird. */
        return (pipe);
      }

      # Only cache GET or HEAD requests. This makes sure the POST requests are always passed.
      if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
      }

      # First remove the Google Analytics added parameters, useless for our backend
      if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
          set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
          set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
          set req.url = regsub(req.url, "\?&", "?");
          set req.url = regsub(req.url, "\?$", "");
      }


      # Remove the "has_js" cookie
      set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");

      # Remove any Google Analytics based cookies
      set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", "");


      if (req.http.Cookie ~ "user_name=" || req.http.Cookie == "registeredDevice") {
          set req.http.Cookie = ";" + req.http.Cookie;
          set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
          set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID|user_name|registeredDevice)=", "; \1=");
          set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
          set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

         if (req.http.Cookie == "") {
              unset req.http.Cookie;
          }
      }



      # Post requests will not be cachedre there cookies left with only spaces o
      # r that are empty?
            if (req.http.cookie ~ "^\s*$") {
                unset req.http.cookie;
            }


      # Remove all cookies for static files
      if (req.url ~ "^[^?]*\.(css|jpeg|jpg|js|txt|ico)(\?.*)?$"){
        unset req.http.Cookie;
         return (hash);
      }

      if (req.url ~ "^/image.php." || 
          req.url ~ "publication.php" ||
          req.url ~ "google_map.php" ) {
        unset req.http.Cookie;
      }

      # Send Surrogate-Capability headers to announce ESI support to backend
      set req.http.Surrogate-Capability = "key=ESI/1.0";



    # if (req.http.Authorization || req.method == "POST") {
    if ( req.method == "POST") {
        return (pass);
    }

    # Normalizing namespace
    if (req.http.host ~ "(?i)^(www.)?domain.ca") {
        set req.http.host = "www.domain.ca"; } 

    if (req.http.host ~ "(?i)^(www.)?domain.com") {
        set req.http.host = "www.domain.com"; }

    if (req.http.host ~ "(?i)^(www.)?domain.uk") {
        set req.http.host = "www.domain.uk"; }



#   the script varnish-cache-warmup.sh must always refresh the cache
    if (client.ip ~ allowed_ip && req.http.Cache-Control ~ "no-cache") {
        set req.hash_always_miss = true;
    }
}

sub vcl_backend_response {

    if(
       bereq.url == "/" ||
       bereq.url == "/about-us/" ||
       bereq.url == "/contact/" ||
       bereq.url == "/blog/" ||
       bereq.url == "/categories/sitemap/" ||
       bereq.url == "/help/"  
                                    ){

       # cache, ignoring any cache headers
       set beresp.ttl = 24h;


       unset beresp.http.Pragma;
       unset beresp.http.Set-Cookie;
       set beresp.http.Cache-Control = "public"; # max-age=0; s-maxage=1800";
       unset beresp.http.Expires;
       set bereq.http.Cookie = regsuball(bereq.http.Cookie, "PHPSESSID=[^;]+(; )?", "");
       unset bereq.http.Cookie;

     }


    if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
        unset beresp.http.Surrogate-Control;
        set beresp.do_esi = true;
    }


    # Enable cache for all static files
    if (bereq.url ~ "^[^?]*\.(css|jpeg|jpg|js|txt|ico)(\?.*)?$") {  
       unset beresp.http.set-cookie;
    }

    if (bereq.url ~ "^/image.php.") {
       unset beresp.http.set-cookie;
    } 


    # Varnish 4 fully supports Streaming, so use streaming here to avoid locking.
    if (bereq.url ~ "^[^?]*\.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip)(\?.*)?$") {
       unset beresp.http.set-cookie;
       set beresp.do_stream = true;  # Check memory usage it'll grow in fetch_chunksize blocks (128k by default) if the backend doesn't send a Content-Length header, so only enable it for big objects
       set beresp.do_gzip   = false;   # Don't try to compress it for storage
    }


    # Set 2min cache if unset for static files
    if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary == "*") {
        set beresp.ttl = 120s; # Important, you shouldn't rely on this, SET YOUR HEADERS in the backend
        set beresp.uncacheable = true;
       return (deliver);
    }

    # Don't cache 50x responses
    if (beresp.status == 500 || beresp.status == 502 || beresp.status == 503 || beresp.status == 504 || beresp.status == 403) {
        return (abandon);
    }
    # Allow stale content, in case the backend goes down.
    # make Varnish keep all objects for 6 hours beyond their TTL
        set beresp.grace = 6h;


    return (deliver);
}

sub vcl_deliver {


}

sub vcl_synth {

   # Redirect non-www domain to www
   if (resp.status == 750) {
    set resp.status = 301;
    set resp.http.Location = "http://www." + req.http.host + req.url;
    return(deliver);
   }

   # Redirect to mobile site
   if (resp.status == 751) {
    set resp.status =301;
    set req.http.host = regsub(req.http.host, "^www\.","");
    set resp.http.Location = "http://m." + req.http.host + req.url;
    return(deliver);
   }

}

sub device_detection {

      set req.http.X-Device = "pc";
      if (req.http.User-Agent ~ "iP(hone|od)" || 
              req.http.User-Agent ~ "Android" || 
              req.http.User-Agent ~ "Symbian" || 
              req.http.User-Agent ~ "^BlackBerry" || 
              req.http.User-Agent ~ "^SonyEricsson" || 
              req.http.User-Agent ~ "^Nokia" || 
              req.http.User-Agent ~ "^SAMSUNG" || 
              req.http.User-Agent ~ "^LG" || 
              req.http.User-Agent ~ "webOS") 
          { set req.http.X-Device = "mobile"; } 

          if (req.http.User-Agent ~ "^PalmSource")
          { set req.http.X-Device = "mobile"; }

      if (req.http.User-Agent ~ "Build/FROYO" || 
              req.http.User-Agent ~ "XOOM" ) {
        set req.http.X-Device = "pc";
      }

      if (req.http.X-Device == "mobile") {

              return (synth(751, ""));
     }
}
1
Mystery is solved. There was a rule in my htaccess redirecting Yandex IP to amazon. So when my index page was not cached and Yandex came for a visit, request was redirected to amazon and cached by varnish. Moving rules from htaccess to varnish solved the problem. Anyway thanks for answer.dewu

1 Answers

1
votes

Varnish is just software like any other, so guarantees are hard to make.

If you judge by earlier incidents, Varnish has a very good security history and appears to be mostly safe.

As far as your VCL goes, there is nothing in there that allows for the behaviour you describe. In fact it would be very hard to introduce something like this on the Varnish level, as Varnish doesn't normally support rewriting/changing response bodies.