0
votes

I hope you can help

I have a image server that generates images on the fly. I'm using varnish to cache generated images.

I need to record how many requests (per image) varnish receives as well as if it was a hit or miss (pass gets marked as miss). Currently, I'm writing access logs with hit/miss to file, I then using crontab process this access-log file and write the data to my db...

What I would like to do instead is:

Have Varnish make a request to my backend notifying it of a cache hit (and if possible the response size (bytes)). My backend could then save this data...

Is this at all possible and if so how?


In-case anybody is interested:

  • 2 varnish instances each with 1 (java+tomcat) backend.
  • Service manipulates and generates each image specific to the requirements made in the request...

Below are per day:

  • Over 35 million page views where each page has at least 3 images in it.
  • Varnish gets around 3+ million requests for images (images are also cached by the browser).
  • Varnish has a 87% hit rate
  • Response times for a hit are a few micro seconds
  • Response times for a miss are 50ms to 1000ms depending on the size of the image (both source and output)
2

2 Answers

2
votes

The best way of doing this is to have a helper process that tails varnishlog output and does the HTTP calls when needed.

You can do this by logging the necessary data with std.log() in vcl_deliver, so the helper process gets all the data it needs. Use obj.hits > 0 to check if this was a cache hit.

If you really really need to do it inline (and slowing down all your cache hits badly), you can use libvmod-curl:

https://github.com/varnish/libvmod-curl
1
votes

If you are going to send a request to a stats server from within your vcl I would try to incorporate some type of aggregate request, where you send it every 100 (or whatever) requests instead of every single incoming request.

Like the other answer, I would recommend using varnishncsa (or varnishlog) with a process that tails the log file. There could be some delay in that method but if that is acceptable then I would consider post processing the varnish log when logrotated runs. This way you have a full day's worth of data and you can churn through it, producing whatever report you need.