It appears to be a measurement of how long the server is taking to generate its response from the ELB's perspective, without regard to how long might be needed for ELB to return the response to the client.
I came to this conclusion by reviewing my own logs in one of my applications, which uses ELB in front of another load balancer, HAProxy, which in turn is in front of the actual application servers. (This may seem redundant, but it gives us several advantages over using only ELB or only HAProxy.)
Here's the setup I'm referring to:
ELB -->>-- EC2+HAProxy -->>-- EC2+Nginx (multipe instances)
HAProxy logs several time metrics on each request, including one called Tr
.
Tr: server response time (HTTP mode only). It's the time elapsed between the moment the TCP connection was established to the server and the moment the server sent its complete response headers. It purely shows its request processing time, without the network overhead due to the data transmission.
Now, stick with me for an explanation of why so much discussion of what HAProxy is doing here is relevant to ELB and the Latency metric.
Even though HAProxy logs a number of other timers related to how long the proxy spends waiting for various events on each request/response, this Tr
timer is the single timer in my HAProxy logs that neatly corresponds to the values logged by Cloudwatch's "Latency" metric for the ELB on a minute-by-minute basis, give or take a millisecond or two... the others are wildly variant... so I would suggest that this ELB metric is similarly logging the response time of your application server, unrelated to the additional time that might be required to deliver the response back to the client.
It seems very unlikely for the HAProxy and the ELB to be so consistently in agreement, otherwise, given HAProxy's definition of the timer in question, unless ELB's timer is measuring something very similar to what HAProxy is measuring, since these systems are literally measuring the performance of the same exact app servers on the same exact requests.
If your application server doesn't benchmark itself and log timers of its own performance, you may want to consider adding them, since (according to my observations) high values for the Latency metric do seem to suggest that your application may be having a responsiveness issue that is unrelated to client connection quality.