3
votes

Stackdriver to test my site up alert slow

We are using cloudflare as our site CDN provider. We use stackdriver to test the site availability from outside, and we have the time checking interval set for 1 minute. We use Scalyr as our log collector and analyse the logs, then ssend alert to us if it find certain pre-define errors in the logs. what we have experiences are when the site is down(user can not access it), Scalyr always alert us first, after 3-5 minutes, we got stackdriver's alert.

As Scalyr traces the log details, sometimes the alert may not be for site down, therefore we need to reply on Stackdriver for this. I tried to remove the location check in limit it just within the US, that did not help at all and we pretty much get the alert after the site is down for about 5-6 min.

Anything else I can do to improve stackdriver alerting latency?

2
Same problem for me. It takes even longer for email notifications to arrive :/ - Marcin

2 Answers

0
votes

Can you post a screenshot on your Stackdriver Uptime Check to confirm the probe frequency is set to 1 minute?

For the alert policy associated with the uptime check, try editing the alert condition and changing the duration value to "most recent value"

SD alert policy - condition duration - most recent value

0
votes

Stackdriver slowness is by design: https://github.com/googleapis/nodejs-logging-winston/issues/515

Accept 4-10 min delays between logging event and ability to see the statement in Stackdriver. Migrate to other services for responsive log indexing.