0
votes

I use Application Insights "Availability" feature to check a web site availability and send an alert if it is down. Now Application Insights sends an alert every 5 minutes, even the "alert failure time window" is 15 minutes. Test frequency is 5 minutes.

So I get an alert after 5 minutes, then after 10 minutes, then after 15 minutes! I get 3 alerts while I need only 1 one alert after 15 minutes. It looks like a bug for me.

How to prevent Application Insights Availability feature to send alerts every 5 minutes? enter image description here

1

1 Answers

1
votes

The email (notification) is sent the moment alert condition is satisfied. It doesn't wait for alert failure time window.

Example: for alerting rule to send notification if 3 locations out of 5 turn red, and 3 locations turning red within the first second => notification will be sent during the same second. It will not wait for 5 (or 15) minutes.

This is by design with the goal to reduce TTD (time to detect).

There are two ways to handle noise:

  1. Configure retries (test will retry 2 times during red => green state switch)
  2. Increase the number of locations to trigger alert (for instance, 14 out of 16)

Either way - only one notification is supposed to be sent, not every 5/15 minutes. Multiple notifications suggest either some bug in tracking current state of an alert (bug in a product) or an Application which intermittently fails (so, alerting rule constantly changes its states green => red => green => ..., as a result email is sent during every transition). Do you get alert every 5 minutes when tests are red all the time?

Alert failure time window defines what failed location means. 5 min test interval and 5 min alert failure means that 1 last result defines whether location failed or not. 5 min test interval and 15 min alert failure means that 3 last results define whether location failed or not. So, if one of those 3 test runs failed then location is considered as failed (even though 2 results after it might have been successes).

Increasing alert failure time window makes alerting rule more aggressive (and noisy for intermittently failing apps).