1
votes

How frequently the Traffic Manager monitors endpoints? It's very obvious that it's not event driven (when an endpoint is down it takes up-to 30 secs - 2.5 mins to identify the status of the endpoint as per my observations). Can we configure this frequency, I cannot see any configuration for this.

Is there a relationship between Traffic Manager Monitoring interval and TTL?

This may look like a general question, but my real issue is that I experience a service downtime in a fail over scenario (fail over of the primary). I understand the effect in TTL where until the client DNS cache expires they are calling the cached endpoint. I spent a lot of time on this and now I have narrowed down it to a specific question.

Issue is that there is a delay in Traffic Manager identifying the endpoint status after it's stopped or started. I need a logical explanation for this, could not find any Azure reference which explains this.

Traffic manager settings

enter image description here

enter image description here

I need to understand this delay and plan for that down time.

1

1 Answers

3
votes

I have gone through the same issue. Check this link, it explains the Monitoring behaviour

Traffic Manager Monitoring

The monitoring system performs a GET, but does not receive a response in 10 seconds or less. It then performs three more tries at 30 second intervals. This means that at most, it takes approximately 1.5 minutes for the monitoring system to detect when a service becomes unavailable. If one of the tries is successful, then the number of tries is reset. Although not shown in the diagram, if the 200 OK message(s) come back more than 10 seconds after the GET, the monitoring system will still count this as a failed check.

This explains the 30-2 mins delay.

basically the maximum delay would be 1.5 mins + TTL as per the details.