3
votes

I need some help to find a good pattern for a custom application insights metric.

Environment

I have a custom Windows Service running on multiple Azure VMs. I can successfull add Events to my Monitoring instance on Azure.

Goal

I want to create a custom metric that allows me to monitor if my windows services are running and responding per instance. It would be perfect if it acts like the respond timeout in website metric. Each service instance has a custom maschine related identifier, like:

TelemetryClient telemetry = new TelemetryClient();
telemetry.Context.Device.Id = FingerPrint.Instance;

Now I wnat to create a alert if one of my Service instances (Context.Device.Id) is not running or responding.

enter image description here

Question

How to achive this? Is it even possible or usefull to Monitor multiple instance of one service type onside on application insight? Or must I open one single application insight per instance? Can anybody help me?

Response to Paul's answere

Track Metric Use TrackMetric to send metrics that are not attached to particular events. For example, you could monitor a queue length at regular intervals.

If I do so, whats happens if my server made a restart (update or somethink) and my service don't start up. Now the service did't send a TrackMetric to the application insight and no alert is raised because the value don't drop below 1, but the Service is still not running.

Regards Steffen

3

3 Answers

4
votes

I found a good working solution, with only a few simple steps.

1) Implement a HttpListener instance on a service port (for example 8181) with a simple text response "200: OK"

2) Add a matching endpoint to the azure VM imstande

3) Create a default web test on "myVM.cloudapp.net:8181" with checkup of response text

Work great so far and matches all my needs! :)

2
votes

Per the documentation on Azure portal:

https://azure.microsoft.com/en-us/documentation/articles/app-insights-api-custom-events-metrics/#track-metric

Track Metric Use TrackMetric to send metrics that are not attached to particular events. For example, you could monitor a queue length at regular intervals.

Metrics are displayed as statistical charts in metric explorer, but unlike events, you can't search for individual occurrences in diagnostic search.

Metric values should be >= 0 to be correctly displayed.

c# code looks like this

private void Run() {
 var appInsights = new TelemetryClient();
 while (true) {
  Thread.Sleep(60000);
  appInsights.TrackMetric("Queue", queue.Length);
 }
}
2
votes

I don't think there is currently a good way to accomplish this. What you're actually looking for is a way to detect a "stale heartbeat." For example, if your service was sending up an event "Service Health is okay", you'd want an alert that you haven't received one of those events in a certain amount of time. There aren't any date/time conditional operators in AI's alert system.

Microsoft might explain that this scenario is not intended to be satisfied by AI, as this is more of a "health checking" system's responsibility, like SCOM or Operation Insights or something else entirely.

I agree this is something that needs a solution, and using AI for it would be wonderful (I've already attempted to accomplish the same thing with no luck); I just think "they" will say its not a scenario in the realm of responsibility for AI.