In my case scenario, Flink is sending the metrics to Datadog. Datadog Host map is as shown below { I have no Idea why is showing me latency here }
Flink metrics are sent to localhost. The issue here is that when
flink-conf.yaml
file configuration is as follows
# adding metrics
metrics.reporters: stsd , dghttp
metrics.reporter.stsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125
# for datadog
metrics.reporter.dghttp.class: org.apache.flink.metrics.datadog.DatadogHttpReporter
metrics.reporter.dghttp.apikey: xxx
metrics.reporter.dghttp.tags: host:localhost, job_id : jobA , tm_id : task1 , operator_name : operator1
metrics.scope.operator: numRecordsIn
metrics.scope.operator : numRecordsInPerSecond
metrics.scope.operator : numRecordsOut
metrics.scope.operator : numRecordsOutPerSecond
metrics.scope.operator : latency
The issue is that Datadog is showing 163 metrics which I don't understand, which I will explain in a while
I don't understand the metrics format in datadog as it shows me metrics something like this
Now as shown in above Image
- Latency is expressed in time
- Number of events per second is event /sec
- count is some value
So my question is that which metric is this?
Also, the execution plan of my job is something like this
How do I relate the metrics in Datadog with execution plan operators in Flink?
I have read in Flink API 1.3.2 that I can use tags, I have tried to use them in flink-conf.yaml file but I don't have complete Idea what sense they make here.
My ultimate goal is to find operator latency, number of records out and in /second at each operator in this case