1
votes

I am in the process of setting up monitoring for kafka consumers and brokers. Monitoring the server metrics seems fairly trivial but I am confused with the kafka consumer metrics specifically lag.

I initially obtained the consumer lag on a topic per partition running the consumer-groups.sh describe group script programmatically. There is also the consumer_offsets topic which I believe reveals the lag as well. But I was informed this lag value is not accurate and I should be obtaining it via jmx metrics on the consumer host. Can someone verify if this is correct and why? Basically I want to know which would be the most reliable means to find the correct lag for a consumer.

This is what I am told I should be retrieving: kafka.consumer:type=consumer-fetch-manager-metrics,client-id={client-id} Attribute: records-lag-max

The problem is that not sure how to access the consumer client server if not given the port or is there a default port for this? Also do all kafka consumer clients register jmx metrics?

Thanks

1

1 Answers

1
votes

Well, not all consumers will be Java clients, but for those that are, the JMX Metrics exposed there would be the most immediately updated, as the conusmer knows what data is consumed and when. And offsets consumed can be blocking synchrous or asynchronous, and only then would end up in the offsets topic.

But if you do want to monitor all consumer groups of all clients, you can use the offsets topic and tools like LinkedIn's Burrow to easily get a REST API over that information. Telegraf also has a plugin for exporting that data into a metric store

how to access the consumer client server if not given the port or is there a default port for this

You would enable the JMX on the Java clients the same as the servers, and other JVM processes. The port is something you would choose