I'm currently running kafka 0.10.0.1 and the corresponding docs for the two values in question are as follows:
heartbeat.interval.ms - The expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.
session.timeout.ms - The timeout used to detect failures when using Kafka's group management facilities. When a consumer's heartbeat is not received within the session timeout, the broker will mark the consumer as failed and rebalance the group. Since heartbeats are sent only when poll() is invoked, a higher session timeout allows more time for message processing in the consumer's poll loop at the cost of a longer time to detect hard failures. See also max.poll.records for another option to control the processing time in the poll loop.
It isn't clear to me why the docs recommend setting heartbeat.interval.ms
to 1/3 of session.timeout.ms
. Does it not make sense to have these values be the same since the heartbeat is only sent when poll()
is invoked, and thus when processing of the current records is done?