1
votes

We have a topic with 5 partitions. We are defining the partition based on the checksum of the key. There are cases where there are no key resolving to partition 3 and so there are no commits made. Hence after the configured number of days for offset retention, the consumer current offset starts showing unknown. We will need to resolve this, hence thought we will have to set log and offset retention at topic level. In the config, I see that we have config:retention.ms is for log retention but did not find corresponding offset retention configuration. Can someone please help on the same.

Edit: bin/kafka-topics.sh --zookeeper XXX --alter --topic XXXX --config retention.ms=86400000

The above is used to set the log retention time specific to the topic. But how can we specify the offset retention in the query.

3

3 Answers

2
votes

Committed consumer offsets for all consumers and all topics are stored in a single internal "__consumer_offsets" topic. Therefore you cannot control offset retention individually per topic, I'm afraid.

NB. I see this can be problematic for the case when there are no messages for prolonged periods of time on one of your topics' partitions.

I found the following ticket that can be of help: https://issues.apache.org/jira/browse/KAFKA-3806

The first comment suggests to commit offsets even in the case the consumer is making no progress (there are no new messages arriving for a given partition), to avoid this exact problem:

you would want to keep committing the offsets even though they are not changing

0
votes

I think you're looking for log.retention.bytes.

That there is no data at all within the retention period is however something you should fix. Either by decreasing the number of partitions or use another algorithm to create the key.

0
votes

You can configure offset retention in server.properties using the parameter "offsets.retention.minutes". Default value is 1440

Offset retention is system wide so you cannot set it on an individual topic level