2
votes

I've got the problem with producing messages to Kafka topic.

When producing a few messages to Kafka topic, intermittently it fails with an exception:

org.apache.kafka.common.errors.TimeoutException: Expiring 6 record(s) for some-topic-1: 30056 ms has passed since batch creation plus linger time

The problem seems to be different than others I already found here. It happens with really low load. I'm pretty sure that I not exceed buffer size (default value). linger.ms setting is set to 0 (default). request.timeout.ms is also set to default which is 30000.

Actually my question is: is there anything that could hold sending kafka message process for 30 seconds between calling send() method and actually sending it? I'm looking for anything that could make ProducerBatch living time longer than 30 seconds in very low load (or without any load).

I use Kafka managed service from external provider so I asked him about broker status and he said everything is ok. Btw, it happens on three different Kafka instances. Kafka Client version doesn't matter too - it's happening for both 0.11.0.0 and 2.0.1.

1
have a look at the chart here it contains all the settings that may affect the timeoutPaizo
Possible duplicate of stackoverflow.com/q/46750420hongsy

1 Answers

2
votes

I've just looked at the Kafka documentation for version 2.0.X and I would say that you're facing this error because of the default settings and the low message ingestion load.

Let's try and dissect your error message:

org.apache.kafka.common.errors.TimeoutException: Expiring 6 record(s) for some-topic-1: 30056 ms has passed since batch creation plus linger time

  1. First clue in the error is 30056 ms has passed - The configuration request.timeout.ms is playing out here. The default value for this config is 30000 i.e. 30 seconds. This 30 seconds wait is for filling up the buffer on the Producer's side - I've explained this in the next point.
  2. Second clue in the error message is since batch creation - When you invoke send() method - the message gets buffered at the Producer's side and it will wait for 30 seconds (as directed by the first config property) to fill the batch. Now, the batch.size has a default value of 16384 bytes i.e. 16 KB so if the message ingestion load is low (as it is in your case) and the buffer doesn't fill up with more messages to the size of 16 KB in 30 seconds - you would expect this error message. So, in your case it seems that the Expiring 6 record(s) don't really make up the 16 KB required to fill the batch.
  3. Lastly, the error also makes a mention of plus linger time - This is the linger.ms configuration that finds it's use in situations where the ingestion load is high and the producer wants to limit the send() calls to the Kafka broker(s). This is the duration the Producer will wait for before sending the messages to the broker after the batch is ready (i.e. buffer is filled up to batch.size). The default value for this is 0 so in your case, the producer will wait for 0 milliseconds after the batch is ready. But your buffer doesn't actually fill up in the given time.

Now, to answer your question: You could increase linger.ms to be able to hold the messages longer after the batch is ready. If you need more time to fill the batch then you'll need to increase request.timeout.ms. You might as well try a combination of both.

On similar lines, to fix this error you could increase request.timeout.ms or reduce batch.sizeor maybe both.

Hope this helps!