1
votes

Have a ArrayList containing 80 to 100 records trying to stream and send each individual record(POJO ,not entire list) to Kafka topic (event hub) . Scheduled a cron job like every hour to send these records(POJO) to event hub.

Able to see messages being sent to eventhub ,but after 3 to 4 successful run getting following exception (which includes several messages being sent and several failing with below exception)

    Expiring 14 record(s) for eventhubname: 30125  ms has passed since batch creation plus linger time

Following is the config for Producer used,

    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.ACKS_CONFIG, "1");
    props.put(ProducerConfig.RETRIES_CONFIG, "3");

Message Retention period - 7 Partition - 6 using spring Kafka(2.2.3) to send the events method marked as @Async where kafka send is written

    @Async
    protected void send() {
       kafkatemplate.send(record);
    }

Expected - No exception to be thrown from kafka Actual - org.apache.kafka.common.errors.TimeoutException is been thrown

2
The error is saying you've not yet filled the batch size of the producer (the records aren't sent immediately). You could either reduce the batch size in the producer configs or periodically flush the producer on your ownOneCricketeer
many thanks for the reply @cricket_007 what desired size you would recommend as the default size is 16384Prakash_se7en
Are your 80-100 records in total larger than 1.6 MB?OneCricketeer
it will be close to 150-200 kb @cricket_007Prakash_se7en
Oops, I meant 1.6 Kb above. Okay, so on the low end, 150000/16384 is about 9 total batches, by default, with some remainder. You'll need to adjust the value such that you won't have data remaining in a un-sent batchOneCricketeer

2 Answers

5
votes

Prakash - we have seen a number of issues where spiky producer patterns see batch timeout.

The problem here is that the producer has two TCP connections that can go idle for > 4 mins - at that point, Azure load balancers close out the idle connections. The Kafka client is unaware that the connections have been closed so it attempts to send a batch on a dead connection, which times out, at which point retry kicks in.

  • Set connections.max.idle.ms to < 4mins – this allows Kafka client’s network client layer to gracefully handle connection close for the producer’s message-sending TCP connection
  • Set metadata.max.age.ms to < 4mins – this is effectively a keep-alive for the producer metadata TCP connection

Feel free to reach out to the EH product team on Github, we are fairly good about responding to issues - https://github.com/Azure/azure-event-hubs-for-kafka

0
votes

This exception indicates you are queueing records at a faster rate than they can be sent. Once a record is added a batch, there is a time limit for sending that batch to ensure it has been sent within a specified duration. This is controlled by the Producer configuration parameter, request.timeout.ms. If the batch has been queued longer than the timeout limit, the exception will be thrown. Records in that batch will be removed from the send queue.

Please check the below for similar issue, this might help better.

Kafka producer TimeoutException: Expiring 1 record(s)

you can also check this link

when-does-the-apache-kafka-client-throw-a-batch-expired-exception/34794261#34794261 for reason more details about batch expired exception.

Also implement proper retry policy.

Note this does not account any network issues scanner side. With network issues you will not be able to send to either hub.

Hope it helps.