I am creating an stream where the source(producer) is producing around 12 million records in around 8 mins, the transformer (consumer) starts consuming them ok, but at some point around 4 mins into it the following shows in the log of the app, and it stops receiving anything past this point:
2018-07-11 21:59:18,811 24043857 [kafka-coordinator-heartbeat-thread | cdSomeApp] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-2, groupId=cdSomeApp] Marking the coordinator 10.16.17.59:9092 (id: 2147483644 rack: null) dead
2018-07-11 21:59:18,815 24043861 [cdSomeApp.cd-source.container-0-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-2, groupId=cdSomeApp] Discovered group coordinator 10.16.17.59:9092 (id: 2147483644 rack: null)
2018-07-11 21:59:18,815 24043861 [cdSomeApp.cd-source.container-0-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-2, groupId=cdSomeApp] Marking the coordinator 10.16.17.59:9092 (id: 2147483644 rack: null) dead
2018-07-11 21:59:18,930 24043976 [cdSomeApp.cd-source.container-0-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-2, groupId=cdSomeApp] Discovered group coordinator 10.16.17.59:9092 (id: 2147483644 rack: null)
2018-07-11 21:59:18,933 24043979 [cdSomeApp.cd-source.container-0-C-1] ERROR o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-2, groupId=cdSomeApp] Offset commit failed on partition cdSomeApp.cd-source-0 at offset 140802810: The coordinator is not aware of this member.
2018-07-11 21:59:18,937 24043983 [cdSomeApp.cd-source.container-0-C-1] ERROR o.s.k.listener.LoggingErrorHandler - Error while processing: null
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:787)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:735)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:814)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:794)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:507)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268)
From what I can see the default values for kafka configuration should work ok, but if anybody knows better, please advice?
thanks!