1
votes

I have Kafka stream application with 1.0.0 Kafka stream API. I have single broker 0.10.2.0 kafka and single topic with single partition. All configurable parameters are same except producer request.timeout.ms. I configured producer request.timeout.ms with 5 minutes to fix Kafka Streams program is throwing exceptions when producing issue.

In my stream application, I read the events from Kafka, process them and forward to another topic of same kafka.

After calculating the statistics, I observed that processing is taking 5% of time and remaining 95% of time is taking for reading & writing.

Even though I have tens of millions of events in Kafka, Some times Kafka poll is returning single digit of records and some times Kafka poll is returning thousand of records.

Some times context forward is taking more time to send fewer records to kafka and some times context forward is taking less time to send more records to kafka.

I tried to increase reading performance by increasing max.poll.records,poll.ms values. But no luck.

How can I improve the performance while reading and forwarding? How kafka poll and forward would work? What are parameters helped to improve the performance?

Following are few important producer config parameters in my application.

acks = 1
batch.size = 16384
buffer.memory = 33554432
compression.type = none
connections.max.idle.ms = 540000
enable.idempotence = false
linger.ms = 100
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 240000
retries = 10
retry.backoff.ms = 100
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
transaction.timeout.ms = 60000
transactional.id = null

Following are few important consumer config parameters in my application:

auto.commit.interval.ms = 5000
auto.offset.reset = earliest
check.crcs = true
connections.max.idle.ms = 540000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
heartbeat.interval.ms = 3000
internal.leave.group.on.close = false
isolation.level = read_uncommitted
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 2147483647
max.poll.records = 10000
metadata.max.age.ms = 300000
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000

Following are few important stream config parameters in my application:

application.server =
buffered.records.per.partition = 1000
cache.max.bytes.buffering = 10485760
commit.interval.ms = 30000
connections.max.idle.ms = 540000
key.serde = null
metadata.max.age.ms = 300000
num.standby.replicas = 0
num.stream.threads = 1
poll.ms = 1000
processing.guarantee = at_least_once
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
replication.factor = 1
request.timeout.ms = 40000
retry.backoff.ms = 100
rocksdb.config.setter = null
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
state.cleanup.delay.ms = 600000
timestamp.extractor = null
value.serde = null
windowstore.changelog.additional.retention.ms = 86400000
zookeeper.connect =
2
Hi, have you found anything interesting?simPod

2 Answers

0
votes

You can bring in parallelism in your operation by controlling the key and increasing the number of partitions of the topic.

The above would increase the number of Kafka streams to process parallely. This can be handled by increasing the number of threads for the Kafka streams applications

0
votes

You can create multiple Kafka Consumer, in different threads, and assigning it, to the same consumer group. They will consume messages in parallel and will not lose messages.

How do you send messages? With Kafka you can send messages in a Fire-and-Forget way: it improves the throughput.

producer.send(record);

The acks parameter controls how many partition replicas must receive the record before the producer can consider the write successful.

If you set ack=0 the producer will not wait for a reply from the broker before assuming the message was sent successfully. However, because the producer is not waiting for any response from the server, it can send messages as fast as the network will support, so this setting can be used to achieve very high throughput.