We have a Kafka consumer which will read messages and do so stuff and again publish to Kafka topic using below script
producer config :
{
"bootstrap.servers": "localhost:9092"
}
I haven't configured any other configuration like queue.buffering.max.messages
queue.buffering.max.ms
batch.num.messages
I am assuming these all will be going to be default values from configuration
queue.buffering.max.messages : 100000
queue.buffering.max.ms : 0
batch.num.messages : 10000
my understanding : When internal queue reaches either of queue.buffering.max.ms or batch.num.messages messages will get published to Kafka in separate thread. in my configuration queue.buffering.max.ms is 0, so every message will be published as soon as when I call produce(). correct me if I am wrong.
My producer snippet:
def send(topic, message):
p.produce(topic, json.dumps(message), callback=delivery_callback(err, msg))
p.flush()
from this post i understand that using flush after every message, producer is going to be sync producer . if I use above script it is taking ~ 45ms to publish to Kafka
If I change above snippet to
def send(topic, message):
p.produce(topic, json.dumps(message), callback=delivery_callback(err, msg))
p.poll(0)
Is there any performance will be improved ? Can you clarify my understanding.
Thanks