I found this article while I was working on calculating my producer throughput. In it Jay Kreps says that for a Single producer thread and 3x synchronous partition replication he gets 421,823 records/sec. His records are 100 bytes each, he has 6 partitions and 6 brokers. He is also using a callback based send so that means that he can guarantee the ordering of messages.
I am using a Kafka as a service running a single broker, 6 partitions, 1x replication. I send roughly the same sized records and I get 23 records/sec. Unlike Jay, I'm using schema registry for avro serialization. I have tried all types of sending the Kafka Producer API provides:
- calling
.get
on the future - sending messages with a callback
- sending messages without a callback
I am not even remotely close to the number given above. I want to guarantee the order of messages so I would like to have at least a callback passed along with the record.
I am aware that chasing his benchmark will be difficult and that's not my goal. I just feel like there's something fundamental that I am missing. Can I ask for some suggestions? I will provide as much additional context as is necessary.
kafka-producer-perf-test
script, what do you get? – OneCricketeer