In Kafka, the lowest unit of sending is a record (a KV pair).
Kafka producer attempts to send records in batches in-order to optimize data transmission. So a single push from producer to the cluster -- to the broker leader to be precise -- could contain multiple records.
Moreover, batching always applies only to a given partition. Records produced to different partitions cannot be batched together, though they could form multiple batches.
There are a few parameters which influence the batching behaviour, as described in the documentation:
buffer.memory -
The total bytes of memory the producer can use to buffer records
waiting to be sent to the server. If records are sent faster than they
can be delivered to the server the producer will block for
max.block.ms after which it will throw an exception.
batch.size -
The producer will attempt to batch records together into fewer
requests whenever multiple records are being sent to the same
partition. This helps performance on both the client and the server.
This configuration controls the default batch size in bytes. No
attempt will be made to batch records larger than this size.
Requests sent to brokers will contain multiple batches, one for each
partition with data available to be sent.
linger.ms -
The producer groups together any records that arrive in between
request transmissions into a single batched request. Normally this
occurs only under load when records arrive faster than they can be
sent out. However in some circumstances the client may want to reduce
the number of requests even under moderate load. This setting
accomplishes this by adding a small amount of artificial delay—that
is, rather than immediately sending out a record the producer will
wait for up to the given delay to allow other records to be sent so
that the sends can be batched together. This can be thought of as
analogous to Nagle's algorithm in TCP. This setting gives the upper
bound on the delay for batching: once we get batch.size worth of
records for a partition it will be sent immediately regardless of this
setting, however if we have fewer than this many bytes accumulated for
this partition we will 'linger' for the specified time waiting for
more records to show up. This setting defaults to 0 (i.e. no delay).
Setting linger.ms=5, for example, would have the effect of reducing
the number of requests sent but would add up to 5ms of latency to
records sent in the absence of load.
So from above documentation, you could understand - linger.ms
is an artificial delay to wait if there are not enough bytes to transmit, but if producer accumulates enough bytes before linger.ms
is elapsed, then the request is sent anyway.
On top of that, batching is also influenced by max.request.size
max.request.size -
The maximum size of a request in bytes. This setting will limit the
number of record batches the producer will send in a single request to
avoid sending huge requests. This is also effectively a cap on the
maximum record batch size. Note that the server has its own cap on
record batch size which may be different from this.