0
votes

We consider using kafka as critical messaging middle-ware.
But it looks like message durability guarantee is optimistic in kafka replication design:

For better performance, each follower sends an acknowledgment after the message is written to memory. So, for each committed message, we guarantee that the message is stored in multiple replicas in memory However, there is no guarantee that any replica has persisted the commit message to disks though.

In worst case, if whole cluster outage at same time before flush acknowledged messages to disk, some data may get lost. Is it possible to avoid this case?

2

2 Answers

0
votes

There are multiple configurations to adjust frequency of log flushes. You can increase the time the flush scheduler thread checks if a flush is necessary log.flush.scheduler.interval.ms and you can decrease the number of messages needed to trigger a flush log.flush.interval.messages.

Although you'd never need to worry about this case if you're able to replicate across different data centers.

0
votes

I don't think it is possible to guarantee that an acknowledged message won't get lost. However we can reduce the probability of loss by taking certain measures listed below-->

  1. Increase the replication factor for the topic

  2. In Producer code set the config acks=all

  3. Keep min.insync.replicas high

For example, by using replication factor of 5, min.insync.replicas=4 and acks=all, a message will not be acknowledged until at least 4 replicas have received the message(not necessarily persisted though!).

Higher the number less probably your message will get lost.