0
votes

I have a process that needs to dispatch certain data to flink cluster (via kafka) consisting of 3 nodes. There will be two different topics in total as far as I can predict. All messages will be timestamped. Order of messages must persist.

I am not able to understand the mechanism behind message partitioning (key). If I wish to have a simple message dispatcher as described above, is a message partition important? If so, based on what should I pick it?

1

1 Answers

0
votes

It's not clear what ordering you need (by some UUID, server producing the data, some other event type, etc), but Kafka message ordering is only guaranteed within partitions of any topic, and any Kafka consumers including Flink are guaranteed to read its assigned partition in order.

The default Kafka partitioner by the producer will use a Murmur2 hash algorithm if you've specified a key. If you have a null key, then it will round-robin all the partitions evenly.

Flink producer overwrites this behavior last time I checked. See FLINK-9610. If you are using Flink as a consumer only, then you don't need to worry about this.

Messages are timestamped in Kafka by default