According to the Apache Flink documentation, KeyBy transformation logically partitions a stream into disjoint partitions. All records with the same key are assigned to the same partition.
Is KeyBy 100% logical transformation? Doesn't it include physical data partitioning for distribution across the cluster nodes? If so, then how it can guarantee that all the records with the same key are assigned to the same partition?
For instance, assuming that we are getting a distributed data stream from Apache Kafka cluster of n nodes. Apache Flink cluster running our streaming job consists of m nodes. When the keyBy transformation is applied on the incoming data stream, how does it guarantees logical data partitioning? Or does it involve physical data partitioning across the cluster nodes?
It seems I am confused between logical and physical data partitioning.