4
votes

If at the time of creation of a Kinesis data stream I specify the number of shards to be let's say 10, and every time I put record I assign it a random Partition key like this:

 var putRecord = new PutRecord
            {
                Data = data ?? new byte[0],
                StreamName = stream,
                PartitionKey = GetRandomPartitionKey()
            };

How will kinesis decide to put a record in a certain shard, and what happens if the number of unique Partition keys is more than the number of shards?

1

1 Answers

6
votes

Hashing and modulo.

The Partition Key is hashed and then divided by the number of shards. The modulo of the division determines the shard to use. This way, the partition key always uses the same shard. If the number of shards is changed, then the allocation will be different.

This is a common method used in many systems. For example, the internal storage of Python dictionaries uses this method to assign storage for key/value pairs.