Why shouldn't I give all my DynamoDB items in the same partition key value?

Question

There are plenty of resources that recommend using high-cardinality attributes as partition keys. My question is, what will happen if I instead do the exact opposite of this and give all of my items the same partition key value (differentiating only by sort key), allowing me to query over the entire table?

Will this cause performance and/or hot partition issues? Do hot partitions even matter with adaptive capacity if they aren't reaching 3000 RCUs/1000 WCUs? Even then, what if queries are evenly distributed among my sort key?

Consensus seems to be that we shouldn't do this, but my question is: Why not?

jellycsc jellycsc · Accepted Answer · 2020-11-21T02:26:28

The recommendations and best practices are there to guide you to benefit the most from using DynamoDB. Typically, people use DynamoDB for storing massive and high-velocity data that suffers from scalability problems in the traditional RDBMS.

If you are talking about a small amount of data where the aggregated access velocity doesn't exceed 3000 RCUs/1000 WCUs, that's not enough for you to reach the pain point of using DynamoDB. In fact, you can probably achieve the same level of performance if you use a traditional RDBMS. However, as soon as your app becomes popular, or even if your app just encountered a spike over the time span of 5 minutes, the amount of data and velocity quickly increases, and you will feel the pain. That's why following best practices will usually give you this kind of future proof benefit.

Even then, what if queries are evenly distributed among my sort key?

DynamoDB splits partitions by sort key if the collection size grows bigger than 10 GB. [ref] So it's likely that you will still have the hot partition problem.

Don't get me wrong. There are use cases that require using the same partition key, such as modeling one-to-many and many-to-many relations of your data. These are valid use cases since data is relational by nature and that's the only way to efficiently model it in DynamoDB. However, if you choose to do the exact opposite of what the documentation suggests, your scalability is limited and you will not be able to take the full benefit from DynamoDB.

Why shouldn't I give all my DynamoDB items in the same partition key value?

3 Answers