I want to understand exactly what will improve my performance if I decide to go with following strategy for partition
Lets say I have a table for songs and I want to define artists as the partition key. This table is going to grow gradually. Today I have 25 artists and 5 songs each for those 25 artists (so total 125 rows). But over a period of time i foresee 500 artists and 5 songs per artists (so total 2500) rows. I want to make artist id as partition key because in CQL it is necessary to mention partition key in where clause and in my ui this is the unique value based on which i can show those 5 songs.
Also, what if I start with 2 cassandra nodes today and eventually grow to 4 nodes and then later 10 nodes. Can I continue to have the same partition key as I grow?
Here is my table structure :
ArtistId (partition key) | SongId | Song
--------------------------------------------
1 | 1 | abc
1 | 2 | cde
1 | 3 | fgh
2 | 4 | ijk
2 | 5 | lmn
1 | 6 | opq
1 | 7 | rst
select * from songs where artistid = 1
– Hitesh