Understanding the relationship between primary key and partitioning in Cassandra

Question

I am new to Cassandra and have a few novice level questions in the primary key.

Is the Primary key supposed to be unique per record? (My guess would be not.) To elaborate. Suppose my table looks like this

    CREATE TABLE user_action (
    user_id int,
    action text,
    date_of_action date,
    PRIMARY KEY (user_id)
    )

I am guessing I can have multiple rows with the same user_id

If primary key is not one per record, can a primary key be split across many partitions?
Can a partition have multiple primary keys?
Is the primary key itself decided to pick the partition or is the hashCode of the primary key used to pick a partition?
Is it fair to think of a partition as a file?

LetsNoSQL LetsNoSQL · Accepted Answer · 2020-09-20T05:51:18

Primary key and Partition key in some case would be the same but not always, it depends upon the number of primary keys. Data is distributing based on partition key which is unique across the Cassandra cluster. I am not explaining all the scenario and concept here but yes, you should go through the documentation and I am sure you can understand the things very quick after reading the below link.

https://www.datastax.com/blog/2016/02/most-important-thing-know-cassandra-data-modeling-primary-key

https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/useCompoundPrimaryKeyConcept.html

Understanding the relationship between primary key and partitioning in Cassandra

2 Answers