1
votes

I've understood difference b/w Cassandra Partition key, Composite key, Clustering key. But not finding enough information to understand how partition is handled in cassandra.
In cassandra, range of partition keys are stored on a node like a partition/shard. Is my understanding is correct or not..?
Is each partition key has different file(at the system level) in DB..? If so, won't the reads be slower..?
If each partition key is not having different file in DB. How it's handled..?

1

1 Answers

0
votes

Data is stored in Cassandra in wide rows called partitions. Each row has a partition key used for identifying that partition. For distributing the data across the cluster, Cassandra is using partitioners which are basically computing hashes of the partition key and the data is distributed across the cluster based on these values. The default partitioner in Cassandra is Murmur3Partitioner.

At OS level, the data is stored in sstables files. A partition can be spread across many sstables. That's why you also need compaction, which is the process of consolidating those sstables, so your partitions won't be spread across a lot of sstables. Reducing the number of sstables a partitions is spread across, will also improve read time. It's worth noting that sstables are immutable.

I suggest reading this, especially "How Cassandra reads and writes data".