Does Cassandra Store Columns from Composite Keys on Different Nodes

Question

I'm reading documentation on the Datastax site at http://www.datastax.com/documentation/cassandra/1.2/cassandra/cql_reference/create_table_r.html and I see: "When you use a composite partition key, Cassandra treats the columns in nested parentheses as partition keys and stores columns of a row on more than one node. "

The example given is: CREATE TABLE Cats ( block_id uuid, breed text, color text, short_hair boolean, PRIMARY KEY ((block_id, breed), color, short_hair) );

I understand how the cluster columns (in this case, color and short_hair) work in regard to how they are actually stored on disk as contiguous "columns" for the given row. What I don't understand is the line "...stores columns of a row on more than one node". Is this right?

For a given block_id and breed, doesn't this composite key just make a partition key similar to "block_id + breed", in which case the columns/clusters would be in the same row, whose physical location is determined by the partition key (block_id + breed) ?

Or is there some kind of splitting in this row going on because the primary key is based on two fields?

EDIT: I think Richard's answer below is probably right, but I've also come across this in the Datastax documentation for 1.2 which enforces the first quote I posted:

"composite partition key - Stores columns of a row on more than one node using partition keys declared in nested parentheses of the PRIMARY KEY definition of a table."

Why would it say using plural partition key*s*... The fields that make up the composite key make up the only row key, as far as I know, and they are all used to make the key.

Then they say, the columns of a row can be split, which to me means a single row (with a given partition key) could have its columns split up on different nodes, which would mean the fields of the composite key are being handled separately.

Still a little confused on the Datastax documentation and whether it's actually right.

Richard Richard · Accepted Answer · 2013-07-15T09:04:01

I think what it means is that rows with the same block_id are stored on different nodes. As you say, the partition key is like "block_id + breed", so columns with the same block_id but different breed will in general be stored on different nodes. But columns with the same block_id and breed will be stored on the same node.

Basically, the nodes that store a partition are found by a function of the partition key only. Whether it is composite or not, nothing else can join together or split rows.

Does Cassandra Store Columns from Composite Keys on Different Nodes

1 Answers