I'm reading documentation on the Datastax site at http://www.datastax.com/documentation/cassandra/1.2/cassandra/cql_reference/create_table_r.html and I see: "When you use a composite partition key, Cassandra treats the columns in nested parentheses as partition keys and stores columns of a row on more than one node. "
The example given is: CREATE TABLE Cats ( block_id uuid, breed text, color text, short_hair boolean, PRIMARY KEY ((block_id, breed), color, short_hair) );
I understand how the cluster columns (in this case, color and short_hair) work in regard to how they are actually stored on disk as contiguous "columns" for the given row. What I don't understand is the line "...stores columns of a row on more than one node". Is this right?
For a given block_id and breed, doesn't this composite key just make a partition key similar to "block_id + breed", in which case the columns/clusters would be in the same row, whose physical location is determined by the partition key (block_id + breed) ?
Or is there some kind of splitting in this row going on because the primary key is based on two fields?
EDIT: I think Richard's answer below is probably right, but I've also come across this in the Datastax documentation for 1.2 which enforces the first quote I posted:
"composite partition key - Stores columns of a row on more than one node using partition keys declared in nested parentheses of the PRIMARY KEY definition of a table."
Why would it say using plural partition key*s*... The fields that make up the composite key make up the only row key, as far as I know, and they are all used to make the key.
Then they say, the columns of a row can be split, which to me means a single row (with a given partition key) could have its columns split up on different nodes, which would mean the fields of the composite key are being handled separately.
Still a little confused on the Datastax documentation and whether it's actually right.