6
votes

I am hoping someone can clear my confusion on the difference between a row and partition in Cassandra. I thought a row would be a set of columns(like in a SQL DB), as specified in the schema, distributed across nodes by partition keys and ordered by the clustering key within each partition.

But then I ran into this tutorial: https://academy.datastax.com/demos/getting-started-time-series-data-modeling

Under "Time series Pattern 1", it states:

Since each column is dynamic, our row will grow as needed to accommodate the data.

Why would a row grow? I can see a partition growing but why a row? The picture in that example also makes no sense to me -- I imagine the partition as being a set of rows each having a (WeatherStation|event) columns, where WeatherStationID would be same repeated value for each row in a partition.

I also tried looking at this tutorial: http://www.slideshare.net/yukim/cql3-in-depth, slide 15. enter image description here

My reading is that this shows a single partition with two rows. It seems to me that no matter how much new data you insert, the partition will grow but not the row (short of running "alter table" of course)?

1

1 Answers

4
votes

You are correct. In the article that you posted he talks of partitions, not rows.

The two terms are still used very loosely to mean each other in many situations. Back in the thrift days, the correct term was row but that changed with the new CQL.

On a unrelated note in Thrift you are able grow a row(not partition) since every row had their own Schema. You can find more information on that here: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows