0
votes

I've noticed that changing a regular Cassandra column to a clustering column can significantly reduce the size of the table in some circumstances.

For this example table:

id     UUID        K
time   TIMESTAMP   C
state  TINYINT    (C)
value  DOUBLE

The size of 100000 rows is estimated at 3.9 MB if state is an ordinary column, or 2.4 MB if state is a clustering column (estimated using the method in DataStax course DS220).

If you look at how the data is physically stored it isn't hard to see why this difference exists. In the former case there are two internal cells per timestamp - one for state and one for value. In the latter case value is incorporated into the cell key so there is just one cell per timestamp, and the timestamp (part of the cell key) is stored only once.

The second clustering column does not create any new restrictions on what can be queried. SELECT * FROM table WHERE id=? AND time>=? AND time<? is still fine.

It seems like a win-win situation. Are there any downsides, in particular, performance-wise?

(All I can think of is that if state is a regular column then it can be omitted from an INSERT and the state internal cell will never be created. I imagine if state is a regular column and usually omitted then the table will be very slightly smaller than if state is a clustering column.)


Additional comments It's worth noting that in the definition above you can't filter by state without an equality filter on time, making it not very useful for filtering state. And if you put the state column above time to resolve this then yes you can filter by state and time inequality, but if you want all states (IN clause) then the rows are returned ordered by state first, then time, which again is not very useful.

2

2 Answers

1
votes

I would think the main difference here is that if it's a clustering column it must be provided with INSERTs as it's part of the primary key. Also, as it's part of the primary key, you can't update it either, which could be problematic for some tables. If you don't have any concerns about either of those two, I don't see any reason why you couldn't add it.

1
votes

1) You create a row per state. Your data model would have to realize and understand that. You could potentially create two rows with the different states for the same id, time, which the original model disallows.

2) If you delete, you'll either need to specify state or you'll be creating Range Tombstones (range deletes, because you're deleting all rows for a given id and time, but it may be a range of states). Range tombstones are especially expensive (on the read path) in 2.1, and aren't properly accounted for in TombstoneOverwhelming exception handlers until a fairly recent version of Cassandra, so avoiding range tombstones is usually a good idea, unless you actually need them.