Does Cassandra store only the affected columns when updating a record or does it store all columns every time it is updated?

Question

If the answer is yes,

Does that mean unlike Mongo or RDMS, whether we retrieve every column or some column will have big performance impact in Cassandra?(I am not talking about transfer time over network as it will affect all of the above)
Does that mean during compaction, it cannot just stop when it finds the latest row for a primary key, it has to go through the full set in SSTables? (I understand there will be optimisations as previously compacted SSTable will have maximum one occurrence for row)

Aaron Aaron · Accepted Answer · 2020-06-16T12:59:15

Please ask only one question per question.

That is entirely up to you. If you write one column value, it'll persist just that one. If you write them all, they will all persist, even if they are the same as the current value.

whether we retrieve every column or some column will have big performance impact

This is definitely the case. Queries for column values that are small or haven't been written to or deleted will be much faster than the opposite.

during compaction, it cannot just stop when it finds the latest row for a primary key, it has to go through the full set in SSTables?

Yes. And not just during compaction, but read queries will also check multiple SSTable files.

Does Cassandra store only the affected columns when updating a record or does it store all columns every time it is updated?

1 Answers