1
votes

As known, in relational databases, when adding a new column, data must be reallocated (ALTER TABLE without locking the table?) to maintain a single row contiguous on disk.

enter image description here

I would like to understand how this is achieved on wide-columns storages such as Cassandra, which are sparses and can handle lots of dynamic columns insertions (http://www.datastax.com/dev/blog/thrift-to-cql3 (Dynamic Column family))

Thanks!

1

1 Answers

2
votes

In Cassandra adding a Column is adding a bit like adding row in relational database. You can even delete a column for a specific row:

delete first_name from user where user_id='abcd';

In CQL, alter table doesn't modify all rows, in short it just modifies the schema dictionary which describe tables (look at tables prefixed schema_ in the system keyspace). This changes only CQL parsing (the new column is now recognized) and interpretation (select * from user meaning is changed).

When you drop a column, data doesn't appear anymore in query results yet it is still present in SSTables. The data will be removed (and space freed) during a future compaction (like tombstones).