With Cassandra it is possible to specify the cluster ordering on a table with a particular column.
CREATE TABLE myTable (
user_id INT,
message TEXT,
modified DATE,
PRIMARY KEY ((user_id), modified)
)
WITH CLUSTERING ORDER BY (modified DESC);
Note: In this example, there is one message per user_id (intended)
Given this table my understanding is that the query's performance will be better in cases where recent data is queried.
However, if one where to make updates to the "modified" column does it add extra overhead on the server to "re-order" and is that overhead vs query performance significant?
In other words given this table would it perform better if the "CLUSTERING ORDER BY (modified DESC)" was dropped?
UPDATE: Updated the invalid CQL by adding modified to primary key, however, the original questions still stand.
modified
is not defined as the clustering key, so you can't define a clustering order on it. To fix this, the primary key should be defined asPRIMARY KEY (user_id, modified)
. For more information regarding the composite key, and the characteristics of the clustering key stackoverflow.com/questions/24949676/… – Carlos Monroy Nieblasmodified
, you won't be able to update that record (as explained in stackoverflow.com/questions/27075596/…); Cassandra is an append-only database engine: this means that any update to the records will add a new record with a different timestamp, a select will consider the records with the latest timestamp. This means that there is no "re-order" operation ever. – Carlos Monroy Nieblas