Put primary keys in cassandra for updating record

1

votes

I have a table in cassandra. My task is: I want to select records with time range (so timestamp must be primary key, and not use allow filtering) when I insert record and provider_id and filename exists in table, the record is updated

CREATE TABLE test (
    name text,
    filename text,
    timestamp timestamp,
    is_deleted boolean,
    PRIMARY KEY (provider_id, filename, timestamp)
)

cassandracql

Do you want to select or update? – xmas79

3

votes

You can't update primary key column, It will insert another record.
That's how cassandra work.
You have to select the timestamp with provider_id, filename then delete with provider_id, filename and timestamp and reinsert with new timestamp

1

votes

If you want to select items depending on a timerange you should use clustering columns. Your create statement should be:

CREATE TABLE test (
    provider_id UUID,
    name text,
    filename text,
    timestamp timestamp,
    is_deleted boolean,
    PRIMARY KEY ((provider_id, filename), timestamp)
)

Now provider_id + filename is your partition key, and timestamp your clustering column.

The composite partition key consists of provider_id and filename. The clustering column, timestamp, determine the clustering order of the data. Generally, Cassandra will store columns having the same provider_id but a different filename on different nodes, and columns having the same provider_id and filename on the same node.

This means that you can now query your data like this:

SELECT * FROM test 
WHERE provider_id = 1
AND filename = "test.txt"
AND timestamp >= '2016-01-01 00:00:00+0200' AND  ts <= '2016-08-13 23:59:00+0200'

And for a possible update:

UPDATE test 
SET name = "test-new"
WHERE provider_id = 1
AND filename = "test.txt"
AND timestamp >= '2016-01-01 00:00:00+0200' AND  ts <= '2016-08-13 23:59:00+0200'

More info

Put primary keys in cassandra for updating record

2 Answers