2
votes

I have table with node_id, node_name and data. My requirement is to getByID and getByName. So I have made id and name as the primary keys. But I also need to sometimes update the name as well.

I know Cassandra does not allow updating primary keys and having non primary key in the WHERE clause.

How can I achieve this?

I did consider deleting the record first, and then inserting again with the same id and new name. But I read that this would create tombstones and affect the performance.

2

2 Answers

1
votes

Use only node_id as the primary key. To implement getByName create a materialized view. materialized views in cassandra

-1
votes
    create table users_by_id_name(
        id int,
        createdOn bigint, -- timestamp in millisec
        name text,
        age int, 
        primary key (id,name,createdOn)
    )WITH CLUSTERING ORDER BY ( name DESC, createdOn DESC);

Use above table definition to insert users. Insert query --

insert into users_by_id_name (id,createdOn,name,age) values (1,100,'darthvedar',28);

to update the user, insert the row again with same user id and updated name and createdOn value.

insert into users_by_id_name (id,createdOn,name,age) values (1,200,'obi-wan-kenobi',28);

while selecting the user use below query --

select by user id -

select * from users_by_id_name where id=1 limit 1;

Select user by name -

select * from users_by_id_name where name='obi-wan-kenobi' ALLOW FILTERING;

Other way is to use secondary index on user name. Think, user name is not going to change too frequently, so secondary index is also one good option.

Edit after comments -

If you have very frequent updates on user name, it would be better to use two different tables.

   create table users_by_id(
    id int,
    name text,
    age int, 
    primary key (id)
);
   create table users_by_name(
    id int,
    name text,
    age int, 
    primary key (name)
);

While inserting , insert in both the tables using batch statement.

Hope this will help.