Cassandra schema table suggestions

Question

I need a suggestion on designing a Cassandra table schema. I have a created a table like this:

CREATE TABLE sams.events (
    addedtime timestamp,
    hostname text,
    appname text,
    eventtime timestamp,
    PRIMARY KEY (addedtime, hostname)
) WITH CLUSTERING ORDER BY (hostname ASC)

Now these are my requirements:

1) I should be able to make range queries via addedtime, like from x date to y date

2) I should be able to query by appname and order the rows in ascending order using addedtime

How can I achieve this? I am ok to change the table schema.

Adding I have created Cassandra cluster of 2 DC and 3 nodes each.

Aravind Chamakura Aravind Chamakura · Accepted Answer · 2015-08-23T06:09:31

You mentioned you have only 2 apps. How many hostname you have, is it equal to or greater than number of nodes in the cluster ? If yes, then you can try the following which can give you even spread of data.

CREATE TABLE mykeyspace.events (
appname text,
hostname text,
addedtime timeuuid,
eventtime timeuuid,
PRIMARY KEY ((appname, hostname), addedtime)
);

insert into events (appname, hostname , addedtime , eventtime ) values ('app1','host1',now(), now());
insert into events (appname, hostname , addedtime , eventtime ) values ('app1','host1',now(), now());
insert into events (appname, hostname , addedtime , eventtime ) values ('app1','host2',now(), now());
insert into events (appname, hostname , addedtime , eventtime ) values ('app1','host3',now(), now());
insert into events (appname, hostname , addedtime , eventtime ) values ('app1','host4',now(), now());

Query1: Range query by added time (hoping your number of hostname is not a higher number, otherwise it will be big in clause)

select * from events where appname = 'app1' and hostname in ('host1','host2') and addedtime > maxTimeuuid('2015-08-23 00:46:00-0500') and addedtime < minTimeuuid('2015-08-23 00:49:19-0500') ;

Query 2: By App name (again hoping your number of hostname is not a higher number)

select appname,hostname,dateOf(addedtime) from events where appname = 'app1' and hostname in ('host1','host2');

NOTE: IN clause does not support ordering of data using a query.

Cassandra schema table suggestions

2 Answers