0
votes

Given below, CQL for 3 tables. Both have same column structure, But difference in setting the PRIMARY KEY.

tab1: NO compound primary key

CREATE TABLE tab1
(
key1 text,
key2 text,
key3 text,
key4 text,
data1 text,
data2 text,
data3 int,
PRIMARY KEY(key1,key2,key3,key4));

tab2: (key1,key2) forms compound primary key

CREATE TABLE tab2
(
key1 text,
key2 text,
key3 text,
key4 int,
data1 text,
data2 text,
data3 text,
PRIMARY KEY((key1,key2),key3,key4));

tab3: (key1,key2,key3) forms compound primary key

 CREATE TABLE tab3
    (
    key1 text,
    key2 text,
    key3 text,
    key4 int,
    data1 text,
    data2 text,
    data3 text,
    PRIMARY KEY((key1,key2,key3),key4));

While querying value1,value2,value3 is known and key4 is specified as a range. Sample CQL query,

select data1,data2,data3 from tab3 where key1='value1' and key2='value2' and key3='value3' and key4 > 1000 and key4 < 1000000 ;

key4 may have some 50,000 records.

Which TABLE Design in better?

Which design have better read/write performance?

1
Tab3 is the best design for read. Your write performance doesn't change that much. Write depends a lot on consistency you use. Best way to find this out is load sample data and test for your use case. Size of data matters for any kind of performance .Ananth

1 Answers

1
votes

If you need to support range queries over key4, then it needs to be a clustering column, so that rules out tab1. Since you're always specifying an exact value for key3, there's no need to make it a clustering column, so tab3 is a better choice than tab2. Leaving key3 in the partition key will partition your data more evenly around the cluster.