1
votes

my aim is to get the msgAddDate based on below query :

select max(msgAddDate)

from sampletable

where reportid = 1 and objectType = 'loan' and msgProcessed = 1;

Design 1 :

here the reportid, objectType and msgProcessed may not be unique. To add the uniqueness I have added msgAddDate and msgProcessedDate (an additional unique value).

I use this design because I don't perform range query.

Create table sampletable ( reportid INT,

objectType TEXT,

msgAddDate TIMESTAMP,

msgProcessed INT,

msgProcessedDate TIMESTAMP,

PRIMARY KEY ((reportid ,msgProcessed,objectType,msgAddDate,msgProcessedDate));

Design 2 :

create table sampletable (

reportid INT,

objectType TEXT,

msgAddDate TIMESTAMP,

msgProcessed INT,

msgProcessedDate TIMESTAMP,

PRIMARY KEY ((reportid ,msgProcessed,objectType),msgAddDate, msgProcessedDate)) );

Please advice which one to use and what will be the pros and cons between two based on performance.

1

1 Answers

1
votes

Design 2 is the one you want.

In Design 1, the whole primary key is the partition key. Which means you need to provide all the attributes (which are: reportid, msgProcessed, objectType, msgAddDate, msgProcessedDate) to be able to query your data with a SELECT statement (which wouldn't be useful as you would not retrieve any additional attributes than the one you already provided in the WHERE statemenent)

In Design 2, your partition key is reportid ,msgProcessed,objectType which are the three attributes you want to query by. Great. msgAddDate is the first clustering column, which will be automatically sorted for you. So you don't even need to run a max since it is sorted. All you need to do is use LIMIT 1:

SELECT msgAddDate FROM sampletable WHERE reportid = 1 and objectType = 'loan' and msgProcessed = 1 LIMIT 1;

Of course, make sure to define a DESC sorted order on msgAddDate (I think by default it is ascending...)

Hope it helps!