I have a table like:
CREATE TABLE mydata (
my_key TEXT,
log_time TIMESTAMP,
value1 DOUBLE,
PRIMARY KEY (my_key, log_time)
);
We get timeseries data where we insert 100K values for 100K keys (1 value for each key each minute) so new values are added every minute.
I would like to run the following range query:
Select * from mydata where my_key='1' and (log_time >='starttime' and log_time <='endtime') and (value1 > 100 and value1 < 200)
With current table design, it is possible as value1 is not part of the partition/clustering key. However we can add indexing on value1 to make it work or make it part of key.
My questions are:
Is it a good idea to add value1 as part of key?
Is it a good idea to add indexing on value1 as the insertions are too frequent (1 minute)?
If both are not good options, how can we model this requirement for range queries?