Regarding Cassandra Read Performance

Question

I am working on a sensor data(timeseries). Number of columns in a table is 3000.
for eg: nodeid,timestamp,sen1,sen2,.....sen-n. nodeid and timestamp are primary key with clustering order by timestamp.
Number of records are 10000.

When a SELECT query for single column(SELECT timestamp,sen1 FROM <table>) requ is requsted through cassandra datastax java driver 3.0 it is replies in 15 sec; i.e If I want read all the tags, one a tag at a time for all 3000 tags requires 3000*15 sec = 12 to 13 hours aproximately. It is on single node cluster with 16GB RAM.
I allocated 10GB for JVM. Still response time is not changed. I used LevelCompactionStragy at the time of table creation.

Hardware: Intel Core i7 and Normal Hard disk not SSD,8GB RAM How to reduce that read or query time on the single node cluster?

Can you provide hardware specs ? Nb of CPU cores ? Spinning hard disk or SSD ? — doanduyhai
one node cluster is a bad idea. It is better to have a SQL solution. If you want to really test Cassandra, try to add more nodes to the cluster and think a bit more about the architecture. — Whitefret

jasonstack jasonstack · Accepted Answer · 2016-04-23T12:53:09

Obviously, there is problem with data modelling. IMO, a table with 3000 columns is bad. if your use case is like "SELECT timestamp,sen1 FROM ", then you should model it as " Primary Key(Timestamp, SensorId) ".

"SELECT timestamp,sen1" in your model, cassandra will still read all other column values from disk into memory.

I am not sure what is 'nodeId' in your case.. I hope it's not cassandra node id..

Regarding Cassandra Read Performance

2 Answers