Low read throughput with cassandra (timeseries data)

Question

We are in the process of researching a move to Cassandra (2.0.10) and we are testing the write and read performance.

While reading we are seeing what seems to be low read throughput, 14MB/s on avg.

Our current testing environment is only one node, Xeon E5-1620 @ 3.7GHZ with 32GB of RAM, windows 7. Cassandra heap was set to 8GB with default concurrent read and writes, key cache size is set to 400mb, the data sits on a local RAID10 array which is doing sustained avg of 300MB/s sequential read performance using 64KB and higher block sizes.

We are storing hourly sensor data with the current model:

CREATE TABLE IF NOT EXISTS sensor_data_by_day (
sensor_id int,
date text,
event_time timestamp,
load float,
PRIMARY KEY ((sensor_id,date),event_time))

Reading is done on the sensor, date and a range of event time.

Current data set is 2 years worth of data for 100K sensors, about 30GB on disk.

Data is inserted by numerous threads (So the inserts are not sorted by event time, if that matters)

Reading back a day worth of data takes about 2m with a throughput of 14MB/s. Reading is done using the java-cassandara-connector with a prepared statement:

 Select event_time, load from sensor_data_by_day where sensor_id = ? and date in ('2014-02-02') and event_time >= ? and event_time < ?

We create one connection and submitting tasks (100K queries as the number of sensors) to an executor service with pool of 100 threads. Reading when the data is in the cache takes about 7s.

It's probably not a client problem, we tested when the data was located on an SSD and the total time went down from 2m to 10s (~170MB/s), which is understandably better given it's an SSD.

The read performance looks like a block read size issue, which can cause the low reads if Cassandra was reading in 4KB blocks. I read the default was 256 but didn't find the setting anywhere to confirm it or perhaps a random I/O issue?

Is this the kinds of read performance you should expect from Cassandra when using mechanical disks? Perhaps a modeling problem?

Output of cfhistograms:

SSTables per Read
1 sstables: 844726
2 sstables: 90

Write Latency (microseconds)
No Data

Read Latency (microseconds)
      5 us: 418
      6 us: 15252
      7 us: 12884
      8 us: 15447
     10 us: 34211
     12 us: 48972
     14 us: 48421
     17 us: 56641
     20 us: 12484
     24 us: 8325
     29 us: 6602
     35 us: 4953
     42 us: 5427
     50 us: 3610
     60 us: 1784
     72 us: 2414
     86 us: 11208
    103 us: 38395
    124 us: 82050
    149 us: 64840
    179 us: 40161
    215 us: 30891
    258 us: 17691
    310 us: 8787
    372 us: 4171
    446 us: 2305
    535 us: 1588
    642 us: 1187
    770 us: 913
    924 us: 811
   1109 us: 716
   1331 us: 602
   1597 us: 513
   1916 us: 513
   2299 us: 516
   2759 us: 595
   3311 us: 776
   3973 us: 1086
   4768 us: 1502
   5722 us: 2212
   6866 us: 3264
   8239 us: 4852
   9887 us: 7586
  11864 us: 11429
  14237 us: 17236
  17084 us: 22285
  20501 us: 26163
  24601 us: 26799
  29521 us: 24311
  35425 us: 22101
  42510 us: 19420
  51012 us: 16497
  61214 us: 13830
  73457 us: 11356
  88148 us: 8749
 105778 us: 6243
 126934 us: 4406
 152321 us: 2751
 182785 us: 1754
 219342 us: 977
 263210 us: 497
 315852 us: 233
 379022 us: 109
 454826 us: 60
 545791 us: 21
 654949 us: 10
 785939 us: 2
 943127 us: 0
1131752 us: 1

Partition Size (bytes)
 179 bytes: 151874
 215 bytes: 0
 258 bytes: 0
 310 bytes: 0
 372 bytes: 5071
 446 bytes: 0
 535 bytes: 4170
 642 bytes: 3724
 770 bytes: 3454
 924 bytes: 3416
1109 bytes: 3489
1331 bytes: 9179
1597 bytes: 11616
1916 bytes: 12435
2299 bytes: 19038
2759 bytes: 20653
3311 bytes: 10245454
3973 bytes: 25121333

Cell Count per Partition
  4 cells: 151874
  5 cells: 0
  6 cells: 0
  7 cells: 0
  8 cells: 5071
 10 cells: 0
 12 cells: 4170
 14 cells: 0
 17 cells: 3724
 20 cells: 3454
 24 cells: 3416
 29 cells: 3489
 35 cells: 3870
 42 cells: 9982
 50 cells: 13521
 60 cells: 20108
 72 cells: 16678
 86 cells: 51646
103 cells: 35323903

Not that this is your main issue, but the IN operator really isn't optimized for performance. You would probably do better with date= instead of date IN. — Aaron
Take a look at TRACING for you query as well (datastax.com/documentation/cql/3.1/cql/cql_reference/…) — Mikhail Stepura

Desert Ice Desert Ice · Accepted Answer · 2015-01-24T09:09:09

What kind of compaction do you use? If you are having bad read latency from disks it mostly because of the number of SS Tables.

My Suggestions:

If you are looking for better read latency, i would suggest use Leveled compaction. Configure the SS Table size to avoid too many compactions.

With leveled compaction you should get only have max number of reads as the levels. So the performance will be much better.

This comes at the cost of increased number of compaction(if the sstable size is lower) and higher disk IO.

What is your current bloom filter size? Increasing it will decrease the probability of false negatives again improving reads
You seem to have a pretty good key cache set up, if you guys have specific rows that might be read frequently you can turn on row cache. This is generally not recommended as the advantage is minimal for most of the applications.
If the data is always going to be time series, may be use date tiered compaction?

Low read throughput with cassandra (timeseries data)

1 Answers