1
votes

The YCSB Endpoint benchmark would have you believe that Cassandra is the golden child of Nosql databases. However, recreating the results on our own boxes (8 cores with hyperthreading, 60 GB memory, 2 500 GB SSD), we are having dismal read throughput for workload b (read mostly, aka 95% read, 5% update).

The cassandra.yaml settings are exactly the same as the Endpoint settings, barring the different ip addresses, and our disk configuration (1 SSD for data, 1 for a commit log). While their throughput is ~38,000 operations per second, ours is ~16,000 regardless (relatively) of the threads/number of client nodes. I.e. one worker node with 256 threads will report ~16,000 ops/sec, while 4 nodes will each report ~4,000 ops/sec

I've set the readahead value to 8KB for the SSD data drive. I'll put the custom workload file below.

When analyzing disk io & cpu usage with iostat, it seems that the reading throughput is consistently ~200,000 KB/s, which seems to suggest that the ycsb cluster throughput should be higher (records are 100 bytes). ~25-30% of cpu seems to be under %iowait, 10-25% in use by the user.

top and nload stats are not ostensibly bottlenecked (<50% memory usage, and 10-50 Mbits/sec for a 10 Gb/s link).

# The name of the workload class to use
workload=com.yahoo.ycsb.workloads.CoreWorkload

# There is no default setting for recordcount but it is
# required to be set.
# The number of records in the table to be inserted in
# the load phase or the number of records already in the
# table before the run phase.
recordcount=2000000000

# There is no default setting for operationcount but it is
# required to be set.
# The number of operations to use during the run phase.
operationcount=9000000

# The offset of the first insertion
insertstart=0
insertcount=500000000

core_workload_insertion_retry_limit = 10
core_workload_insertion_retry_interval = 1

# The number of fields in a record
fieldcount=10

# The size of each field (in bytes)
fieldlength=10

# Should read all fields
readallfields=true

# Should write all fields on update
writeallfields=false

fieldlengthdistribution=constant

readproportion=0.95

updateproportion=0.05

insertproportion=0

readmodifywriteproportion=0

scanproportion=0

maxscanlength=1000

scanlengthdistribution=uniform

insertorder=hashed

requestdistribution=zipfian
hotspotdatafraction=0.2

hotspotopnfraction=0.8
table=usertable

measurementtype=histogram

histogram.buckets=1000
timeseries.granularity=1000
1
Throughput is not scaling at all with respect to the number of clients. With 1 client, we get a set amount of ops/sec. When running 2 ycsb clients in parallel, each get throughput / 2 , 4 clients gets throughput / 4. Have set concurrent reads to 1024, as well as core & max connections in ycsb client to 1024. nload reveals no change in network traffic when adding clients (expect 2x traffic when going from 1 client to 2). Traffic is only ~20 Mbits, so definitely not bottlenecked. - Rdesmond
Compacting SSTables had no effect on performance. - Rdesmond
You should be more specific about what you are asking. - cabad

1 Answers

0
votes

The key was increasing native_transport_max_threads in the casssandra.yaml file.

Along with the increased settings in the comment (increasing connections in ycsb client as well as concurrent read/writes in cassandra), Cassandra jumped to ~80,000 ops/sec.