4
votes

I want to benchmark my Cassandra Clusters with 1, 2, 3 and 4 instances. So I ran the cassandra-stress tool on one of the nodes. The benchmark shows strange results, see graph below (--> the one-node cluster has more ops/sek than the 2-/3-/4-node cluster when few threads).

My results (x-axis=threads, y-axis=ops/sek, dataset=nodes in cluster (1, 2, 3, 4):
enter image description here

Compared to the results from this benchmark site, my results seem not to be correct.

My question now is: Do I use the tool correctly if I run the following command on one machine of the cluster:

cassandra-stress write

I also tried this without any effect:

cassandra-stress write -node ip1,ip2,...

See also my other question here. Thank you!

-- EDIT: Solution by Jim --
Run the cassandra-tool from other EC2-instances outside the C*-cluster, but in same LAN (so you can work with internal ips 10.x.x.x). I launched a 1/2/4 node cluster with 4 separate benchmark-caller nodes. Each of them got one of the following commands:

First writing:

cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1..1000000 -node ip1,ip2,ip3,ip4
cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1000001..2000000  -node ip1,ip2,ip3,ip4
cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=2000001..3000000 -node ip1,ip2,ip3,ip4
cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=3000001..4000000 -node ip1,ip2,ip3,ip4

Then reading this data with read command:

cassandra-stress read n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1..1000000 -node ip1,ip2,ip3,ip4
cassandra-stress read n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1000001..2000000  -node ip1,ip2,ip3,ip4
cassandra-stress read n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=2000001..3000000 -node ip1,ip2,ip3,ip4
cassandra-stress read n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=3000001..4000000 -node ip1,ip2,ip3,ip4

Here the results of the reading

1 Node cluster: 149,000 ops/sec
2 Node cluster: 348,000 ops/sec
4 Node cluster: 480,000 ops/sec



Thank you, Jim!

1
Are these final numbers an aggregate you made of the results from all the stress tests? And thanks I'm dealing with just such an issue now.Totoro

1 Answers

4
votes

If you are only running cassandra-stress on one node then I think this would be the expected result. A single machine cannot saturate a four node cluster and would be a bottleneck.

Also if you are running cassandra-stress on one of the cassandra nodes, then that node will be doubly loaded by running both Cassandra and the stress client. This will put extra strain on the CPU and network connection for that machine.

To get a true picture of your cluster throughput, you should run stress from multiple machines outside the cluster (but on the same LAN).