1
votes

I am working on a single node Cassandra setup. The system which I am using has 4-Core cpu with 8GB RAM. The properties of the column family which i am using is:

Keyspace: keyspace1:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
    Options: [datacenter1:1]
  Column Families:
    ColumnFamily: colfamily (Super)
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type/org.apache.cassandra.db.marshal.BytesType
      Row cache size / save period in seconds / keys to save : 100000.0/0/all
      Row Cache Provider: org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider
      Key cache size / save period in seconds: 200000.0/14400
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 1.0
      Replicate on write: true
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy

I tried to insert 1million rows to a column family. The Throughput for writes is around 2500 per sec and reads is around 380 per sec.

How can I improve both the read and write throughput??.

1
How many threads are you using to run through your example? - zznate
@zznate: There is only one thread running for the example.. - sravan_kumar
That is about right for one thread then. You can use the stress tool in the apache source distribution for some easy verification of performance: github.com/apache/cassandra/tree/trunk/tools/stress - zznate

1 Answers

1
votes

380 per second means, that you are reading data from hard drive with low cache hit rate or OS is swapping. Check Cassandra statistics to find out cache usage:

./nodetool -host <IP> cfstats

You have enabled both row and key cache. row cache will read whole row into RAM - means all columns given by row key. In this case you can disable key cache. But make sure that you have enough free RAM to handle row caching.

If you have Cassandra with off-heap-cache (default from 1.x), it is possible that row cache is very large and OS started swapping - check swap size - this can decrease performance.