I'm trying to insert records using Hector and from time to time I get this error:
me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level.
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:59)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:264)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69)
at ustocassandra.USToCassandraHector.consumer(USToCassandraHector.java:271)
at ustocassandra.USToCassandraHector.access$100(USToCassandraHector.java:41)
at ustocassandra.USToCassandraHector$2.run(USToCassandraHector.java:71)
at java.lang.Thread.run(Thread.java:724)
Caused by: UnavailableException()
at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258)
... 9 more
I know the usual explanation is that there are not enough nodes up, but it's not this case. All my nodes are up:
./nodetool ring
Note: Ownership information does not include topology; for complete information, specify a keyspace
Datacenter: DC1
==========
Address Rack Status State Load Owns Token
4611686018427388000
172.16.217.222 RAC1 Up Normal 353.36 MB 25.00% -9223372036854775808
172.16.217.223 RAC2 Up Normal 180.84 MB 25.00% -4611686018427388000
172.16.217.224 RAC3 Up Normal 260.34 MB 25.00% -2
172.16.217.225 RAC4 Up Normal 222.71 MB 25.00% 4611686018427388000
I'm inserting records with 20 threads (maybe I should use less? From what I know, the error would be Overloaded in this case, not Unavailable). I'm using a write consistency of ONE. I'm using AutoDiscoveryAtStartup and LeastActiveBalancingPolicy. The replication factor is 2.
I'm using Cassandra 1.2.8 (I tried with 2.0 and it's the same).
The error doesn't occur from the beginning. I ususally manage to insert about 2 million records before getting the error. My code is set to retry when an error occurs. After some dozens of retries, the insert usually succeeds. After that, it again works fine for some millions of inserts then I get the error again and the cycle continues.
Could it be because I set gc_grace = 60? Anyway, I don't get the error every 60 seconds so I don't think this is the reason.
Could you give me some suggestions about the reason of this error and what should I do?
EDIT:
'nodetool tpstats' says I have some messages dropped:
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 0
BINARY 0
READ 0
MUTATION 11
_TRACE 0
And I see the following warnings in the log file:
WARN [ScheduledTasks:1] 2013-09-30 09:20:16,633 GCInspector.java (line 136) Heap is 0.853986836999536 full. You may need to reduce memtable and/or cache sizes. Cassandra is now reducing cache sizes to free up memory. Adjust reduce_cache_sizes_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 AutoSavingCache.java (line 185) Reducing KeyCache capacity from 1073741824 to 724 to reduce memory pressure
WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 GCInspector.java (line 142) Heap is 0.853986836999536 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
WARN [ScheduledTasks:1] 2013-09-30 09:20:16,634 StorageService.java (line 3618) Flushing CFS(Keyspace='us', ColumnFamily='my_cf') to relieve memory pressure
This is at the exact time when Hector throws the Unavailable exception. So, it's probably a memory related problem. I guess I will try what the warning says: reducing the memtable size.