I am using Cassandra 2.0.8 and I have got a cql3 table defined like this:
CREATE TABLE search_scf_tdr (
fieldname text,
fieldvalue text,
scalability int,
timestamptdr bigint,
tdrkeys set<blob>,
PRIMARY KEY ((fieldname, fieldvalue, scalability), timestamptdr)
)
I use a replication factor of 2 per DC for this keyspace. I am inserting in this table by adding items to tdrkeys collection one by one by using an update like this:
UPDATE search_scf_tdr SET tdrkeys = tdrkeys + "new value" WHERE "all primary key fields";
Each element in tdrkeys
is 84 bytes (fixed size).
When querying in this table I retrieve about 160 rows at once with my query (using ranges on timestamptdr
and scalability
and a fixed value for fieldname
and fieldvalue
). Rows are containing a few thousands elements in tdrkeys
collection.
I have a cluster of 42 nodes split in two data centers. I have separate servers using datastax java driver 2.0.9.2 running a total of 24 threads in each data center calling this query (doing many other things with the result between each query) with consistency level ONE:
SELECT tdrkeys FROM search_scf_tdr WHERE fieldname='timestamp' and fieldvalue='' and scalability IN (0,1,2,3,4,5,6,7,8,9,10) and timestamptdr >= begin and timestamptdr < end;
Each Cassandra node has 8 Gb of Java heap and 16 Gb of physical memory. We have tuned as much as we can the cassandra.yaml file and JVM parameters but still getting out of memory problems.
The heap dumps that we get on out of memory errors are showing more than 6 Gb of the heap taken by threads (between 200 and 300) holding many instances of org.apache.cassandra.io.sstable.IndexHelper$IndexInfo containing 2 HeapByteBuffer containing 84 bytes of data.
Cassandra system.log shows errors like this:
ERROR [Thread-388] 2015-05-18 12:11:10,147 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-388,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR [ReadStage:321] 2015-05-18 12:11:10,147 CassandraDaemon.java (line 199) Exception in thread Thread[ReadStage:321,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
at org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:146)
at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo.deserialize(IndexHelper.java:187)
at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:122)
at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:970)
at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:41)
at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)