3
votes

I am using Cassandra 2.0.8 and I have got a cql3 table defined like this:

CREATE TABLE search_scf_tdr (
  fieldname text,
  fieldvalue text,
  scalability int,
  timestamptdr bigint,
  tdrkeys set<blob>,
  PRIMARY KEY ((fieldname, fieldvalue, scalability), timestamptdr)
)

I use a replication factor of 2 per DC for this keyspace. I am inserting in this table by adding items to tdrkeys collection one by one by using an update like this:

UPDATE search_scf_tdr SET tdrkeys = tdrkeys + "new value" WHERE "all primary key fields";

Each element in tdrkeys is 84 bytes (fixed size).

When querying in this table I retrieve about 160 rows at once with my query (using ranges on timestamptdr and scalability and a fixed value for fieldname and fieldvalue). Rows are containing a few thousands elements in tdrkeys collection.

I have a cluster of 42 nodes split in two data centers. I have separate servers using datastax java driver 2.0.9.2 running a total of 24 threads in each data center calling this query (doing many other things with the result between each query) with consistency level ONE:

SELECT tdrkeys FROM search_scf_tdr WHERE fieldname='timestamp' and fieldvalue='' and scalability IN (0,1,2,3,4,5,6,7,8,9,10) and timestamptdr >= begin and timestamptdr < end;

Each Cassandra node has 8 Gb of Java heap and 16 Gb of physical memory. We have tuned as much as we can the cassandra.yaml file and JVM parameters but still getting out of memory problems.

The heap dumps that we get on out of memory errors are showing more than 6 Gb of the heap taken by threads (between 200 and 300) holding many instances of org.apache.cassandra.io.sstable.IndexHelper$IndexInfo containing 2 HeapByteBuffer containing 84 bytes of data.

Cassandra system.log shows errors like this:

ERROR [Thread-388] 2015-05-18 12:11:10,147 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-388,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR [ReadStage:321] 2015-05-18 12:11:10,147 CassandraDaemon.java (line 199) Exception in thread Thread[ReadStage:321,5,main]
java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
    at org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:146)
    at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
    at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
    at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo.deserialize(IndexHelper.java:187)
    at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:122)
    at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:970)
    at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:871)
    at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:41)
    at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
    at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
    at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
    at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
    at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
    at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
    at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327)
    at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
    at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
    at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
1

1 Answers

3
votes

You are using "IN" query for multiple partitions, since scalability is the part of the partition key. This causes cassandra to coordinate the query across multiple nodes. For more details, see, for example, this.

The solution would be to run a separate query for every value in scalability and then merge the result manually or not make it part of the partition key, ie. PRIMARY KEY ((fieldname, fieldvalue), scalability, timestamptdr) if possible.