2
votes

We are using Apache Cassandra 3.0.7 version and off late we see that 90% of memory is occupied on almost all nodes, even though disk is hardly used. We have a cluster of 5 nodes with 15 GB memory, 4 cores, 200 GB SSD each.

We tried all kind of configurations through both YAML as well as table level properties but none seem to help. Memory usage constantly increases almost in direct proportion to data.

Considering the fact that our application is a write-intensive one, we are okay with reduced read performance but would like to utilize as less memory as possible. To do this, our idea was to disable all caches possible or avoid keeping anything not-necessary in memory. But nothing so far seem to help.

​Here's our yaml: http://pastebin.com/yeRGcHRt

and here's our table configuration:

CREATE KEYSPACE if not exists test_ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}; CREATE TABLE if not exists test_ks.test_cf (id bigint PRIMARY KEY,key_str text,value1 int,value2 int,update_ts bigint) WITH bloom_filter_fp_chance = 1 AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 10240 AND memtable_flush_period_in_ms = 3600000 AND min_index_interval = 10240 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE' AND caching = {'keys': 'NONE', 'rows_per_partition': 'NONE'};

We have seen that most of the consumption is on off-heap, heap memory is capped at 4.5 G. So out of total 14 G on a node, only 4.5G is consumed by heap.

Has anyone tried such configuration before? Please let us know if disabling cache would help us in this situation. And if yes, how we can we disable cache completely. Looking forward to your help.

3

3 Answers

1
votes

We are experiencing a similar problem. After upgrading from Cassandra 2.x to 3.11.0, Cassandra is using <2GB on-heap and >10GB off-heap, on a use case that didn't have any problems before. This results in the (Windows) machine staying pegged at 99.5% memory usage continually. Heap memory is similarly capped at 2GB.

Most caching values are left to the defaults; in particular the row cache is disabled.

EDIT: I have a better answer. It appears (still testing) that the slowness in our case was because Windows' page file was not disabled. Cassandra recommends disabling the swap file on Linux or the page file on Windows. It also outputs a warning on startup if a swap or page file is detected.

Cassandra's off-heap memory, at least on Windows, is mostly due to memory-mapped IO of files, which is apparently (from reading the Cassandra issue tracker) significantly faster. However, if a swap/page file is enabled, things are forced out of physical memory by mmapped files and experience a huge slowdown swapping to disk. Disabling the page file on Windows in our testing appears to mitigate this significantly. Cassandra is still using lots of memory for mmapped files, but as no memory is being swapped to disk, some combination of Cassandra and the OS properly free up the mmapped files so that other processes can run smoothly. I used this tool to confirm the presence of mmapped files on Windows.

0
votes

To decrease used memory try to set next parameters MAX_HEAP_SIZE, HEAP_NEWSIZE in cassandra-env.sh to values you want

0
votes

Try set -XX:MaxDirectMemorySize. It will limit the use of off-heap memory