9
votes

I have 6 nodes, 1 Solr, 5 Spark nodes, using datastax. My cluster is on a similar server to Amazon EC2, with EBS volume. Each node has 3 EBS volumes, which compose a logical data disk using LVM. In my OPS center the same node frequently becomes unresponsive, which leads to a connect time out of my data system. My data amount is around 400GB with 3 replicas. I have 20 streaming jobs with batch interval every minute. Here is my error message:

/var/log/cassandra/output.log:WARN 13:44:31,868 Not marking nodes down due to local pause of 53690474502 > 5000000000
/var/log/cassandra/system.log:WARN [GossipTasks:1] 2016-09-25 16:40:34,944 FailureDetector.java:258 - Not marking nodes down due to local pause of 64532052919 > 5000000000 
/var/log/cassandra/system.log:WARN [GossipTasks:1] 2016-09-25 16:59:12,023 FailureDetector.java:258 - Not marking nodes down due to local pause of 66027485893 > 5000000000 
/var/log/cassandra/system.log:WARN [GossipTasks:1] 2016-09-26 13:44:31,868 FailureDetector.java:258 - Not marking nodes down due to local pause of 53690474502 > 5000000000

EDIT:

These are my more specific configurations. I would like to know wether I am doing something wrong and if so how can I find out in details what it is and how to fix it?

out heap is set to

MAX_HEAP_SIZE="16G"
HEAP_NEWSIZE="4G"

current heap:

[root@iZ11xsiompxZ ~]# jstat -gc 11399
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
 0.0   196608.0  0.0   196608.0 6717440.0 2015232.0 43417600.0 23029174.0 69604.0 68678.2  0.0    0.0     1041  131.437   0      0.000  131.437
[root@iZ11xsiompxZ ~]# jmap -heap 11399
Attaching to process ID 11399, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.102-b14

using thread-local object allocation.
Garbage-First (G1) GC with 23 thread(s)

Heap Configuration:

MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 51539607552 (49152.0MB)
   NewSize                  = 1363144 (1.2999954223632812MB)
   MaxNewSize               = 30920409088 (29488.0MB)
   OldSize                  = 5452592 (5.1999969482421875MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 16777216 (16.0MB)

Heap Usage:

G1 Heap:
   regions  = 3072
   capacity = 51539607552 (49152.0MB)
   used     = 29923661848 (28537.427757263184MB)
   free     = 21615945704 (20614.572242736816MB)
   58.059545404588185% used
G1 Young Generation:
Eden Space:
   regions  = 366
   capacity = 6878658560 (6560.0MB)
   used     = 6140461056 (5856.0MB)
   free     = 738197504 (704.0MB)
   89.26829268292683% used
Survivor Space:
   regions  = 12
   capacity = 201326592 (192.0MB)
   used     = 201326592 (192.0MB)
   free     = 0 (0.0MB)
   100.0% used
G1 Old Generation:
   regions  = 1443
   capacity = 44459622400 (42400.0MB)
   used     = 23581874200 (22489.427757263184MB)
   free     = 20877748200 (19910.572242736816MB)
   53.04110320109241% used

40076 interned Strings occupying 7467880 bytes.

I don't know why this happens. Thanks a lot.

1
(this is mostly based on some fuzzy memory but) I think that node has GC problems. Afaik the logic surrounding the Not marking nodes down due to local pause message is an attempt to distinguish between not getting updates from the other nodes because they are really down/non-responsive & the node itself being subject to heavy GC/otherwise unresponsive.Eugen Constantin Dinca
@EugenConstantinDinca Thanks for the reply, I agree with your reasoning just don't know what i am doing wrong and how I can track the more detailed problem/ solution? I added some more details to the question would be great if you could take a look.peter

1 Answers

4
votes

The message you see Not marking nodes down due to local pause is due to the JVM pausing. Although you're doing some good things here by posting JVM information, often a good place to start is just looking at the /var/log/cassandra/system.log for example check for things such as ERROR, WARN. Also check for length and frequency of GC events by grepping for GCInspector.

Tools such as nodetool tpstats are your friend here, seeing if you have backed up or dropped mutations, blocked flush writers and such.

Docs here have some good things to check for: https://docs.datastax.com/en/landing_page/doc/landing_page/troubleshooting/cassandra/cassandraTrblTOC.html

Also check your nodes have the recommended production settings, this is something often overlooked:

http://docs.datastax.com/en/landing_page/doc/landing_page/recommendedSettingsLinux.html

Also one thing to note is that Cassandra is rather i/o sensitive and "normal" EBS might not be fast enough for what you need here. Throw Solr into the mix too and you can see a lot of i/o contention when you hit a Cassandra compaction and Lucene Merge going for disk at the same time.