I'm trying to understand if behaviour of the flink jobmanager during zookeeper upgrade is expected or not.
I'm running flink 1.11.2 in kubernetes, with zookeeper server 3.5.4-beta. While I'm doing zookeeper upgrade, there is a 20 seconds zookeeper downtime. I'd expect to either flink job to restart or few warnings in the logs during those 20 seconds. Instead, I see whole flink JVM crash ( and later the pod restart).
I expected for flink to internally retry zookeeper requests, so I'm surprised it crashes. Is this expected, or is it a bug?
From the logs
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
[09-Feb-2021 11:30:00.197 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Opening socket connection to server zdzk.servicexxx/192.168.190.92:2181
[09-Feb-2021 11:30:00.197 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Socket connection established to zdzk.servicexxx/192.168.190.92:2181, initiating session
[09-Feb-2021 11:30:00.198 UTC] WARN org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Session 0x3012b0057140004 for server zdzk.servicexxx/192.168.190.92:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_192]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_192]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_192]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
[09-Feb-2021 11:30:02.294 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Opening socket connection to server zdzk.servicexxx/192.168.190.92:2181
[09-Feb-2021 11:30:02.295 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Socket connection established to zdzk.servicexxx/192.168.190.92:2181, initiating session
[09-Feb-2021 11:30:02.295 UTC] WARN org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Session 0x3012b0057140004 for server zdzk.servicexxx/192.168.190.92:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_192]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_192]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_192]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
[09-Feb-2021 11:30:03.841 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Opening socket connection to server zdzk.servicexxx/192.168.190.92:2181
[09-Feb-2021 11:30:03.842 UTC] INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Socket connection established to zdzk.servicexxx/192.168.190.92:2181, initiating session
[09-Feb-2021 11:30:03.842 UTC] WARN org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - Session 0x3012b0057140004 for server zdzk.servicexxx/192.168.190.92:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_192]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_192]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_192]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_192]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
[09-Feb-2021 11:30:04.175 UTC] ERROR org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl [] - Background operation retry gave up
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
[09-Feb-2021 11:30:04.176 UTC] ERROR org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever [] - Received error from LeaderRetrievalService.
org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderRetrievalService:Background operation retry gave up
at org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService.unhandledError(ZooKeeperLeaderRetrievalService.java:208) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
... 10 more
[09-Feb-2021 11:30:04.178 UTC] ERROR org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl [] - Leader Election Service encountered a fatal error.
org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderElectionService: Background operation retry gave up
at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.unhandledError(ZooKeeperLeaderElectionService.java:430) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
... 10 more
[09-Feb-2021 11:30:04.179 UTC] ERROR org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever [] - Received error from LeaderRetrievalService.
org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderRetrievalService:Background operation retry gave up
at org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService.unhandledError(ZooKeeperLeaderRetrievalService.java:208) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
... 10 more
[09-Feb-2021 11:30:04.180 UTC] ERROR org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Fatal error occurred in ResourceManager.
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Received an error from the LeaderElectionService.
at org.apache.flink.runtime.resourcemanager.ResourceManager.handleError(ResourceManager.java:1053) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.unhandledError(ZooKeeperLeaderElectionService.java:430) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
Caused by: org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderElectionService: Background operation retry gave up
... 18 more
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
... 10 more
[09-Feb-2021 11:30:04.181 UTC] ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error occurred in the cluster entrypoint.
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Received an error from the LeaderElectionService.
at org.apache.flink.runtime.resourcemanager.ResourceManager.handleError(ResourceManager.java:1053) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.unhandledError(ZooKeeperLeaderElectionService.java:430) [flink-dist_2.11-1.11.2.jar:1.11.2]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346) [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_192]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_192]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_192]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
Caused by: org.apache.flink.util.FlinkException: Unhandled error in ZooKeeperLeaderElectionService: Background operation retry gave up
... 18 more
Caused by: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
at org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862) ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]
... 10 more
[09-Feb-2021 11:30:04.196 UTC] INFO org.apache.flink.runtime.blob.BlobServer [] - Stopped BLOB server at 0.0.0.0:6124