2
votes

I'm trying to upgrade our Flink cluster from 1.4.2 to 1.7.2

When I bring up the cluster, the task managers refuse to connect to the job managers with the following error.

2019-03-14 10:34:41,551 WARN  akka.remote.ReliableDeliverySupervisor                        
- Association with remote system [akka.tcp://flink@cluster:22671] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@cluster:22671]] Caused by: [cluster: Name or service not known]

Now, this works correctly if I add the following line into the /etc/hosts file.

x.x.x.x job-manager-address.com cluster

Why is Flink 1.7.2 connecting to JM using cluster in the address? Flink 1.4.2 used to have the job manager's address instead of the word cluster.

1

1 Answers

1
votes

The jobmanager.sh script was being invoked with a second argument called cluster.

${Flink_HOME}/bin/jobmanager.sh start cluster

Prior to 1.5, the script expected an execution mode (local or cluster) but this is no longer the case. Invoking the script without the second argument solved this issue.

${Flink_HOME}/bin/jobmanager.sh start