I experience very strange behavior when I'm trying to set new Kubernetes cluster in AWS.
Whenever I try to run kube-up.sh with its default config it works perfectly, The cluster and all its relative components are setting up in less than 10 minutes.
The problem occur when I set the "kube-aws-zone" to be us-east-1e (the same as my current VPC) instead of us-west-2a (default). The installation process stuck in a loop with the following message-
Waiting 3 minutes for cluster to settle ..................Re-running salt highstate sudo: unable to resolve host ip-172-20-0-9 Waiting for cluster initialization.
This will continually check to see if the API for kubernetes is reachable. This might loop forever if there was some uncaught error during start up.
I tried to dig a bit in the minions and find this error in /var/log/salt/minion
2015-10-01 14:52:54,912 [salt.loaded.int.module.cmdmod][ERROR ] Command 'runlevel /run/utmp' failed with return code: 1 2015-10-01 14:52:54,913 [salt.loaded.int.module.cmdmod][ERROR ] output: Too many arguments. 2015-10-01 14:53:00,902 [salt.state ][ERROR ] The named service kubelet is not available 2015-10-01 14:53:03,078 [salt.state ][ERROR ] The named service kube-proxy is not available 2015-10-01 14:53:16,677 [salt.state ][ERROR ] An exception occurred in this state: Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1533, in call **cdata['kwargs']) File "/usr/lib/python2.7/dist-packages/salt/states/sysctl.py", line 56, in present configured = salt'sysctl.show' File "/usr/lib/python2.7/dist-packages/salt/modules/linux_sysctl.py", line 86, in show for line in salt.utils.fopen(config_file_path): File "/usr/lib/python2.7/dist-packages/salt/utils/init.py", line 1065, in fopen fhandle = open(*args, **kwargs) IOError: [Errno 2] No such file or directory: '/etc/sysctl.d/99-salt.conf'2015-10-01 14:53:16,707 [salt.loaded.int.module.cmdmod][ERROR ] Command 'runlevel /run/utmp' failed with return code: 1 2015-10-01 14:53:16,708 [salt.loaded.int.module.cmdmod][ERROR ] output: Too many arguments. 2015-10-01 14:53:16,719 [salt.loaded.int.module.cmdmod][ERROR ] Command 'service docker status' failed with return code: 3 2015-10-01 14:53:16,719 [salt.loaded.int.module.cmdmod][ERROR ] output: * docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Thu 2015-10-01 14:53:16 UTC; 262ms ago Docs: http://docs.docker.com Process: 15285 ExecStart=/usr/bin/docker -d -H fd:// $DOCKER_OPTS (code=exited, status=1/FAILURE) Main PID: 15285 (code=exited, status=1/FAILURE)
Oct 01 14:53:16 ip-172-20-0-90 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE Oct 01 14:53:16 ip-172-20-0-90 systemd[1]: Unit docker.service entered failed state. Oct 01 14:53:16 ip-172-20-0-90 systemd[1]: docker.service failed. 2015-10-01 14:53:20,259 [salt.state ][ERROR ] The named service kubelet is not available 2015-10-01 14:53:20,687 [salt.state
][ERROR ] The named service kube-proxy is not available
I've tried to remove and re-set the IAM roles as suggested to similar issue, but ended up with no luck.
Will appreciate any assistance. Thanks,