3
votes

After restarting my 3 masters in my DC/OS cluster, the DC/OS dashboard is showing 0 connected nodes. However from the DC/OS cli I see all 6 of my agent nodes:

$ dcos node
  HOSTNAME        IP                         ID
172.16.1.20  172.16.1.20  a7af5134-baa2-45f3-892e-5e578cc00b4d-S7
172.16.1.21  172.16.1.21  a7af5134-baa2-45f3-892e-5e578cc00b4d-S12
172.16.1.22  172.16.1.22  a7af5134-baa2-45f3-892e-5e578cc00b4d-S8
172.16.1.23  172.16.1.23  a7af5134-baa2-45f3-892e-5e578cc00b4d-S6
172.16.1.24  172.16.1.24  a7af5134-baa2-45f3-892e-5e578cc00b4d-S11
172.16.1.25  172.16.1.25  a7af5134-baa2-45f3-892e-5e578cc00b4d-S10`

I am still able to schedule tasks in Marathon both from the dcos cli and from the Marathon gui, they then are properly scheduled and executed on the agents. Also, from the mesos interface on :5050 I can see all of the agents in the slaves page.

I have restarted agent nodes and master nodes. I have also rerun the DC/OS GUI installer and run preflight check, which of course fails with an "already installed" error.

Is there a way to re-register the node with DC/OS GUI short of uninstalling/reinstalling a node?

1

1 Answers

0
votes

For anyone who is running into this, my problem was related to our corporate proxy. In order to get the Universe working in my cluster I had to add proxy settings to /opt/mesosphere/environment. I then restarted the dcos-cosmos.service and life was good. However, upon server restart, dcos-history-service.service was now running with the new environment and was unable to resolve my local names with our proxy server. To solve, I added a NO_PROXY to the /opt/mesosphere/environment and DCOS dashboard is again happy.