2
votes

I have to manage a cluster of ~600 ubuntu (16.04-20.04) servers using Saltstack 3002.

I decided to install a multi-master setup for load-distribution and fault-tolerance. salt-syndic appeared not the right choice for me. Instead I thought the salt-minions should pick a master from a list by random (?) at minion start. So my config looks as follows (excerpts):

master:

auto_accept: True
master_sign_pubkey: True
master_use_pubkey_signature: True

minion:

master:
  - saltmaster001
  - saltmaster002
  - saltmaster003

verify_master_pubkey_sign: True
retry_dns: 0
master_type: failover
random_master: True

(three salt masters as you can see). I basically followed this tutorial: https://docs.saltstack.com/en/latest/topics/tutorials/multimaster_pki.html

Now, it doesn't work really well... For various reasons:

salt 'tnscass*' test.ping
tnscass011.mo-mobile-prod.ams2.cloud:
    True
tnscass010.mo-mobile-prod.ams2.cloud:
    True
tnscass004.mo-mobile-prod.ams2.cloud:
    True
tnscass005.mo-mobile-prod.ams2.cloud:
    Minion did not return. [Not connected]
tnscass003.mo-mobile-prod.ams2.cloud:
    Minion did not return. [Not connected]
tnscass007.mo-mobile-prod.ams2.cloud:
    Minion did not return. [Not connected]

Salt runs on the master work only if the targeted minions by accident are connected to the master on which you issue the salt command and not to any other master. In the above example the response would be True for different minions if you ran it on a different master.

So the only way is to use salt-call on a particular minion. Not very useful. And even that is not working well, e.g.:

root@minion:~# salt-call state.apply
[WARNING ] Master ip address changed from 10.48.40.93 to 10.48.42.32
[WARNING ] Master ip address changed from 10.48.42.32 to 10.48.42.35

So the minion decides to switch to another master and the salt-call takes ages... The rules that determine under which condition a minion decides to switch are not explained (at least I couldn't find anything)... Is it the load on the master? The number of connected minions?...

Another problem is the salt mines. I'm using code as follows:

salt.saltutil.runner('mine.get', tgt='role:mopsbrokeraggr', fun='network.get_hostname', tgt_type='grain')

Unfortunately, the values of the mines differ badly from minion to minion, so also mines are unusable.

I should mention that my masters are big machines with 16 cores and 128GB RAM, so this is not a matter of resource shortage.

To me, the scenario described in https://docs.saltstack.com/en/latest/topics/tutorials/multimaster_pki.html does simply not work at all.

  • So if anybody could tell me how to create a proper setup with 3 saltmasters for load distribution?
  • Is salt-syndic actually the better approach?
  • Can salt-syndic be used with randomly assigning the minions to the masters based on load or whatever?
  • what is the purpose of the mentioned tutorial? Or do I just have overlooked anything?
1

1 Answers

0
votes

There are a couple of statements worth noticing in the documentation about this method. Quoting from the link in the question:

The first master that accepts the minion, is used by the minion. If the master does not yet know the minion, that counts as accepted and the minion stays on that master.

Then

A test.version on the master the minion is currently connected to should be run to test connectivity.

So this seems to indicate that the minion is connected to one master at a time. Which means only that master can run test.version on that minion (and not any other master).

One of the primary objectives of your question can be met with a different method of multi-master setup: https://docs.saltproject.io/en/latest/topics/tutorials/multimaster.html

In a nutshell, you configure more than 1 master with the same PKI keypair. In the below explanation I have a multi-master setup with 2 servers. I use the below files from my first/primary server on the second server.

/etc/salt/pki/master/master.pub
/etc/salt/pki/master/master.pem

Then configure salt-minion for multiple masters in /etc/salt/minion:

masters:
- master1
- master2

Once the respective services have been restarted, you can check that all minions are available on both masters with salt-key -L:

# salt-key -L
Accepted Keys:
Denied Keys:
Unaccepted Keys:
minion1
minion2
minion3
...
Rejected Keys:

Once all minions' keys are accepted on both masters, we can run salt '*' test.version from either of the masters and reach all minions.

There are other considerations on how to keep the file_roots, pillar_roots, minion keys, and configuration consistent between the masters in the link referenced above.