0
votes

How do I handle hosts removal from an Ansible Inventory Group, either because the host is completely unavailable or repurposed?

Let me give a brief example of 3 (exemplary) Ansible roles. They don't represent what is actually done but help to explain my question:

  • A webserver role installs a certain webserver application, copies the HTML content to the target hosts and opens ports 80 and/or 443 in the firewall (e.g. using the ufw module).
  • A Kubernetes role joins a host to an existing Kubernetes cluster. In order for the networking to work, it opens firewall ports on both, the affected host and all other cluster members.
  • A nodexporter role installs Prometheus Node Exporter and opens its port in the Firewall to allow incoming connections from a certain Prometheus server.

The main playbook applies the roles to hosts in corresponding Inventory Groups.

Now consider 2 scenarios.

  1. A host is member of the nodeexporter and webserver groups. The host need to be reused for another purpose and therefore removed from the webserver Inventory Group. This still leaves the ports 80/443 open.
  2. A physical host is member of the nodeexporter and kubernetes Inventory Groups. The machine is defective and completely removed from the system. This leaves the firewall rule for its old IP open on all of the remaining Kubernetes nodes.

The way I was writing my roles is to add / ensure things like firewall ports. The nodeexporter role servers as an example why I can't simply flush the firewall. So how can I ensure a proper state if a host leaves a group to the host itself and other affected hosts like in the Kubernetes example?

My current workaround for the Kubernetes case is to maintain an auto-generated file on each host that contains the list of allowed IP addresses. This is (on each playbook run) matched with the actual granted IPs. What this approach doesn't work very well with e.g. installed software packages that might become stale if a group was left.

Is there a better way to do this?

2

2 Answers

1
votes

Is there a better way to do this?

The typical way to handle this is to redeploy your server from scratch when re-purposing it for another role. This ensures that your server starts from a known state and avoids problems that can be caused by stale packages/firewall configuration/other system configuration that are not appropriate for the new role.

If you're already using Ansible for configuration management, this ought to be a simple process:

  • Provision server with a basic OS using some automated installation mechanism
  • Use your playbooks to do everything else
0
votes

I used two ways to address this behaviour on my previous projects :

  • Use a dynamic inventory : In this case, you can programmatically assign your hosts to groups according your rules (host name, OS, location, environment variables, ...). Works very well as long as there are few machines and if you inventory code is fast
  • Use a host centric approach : In this case, for each hosts, you install ansible and use command ansible-inventory instead of ansible-playbook to clone playbook and execute it only for localhost. Playbook execution can retrieve some informations (host name, OS, location, environment variables, ...) to condition execution. I prefer this approach because you can automatically execute provision at each startup (first time to install, next times to upgrade). Cons: you need ansible on all VMs.

To address the second case, I rebuild a complete configuration file which is the new reference to apply. And you have to do it every time a machine is added or deleted. When using Ansible, you have to think "final state" and the new configuration file is the new state that must delete the old one :

  • With a dynamic inventory, its easy to do : generate all the new firewall rules and apply them when needed, for each host.
  • With a host centric approach it's more complicated because each host is not aware of others. There are solutions but too complex to describe in this question.