I set up a Kubernetes cluster with a single master node and two worker nodes using kubeadm, and I am trying to figure out how to recover from node failure.
When a worker node fails, recovery is straightforward: I create a new worker node from scratch, run kubeadm join, and everything's fine.
However, I cannot figure out how to recover from master node failure (without interrupting the deployments running on the worker nodes). Do I need to backup and restore the original certificates or can I just run kubeadm init to create a new master from scratch? How do I join the existing worker nodes?