My nodes got deleted in EKS, how can I recover

Question

I am doing getting started with AWS-EKS demo on my machine. I created a EKS cluster, Worker nodes and then attached those nodes to the Cluster and deployed nginx service over the nodes. In first attempt, I could do this demo successful, and I was able to access the Load balancer url, having nginx service deployed on it. Now while playing with the instance, both of my nodes say node1 and node2 got deleted with below commands

kubectl delete node <node-name>
node "ip-***-***-***-**.ap-south-1.compute.internal" deleted

To recover this i spent more time, i found that the Load balancer URL is ACTIVE, the two respective EC2 instances (or worker nodes) are running fine. However, below command gives this result

PS C:\k8s> kubectl get nodes
No resources found.
PS C:\k8s>

I tried to replicate step#3 from getting started guide But could end up only in recreating the same worker nodes

When i try to create a pods again on the same EC2 instances or worker node, it says STATUS is pending for pods

PS C:\k8s> kubectl create -f .\aws-pod-nginx.yaml
deployment.apps/nginx created
PS C:\k8s> kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
nginx-76b782ee75-n6nwv   0/1     Pending   0          38s
nginx-76b78dee75-rcf6d   0/1     Pending   0          38s
PS C:\k8s> kubectl get pods

when i describe the pod error is as below:

Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  52s (x5 over 4m11s)  default-scheduler  no nodes available to schedule pods

I have my two EC2 instances (or worker nodes) running, I tried to attach those to ELB url manually, but the service status is 'OutOfService' for those EC2 instances

I would like to get result of the below command, having working nodes, which can be accessed from the ELB, but the result of below command 'no resources found':

kubectl get nodes

Why are you trying to access sk8s nodes from an ELB? From my understanding that is not how k8s works. The client (you) would request the k8s service endpoint and leave te routing to k8s. In situation where you would need direct access, use the EC2 public DNS. In the worse case, clean everything and start from step 1 of the walk through. — David J Eddy
I was using k8s elb which is launched as a service as a part of the k8s demo, i did not create it separately, but as a k8s service using .yml file. This k8s ELB uses ec2 instances in the backend (this routing is automatically done by k8s). From my above post, What I have well understood is pods are created automatically if any of those gets terminated, I want to understand, if I delete any of the node through CLI, is there a way (a command or manual step) to recover/restore the node? — Jagdish0886
k8s is a declarative system. Meaning if the configuration declares two nodes, one of the nodes goes away, then the k8s master should relaunch a node to match the desired declared state of two nodes. IE the node should be replaces automatically by k8s. — David J Eddy
I am wondering, as I was using EKS the getting started demo, in this case though I was having master node and two nodes hosted on EC2 as a part of the demo, Though one of the node got terminated, why the master node didnt relaunch the terminated node. — Jagdish0886

dlaidlaw dlaidlaw · Accepted Answer · 2019-09-13T19:47:21

You say you deleted the nodes with the kubectl delete node <node-name> command. I don't think you wanted to do that. You deleted the nodes from Kubernetes, but the two EC2 instances are still running. Kubernetes is not able to schedule pods to run on the EC2 instances that were deleted from the cluster. It is very difficult to re-attach instances to the cluster. You would need to have ssh or SSM session manager access to log into the instances and run the commands to join the cluster.

It would actually be far easier to just delete the old EC2 instances and create new ones. If you followed the AWS EKS documentation to create the cluster, then an ASG (Auto Scaling Group, or Node Group) was created, and that ASG created the EC2 instances. The ASG allows you to scale up and down the number of EC2 instances in the cluster. Check to see if the EC2 instances were created by an ASG by using the AWS Console. Using the EC2 Instances page, select one of the instances that was in your cluster and then select the Tags tab to view the Tags attached to the instance. You will see a tag named aws:autoscaling:groupName if the instance was created by an ASG.

If the EC2 instance was created by an ASG, you can simply terminate the instance and the ASG will create a new one to replace it. When the new one comes up, its UserData will have a cloud-init script defined that will join the instance to the kubernetes cluster. Do this with all the nodes you removed with the kubectl delete node command.

When the new EC2 instances join the cluster you will see them with the kubectl get nodes command. At this point, kubernetes will be able to schedule pods to run on those instances.

My nodes got deleted in EKS, how can I recover

1 Answers