More than specified number of pods created on deleting a pod

Question

I have deployed Hashicorp's Vault in my Kubernetes clusters on AWS using the Helm chart for it.

The number of replicas in the deployment is specified as 3.

Out of these 3 pods, 1 was ready(1/1) while the other two replica pods were not ready(0/1). I killed the ready pod and while it was expected that Kubernetes will deploy a new pod to replace it, it deployed two new pods.

Now I have two ready pods and two not ready pods. On deleting one of these pods, now Kubernetes recreates only one pod. So I have 4 instead of 3 pods for my vault deployment. What could be the reason behind this and how can we prevent this?

I have used this chart to deploy, github.com/kubernetes/charts/tree/master/incubator/vault Used it as it is, just added the s3 storage part and made vault.dev=false — Uddhav Bhosle
It sounds like you’re not interested in the actual solution to this (how to I get vault running successfully?) but an explanation for the descrepancy (I.e. why 4 instead of 3?) for that it would be helpful to paste the output of ‘kubectl get pods -l app=vault” as well as ‘kubectl describe deploy -l app=vault’. — erstaples

erstaples erstaples · Accepted Answer · 2018-07-07T19:40:06

Your deployment is not working because HA (high availability) is not available when using the s3 storage backend. You’ll need Hashicorp’s Consul or AWS’s DynamoDB, or a different backend provider for that. Change the number of replicas to 1 if you're sticking with the s3 backend provider.

As far as why your seeing 4 pods instead of 3, you need to provide more details. Paste the output of kubectl get pods -l app=vault as well as kubectl describe deploy -l app=vault and I will update this answer.

I can only offer speculation for what it's worth. With Deployment objects there's a maxSurge property that allows rolling updates to scale up beyond the desired number of replicas. It defaults to 25%, rounded up, which in your case would be an additional 1 pod.

Max Surge

.spec.strategy.rollingUpdate.maxSurge is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.

For example, when this value is set to 30%, the new ReplicaSet can be scaled up immediately when the rolling update starts, such that the total number of old and new Pods does not exceed 130% of desired Pods. Once old Pods have been killed, the new ReplicaSet can be scaled up further, ensuring that the total number of Pods running at any time during the update is at most 130% of desired Pods.

It's possible that deleting the one Running (1/1) pod, along with the NotReady state of the other pods, put your Deployment into a state of "rolling update" or something along those lines, which allowed your deployment to scale up to its maxSurge setting.

More than specified number of pods created on deleting a pod

2 Answers