kubernetes pod scheduling with resource quotas

Question

I have read about k8s resource management but things are still not very clear to me. Lets say we have 2 k8s nodes each with 22 mb memory. Lets say Pod A has request 10mb and limit 15mb(but lets say actual usage is 5mb). so this pod is scheduled on node 1. So node1 has 22 mb memory, 5 is used by Pod A but another 17mb is available if more memory is needed by Pod A. Pod B has request 10 and limit 15(basically the same with Pod A). so this pod is scheduled on node 2

So both nodes have 5 mbs of usages out of 22mb. If Pod C has a request 5mb and limit 10mb, will this pod be scheduled on any of the nodes? If yes, what would happen Pod C needs 10m memory and the other pod needs 15mb of memory?

What would happen if Pod C has a request of 13mb and a limit of 15mb? In this case 13(request of pod C) + 10(request of pod A) will be 23(more than 22)?

Does k8s try to make sure that requests of all pods < available memory && limits of all pods < available memory ?

Assuming that your node has 22mb of RAM and noone of it is used by other components and you tried to spawn a Pod on it that has a request of 10MB and then spawn a Pod that has a request of 15MB, the Pod will stay in the Pending state as Kubernetes can't guarantee that 15MB of RAM will be available (even if first Pod uses no resources). I see that you've reviewed the K8S documentation about it but have you seen the docs about kube-scheduler? — Dawid Kruk

Dawid Kruk Dawid Kruk · Accepted Answer · 2021-03-15T11:09:45

Answering question from the post:

Lets say Pod A has request 10mb and limit 15mb(but lets say actual usage is 5mb). so this pod is scheduled on node 1. So node1 has 22 mb memory, 5 is used by Pod A but another 17mb is available if more memory is needed by Pod A. Pod B has request 10 and limit 15(basically the same with Pod A). so this pod is scheduled on node 2

This is not really a question but I think this part needs some perspective on how Pods are scheduled onto the nodes. The component that is responsible for telling Kubernetes where a Pod should be scheduled is: kube-scheduler. It could come to the situation as you say that:

Pod A, req:10M, limit: 15M -> Node 1, mem: 22MB
Pod B req:10M, limit: 15M -> Node 2, mem: 22MB

Citing the official documentation:

Node selection in kube-scheduler

kube-scheduler selects a node for the pod in a 2-step operation:

Filtering

Scoring

The filtering step finds the set of Nodes where it's feasible to schedule the Pod. For example, the PodFitsResources filter checks whether a candidate Node has enough available resource to meet a Pod's specific resource requests. After this step, the node list contains any suitable Nodes; often, there will be more than one. If the list is empty, that Pod isn't (yet) schedulable.

In the scoring step, the scheduler ranks the remaining nodes to choose the most suitable Pod placement. The scheduler assigns a score to each Node that survived filtering, basing this score on the active scoring rules.

Finally, kube-scheduler assigns the Pod to the Node with the highest ranking. If there is more than one node with equal scores, kube-scheduler selects one of these at random.

-- Kubernetes.io: Docs: Concepts: Scheduling eviction: Kube sheduler: Implementation

So both nodes have 5 mbs of usages out of 22mb. If Pod C has a request 5mb and limit 10mb, will this pod be scheduled on any of the nodes? If yes, what would happen Pod C needs 10m memory and the other pod needs 15mb of memory?

In this particular example I would much more focus on the request part rather than the actual usage. Assuming that there is no other factor that will deny the scheduling, Pod C should be spawned on one of the nodes (that the kube-scheduler chooses). Why is that?:

resource.limits will not deny the scheduling of the Pod (limit can be higher than memory)
resource.requests will deny the scheduling of the Pod (request cannot be higher than memory)

I encourage you to check following articles to get more reference:

Sysdig.com: Blog: Kubernetes limits requests
Cloud.google.com: Blog: Products: Containers Kubernetes: Kubernetes best practices: (this is GKE blog but it should give the baseline idea, see the part on: "The lifecycle of a Kubernetes Pod" section)

What would happen if Pod C has a request of 13mb and a limit of 15mb? In this case 13(request of pod C) + 10(request of pod A) will be 23(more than 22)?

In that example the Pod will not be scheduled as the sum of requests > memory (assuming no Pod priority). The Pod will be in Pending state.

Additional resources:

kubernetes pod scheduling with resource quotas

1 Answers

Node selection in kube-scheduler