0
votes

I'm trying to define a job with a resource limit, here is a basic pod definition:

apiVersion: v1
kind: Pod
metadata:
    name: busybox
    labels:
        app: busybox
spec:
    containers:
        - name: busybox
          image: busybox
          args: [/bin/sh, -c, 'sleep 600']
          resources:
            limits:
                memory: "1Gi"
                cpu: "1"
            requests:
                cpu: "0.1"
                memory: "400Mi"
    nodeSelector:
        gitlab: "true"

When declaring a job with a resource request of "300Mi" this works, but with a request of "400Mi" this fails with:

52m 29s 183 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient memory (3), MatchNodeSelector (2).

Getting the relevant information from kubectl describe node <relevant node> shows me the following under resources:

Capacity:
cpu: 2
memory: 2052872Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950472Ki
pods: 110

Further down, past the system info I get the following:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests    CPU Limits  Memory Requests Memory Limits
  ------------    ----------  --------------- -------------
  150m (7%)   1 (50%)     12Mi (0%)   128Mi (6%)
Events:       <none>

Now, why is this job refusing to schedule with a request of 400Mi, but works fine when requesting 300Mi? Nothing I add or subtract can show me that the 2GiB accessible memory ( as per "Allocatable" above) would cause this job to not schedule.

I only have a single container (busybox for demo sake).

There is no resource quota for the namespace.

Restarting the apiserver appears to allow pods to schedule, but only for a while.

1

1 Answers

0
votes

The correct answer turns out to be:

Because I was using a too old version of etcd

Specifically, upgrading to version 3.1.7 or later solved the problem.