For some reason Kubernetes 1.6.2 does not trigger autoscaling on Google Container Engine.
I have a someservice
definition with the following resources and rolling update:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: someservice
labels:
layer: backend
spec:
minReadySeconds: 160
replicas: 1
strategy:
rollingUpdate:
maxSurge: 100%
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
name: someservice
layer: backend
spec:
containers:
- name: someservice
image: eu.gcr.io/XXXXXX/someservice:v1
imagePullPolicy: Always
resources:
limits:
cpu: 2
memory: 20Gi
requests:
cpu: 400m
memory: 18Gi
<.....>
After changing image version, the new instance cannot start:
$ kubectl -n dev get pods -l name=someservice
NAME READY STATUS RESTARTS AGE
someservice-2595684989-h8c5d 0/1 Pending 0 42m
someservice-804061866-f2trc 1/1 Running 0 1h
$ kubectl -n dev describe pod someservice-2595684989-h8c5d
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
43m 43m 4 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (4), Insufficient memory (3).
43m 42m 6 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (3), Insufficient memory (3).
41m 41m 2 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (2), Insufficient memory (3).
40m 36s 136 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (3).
43m 2s 243 cluster-autoscaler Normal NotTriggerScaleUp pod didn't trigger scale-up (it wouldn't fit if a new node is added)
My node pool is set to autoscale with min: 2
, max: 5
. And machines (n1-highmem-8
) in node pool are large enough (52GB) to accommodate this service. But somehow nothing happens:
$ kubectl get nodes
NAME STATUS AGE VERSION
gke-dev-default-pool-efca0068-4qq1 Ready 2d v1.6.2
gke-dev-default-pool-efca0068-597s Ready 2d v1.6.2
gke-dev-default-pool-efca0068-6srl Ready 2d v1.6.2
gke-dev-default-pool-efca0068-hb1z Ready 2d v1.6.2
$ kubectl describe nodes | grep -A 4 'Allocated resources'
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
7060m (88%) 15510m (193%) 39238591744 (71%) 48582818048 (88%)
--
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
6330m (79%) 22200m (277%) 48930Mi (93%) 66344Mi (126%)
--
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
7360m (92%) 13200m (165%) 49046Mi (93%) 44518Mi (85%)
--
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
7988m (99%) 11538m (144%) 32967256Ki (61%) 21690968Ki (40%)
$ gcloud container node-pools describe default-pool --cluster=dev
autoscaling:
enabled: true
maxNodeCount: 5
minNodeCount: 2
config:
diskSizeGb: 100
imageType: COS
machineType: n1-highmem-8
oauthScopes:
- https://www.googleapis.com/auth/compute
- https://www.googleapis.com/auth/datastore
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/devstorage.read_write
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/sqlservice
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
serviceAccount: default
initialNodeCount: 2
instanceGroupUrls:
- https://www.googleapis.com/compute/v1/projects/XXXXXX/zones/europe-west1-b/instanceGroupManagers/gke-dev-default-pool-efca0068-grp
management:
autoRepair: true
name: default-pool
selfLink: https://container.googleapis.com/v1/projects/XXXXXX/zones/europe-west1-b/clusters/dev/nodePools/default-pool
status: RUNNING
version: 1.6.2
$ kubectl -n dev get pods -l name=someservice
NAME READY STATUS RESTARTS AGE
someservice-2595684989-h8c5d 0/1 Pending 0 42m
someservice-804061866-f2trc 1/1 Running 0 1h
$ kubectl -n dev describe pod someservice-2595684989-h8c5d
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
43m 43m 4 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (4), Insufficient memory (3).
43m 42m 6 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (3), Insufficient memory (3).
41m 41m 2 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (2), Insufficient memory (3).
40m 36s 136 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (3).
43m 2s 243 cluster-autoscaler Normal NotTriggerScaleUp pod didn't trigger scale-up (it wouldn't fit if a new node is added)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:33:11Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:22:08Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}