I am experimenting with GKE cluster upgrades in a 6 nodes (in two node pools) test cluster before I try it on our staging or production cluster. Upgrading when I only had a 12 replicas nginx deployment, the nginx ingress controller and cert-manager (as helm chart) installed took 10 minutes per node pool (3 nodes). I was very satisfied. I decided to try again with something that looks more like our setup. I removed the nginx deploy and added 2 node.js deployments, the following helm charts: mongodb-0.4.27, mcrouter-0.1.0 (as a statefulset), redis-ha-2.0.0, and my own www-redirect-0.0.1 chart (simple nginx which does redirect). The problem seems to be with mcrouter. Once the node starts draining, the status of that node changes to Ready,SchedulingDisabled
(which seems normal) but the following pods remains:
- mcrouter-memcached-0
- fluentd-gcp-v2.0.9-4f87t
- kube-proxy-gke-test-upgrade-cluster-default-pool-74f8edac-wblf
I do not know why those two kube-system pods remains, but that mcrouter is mine and it won't go quickly enough. If I wait long enough (1 hour+) then it eventually work, I am not sure why. The current node pool (of 3 nodes) started upgrading 2h46 minutes ago and 2 nodes are upgraded, the 3rd one is still upgrading but nothing is moving... I presume it will complete in the next 1-2 hours...
I tried to run the drain command with --ignore-daemonsets --force
but it told me it was already drained.
I tried to delete the pods, but they just come back and the upgrade does not move any faster.
Any thoughts?
Update #1
The mcrouter helm chart was installed like this:
helm install stable/mcrouter --name mcrouter --set controller=statefulset
The statefulsets it created for mcrouter part is:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
labels:
app: mcrouter-mcrouter
chart: mcrouter-0.1.0
heritage: Tiller
release: mcrouter
name: mcrouter-mcrouter
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: mcrouter-mcrouter
chart: mcrouter-0.1.0
heritage: Tiller
release: mcrouter
serviceName: mcrouter-mcrouter
template:
metadata:
labels:
app: mcrouter-mcrouter
chart: mcrouter-0.1.0
heritage: Tiller
release: mcrouter
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: mcrouter-mcrouter
release: mcrouter
topologyKey: kubernetes.io/hostname
containers:
- args:
- -p 5000
- --config-file=/etc/mcrouter/config.json
command:
- mcrouter
image: jphalip/mcrouter:0.36.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: mcrouter-port
timeoutSeconds: 5
name: mcrouter-mcrouter
ports:
- containerPort: 5000
name: mcrouter-port
protocol: TCP
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: mcrouter-port
timeoutSeconds: 1
resources:
limits:
cpu: 256m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/mcrouter
name: config
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: mcrouter-mcrouter
name: config
updateStrategy:
type: OnDelete
and here is the memcached statefulset:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
labels:
app: mcrouter-memcached
chart: memcached-1.2.1
heritage: Tiller
release: mcrouter
name: mcrouter-memcached
spec:
podManagementPolicy: OrderedReady
replicas: 5
revisionHistoryLimit: 10
selector:
matchLabels:
app: mcrouter-memcached
chart: memcached-1.2.1
heritage: Tiller
release: mcrouter
serviceName: mcrouter-memcached
template:
metadata:
labels:
app: mcrouter-memcached
chart: memcached-1.2.1
heritage: Tiller
release: mcrouter
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: mcrouter-memcached
release: mcrouter
topologyKey: kubernetes.io/hostname
containers:
- command:
- memcached
- -m 64
- -o
- modern
- -v
image: memcached:1.4.36-alpine
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: memcache
timeoutSeconds: 5
name: mcrouter-memcached
ports:
- containerPort: 11211
name: memcache
protocol: TCP
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: memcache
timeoutSeconds: 1
resources:
requests:
cpu: 50m
memory: 64Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
updateStrategy:
type: OnDelete
status:
replicas: 0
mcrouter-memcached-0
statefulset deployment? – Anton Kostenko