Environment
I am running Cloud Composer cluster (composer-1.6.0-airflow-1.10.1) with 3 nodes with default yaml GKE files provided when creating composer environment. We have 3 worker celery nodes running 4 worker threads each (celery.dag_concurrency)
The problem
I have noticed that two celery worker pods are scheduled on the same cluster node (let's say node A), the third pod is on node B. Node C has some supporting pods running but its cpu and memory utilisation is marginal.
Previously, we used 10 worker threads per worker and it led to all three worker pods being scheduled on the same node! causing pods to be evicted every few minutes due to node going out of memory.
I would expect that each pod is scheduled on a separate node for the best resource utilisation.
GKE Master version - 1.11.10-gke.5
Total size - 3 nodes
Node spec:
Image type - Container-Optimised OS (cos)
Machine type - n1-standard-1
Boot disk type - Standard persistent disk
Boot disk size (per node) - 100 GB
Pre-emptible nodes - Disabled
Workaround
By default Cloud Composer doesn't specify requested memory for worker pods. By setting requested memory in such a way that prevents scheduling two worker pods on the same node kind of fixes the problem. In my case I set requested memory to 1.5Gi