GKE Pod not scheduled in different namespace

Question

I want to deploy a monstache deployment in my already existing namespace "test-namespace". When I deploy it in "default" namespace it works but when I deploy it in "test-namespace" the pod does not schedule.

kubectl get pods -n test-namespace monstache-74466dc7-5tnrr -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
monstache-74466dc7-5tnrr   0/1     Pending   0          57m   <none>   <none>   <none>           <none>

and:

kubectl describe pods -n test-namespace monstache-74466dc7-5tnrr
Name:           monstache-74466dc7-5tnrr
Namespace:      test-namespace
Priority:       0
Node:           <none>
Labels:         app=monstache
                pod-template-hash=74466dc7
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/monstache-74466dc7
Containers:
  monstache:
    Image:      rwynn/monstache:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/monstache
      -f
      /app/monstache.test.config.toml
    Environment:
      MONSTACHE_DIRECT_READ_NS:    xxx.XXX
      MONSTACHE_CHANGE_STREAM_NS:  xxx.XXX
      MONSTACHE_MONGO_URL:         mongodb://xxx?replicaSet=rs0
      MONSTACHE_ES_USER:           elastic
      MONSTACHE_ES_PASS:           XXX
    Mounts:
      /app from monstache-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qmcwm (ro)
Volumes:
  monstache-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      monstache-config
    Optional:  false
  default-token-qmcwm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qmcwm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

and:

kubectl get events -n test-namespace
LAST SEEN   TYPE     REASON              OBJECT                          MESSAGE
55m         Normal   SuccessfulCreate    replicaset/monstache-74466dc7   Created pod: monstache-74466dc7-snrdb
55m         Normal   SuccessfulCreate    replicaset/monstache-74466dc7   Created pod: monstache-74466dc7-5tnrr
55m         Normal   ScalingReplicaSet   deployment/monstache            Scaled up replica set monstache-74466dc7 to 1
55m         Normal   ScalingReplicaSet   deployment/monstache            Scaled up replica set monstache-74466dc7 to 1

and:

This is my monstache deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: monstache
  namespace: test-namespace
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: monstache
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: monstache
    spec:
      containers:
      - command:
        - /bin/monstache
        - -f
        - /app/monstache.test.config.toml
        env:
        - name: MONSTACHE_DIRECT_READ_NS
          value: xxx.xxx
        - name: MONSTACHE_CHANGE_STREAM_NS
          value: xxx.xxx
        - name: MONSTACHE_MONGO_URL
          value: mongodb://mongodb-service:27017/xxx?replicaSet=rs0
        - name: MONSTACHE_ES_USER
          value: elastic
        - name: MONSTACHE_ES_PASS
          value: XXXX
        image: rwynn/monstache:latest
        imagePullPolicy: Always
        name: monstache
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /app
          name: monstache-config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: monstache-config
        name: monstache-config
---
apiVersion: v1
data:
  monstache.test.config.toml: |
    resume = true

    gzip = true

    elasticsearch-urls = ["https://elasticsearch:9200"]

    elasticsearch-max-conns = 10

    elasticsearch-max-seconds = 1

    elasticsearch-max-docs = 1

    #namespace-regex = '*'

    verbose = false

    enable-http-server = true

    elasticsearch-validate-pem-file = false

    [[mapping]]
    namespace = "XXX.XXX"
    index = "XXX"
kind: ConfigMap
metadata:
  name: monstache-config
  namespace: test-namespace

Few more Things to know:

I already have pods scheduled in that namespace
I tried to delete the deployment an re-create
I even created a new nodepool and tried to schedule the deployment there - also didn't work.
I searched for a pod count limit and pod quota, and it does not conflict.
I have 12 namespaces in that GKE cluster
I have total 113 pods in that GKE cluster
I have some successfully scheduled monstache deployments in other namespaces in that cluster.
It happens in the 2 most recent namespaces I've created.

Any clues?

What is your master and nodepool version? Are you using preemptible nodes? There are two possibilities. 1) It may pod cidr not matching the nodes alias ip range, If so, it can be fixed by deleting the node which has mismatching pod cidr and alias ip. 2) Describe the node and make sure there is enough ephemeral-storage if you see any 'Insufficient ephemeral-storage' message. Hope it helps — Milad Tabrizi
Hello, could you recreate(delete and create) the monstache deployment? After that you should be able to see why it's in Pending state by kubectl describe pod monstache-xxx (by default events are cleared after one hour). Alternatively you could check the Stackdriver logs for more information. — Dawid Kruk
@Idan for reproduction purposes please add all of the steps and YAML definitions that you followed to deploy monstache. Also have you tried to deploy it on a new cluster (on default and in test-namespace)? Could you please check if Stackdriver logged any information about this Pending state? — Dawid Kruk
Firstly, by changing namespaces on your Deployment you will also need to change the connection strings. If your Pod resides in other namespace that a it tries to connect to, it will need to use the full FQDN like: mongodb.NAMESPACE.svc.cluster.local. Secondly, please check if you are not blocked by ResourceQuota: stackoverflow.com/a/63516629/12257134. — Dawid Kruk
@Idan I've used the definition (Configmap and Deployment) you posted in the question and I couldn't replicate the issue you're having. My Pod was in Running state and then went in to Crashloopbackoff (lack of connections) and in your case the Pod was staying in Pending state. To be able to pinpoint the issue you will need to add the logs from either kubectl describe or Stackdriver. Stackdriver should store the logs from all of the events of GKE. You can read more about it here: cloud.google.com/stackdriver/docs/solutions/gke — Dawid Kruk

Idan Idan · Accepted Answer · 2020-12-17T12:35:24

0

votes

It was a bug. Re-deploying it with GKE version 1.17.14-gke.1200 solved the problem.

GKE Pod not scheduled in different namespace

1 Answers