0
votes

I am trying to deploy a node.js application (Dockerized in Artifact Registry) into a GCP Kubernetes cluster... and then also put the service/deployment behind an ingress, so that our static frontend can talk to this application via CORS.

I am able to get the service to work without using ingress (just a standard service/deployment), but the frontend cannot talk to it because of CORS errors. After researching I have learned that I should create an Ingress to control the traffic for this scenario.

I have verified the app is running, both by looking at the GKE Workloads logs (app has started), and also entering the GKE pods (via busybox curl proxy), and am able to curl the GKE service, and it returns the expected responses. So I have determined the issue is restricted to the load balancer traffic not being routed correctly, or denied for some reason. curl from pod to GKE service

The app deployment is configured to run on port 80 everywhere (both in Docker/app, and node port/targetPort).

I have enabled the firewalls, both for the node port itself, and also the health checks, as explained in the GCP documentation: Firewalls

The steps I have mostly done are:

  1. Create a new GKE cluster (with HTTP load balancing enabled, though I'm not sure this is necessary because the ingress definition below automatically creates its own load balancer)

  2. Then I applied this deployment + service + ingress configuration with: kubectl apply -f deployment.yaml:

# Main api deployment
kind: Deployment
apiVersion: apps/v1
metadata:
    name: adcloud-api
spec:
    selector:
        matchLabels:
            app: adcloud-api
    replicas: 1
    template:
        metadata:
            labels:
                app: adcloud-api
        spec:

            containers:
                - name: adcloud-api
                  image: gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1
                  imagePullPolicy: IfNotPresent
                  ports:
                  - containerPort: 80
                    protocol: TCP
                  resources:
                      requests:
                          memory: "32Mi"
                          cpu: "100m"
                      limits:
                          memory: "128Mi"
                          cpu: "250m"
---
# Service for the above deployment
kind: Service
apiVersion: v1
metadata:
    name: adcloud-api
    annotations:
        cloud.google.com/backend-config: '{"ports": {"80":"adcloud-api-backendconfig"}, "default": "adcloud-api-backendconfig"}'
spec:
    #3type: LoadBalancer
    type: NodePort
    selector:
        app: adcloud-api
    ports:
        - protocol: TCP
          port: 80
          targetPort: 80
          nodePort: 32001
---
kind: BackendConfig
apiVersion: cloud.google.com/v1
metadata:
    name: adcloud-api-backendconfig
spec:
    healthCheck:
        # timeoutSec: 10
        # checkIntervalSec: 30
        requestPath: /health
        port: 80
        type: HTTP
  ---
# Ingress for the above service
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
    name: api-ingress
    annotations:
        kubernetes.io/ingress.class: "gce"
        gce.ingress.kubernetes.io/enable-cors: "true"
        gce.ingress.kubernetes.io/cors-allow-credentials: "true"
        gce.ingress.kubernetes.io/cors-allow-methods: "GET, POST, PUT, PATCH, DELETE, OPTIONS"
        gce.ingress.kubernetes.io/cors-allow-origin: "*"
spec:
    rules:
        - http:
              paths:
                  - path: /*
                    pathType: ImplementationSpecific
                    backend:
                        service:
                            name: adcloud-api
                            port:
                                number: 80

I have a general health check defined on port 80: health check

The backend services defined for the created Load Balancer show 1 healthy, and 1 other unhealthy because the health checks for it are "timing out". load balancer backends, 1 unhealthy

You can see this backend service is targeting nodePort 32001. Is this correct? I have the app's Dockerfile exposing only port 80, and have port 80 defined everywhere else (ie. in the health checks). Should the backend service here also be using port 80, or should it be using the nodePort 32001? Is there some internal proxy handling that?

The instance group members in the instance group for this "unhealthy" load balancer backend show that the "resource does not exist" for the VM instances...
Instance group VM instances don't exist?

However, GCE VM Instances shows that these instances are there? GCE VM instances

kubectl describe ingress is:

$ kubectl describe ingress
Name:             api-ingress
Namespace:        default
Address:          35.227.241.142
Default backend:  default-http-backend:80 (10.0.17.5:8080)
Rules:
  Host        Path  Backends
  ----        ----  --------
  *
              /*   adcloud-api:80 (10.0.17.6:80)
Annotations:  gce.ingress.kubernetes.io/cors-allow-credentials: true
              gce.ingress.kubernetes.io/cors-allow-methods: GET, POST, PUT, PATCH, DELETE, OPTIONS
              gce.ingress.kubernetes.io/cors-allow-origin: *
              gce.ingress.kubernetes.io/enable-cors: true
              ingress.kubernetes.io/backends: {"k8s-be-32001--8950a5f3e6292d2e":"Unknown","k8s-be-32722--8950a5f3e6292d2e":"Unknown"}
              ingress.kubernetes.io/forwarding-rule: k8s2-fr-nto0o9ht-default-api-ingress-vo6louo7
              ingress.kubernetes.io/target-proxy: k8s2-tp-nto0o9ht-default-api-ingress-vo6louo7
              ingress.kubernetes.io/url-map: k8s2-um-nto0o9ht-default-api-ingress-vo6louo7
              kubernetes.io/ingress.class: gce
Events:
  Type    Reason     Age                From                     Message
  ----    ------     ----               ----                     -------
  Normal  Sync       35m                loadbalancer-controller  UrlMap "k8s2-um-nto0o9ht-default-api-ingress-vo6louo7" created
  Normal  Sync       35m                loadbalancer-controller  TargetProxy "k8s2-tp-nto0o9ht-default-api-ingress-vo6louo7" created
  Normal  Sync       34m                loadbalancer-controller  ForwardingRule "k8s2-fr-nto0o9ht-default-api-ingress-vo6louo7" created
  Normal  IPChanged  34m                loadbalancer-controller  IP is now 35.227.241.142
  Normal  Sync       12m (x5 over 35m)  loadbalancer-controller  Scheduled for sync

kubectl describe pods is:

$ kubectl describe pods
Name:         adcloud-api-5cb96bb47d-tmrd8
Namespace:    default
Priority:     0
Node:         gke-adcloud-cluster-default-pool-6f91d4e7-z7ks/10.128.15.216
Start Time:   Wed, 27 Oct 2021 00:51:50 -0400
Labels:       app=adcloud-api
              pod-template-hash=5cb96bb47d
Annotations:  <none>
Status:       Running
IP:           10.0.17.6
IPs:
  IP:           10.0.17.6
Controlled By:  ReplicaSet/adcloud-api-5cb96bb47d
Containers:
  adcloud-api:
    Container ID:   containerd://ebd1b857c541b8fdc52dcc6e44d4617d9558bd7e16783a5a016d5bdd1cce7370
    Image:          gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1
    Image ID:       gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image@sha256:932e59850992e37854242cac9e70ca65eb52863f63c795d892c84671dca4ba68
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 27 Oct 2021 02:19:48 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     250m
      memory:  128Mi
    Requests:
      cpu:        100m
      memory:     32Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-4xcjx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-4xcjx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-4xcjx
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason        Age                From             Message
  ----     ------        ----               ----             -------
  Warning  FailedMount   53m                kubelet          MountVolume.SetUp failed for volume "default-token-4xcjx" : failed to sync secret cache: timed out waiting for the condition
  Normal   Pulling       53m                kubelet          Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
  Normal   Pulled        52m                kubelet          Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 52.167675207s
  Normal   Created       52m                kubelet          Created container adcloud-api
  Normal   Started       52m                kubelet          Started container adcloud-api
  Normal   Pulling       34m                kubelet          Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
  Normal   Pulled        34m                kubelet          Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 29.499249423s
  Normal   Started       34m                kubelet          Started container adcloud-api
  Normal   Created       34m                kubelet          Created container adcloud-api
  Warning  NodeNotReady  16m (x7 over 95m)  node-controller  Node is not ready
  Normal   Pulling       14m                kubelet          Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
  Normal   Pulled        13m                kubelet          Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 27.814932495s
  Normal   Created       13m                kubelet          Created container adcloud-api
  Normal   Started       13m                kubelet          Started container adcloud-api

I don't really know what is preventing the unhealthy nature of the single backend service, why it can't be reached for the health check.

You can see the ingress can't see the backend services here: ingress backend services not reachable

Are these unreachable because the health check is failing, or is the health check failing because these are unreachable? Does anyone have any suggestions on what else might be going on here? Do I need to do any network configuration beyond the deployment definition file above? Should the health checks run on the open app ports (ie. 80), or the ephemeral nodePort (ie. 32001)?

The ingress external IP results in a 502 because of the unreachable backend: 502 Service Error

I have been at it for a few days and would appreciate any help at this point!

1
Have your tried uncommenting timeoutSec and checkIntervalSec ? Also, try changing your TargetPort to 8080.Sergiusz
Thanks. If I change the targetPorrt to 8080, this would mean I would need to ensure my application is starting on port 8080, correct? Currently it's all setup for port 80. Why would changing to 8080 make a difference? I will give it a shot.Ryan Weiss

1 Answers

0
votes

It is a common practice to have a different Port and targetPort. Take a look at example services in documentation.
I also noticed that you commented out Loadbalancer and changed the service type to NodePort. Make sure settings across all yaml files are aligned.

If all the settings are correct and the issue persist, enable GKE logging and contact Google Support or report the problem via Public Issue Tracker.