I am trying to deploy a node.js application (Dockerized in Artifact Registry) into a GCP Kubernetes cluster... and then also put the service/deployment behind an ingress, so that our static frontend can talk to this application via CORS.
I am able to get the service to work without using ingress (just a standard service/deployment), but the frontend cannot talk to it because of CORS errors. After researching I have learned that I should create an Ingress to control the traffic for this scenario.
I have verified the app is running, both by looking at the GKE Workloads logs (app has started), and also entering the GKE pods (via busybox curl proxy), and am able to curl the GKE service, and it returns the expected responses. So I have determined the issue is restricted to the load balancer traffic not being routed correctly, or denied for some reason.
The app deployment is configured to run on port 80 everywhere (both in Docker/app, and node port/targetPort).
I have enabled the firewalls, both for the node port itself, and also the health checks, as explained in the GCP documentation:
The steps I have mostly done are:
Create a new GKE cluster (with HTTP load balancing enabled, though I'm not sure this is necessary because the ingress definition below automatically creates its own load balancer)
Then I applied this deployment + service + ingress configuration with:
kubectl apply -f deployment.yaml
:
# Main api deployment
kind: Deployment
apiVersion: apps/v1
metadata:
name: adcloud-api
spec:
selector:
matchLabels:
app: adcloud-api
replicas: 1
template:
metadata:
labels:
app: adcloud-api
spec:
containers:
- name: adcloud-api
image: gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
protocol: TCP
resources:
requests:
memory: "32Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "250m"
---
# Service for the above deployment
kind: Service
apiVersion: v1
metadata:
name: adcloud-api
annotations:
cloud.google.com/backend-config: '{"ports": {"80":"adcloud-api-backendconfig"}, "default": "adcloud-api-backendconfig"}'
spec:
#3type: LoadBalancer
type: NodePort
selector:
app: adcloud-api
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 32001
---
kind: BackendConfig
apiVersion: cloud.google.com/v1
metadata:
name: adcloud-api-backendconfig
spec:
healthCheck:
# timeoutSec: 10
# checkIntervalSec: 30
requestPath: /health
port: 80
type: HTTP
---
# Ingress for the above service
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: api-ingress
annotations:
kubernetes.io/ingress.class: "gce"
gce.ingress.kubernetes.io/enable-cors: "true"
gce.ingress.kubernetes.io/cors-allow-credentials: "true"
gce.ingress.kubernetes.io/cors-allow-methods: "GET, POST, PUT, PATCH, DELETE, OPTIONS"
gce.ingress.kubernetes.io/cors-allow-origin: "*"
spec:
rules:
- http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: adcloud-api
port:
number: 80
I have a general health check defined on port 80:
The backend services defined for the created Load Balancer show 1 healthy, and 1 other unhealthy because the health checks for it are "timing out".
You can see this backend service is targeting nodePort 32001. Is this correct? I have the app's Dockerfile exposing only port 80, and have port 80 defined everywhere else (ie. in the health checks). Should the backend service here also be using port 80, or should it be using the nodePort 32001? Is there some internal proxy handling that?
The instance group members in the instance group for this "unhealthy" load balancer backend show that the "resource does not exist" for the VM instances...
However, GCE VM Instances shows that these instances are there?
kubectl describe ingress is:
$ kubectl describe ingress
Name: api-ingress
Namespace: default
Address: 35.227.241.142
Default backend: default-http-backend:80 (10.0.17.5:8080)
Rules:
Host Path Backends
---- ---- --------
*
/* adcloud-api:80 (10.0.17.6:80)
Annotations: gce.ingress.kubernetes.io/cors-allow-credentials: true
gce.ingress.kubernetes.io/cors-allow-methods: GET, POST, PUT, PATCH, DELETE, OPTIONS
gce.ingress.kubernetes.io/cors-allow-origin: *
gce.ingress.kubernetes.io/enable-cors: true
ingress.kubernetes.io/backends: {"k8s-be-32001--8950a5f3e6292d2e":"Unknown","k8s-be-32722--8950a5f3e6292d2e":"Unknown"}
ingress.kubernetes.io/forwarding-rule: k8s2-fr-nto0o9ht-default-api-ingress-vo6louo7
ingress.kubernetes.io/target-proxy: k8s2-tp-nto0o9ht-default-api-ingress-vo6louo7
ingress.kubernetes.io/url-map: k8s2-um-nto0o9ht-default-api-ingress-vo6louo7
kubernetes.io/ingress.class: gce
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 35m loadbalancer-controller UrlMap "k8s2-um-nto0o9ht-default-api-ingress-vo6louo7" created
Normal Sync 35m loadbalancer-controller TargetProxy "k8s2-tp-nto0o9ht-default-api-ingress-vo6louo7" created
Normal Sync 34m loadbalancer-controller ForwardingRule "k8s2-fr-nto0o9ht-default-api-ingress-vo6louo7" created
Normal IPChanged 34m loadbalancer-controller IP is now 35.227.241.142
Normal Sync 12m (x5 over 35m) loadbalancer-controller Scheduled for sync
kubectl describe pods is:
$ kubectl describe pods
Name: adcloud-api-5cb96bb47d-tmrd8
Namespace: default
Priority: 0
Node: gke-adcloud-cluster-default-pool-6f91d4e7-z7ks/10.128.15.216
Start Time: Wed, 27 Oct 2021 00:51:50 -0400
Labels: app=adcloud-api
pod-template-hash=5cb96bb47d
Annotations: <none>
Status: Running
IP: 10.0.17.6
IPs:
IP: 10.0.17.6
Controlled By: ReplicaSet/adcloud-api-5cb96bb47d
Containers:
adcloud-api:
Container ID: containerd://ebd1b857c541b8fdc52dcc6e44d4617d9558bd7e16783a5a016d5bdd1cce7370
Image: gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1
Image ID: gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image@sha256:932e59850992e37854242cac9e70ca65eb52863f63c795d892c84671dca4ba68
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 27 Oct 2021 02:19:48 -0400
Ready: True
Restart Count: 0
Limits:
cpu: 250m
memory: 128Mi
Requests:
cpu: 100m
memory: 32Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4xcjx (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-4xcjx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4xcjx
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 53m kubelet MountVolume.SetUp failed for volume "default-token-4xcjx" : failed to sync secret cache: timed out waiting for the condition
Normal Pulling 53m kubelet Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
Normal Pulled 52m kubelet Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 52.167675207s
Normal Created 52m kubelet Created container adcloud-api
Normal Started 52m kubelet Started container adcloud-api
Normal Pulling 34m kubelet Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
Normal Pulled 34m kubelet Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 29.499249423s
Normal Started 34m kubelet Started container adcloud-api
Normal Created 34m kubelet Created container adcloud-api
Warning NodeNotReady 16m (x7 over 95m) node-controller Node is not ready
Normal Pulling 14m kubelet Pulling image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1"
Normal Pulled 13m kubelet Successfully pulled image "gcr.io/ad-cloud-328718/adcloud/adcloud-web-api-image:v1" in 27.814932495s
Normal Created 13m kubelet Created container adcloud-api
Normal Started 13m kubelet Started container adcloud-api
I don't really know what is preventing the unhealthy nature of the single backend service, why it can't be reached for the health check.
You can see the ingress can't see the backend services here:
Are these unreachable because the health check is failing, or is the health check failing because these are unreachable? Does anyone have any suggestions on what else might be going on here? Do I need to do any network configuration beyond the deployment definition file above? Should the health checks run on the open app ports (ie. 80), or the ephemeral nodePort (ie. 32001)?
The ingress external IP results in a 502 because of the unreachable backend:
I have been at it for a few days and would appreciate any help at this point!