I'm trying to apply gRPC load balancing with Ingress on GCP, and for this I referenced this example. The example shows gRPC load balancing is working by 2 ways(one with envoy side-car and the other one is HTTP mux, handling both gRPC/HTTP-health-check on same Pod.) However, the envoy proxy example doesn't work.
What makes me confused is, the Pods are running/healthy(confirmed by kubectl describe
, kubectl logs
)
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
fe-deployment-757ffcbd57-4w446 2/2 Running 0 4m22s
fe-deployment-757ffcbd57-xrrm9 2/2 Running 0 4m22s
$ kubectl describe pod fe-deployment-757ffcbd57-4w446
Name: fe-deployment-757ffcbd57-4w446
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc/10.128.0.64
Start Time: Thu, 26 Sep 2019 16:15:18 +0900
Labels: app=fe
pod-template-hash=757ffcbd57
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container fe-envoy; cpu request for container fe-container
Status: Running
IP: 10.56.1.29
Controlled By: ReplicaSet/fe-deployment-757ffcbd57
Containers:
fe-envoy:
Container ID: docker://b4789909494f7eeb8d3af66cb59168e009c582d412d8ca683a7f435559989421
Image: envoyproxy/envoy:latest
Image ID: docker-pullable://envoyproxy/envoy@sha256:9ef9c4fd6189fdb903929dc5aa0492a51d6783777de65e567382ac7d9a28106b
Port: 8080/TCP
Host Port: 0/TCP
Command:
/usr/local/bin/envoy
Args:
-c
/data/config/envoy.yaml
State: Running
Started: Thu, 26 Sep 2019 16:15:19 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data/certs from certs-volume (rw)
/data/config from envoy-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
fe-container:
Container ID: docker://a533224d3ea8b5e4d5e268a616d73762b37df69f434342459f35caa8fac32dab
Image: salrashid123/grpc_only_backend
Image ID: docker-pullable://salrashid123/grpc_only_backend@sha256:ebfac594116445dd67aff7c9e7a619d73222b60947e46ef65ee6d918db3e1f4b
Port: 50051/TCP
Host Port: 0/TCP
Command:
/grpc_server
Args:
--grpcport
:50051
--insecure
State: Running
Started: Thu, 26 Sep 2019 16:15:20 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
certs-volume:
Type: Secret (a volume populated by a Secret)
SecretName: fe-secret
Optional: false
envoy-config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: envoy-configmap
Optional: false
default-token-c7nqc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-c7nqc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m25s default-scheduler Successfully assigned default/fe-deployment-757ffcbd57-4w446 to gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc
Normal Pulled 4m25s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Container image "envoyproxy/envoy:latest" already present on machine
Normal Created 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Created container
Normal Started 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Started container
Normal Pulling 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc pulling image "salrashid123/grpc_only_backend"
Normal Pulled 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Successfully pulled image "salrashid123/grpc_only_backend"
Normal Created 4m24s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Created container
Normal Started 4m23s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Started container
Warning Unhealthy 4m10s (x2 over 4m20s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 4m9s (x2 over 4m19s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-l7vc Liveness probe failed: HTTP probe failed with statuscode: 503
$ kubectl describe pod fe-deployment-757ffcbd57-xrrm9
Name: fe-deployment-757ffcbd57-xrrm9
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9/10.128.0.22
Start Time: Thu, 26 Sep 2019 16:15:18 +0900
Labels: app=fe
pod-template-hash=757ffcbd57
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container fe-envoy; cpu request for container fe-container
Status: Running
IP: 10.56.0.23
Controlled By: ReplicaSet/fe-deployment-757ffcbd57
Containers:
fe-envoy:
Container ID: docker://255dd6cab1e681e30ccfe158f7d72540576788dbf6be60b703982a7ecbb310b1
Image: envoyproxy/envoy:latest
Image ID: docker-pullable://envoyproxy/envoy@sha256:9ef9c4fd6189fdb903929dc5aa0492a51d6783777de65e567382ac7d9a28106b
Port: 8080/TCP
Host Port: 0/TCP
Command:
/usr/local/bin/envoy
Args:
-c
/data/config/envoy.yaml
State: Running
Started: Thu, 26 Sep 2019 16:15:19 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:fe/_ah/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data/certs from certs-volume (rw)
/data/config from envoy-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
fe-container:
Container ID: docker://f6a0246129cc89da846c473daaa1c1770d2b5419b6015098b0d4f35782b0a9da
Image: salrashid123/grpc_only_backend
Image ID: docker-pullable://salrashid123/grpc_only_backend@sha256:ebfac594116445dd67aff7c9e7a619d73222b60947e46ef65ee6d918db3e1f4b
Port: 50051/TCP
Host Port: 0/TCP
Command:
/grpc_server
Args:
--grpcport
:50051
--insecure
State: Running
Started: Thu, 26 Sep 2019 16:15:20 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-c7nqc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
certs-volume:
Type: Secret (a volume populated by a Secret)
SecretName: fe-secret
Optional: false
envoy-config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: envoy-configmap
Optional: false
default-token-c7nqc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-c7nqc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m8s default-scheduler Successfully assigned default/fe-deployment-757ffcbd57-xrrm9 to gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9
Normal Pulled 5m8s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Container image "envoyproxy/envoy:latest" already present on machine
Normal Created 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Created container
Normal Started 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Started container
Normal Pulling 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 pulling image "salrashid123/grpc_only_backend"
Normal Pulled 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Successfully pulled image "salrashid123/grpc_only_backend"
Normal Created 5m7s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Created container
Normal Started 5m6s kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Started container
Warning Unhealthy 4m53s (x2 over 5m3s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 4m52s (x2 over 5m2s) kubelet, gke-ingress-grpc-loadbal-default-pool-92d3aed5-52l9 Liveness probe failed: HTTP probe failed with statuscode: 503
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
fe-srv-ingress NodePort 10.123.5.165 <none> 8080:30816/TCP 6m43s
fe-srv-lb LoadBalancer 10.123.15.36 35.224.69.60 50051:30592/TCP 6m42s
kubernetes ClusterIP 10.123.0.1 <none> 443/TCP 2d2h
$ kubectl describe service fe-srv-ingress
Name: fe-srv-ingress
Namespace: default
Labels: type=fe-srv
Annotations: cloud.google.com/neg: {"ingress": true}
cloud.google.com/neg-status:
{"network_endpoint_groups":{"8080":"k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2"},"zones":["us-central1-a"]}
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"cloud.google.com/neg":"{\"ingress\": true}","service.alpha.kubernetes.io/a...
service.alpha.kubernetes.io/app-protocols: {"fe":"HTTP2"}
Selector: app=fe
Type: NodePort
IP: 10.123.5.165
Port: fe 8080/TCP
TargetPort: 8080/TCP
NodePort: fe 30816/TCP
Endpoints: 10.56.0.23:8080,10.56.1.29:8080
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Create 6m47s neg-controller Created NEG "k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2" for default/fe-srv-ingress-8080/8080 in "us-central1-a".
Normal Attach 6m40s neg-controller Attach 2 network endpoint(s) (NEG "k8s1-963b7b91-default-fe-srv-ingress-8080-e459b0d2" in zone "us-central1-a")
but NEG says they are unhealthy(so Ingress also says backend is unhealthy).
I couldn't found what caused this. Does anyone know how to solve this?
Test environment:
- GKE, 1.13.7-gke.8 (VPC enabled)
- Default HTTP(s) load balancer on Ingress
YAML files I used(same with the example previously mentioned),
envoy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-configmap
labels:
app: fe
data:
config: |-
---
admin:
access_log_path: /dev/null
address:
socket_address:
address: 127.0.0.1
port_value: 9000
node:
cluster: service_greeter
id: test-id
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 0.0.0.0, port_value: 8080 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
path: "/echo.EchoServer/SayHello"
route: { cluster: local_grpc_endpoint }
http_filters:
- name: envoy.lua
config:
inline_code: |
package.path = "/etc/envoy/lua/?.lua;/usr/share/lua/5.1/nginx/?.lua;/etc/envoy/lua/" .. package.path
function envoy_on_request(request_handle)
if request_handle:headers():get(":path") == "/_ah/health" then
local headers, body = request_handle:httpCall(
"local_admin",
{
[":method"] = "GET",
[":path"] = "/clusters",
[":authority"] = "local_admin"
},"", 50)
str = "local_grpc_endpoint::127.0.0.1:50051::health_flags::healthy"
if string.match(body, str) then
request_handle:respond({[":status"] = "200"},"ok")
else
request_handle:logWarn("Envoy healthcheck failed")
request_handle:respond({[":status"] = "503"},"unavailable")
end
end
end
- name: envoy.router
typed_config: {}
tls_context:
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/data/certs/tls.crt"
private_key:
filename: "/data/certs/tls.key"
clusters:
- name: local_grpc_endpoint
connect_timeout: 0.05s
type: STATIC
http2_protocol_options: {}
lb_policy: ROUND_ROBIN
common_lb_config:
healthy_panic_threshold:
value: 50.0
health_checks:
- timeout: 1s
interval: 5s
interval_jitter: 1s
no_traffic_interval: 5s
unhealthy_threshold: 1
healthy_threshold: 3
grpc_health_check:
service_name: "echo.EchoServer"
authority: "server.domain.com"
hosts:
- socket_address:
address: 127.0.0.1
port_value: 50051
- name: local_admin
connect_timeout: 0.05s
type: STATIC
lb_policy: ROUND_ROBIN
hosts:
- socket_address:
address: 127.0.0.1
port_value: 9000
fe-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: fe-deployment
labels:
app: fe
spec:
replicas: 2
template:
metadata:
labels:
app: fe
spec:
containers:
- name: fe-envoy
image: envoyproxy/envoy:latest
imagePullPolicy: IfNotPresent
livenessProbe:
httpGet:
path: /_ah/health
scheme: HTTPS
port: fe
readinessProbe:
httpGet:
path: /_ah/health
scheme: HTTPS
port: fe
ports:
- name: fe
containerPort: 8080
protocol: TCP
command: ["/usr/local/bin/envoy"]
args: ["-c", "/data/config/envoy.yaml"]
volumeMounts:
- name: certs-volume
mountPath: /data/certs
- name: envoy-config-volume
mountPath: /data/config
- name: fe-container
image: salrashid123/grpc_only_backend # This runs gRPC secure/insecure server using port argument(:50051). Port 50051 is also exposed on Dockerfile.
imagePullPolicy: Always
ports:
- containerPort: 50051
protocol: TCP
command: ["/grpc_server"]
args: ["--grpcport", ":50051", "--insecure"]
volumes:
- name: certs-volume
secret:
secretName: fe-secret
- name: envoy-config-volume
configMap:
name: envoy-configmap
items:
- key: config
path: envoy.yaml
fe-srv-ingress.yaml
---
apiVersion: v1
kind: Service
metadata:
name: fe-srv-ingress
labels:
type: fe-srv
annotations:
service.alpha.kubernetes.io/app-protocols: '{"fe":"HTTP2"}'
cloud.google.com/neg: '{"ingress": true}'
spec:
type: NodePort
ports:
- name: fe
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: fe
fe-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: fe-ingress
annotations:
kubernetes.io/ingress.allow-http: "false"
spec:
tls:
- hosts:
- server.domain.com
secretName: fe-secret
rules:
- host: server.domain.com
http:
paths:
- path: /echo.EchoServer/*
backend:
serviceName: fe-srv-ingress
servicePort: 8080
kubectl describe
for each pod, thx. – Yasen