Our application is running on Google Kubernetes Engine
and pulling messages from a Google Cloud Pub/Sub
Subscription. We have one pod running in idle state, and horizontal pod autoscaling is set up to 10 pods depending on cpu usage. The subscription is mostly empty, and when a batch job kicks in, it writes into to Pub/Sub topic. The autoscaling is working well. It immediately (within 30 seconds) scales up to 10 pods once there are outstanding messages in the Pub/Sub subscription. But the issue is only a few pods are pulling the messages from the subscription and rest of them are just sitting even though there are still messages in the subscription.
Pub/Sub Client settings are:
MaxExtension: 600
MaxOutstandingMessages: 100 (also tried with 25)
Synchronous: true (also tried with false)
Google Cloud Pub/Sub Subscription Settings:
Pull-based
Ack Deadline is 600 seconds
And once the batch job kicks in, it writes 20k messages into the Pub/Sub topic. And the application can process 2 messages/sec in average.
The application is written in golang
and we're using cloud.google.com/go v0.44.1
package version.
Do you know why the pods are sitting and not pulling messages even though there's a backlog in the Cloud Pub/Sub subscription?
recieve
– code muncher