2
votes

There is an issue with my company's Pub/Sub. Some of our messages are stuck and the oldest unacked message age is increasing over time.

1 day charts:

enter image description here enter image description here

and when I go to metrics explorer and select Expired ack deadlines count this is the one week chart.

enter image description here

I decided to find out why these messages are stuck, but when I ran the pull command (below), I got Listed 0 items response. It is therefore not possible to see them.

Is there a way how I can figure out why some of the messages are displayed as unacknowledged?

Also, the Unacked message count shows the same amount (around 2k) messages for the whole month, even though there are new messages published every day.

Here are the parameters we use for this subscription: enter image description here

I tried to fix this error by setting the deadline to 600 seconds, but it didn't help.

Additionally, I want to mention that we use node.js Pub/Sub client library to handle the messages.

1
You say "this" subscription. Are there other subscriptions for the topic? Messages may be being held pending another subscription's pulls. - DazWilkin
@DazWilkin thanks for the comment. There is only one subscription for the topic, the one that I described. - Michal Moravik
Then I recommend you contact Cloud Support and have an engineer investigate - DazWilkin
Alright, thanks for your time - Michal Moravik
@DazWilkin The existence of multiple subscriptions does not result in messages being held pending delivery to another subscription. This can happen if there are multiple subscribers on the same subscription. In other words, if some messages are already outstanding to a subscriber, those messages are not eligible for redelivery until the ack deadline has passed. - Kamal Aboul-Hosn

1 Answers

1
votes

The most common causes of messages not being able to be pulled are:

  1. The subscriber client already received the messages and "forgot" about them, perhaps due to an exception being thrown and not handled. In this case, the message will continue to be leased by the client until the deadline passes. The client libraries all extend the lease automatically until the maxExtension time is reached. If these are messages that are always forgotten, then it could be that they are redelivered to the subscriber and forgotten again, resulting in them not being pullable via the gcloud command-line tool or UI.
  2. There could be a rogue subscriber. It could be that another subscriber is running somewhere for the same subscription and is "stealing" these messages. Sometimes this can be a test job or something that was used early on to see if the subscription works as expected and wasn't turned down.
  3. You could be falling into the case of a large backlog of small messages. This should be fixed in more recent versions of the client library (v2.3.0 of the Node client has the fix).
  4. The gcloud pubsub subscription pull command and UI are not guaranteed to return messages, even if there are some available to pull. Sometimes, rerunning the command multiple times in quick succession helps to pull messages.

The fact that you see expired ack deadlines likely points to 1, 2, or 3, so it is worth checking for those things. Otherwise, you should open a support case so the engineers can look more specifically at the backlog and determine where the messages are.