1
votes

I have a dataflow job which I am trying to 'drain'. Explanation of drain option says that

Dataflow will cease all data ingestion, but will attempt to finish processing any remaining buffered data. Pipeline resources will be maintained until buffered data has finished processing and any pending output has finished writing.

But data ingestion does not seem to stop. The Elements added count is still increasing and the job hasn't stopped for over an hour now. Is this expected behavior? I am using Pub/Sub source if that helps.

EDIT: Here is the job ID - 2017-10-30_19_59_30-14251132252018661885

1
That potentially sounds like a bug. Please include a Dataflow job ID so a Dataflow engineer can help debug this. - jkff
@jkff I have added the job ID to the question. :-) - Kakaji
Thanks. Is your job stuck in a loop of retrying some work that keeps failing? In that case drain won't work, you'll need to update the pipeline with non-failing code first, or cancel it. - jkff
@jkff I am deliberately failing my job to see if I can obtain the failed data again after I fix the job. I was suggested that drain is the way to go (stackoverflow.com/questions/46721532/…). There is no loop inside the code, it's a simple JSON decoding pipeline. As I mentioned, Elements added keeps on increasing in the very first PubsubIO.Read step even after I press drain. That step does not contain any code I wrote, it's a simple PubsubIO.readStrings().fromSubscription() command. Thanks! - Kakaji
Does the Dataflow UI show any errors or exceptions in the logs? The retry loop is in Dataflow: streaming runner treats all errors as transient and retries them forever to avoid discarding data. - jkff

1 Answers

1
votes

As mentioned in the comments by @jkff, a failing job cannot be drained. The correct way to handle a failing dataflow job is to fix the code and update the job using --update option. This prevents any data loss.