I'm using cloud Dataflow to import data from Pub/Sub messages to BigQuery tables. I'm using DynamicDestinations since these messages can be put into different tables.
I've recently noticed that the process started consuming all resources and messages stating that the process is stuck started showing:
Processing stuck in step Write Avros to BigQuery Table/StreamingInserts/StreamingWriteTables/StreamingWrite for at least 26h45m00s without outputting or completing in state finish at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:765) at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:829) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:131) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:103) at org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown Source)
Currently, simply cancelling the pipeline and restarting it seems to temporarily solve the problem, but I can't seem to pinpoint the reason the process is getting stuck.
The pipeline is using beam-runners-google-cloud-dataflow-java version 2.8.0 and google-cloud-bigquery version 1.56.0