I have a Cloud Dataflow job that's stuck in the initiation phase, before running any application logic. I tested this by adding a log output statement inside inside the processElement
step, but it's not appearing in the logs so it seems it's not being reached.
All I can see in the logs are the following messages, this which appears every minute:
logger: Starting supervisor: /etc/supervisor/supervisord_watcher.sh: line 36: /proc//oom_score_adj: Permission denied
And these which loop every few seconds:
VM is healthy? true.
http: TLS handshake error from 172.17.0.1:38335: EOF
Job is in state JOB_STATE_RUNNING, will check again in 30 seconds.
The job ID is 2015-09-14_06_30_22-15275884222662398973
, though I have an additional two jobs (2015-09-14_05_59_30-11021392791304643671
, 2015-09-14_06_08_41-3621035073455045662
) that I started the morning and which have the same problem.
Any ideas on what might be causing this?
output()
? If the data is coming in from the input to theDoFn
this shouldn't be a problem (since it happens on the worker, after construction of the job). Or is the data coming from a field in aDoFn
or somehow being serialized to the worker some other way? – Ben Chambersoutput()
about 300 times. – Darren OlivierDoFn
, but within theDoFn
there's a for-each loop that runs through a simple array data structure of +- 300 elements and outputs a result for most of them. So theDoFn
itself outputs +-300TableRow
instances. – Darren Olivier