1
votes

Recently I've been getting this error when running dataflow jobs written in Python. The thing is it used to work and no code has changed so I'm thinking it's got something to do with the env.

Error syncing pod d557f64660a131e09d2acb9478fad42f (""), skipping: failed to "StartContainer" for "python" with CrashLoopBackOff: "Back-off 20s restarting failed container=python pod=dataflow-)

Can anyone help me with this?

2
What is your SDK version?Héctor Neri
I'm using Google Cloud Dataflow SDK for Python 2.5.0Sunil Karsan

2 Answers

3
votes

In my case, I was using Apache Beam SDK version 2.9.0 had the same problem.

I used setup.py and set-up field “install_requires” was filled dynamically by loading content of requirements.txt file. It’s okay if you’re using DirectRunner but DataflowRunner is too sensitive for dependencies on local files, so abandoning that technique and hard-coding dependencies from requirements.txt into “install_requires” solved an issue for me.

If you stuck on that try to investigate your dependencies and minimize them as much as you can. Please refer to the Managing Python Pipeline Dependencies documentation topic for help. Avoid using complex or nested code-structures or dependencies on the local filesystem.

1
votes

Neri, thanks for your pointer to the SDK. I noticed that my requirements file was using an older version of the SDK 2.4.0. I've now changed everything to 2.6.0 and it's no longer stuck.