Summary: Some local packages works and some doesn't
My beam application's structure:
-setup.py
-app/__init__.py
-app/main.py
-package1/__init__.py
-package1/one.py
-package2/__init__.py
-package2/two.py
-package3/__init__.py
-package3/three.py
In main.py:
from package1 import one
from package2 import two
from package3 import three
In setup.py
import setuptools
setuptools.setup(
name='beam',
version='1.0',
install_requires=['apache-beam[gcp]',
'google-cloud==0.34.0',
'google-cloud-bigquery==0.25.0',
'requests==2.19.1',
'google-cloud-storage==1.12.0'
],
packages=setuptools.find_packages(),
)
When running, by using python -m app.main
:
With direct runner (locally run), no problem.
With DataflowRunner (send to gogole dataflow), I have this error:
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 642, in do_work work_executor.execute() File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 156, in execute op.start() File "apache_beam/runners/worker/operations.py", line 344, in apache_beam.runners.worker.operations.DoOperation.start def start(self): File "apache_beam/runners/worker/operations.py", line 345, in apache_beam.runners.worker.operations.DoOperation.start with self.scoped_start_state: File "apache_beam/runners/worker/operations.py", line 350, in apache_beam.runners.worker.operations.DoOperation.start pickler.loads(self.spec.serialized_fn)) File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 244, in loads return dill.loads(s) File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 316, in loads return load(file, ignore) File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 304, in load obj = pik.load() File "/usr/lib/python2.7/pickle.py", line 864, in load dispatchkey File "/usr/lib/python2.7/pickle.py", line 1096, in load_global klass = self.find_class(module, name) File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 465, in find_class return StockUnpickler.find_class(self, module, name) File "/usr/lib/python2.7/pickle.py", line 1130, in find_class import(module) ImportError: No module named three
This is "a bit" frustrating because I double/triple/... check what can be the difference between those packages, and they are the same. Sane __init__.py
file (empty, no weird or hidden characters in them). Same type of structure in *.py
. But for some reason, the package 3 just doesn't want to cooperate.
Does anyone have a solution for this problem?
Thank you.