I am trying to test my dataflow pipeline on the DataflowRunner. My code always gets stuck at 1 hr 1min and says: The Dataflow appears to be stuck. When digging through the stack trace of the Dataflow stackdriver, I come across the error saying the Failed to install packages: failed to install workflow: exit status 1
. I saw other stack overflow messages saying that this can be caused when pip packages are not compatible. This is causing my worker startup to always fail.
This is my current setup.py. Can someone please help me understand what I am missing. The job id is 2018-02-09_08_22_34-6196858167817670597.
setup.py
from setuptools import setup, find_packages
requires = [
'numpy==1.14.0',
'google-cloud-storage==1.7.0',
'pandas==0.22.0',
'sqlalchemy-vertica[pyodbc,turbodbc,vertica-python]==0.2.5',
'sqlalchemy==1.2.2',
'apache_beam[gcp]==2.2.0',
'google-cloud-dataflow==2.2.0'
]
setup(
name="dataflow_pipeline_dependencies",
version="1.0.0",
description="Beam pipeline for flattening ism data",
packages=find_packages(),
install_requires=requires
)