I created a pipeline with Apache Beam on Python 2.7 that runs on Google Dataflow. This pipeline works well when I deploy it locally from my laptop. I now wish to deploy it via CloudBuild. This is my cloudbuild.yaml file:
steps:
- name: "docker.io/library/python:2.7"
args: ["pip", "install", "-t", "/workspace/lib", "-r", "requirements.txt"]
- name: "docker.io/library/python:2.7"
args: ["python2", "tests.py"]
env: ["PYTHONPATH=/workspace/lib"]
When the CloudBuild is triggered, it successfully installs all the requirements, but then when it tries to import apache_beam in the tests.py file, I recieve the following error:
File "tests.py", line 3, in <module>
import apache_beam as beam
File "/workspace/lib/apache_beam/__init__.py", line 97, in <module>
from apache_beam import coders
File "/workspace/lib/apache_beam/coders/__init__.py", line 19, in <module>
from apache_beam.coders.coders import *
File "/workspace/lib/apache_beam/coders/coders.py", line 29, in <module>
import google.protobuf.wrappers_pb2
ImportError: No module named google.protobuf.wrappers_pb2
In the requirements.txt file I have inter alia the following:
apache-beam==2.16.0
protobuf==3.11.0
Note: All the necessary requirements are listed in requirements.txt, since I can deploy the pipeline locally.