Apache Beam Python SDK upgrade to 2.11.0 issue .
I am upgrading the sdk from 2.4.0 to 2.11.0 using requirements.txt. It has dependencies as below:
apache_beam==2.11.0
google-cloud-dataflow==2.4.0
httplib2==0.11.3
google-cloud==0.27.0
google-cloud-storage==1.3.0
workflow
For managing the dependencies in beam pipeline we have this txt file. There are two vm instance on google compute engine , one is master other is worker. These instances will install all packages listed in the requirements.txt file.
The jobs are run through DataflowRunner. If running the code manually using command as
python code.py --project --setupFilePath --requirementFilePath --workerMachineType n1-standard-8 --runner DataflowRunner.
The job is not upgrading the version to 2.11.0 , rather it fails .Error Message in stackdriver logs:
2019-03-26 19:02:02.000 IST
Failed to install packages: failed to install requirements: exit status 1
Expand all | Collapse all {
insertId: "27857323862365974846:1225647:0:438995"
jsonPayload: {
line: "boot.go:144"
message: "Failed to install packages: failed to install requirements: exit status 1"
}
labels: {
compute.googleapis.com/resource_id: "278567544395974846"
compute.googleapis.com/resource_name: "icf-20190334132038-03260625-b9fa-harness-gtml"
compute.googleapis.com/resource_type: "instance"
dataflow.googleapis.com/job_id: "2019-03-26_06_25_16-6068768320191854196"
dataflow.googleapis.com/job_name: "icf-20190326132038"
dataflow.googleapis.com/region: "global"
}
logName: "projects/project-id/logs/dataflow.googleapis.com%2Fworker-startup"
receiveTimestamp: "2019-03-26T13:32:07.627920858Z"
resource: {
labels: {
job_id: "2019-03-26_06_25_16-6068768320191854196"
job_name: "icf-20190326132038"
project_id: "project-id"
region: "global"
step_id: ""
}
type: "dataflow_step"
}
severity: "CRITICAL"
timestamp: "2019-03-26T13:32:02Z"
}
Note : When running the pip install apache-beam==2.11.0 on both worker and master , the code runs.*