Our cloud DataFlow job reads from BigQuery, does some preprocessing and then writes back to BigQuery. Unfortunately, it has failed after some hours when reading from BigQuery with the following error message:
raise exceptions.HttpError.FromResponse(response) apitools.base.py.exceptions.HttpNotFoundError: HttpError accessing : response: <{'x-guploader-uploadid': 'AEnB2UpgIuanY0AawrT7fRC_VW3aRfWSdrrTwT_TqQx1fPAAAUohVoL-8Z8Zw_aYUQcSMNqKIh5R2TulvgHHsoxLWo2gl6wUEA', 'content-type': 'text/html; charset=UTF-8', 'date': 'Tue, 19 Nov 2019 15:28:07 GMT', 'vary': 'Origin, X-Origin', 'expires': 'Tue, 19 Nov 2019 15:28:07 GMT', 'cache-control': 'private, max-age=0', 'content-length': '142', 'server': 'UploadServer', 'status': '404'}>, content No such object: --project--/beam/temp--job-name---191119-084402.1574153042.687677/11710707918635668555/000000000009.avro>
Before this error, the logs show a lot of entries similar to these ones:
Does someone have an idea what might cause the DataFlow job to fail? When running this job on a small subset of the data, there is no problem at all.
