I have a requirement to extract data from BigQuery table using Dataflow and write to GCS bucket.
Dataflow is built using apache beam (Java). The dataflow extracts from BigQuery and write to GCS perfectly for the first time.
But when a second dataflow is spinned up to extract data from same table after the first pipeline executes successfully, it is not extracting any data from Big Query. The only error i can see in the stackdriver log is
Blockquote "Request failed with code 409, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes, HTTP framework says request can be retried, (caller responsible for retrying): https://www.googleapis.com/bigquery/v2/projects/dataflow-begining/jobs"
The sample code i have used for extraction is
pipeline.apply("Extract from BQ", BigQueryIO.readTableRows().fromQuery("SELECT * from bq_test.employee"))
Any help is appreciated
SELECT *
. Instead, usefrom(TableReference)
instead offromQuery(SQL)
. – Graham Polley