2
votes

I'm trying to create and export a stream of synthetic data using Dataflow, Pub/Sub, and BigQuery. I followed the synthetic data generation instructions using the following schema:

{
    "id": "{{uuid()}}",
    "test_value": {{integer(1,50)}}
}

The schema is in a file gs://my-folder/my-schema.json. The stream seems to be running correctly - I can export from the corresponding Pub/Sub topic to a GCS bucket using the "Export to Cloud Storage" template. When I try to use the "Export to BigQuery" template, I keep getting this error:

Request failed with code 400, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes, HTTP framework says request can be retried, (caller responsible for retrying): https://bigquery.googleapis.com/bigquery/v2/projects/<my-project>/datasets/<my-dataset>/tables/<my-table>/insertAll.

Before starting the export job, I created an empty table <my-project>:<my-dataset>.<my-table> with fields that match the JSON schema above:

id          STRING  NULLABLE    
test_value  INTEGER NULLABLE    

I have outputTableSpec set to <my-project>:<my-dataset>.<my-table>.

1
You use the legacy BQ table description. Did you try with standard format? project.dataset.tableguillaume blaquiere
If I try that, the UI shows an error Value must be of the form: ".+:.+\..+" and doesn't allow me to run the job.zack

1 Answers

4
votes

If the BQ table name is given in the form project:dataset.table, then there cannot be any hyphens in the table string. I was using my-project.test.stream-data-102720 when I got the code 400 error. Creating a new table my-project.test.stream_data_102720 and re-running the job with the new name fixed the problem.