0
votes

I am trying to export a table from BigQuery to google storage using the following command within the console:

bq --location=<hidden> extract --destination_format CSV --compression GZIP --field_delimiter "|" --print_header=true <project>:<dataset>.<table> gs://<airflow_bucket>/data/zip/20200706_<hidden_name>.gzip

I get the following error :

BigQuery error in extract operation: An internal error occurred and the request could not be completed.

Here is some information about the said table

Table ID    <HIDDEN> 
Table size 6,18 GB
Number of rows   25 854 282 
Created 18.06.2020, 15:26:10 
Table expiration     Never 
Last modified 14.07.2020, 17:35:25 
Data location    EU

What I'm trying to do here, is extract this table into google storage. Since the table is > 1 Gb, then it gets fragmented... I want to assemble all those fragments into one archive, into a google cloud storage bucket.

What is happening here? How do I fix this?

Note: I've hidden the actual names and locations of the table & other information with the mention <hidden> or <airflow_bucket> or `:.


`

2

2 Answers

1
votes

I found out the reason behind this, the documentation gives the following syntax for the bq extract

> bq --location=location extract \
> --destination_format format \
> --compression compression_type \
> --field_delimiter delimiter \
> --print_header=boolean \ project_id:dataset.table \ gs://bucket/filename.ext

I removed location=<bq_table_location> and it works on principle. Except I had to add a wildcard and I end up having multiple compressed files.

1
votes

According to the public documentation, you are getting the error due to the 1 Gb file size limit.

Currently it is not possible to accomplish what you want without adding an additional step, either with concatenating on Cloud Storage, or using a Batch Job, on Dataflow as an example.

There are some Google-provided batch templates that export data from BigQuery to GCS, but none with the CSV format, so you would need to touch some code in order to do it on Dataflow.