1
votes

I have two bigquery tables, bigger than 1 GB.

To export to storage,

https://googlecloudplatform.github.io/google-cloud-php/#/docs/google-cloud/v0.39.2/bigquery/table?method=export

$destinationObject = $storage->bucket('myBucket')->object('tableOutput_*');
$job = $table->export($destinationObject);

I used wild card.

Strange things is one bigquery table is exported to 60 files each of them with 3 - 4 MB size.

Another table is exported to 3 files, each of them close to 1 GB, 900 MB.

The codes are the same. The only difference is in the case that the table exported to 3 files. I put them into a subfolder.

The one exported to 60 files are one level above the subfolder.

My question is how bigquery decided that a file will be broken into dozens smaller files or just be broken into a few big files (as long as each file is less than 1GB)?

Thanks!

1
At a guess, I would imagine it's determined by how fragmented the table is. But maybe a Googler, like @felipehoffa, can shed some more light. However, this shouldn't be a problem. Is it?Graham Polley
Well. It might be a problem. I want to break it into smaller files. When I process the 1GB files, such as upload it to Elasticsearch, I got timeout issue,searain
AFAIK, you don't have any control over how many or how big the files are that get exported.Graham Polley

1 Answers

2
votes

BigQuery makes no guarantees on the sizes of the exported files, and there is currently no way to adjust this.