1
votes

We've started migrating to using partitioned tables in BigQuery. We've noticed that copying a partitioned table takes considerably longer than a non-partitioned table - in every case. I'm sure there is a very good reason for this e.g. because of the underlying architecture of BigQuery & partitioned tables.

For example (copying in same project & to the same dataset):

Non-partitioned table:

  • Size: 15GB, 87M rows
  • Copy time: 3 seconds
  • Job id: bquijob_64e11150_15b373c714a

Partitioned table:

  • Size: 15GB, 87M rows (same table as above, but partitioned)
  • Copy time: 16 minutes
  • Job id: bquijob_6bae14c3_15b373e623d

Is there a trick/workaround to speed up copying partitioned tables in BigQuery?

1
just wondering: how many partitions does the table have (mental math, don't take it seriously: 365*3 seconds ~= 16 minutes)Felipe Hoffa
622 partitions containedGraham Polley
@FelipeHoffa I would expected to be in parallel copy, not ending up with linear time.Pentium10
That's why I asked to not take my math seriously, but the answer to the question still was an interesting data point to haveFelipe Hoffa
@FelipeHoffa - so, is BQ performing 622 sequential load jobs under the hood?Graham Polley

1 Answers

0
votes

I've been told through our enterprise support channel that it's working as designed and there's nothing that can be done to speed up the copy on partitioned tables (paraphrasing).

I've raised a feature request in an attempt to get this changed, but I doubt it's going to be a top priority!

Anyway, in the interest of posterity, here's the FR if anyone wants to track it.

https://issuetracker.google.com/issues/37012156