3
votes

I noticed a strange behavior in the Google cloud compute engine using Bigquery and VM instances.

I have a java process that streams data into Bigquery.

I expected to have better performances by choosing the same region for BigQuery dataset and the VM instances but my tests showed an unexpected behavior.

CASE1: VM on us-central1-a AND dataset location US Average time on insertion Bigquery response: 150 milliseconds

CASE2: VM on europe-west1-c AND dataset location US Average time on insertion Bigquery response: 700 milliseconds

CASE3: VM on us-central1-a AND dataset location EU Average time on insertion Bigquery response: 1200 milliseconds

CASE4: VM on europe-west1-c AND dataset location EU Average time on insertion Bigquery responset: 1700 milliseconds

I can understand the decrease of performances in CASE2 and CASE3 but what about CASE4?

The test shows that if the Bigquery dataset location is "EU" performance decrease even if the VM region is europe-west1-c.

My conclusion is: never use Bigquery in EU (sure, except for requirements on the location of the data)!

Anything wrong in my considerations?

1
could you please provide your project id, dataset id, tablet id? so we can take a look what's going on? Our server side statistics shows much lower latency than 1700 ms. It doesn't seem normal... - Cheng Miezianko
I could give you the codes in private. Is it possible? Then we could continue the conversation here. Is it ok for you? - Andrea Zonzin
Sure~ my email: [email protected] Thanks! - Cheng Miezianko
Is sent you my ids. Thanks in advance! - Andrea Zonzin

1 Answers

1
votes

Thanks for reporting.

Looks like the latency mentioned in the post includes both tables.get() + tabledata.insertAll(). The latency difference is mostly caused by tables.get().

We are aware that calling metadata related APIs (e.g. tables.get) is slower from EU than US. It is caused by some existing infrastructure limitations, and unfortunately there is short-term fix for it. But we are actively working on some backend changes to minimize this latency difference for the long term.

A few things you might consider to mitigate this: