BigQuery Retrieval Times Slow

Question

BigQuery is fast at processing large sets of data, however retrieving large results from BigQuery is not fast at all.

For example, I ran a query that returned 211,136 rows over three HTTP requests, taking just over 12 seconds in total.
The query itself was returned from cache, so no time was spent executing the query. The host server is Amazon m4.xlarge running in US-East (Virginia).

In production I've seen this process take ~90seconds when returning ~1Mn rows. Obviously some of this could be down to network traffic... but it seems too slow for that to be the only cause (those 211,136 rows were only ~1.7MB).

Has anyone else encountered such slow speed when having results returned, and have found a resolution?

Update: Reran test on VM inside Google Cloud with very similar results. Ruling out network issues beteween Google and AWS.

never mind, I figured it out. Will do some investigation then. — xuejian
@xuejian as per update I've ruled out Google <--> Amazon network issues by running the test inside google cloud with similar results. — NPSF3000

yiru yiru · Accepted Answer · 2016-12-13T18:38:26

Our SLO on this API is 32 seconds,and a call taking 12 seconds is normal. 90 seconds sounds too long, it must be hitting some of our system's tail latency.

I understand that it is embarrassingly slow. There are multiple reasons to it, and we are working on improving the latency of this API. By the end of Q1 next year, we should be able to roll out a change that would cut tabledata.list time in half (by upgrading the API frontend to our new One Platform technology). If we have more resource, we would also make jobs.getQueryResults faster.

BigQuery Retrieval Times Slow

2 Answers