0
votes

We can create a table partitioned by Partitioning Field. Doc Reference here.

I created a table visitors that contains 2 fields:

  • visitor: STRING
  • part: TIMESTAMP <-- this is the Partitioning Field

When I query the table:

SELECT * FROM mydataset.visitors WHERE part >= "2018-03-14 09:00:00" AND part < "2018-03-15 18:00:00"

Questions:

  • Is the BigQuery engine smart enough to scan only the 2 partitions 2018-03-14 and 2018-03-15, then find all rows that satisfy the hour range?
  • Or does BigQUery do a full scan of all partitions?

Thanks

1
The whole point of partitioned tables, is that BigQuery only has to scan the partitions referenced in your where clause. You can easily see this by removing the where clause and you'll see the amount of data to be processed increase (in the UI). To answer your question, yes, it's smart enough ;-) - Graham Polley
Thanks, @GrahamPolley! That answers my question :) - Kewei Shang
@GrahamPolley May I ask you to post the comment as an answer? - Xiaoxia Lin
Ok, I will do that. Seems more like a comment though ;-) - Graham Polley

1 Answers

2
votes

The whole point of partitioned tables, is that BigQuery only has to scan the partitions referenced in your where clause. You can easily see this by removing the where clause and you'll see the amount of data to be processed increase (in the UI). To answer your question, yes, it's smart enough ;-)