0
votes

This may sound crazy, but I want to implement something like having a view with a partition.

Background:

I had a table with a date partition on a column which is really huge in size. We are running data ingestion to this table at every 2mins interval. All the data loads are append-only. Ever load will insert 10k+ rows. After some time, we encountered the partition limitation issue.

message: "Quota exceeded: Your table exceeded quota for Number of partition modifications to a column partitioned table. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors"

Root cause:(from GCP support team)

The root cause under the hood was that due to your partitioned tables have pretty granular partition for instance by minutes, hours or date, when the loaded data cover a wide range of partition period, the number of partition get modified will be high and above 4000. As per internal documentation, it was suggested the user who ran into this issue to consider making a less granular partition for instance change a date/hour/minute based partitioned table to a week based partitioned table. Alternatively split the load to multiple and hence limit the data range to cover less number of partitions that would be affected. This is the best recommendation I could have now.

So I'm planning to keep this table as un-partitioned and create a view(we need a view for eliminating the duplicates) and it should have parition. Is this possible? or any other alternate solution for this?

1

1 Answers

1
votes

You can't partition a view, it's not physically materialized. Partitioning on day can be limiting with the 4000 limit, would year work? then you can use an integer partition:

create or replace table BI.test 
PARTITION BY RANGE_BUCKET(Year, GENERATE_ARRAY(2000, 3000, 1)) as
select 2000 as Year, 1 as value
union all
select 2001 as Year, 1 as value
union all
select 2002 as Year, 1 as value

Alternatively, I've used month (YYYYMM) or week (YYYYWW) to integer partition by which gets you around 40 years:

RANGE_BUCKET(monthasintegerfield, GENERATE_ARRAY(201612, 205712, 1))