6
votes

I'm trying to use google bigquery to select data from tables with a date wildcard. I'd like to be able to use the TABLE_DATE_RANGE function, but I need to query over a large date range (> 1 year). Right now my query works for a year's worth of data, but anything over I get:

Error: TABLE_DATE_RANGE error: too many days

#Fails
SELECT 
  system_id, sample_date, e_pv_array
FROM 
  (TABLE_DATE_RANGE(workspace.intervaldata, 
                    TIMESTAMP('2009-03-01'), 
                    TIMESTAMP('2010-03-04')))
WHERE
  system_id = 20006
  and e_pv_array is not null;

#Works
SELECT 
  system_id, sample_date, e_pv_array
FROM 
  (TABLE_DATE_RANGE(workspace.intervaldata, 
                    TIMESTAMP('2009-03-01'), 
                    TIMESTAMP('2010-03-03')))
WHERE
  system_id = 20006
  and e_pv_array is not null;

Is this just a limitation of bigquery? Or is there a way to use table wildcards with date ranges greater than 1 year?

1

1 Answers

5
votes

I've experienced problems with TABLE_DATE_RANGE() as well. Here's a possible workaround, assuming your tables are named workspace.intervaldata20090303, etc.:

SELECT 
system_id, sample_date, e_pv_array
FROM 
TABLE_QUERY(workspace, "integer(regexp_extract(table_id, r'intervaldata([0-9]+)')) BETWEEN 20090301 and 20100304")
WHERE
system_id = 20006
and e_pv_array is not null;

I have a similar scenario (date partitioned tables) and am able to run queries over 2 years' worth of data using TABLE_QUERY() without errors.