1
votes

I have created a BQ table with time partitioned (day/ timestamp) and also clustering data using region and location field.

I have created a daily scheduled query using the query that I ran to create the initial records. The scheduled query was set to append data to the previously created partitioned and clustered table. However, scheduled query failed to run with a message below:

Incompatible table partitioning specification. Destination table exists with partitioning specification interval(type:DAY,field:timestamp) clustering(location,region), but transfer target partitioning specification is interval(type:DAY,field:timestamp). Please retry after updating either the destination table or the transfer partitioning specification.

The simplified version of the query is below:

CREATE TEMP FUNCTION
  isYesterday(recordTimestamp TIMESTAMP) AS ( DATE(recordTimestamp) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) );

WITH
  daily AS (
  SELECT
    TIMESTAMP_TRUNC(timestamp,day) day,
    location,
    direction,
    region,
    MAX(metric1) metric1,
    MAX(metric2) metric2
  FROM (
    SELECT
      location,
      timestamp,
        direction,
        CASE
        WHEN area>0 AND area<20 THEN "region1"
        WHEN area>10
      AND area<20 THEN "region2"
        WHEN area>30 AND area<40 THEN "Region3"
        WHEN area=61 THEN "Region4"
        WHEN area=65 THEN "Region5"
        WHEN area=76 THEN "Region6"
        WHEN area>89 AND area<100 THEN "Region7"
    END
      AS region,
      
      SUM(terminalcount0) metric1,
      SUM(terminalcount1) metric2
    FROM
      `statistic`
    WHERE
  isYesterday(timestamp)
--      DATE(timestamp) BETWEEN '2020-07-01'
--      AND DATE_SUB(CURRENT_DATE(),INTERVAL 1 day)
        GROUP BY
      location,
      timestamp,
      direction,
      region )
  GROUP BY
    day,
    location,
    direction,
    region)
SELECT
  day as timestamp,
  location,
  direction,
  SUM(metric1) metric1,
  SUM(metric2) metric2
FROM
  daily
GROUP BY
  day,
  location,
  direction,
  region
  
  

How do I fix this so that scheduled query can run and save the results to the destination table regularly? Or is it really the scheduled query to the existing clustered table not supported by GCP?

1
can you share the query? - Felipe Hoffa
Are you sure that is sheduled query and not BigQuery transfer? - guillaume blaquiere
Hi, yes I am sure it is scheduled query and not BigQuery transfer. The query is scheduled to run daily. I have no problem in running other queries where the destination tables have no partition or cluster. Is this a limitation of GCP? I have added the simplified and shareable query in the question above as can't add it in the reply here due to limited characters allowed. I couldn't add exact query due to proprietary restriction . - busheriff
Anyone can help on this? - busheriff

1 Answers

1
votes

There's an answer for this problem here: 61896744

You can workaround this problem by adding a view with your select query:

SELECT field_a, field_b, datetime, date
FROM large_table
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 WEEK)

And creating a scheduled query like this:

INSERT INTO `my_clustered_table` (field_a, field_b, ...)
SELECT *
FROM `my_view`