0
votes

I am trying to analyse firebase analytics data in BigQuery. I need to update a table in BigQuery using StandardSQL.

I have to update order_flag in table cart where key = 'item_id' by joining it to another table order.

Below is the query:

#standardSQL
UPDATE `dataset.cart` c
SET c.order_flag = true
WHERE (SELECT value.string_value 
       FROM UNNEST(c.event_dim.params) 
       WHERE key = 'item_id') IN
      (SELECT
           (SELECT value.string_value 
            FROM UNNEST(o.event_dim.params) 
            WHERE key = 'item_id')
       FROM `dataset.order` o
       WHERE (SELECT key FROM UNNEST(o.event_dim.params) 
              WHERE key = 'item_id') = 
             (SELECT value.string_value FROM UNNEST(c.event_dim.params) 
              WHERE key = 'item_id'))

But I am getting the error:

Error: Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN.

How to do an efficient join in this scenario?

1

1 Answers

2
votes

Your query looks a bit strange because it has IN clause with correlated subquery (subquery uses both a and c tables).

Which is kind of antipattern and usually indicates mistake in query. Because normally IN clause subquery is NOT correlated across tables.

EXISTS clause usually requires correlation in subquery, but not IN. This would work most likely:

UPDATE
  `dataset.cart` c
SET
  c.order_flag=TRUE
WHERE
  (
  SELECT
    value.string_value
  FROM
    UNNEST(c.event_dim.params)
  WHERE
    key = 'item_id') IN (
  SELECT
    (
    SELECT
      value.string_value
    FROM
      UNNEST(o.event_dim.params)
    WHERE
      key = 'item_id')
  FROM
    `dataset.order` o
)

If you decide to switch to EXISTS then I would recommend storing

(SELECT
      value.string_value
    FROM
      UNNEST(o.event_dim.params)
    WHERE
      key = 'item_id')

into separate column to keep things simple and easy to optimize for query optimizer.