1
votes

I'm trying to join array elements in BigQuery but I am getting the following error message: Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN.

Imagine I have two mapping tables:

CREATE OR REPLACE TABLE `test.field_id_name` (
  id STRING,
  name STRING
) AS (
  SELECT * FROM UNNEST(
     [STRUCT("s1", "string1"),
     STRUCT("s2", "string2"),
     STRUCT("s3", "string3")]
  )
)
CREATE OR REPLACE TABLE `test.field_values` (
  id STRING,
  name STRING
) AS (
  SELECT * FROM UNNEST(
     [STRUCT("v1", "val1"),
     STRUCT("v2", "val2"),
     STRUCT("v3", "val3")]
  )
)

And I have the following as input:

CREATE OR REPLACE TABLE `test.input` AS
  SELECT [
    STRUCT<id STRING, value ARRAY<STRING>>("s1", ["v1"]),
    STRUCT("s2", ["v1"]),
    STRUCT("s3", ["v1"])
  ] records
  UNION ALL
  SELECT [
    STRUCT("s1", ["v1", "v2"]),
    STRUCT("s2", ["v1", "v2"]),
    STRUCT("s3", ["v1", "v2"])
  ]
  UNION ALL
  SELECT [
    STRUCT("s1", ["v1", "v2", "v3"]),
    STRUCT("s2", ["v1", "v2", "v3"]),
    STRUCT("s3", ["v1", "v2", "v3"])
  ]

I am trying to produce this output:

SELECT [
  STRUCT<id_mapped STRING, value_mapped ARRAY<STRING>>("string1", ["val1"]),
  STRUCT("string2", ["val1"]),
  STRUCT("string3", ["val1"])
] records
UNION ALL
SELECT [
  STRUCT("string1", ["val1", "val2"]),
  STRUCT("string2", ["val1", "val2"]),
  STRUCT("string3", ["val1", "val2"])
]
UNION ALL
SELECT [
  STRUCT("string1", ["val1", "val2", "val3"]),
  STRUCT("string2", ["val1", "val2", "val3"]),
  STRUCT("string3", ["val1", "val2", "val3"])
]

However the following query is failing with the correlated subqueries error.

SELECT
  ARRAY(
    SELECT
      STRUCT(fin.name, ARRAY(SELECT fv.name FROM UNNEST(value) v JOIN test.field_values fv ON (v = fv.id)))
    FROM UNNEST(records) r
    JOIN test.field_id_name fin ON (fin.id = r.id)
  )
FROM test.input
1
please present sample data and expected result. - Mikhail Berlyant
Hi @MikhailBerlyant - added sample data with desired transformation - Josh Laird
looks very messy to me :o( I cannot even match sample data with query provided! Can yo please rewrite your question such that we can clearly see 1) input data sample 2) expected output sample matching presented input 3) code you have so far and what he problem you have with it - Mikhail Berlyant
Appreciate the feedback. I have modified the question, hope that's clearer! - Josh Laird
from quick glance - looks much better now. will look more closer later when have time (unless someone will already answer it) :o) - Mikhail Berlyant

1 Answers

1
votes

Below is for BigQuery Standard SQL

#standardSQL
SELECT ARRAY_AGG(STRUCT(id AS id_mapped, val AS value_mapped)) AS records
FROM (
  SELECT fin.name AS id, ARRAY_AGG(fv.name) AS val, FORMAT('%t', t) id1, FORMAT('%t', RECORD) id2
  FROM `test.input` t,
  UNNEST(records) record,
  UNNEST(value) val
  JOIN `test.field_id_name` fin ON record.id = fin.id
  JOIN `test.field_values` fv ON val = fv.id
  GROUP BY id, id1, id2
)
GROUP BY id1   

If to apply to sample data from your question - returns exact output you expecting