1
votes

I have a BigQuery table with two nested levels of repeated field hierarchies. I need to do self join (join the table with itself) on a leaf field in the inner level.

Usage of FLATTEN clause only flattens one level and I couldn't figure out how to do this. In theory I need to write nested FLATTEN but I couldn't make this work.

Any help would be appreciated.

Example:

Given the following table structure:

a1, integer a2, record (repeated) a2.b1, integer a2.b2, record (repeated) a2.b2.c1, integer

How do I write a query which does a self join (join each) on a2.b2.c1 on both sides.

1

1 Answers

1
votes

Nested flatten -- that is flatten of a subquery -- should work. Note it requires a plethora of parentheses. Given the schema:

{nested_repeated_f: [ {inner_nested_repeated_f: [ {string_f}]}]}

The following query will work:

SELECT t1.f1 FROM (
  SELECT nested_repeated_f.inner_nested_repeated_f.string_f as f1 
  FROM (FLATTEN((
    SELECT nested_repeated_f.inner_nested_repeated_f.string_f
    FROM 
      (FLATTEN(lotsOdata.nested002, nested_repeated_f.inner_nested_repeated_f))
    ), nested_repeated_f))) as t1    
JOIN (
  SELECT nested_repeated_f.inner_nested_repeated_f.string_f as f2
  FROM (FLATTEN((
    SELECT nested_repeated_f.inner_nested_repeated_f.string_f
    FROM 
      (FLATTEN(lotsOdata.nested002, nested_repeated_f.inner_nested_repeated_f))
    ), nested_repeated_f))) as t2
on t1.f1 = t2.f2