0
votes

I have a situation to create nested-repeated table by joining 2 normal tables in BigQuery. Here is the context,

I have the input table data contains looks like this: In the below data, v_no is one col, and crit is another col which contains more data in it. cirt col should be splitted for each 6 chars (I dit it in my query below).

v_no crit
v1  EA1   TEMP    MR   
v2  DRA   KM

Another input table contains the data like this:

c_code  o_code
TEMP    720
MR  D45
EA1 E88
KM  282
DRA 121

Now the need is I need to join these 2 tables and create a nested-repeated table. The expected output should be like this.

v1  EA1  E88
    TEMP 720
    MR   D45
v2  DRA  121
    KM   282

I did for some part in creating nested table from table1 by just splitting each 6 chars. Then after I am not sure how to join with other table.

CREATE TABLE
  `project.db.table1`
CLUSTER BY
  v_no AS
SELECT
  v_no,
  ARRAY_AGG(STRUCT(crit)) as codes
FROM
  `project.db.table2`,unnest(regexp_extract_all(crit, r'.{1,6}')) crit 
GROUP BY
  v_no

Can anyone please suggest how to achieve it in single query in BigQuery? Generic Query please :-)

2

2 Answers

2
votes

Use below

select v_no, 
  array(
    select as struct c_code, o_code 
    from unnest(regexp_extract_all(crit, r'.{1,6}')) code
    join `project.dataset.table2`
    on trim(code) = c_code
  ) codes
from `project.dataset.table1`     

If applied to sample data in your question - output is

enter image description here

Data used for test - as per samples in question:

with `project.dataset.table1` as (
  select 'v1' v_no, 'EA1   TEMP    MR' crit union all
  select 'v2', 'DRA   KM'
), `project.dataset.table2` as (
  select 'TEMP' c_code, '720' o_code union all
  select 'MR', 'D45' union all
  select 'EA1', 'E88' union all
  select 'KM', '282' union all
  select 'DRA', '121' 
)
0
votes

After some deep dive, I understand that v_no are not in sorted order after splitting by each 6 chars. So, I have constructed a query to achieve the need.

Here is my Query:

CREATE TABLE `table_tgt`
CLUSTER BY v_no AS
WITH mytable AS (SELECT v_no,TRIM(crit) AS crit FROM `inp_table_1`, 
UNNEST(REGEXP_EXTRACT_ALL(crit, r'.{1,6}')) crit ORDER BY v_no )
SELECT v_no, ARRAY_AGG(STRUCT(crit, o_code)) AS critrec
FROM mytable LEFT JOIN `inp_table_2`
ON TRIM(crit) = TRIM(c_code)
GROUP BY v_no

Hope this helps others too.