I have some variant structure in a source table that I need to read, lookup and get into another variant structure into a target table.
-- Source table: clicks
create table clicks
(payload variant
);
-- set data in clicks
insert into clicks
select
object_construct(
'language', 'en-us',
'browser', 'chrome',
'color', '35',
'event_ids', '190,195'
)
union
object_construct(
'language', 'en-us',
'browser', 'chrome',
'color', '32',
'event_ids', '201,203,190,195'
)
union
select
object_construct(
'language', 'en-us',
'browser', 'mozilla',
'color', '38',
'event_ids', '188,190,202,203,195,176'
);
-- Target table: clicks_processed
create table clicks_processed
(payload variant,
click_events variant )
Lookup table: click_event
create table click_event
(
key varchar,
event_id varchar,
desc varchar
)
-- set up data for click_event
insert into click_event (key,event_id,desc)
select 'event201','201','pic_click'
union
select 'event202','202','vid_view'
union
select 'event203','203','download'
Requirement: Insert records into clicks_processed table.
Payload column of clicks_processed table is the same payload from clicks table.
click_events column of clicks_processed table: event_ids key has to be pulled out of the clicks.payload and joined with lookup table "click_event" to form a variant that will be inserted to click_events column of clicks_processed
Desired output: clicks_processed (only matching event_id from click_event get populated under click_events)
Note: The first record has null for click_events because none of the event_ids match from the click_event lookup table (still need the record)
Current code I've come up with:
select max(payload),
object_agg(e.key,object_construct(
'code', e.event_id,
'desc', e.desc
)) as click_events
from clicks c, table(flatten(input => split(c.payload:event_ids,','))) as f
inner join click_event e on e.event_id = f.value
group by seq
This works fine if all the records have matching event_ids from that of click_event otherwise not (first record in the example above), a left join with click_event also didn't help.
Also, not sure if this is the best piece of code, especially worried about usage of max(payload) and group by. In real scenario, the payload is huge with about 500 keys and the event_ids can hold around 30-50 events which means the flatten also blows up as many records.
Thanks!
