0
votes

I have some variant structure in a source table that I need to read, lookup and get into another variant structure into a target table.

-- Source table: clicks

create table clicks
(payload variant
);

-- set data in clicks

insert into clicks
select 
object_construct(
  'language', 'en-us',
  'browser', 'chrome',  
  'color', '35',
  'event_ids', '190,195'
)
union
object_construct(
  'language', 'en-us',
  'browser', 'chrome',  
  'color', '32',
  'event_ids', '201,203,190,195'
)
union
select 
object_construct(
  'language', 'en-us',
  'browser', 'mozilla',  
  'color', '38',
  'event_ids', '188,190,202,203,195,176'
);

-- Target table: clicks_processed

create table clicks_processed
(payload variant,
 click_events variant )

Lookup table: click_event

create table click_event
(
  key varchar,
  event_id varchar,
  desc varchar  
)

-- set up data for click_event

insert into click_event (key,event_id,desc)
select 'event201','201','pic_click'
union
select 'event202','202','vid_view'
union
select 'event203','203','download'

Requirement: Insert records into clicks_processed table.

Payload column of clicks_processed table is the same payload from clicks table.

click_events column of clicks_processed table: event_ids key has to be pulled out of the clicks.payload and joined with lookup table "click_event" to form a variant that will be inserted to click_events column of clicks_processed

Desired output: clicks_processed (only matching event_id from click_event get populated under click_events)

data of clicks_processed

Note: The first record has null for click_events because none of the event_ids match from the click_event lookup table (still need the record)

Current code I've come up with:

select max(payload),
object_agg(e.key,object_construct(
'code', e.event_id,
'desc', e.desc
)) as click_events
from clicks c, table(flatten(input => split(c.payload:event_ids,','))) as f
inner join click_event e on e.event_id = f.value
group by seq

This works fine if all the records have matching event_ids from that of click_event otherwise not (first record in the example above), a left join with click_event also didn't help.

Also, not sure if this is the best piece of code, especially worried about usage of max(payload) and group by. In real scenario, the payload is huge with about 500 keys and the event_ids can hold around 30-50 events which means the flatten also blows up as many records.

Thanks!

1

1 Answers

0
votes
select any_value(payload),
       object_agg(e.key,object_construct(
            'code', e.event_id,
            'desc', e.desc
            )) as click_events
from (
  select f.seq, f.value::varchar as value, c.payload
  from clicks c, 
  lateral flatten(input => split(c.payload:event_ids,',')) as f
  ) a
left join click_event e
  on a.value = e.event_id
group by a.seq;

Flatten the array and then join back to the click_event as a left join. This gives you an empty array as opposed to a NULL, though.

Also, leverage any_value() function over the max() function, since that value will always be the same across the seq value.