1
votes

Is there any way to convert below LEAD function into HIVE QL format??

NVL(LEAD(START_DT) OVER (PARTITION BY EV_ID,AR_EV_RLTNSHP_TYPE_CD ORDER BY START_DT)-1,'2099-12-31') AS DERIVED_END_DT

PFB the error:

FAILED: ParseException line 1:1599 missing ) at 'OVER' near '(' in subquery source line 1:1603 missing FROM at '(' near '(' in subquery source line 1:1604 cannot recognize input near 'PARTITION' 'BY' 'EV_ID' in subquery source

1

1 Answers

1
votes

It is complicated in HiveSQL but you can do it with a left join and aggregation:

select t.ev_id, t.ar_ev_rltnshp_type_cd, t.start_date,
       coalesce(min(tnext.start_dt) - 1, '2099-12-31') as derived_end_dt
from table t left join
     table tnext
     on t.ev_id = tnext.ev_id and t.ar_ev_rltnshp_type_cd = tnext.ar_ev_rltnshp_type_cd and
        tnext.start_date > t.start_date
group by t.ev_id, t.ar_ev_rltnshp_type_cd, t.start_date;

This makes certain assumptions about start_date being unique within a given group, but it will probably work for your purposes.