2
votes

I have dataset in hive table like below

date           col1      col2  col3
2016-02-01      A         X     5 
2016-02-03      A         X     5
2016-02-04      A         X     5
2016-03-01      A         X     6
2016-04-01      A         X     5
2016-04-02      A         Y     5
2016-04-03      A         Y     5

I have to do selective grouping on col 1 and col2 where in one group is formed whenever col3 value changes , for example col3 value is changed from 5 to 6 in row4 , I have to take the date column and get the min and max values out of it. the output should be like this .

col1 col2 col3    minDate           maxDate
A     X    5     2016-02-01         2016-02-04
A     X    6     2016-03-01         2016-03-01
A     X    5     2016-04-01         2016-04-01
A     Y    5     2016-04-02         2016-04-03  

I am sure simple group by on col1 and col2 can not work. Can anyone please suggest a way to achieve this please ?

1

1 Answers

2
votes
select      col1,col2,col3
           ,min(date) as minDate
           ,max(date) as maxDate

from       (select      *
                       ,row_number () over 
                        (
                            partition by    col1,col2
                            order by        date
                        ) as rn_part_1_2

                       ,row_number () over 
                        (
                            partition by    col1,col2,col3
                            order by        date
                        ) as rn_part_1_2_3                            

            from        mytable
            ) t

group by    col1,col2,col3
           ,rn_part_1_2 - rn_part_1_2_3

order by    col1,col2
           ,minDate
;           

+------+------+------+------------+------------+
| col1 | col2 | col3 |  mindate   |  maxdate   |
+------+------+------+------------+------------+
| A    | X    |    5 | 2016-02-01 | 2016-02-04 |
| A    | X    |    6 | 2016-03-01 | 2016-03-01 |
| A    | X    |    5 | 2016-04-01 | 2016-04-01 |
| A    | Y    |    5 | 2016-04-02 | 2016-04-03 |
+------+------+------+------------+------------+