0
votes

Honestly, I don't know how simply I can describe the question on the title line instead of showing an example.

I have a hive table which contains two columns: ID and date

 ID      Date
 31    01-01-2017
 31    01-02-2017
 31    01-03-2017
123    01-01-2017
123    01-01-2017
...

In this table, I would like to include another column which is hour such as below

 ID      Date        Hour
 31    01-01-2017      00
 31    01-01-2017      01
 31    01-01-2017      02
 31    01-01-2017      03
 31    01-01-2017      04
...
 31    01-01-2017      23
 31    01-02-2017      00
 31    01-02-2017      01
...

Basically, for every row, I would like add an hour column of values from 00 to 23. Can this be achieved using hive? Thank you so much.

1
Is it internal or external table? Do you want that column to be a static or dynamic calculated column? - Bala
It is an internal table. Not sure the difference between static and dynamic - user4279562
For your requirement there is no straight and small HQL/SQL that can do the job, but there is a work around for it, I’m hoping your third field is dynamic as I see in the example. If you are comfortable creating an other temporary table to populate the main table I can provide the solution for it, let me know.. - roh

1 Answers

1
votes

You could create a temporary table which contains entries from 0 to 23 and do a cross join with the table you have. Or you can leverage on the CTE function a CTE table with entries from 0 to 23 and then do a cross join with it.

An example:

with temp as (
select 0 hour union all
select 1 hour union all
select 2 hour union all
select 3 hour union all
select 4 hour union all
select 5 hour union all
select 6 hour union all
select 7 hour union all
select 8 hour union all
select 9 hour union all
select 10 hour union all
select 11 hour union all
select 12 hour union all
select 13 hour union all
select 14 hour union all
select 15 hour union all
select 16 hour union all
select 17 hour union all
select 18 hour union all
select 19 hour union all
select 20 hour union all
select 21 hour union all
select 22 hour union all
select 23 hour
)
select * from table join temp

You can also insert the result into a table to persist the result. Hope it helps