0
votes

I'm trying to insert data from a csv file into a hive table where date in csv files are in formats like 'mm/dd/yyyy','mm-dd-yyyy' which i have to insert in a hive table in a column whose data-type is date i.e 'yyyy-mm-dd'.

Firstly i tried to load data from csv file inside a table whose date is stored with data-type as string. Then, i tried to insert the same data into a new table whose data-type is date but i was only able to load dates inside table of one format while other format stored as NULL.

create table sample1(order_id int, order_dt string);
load data local inpath "\home\cloudera\data.txt" into table sample1;

create table sample2(order_id int, order_dt date);   
insert into table sample2 select order_id, 
to_date(from_unixtime(unix_timestamp(order_dt,'mm/dd/yyyy'),'yyyy-mm-dd')) 
from sample1; 

csv file:

order_id order_date

1 10/27/2016

2 10/27/2018

3 11/23/2016

4 09-23-2013

5 08-20-2010

6 05-13-2017

7 02/15/2009

output after:

select * from sample2;

1 2016-10-27

2 2018-10-27

3 2016-11-23

4 NULL

5 NULL

6 NULL

7 2009-02-15

Here, I'm getting NULL for other date format.

1

1 Answers

0
votes

You can use COALESCE function here. if first to_date function returns null then it will execute the 2nd argument, you can pass any number of patterns this way.

insert into table sample2 
select 
    order_id, 
    COALESCE(to_date(from_unixtime(unix_timestamp(order_dt,'mm/dd/yyyy'),'yyyy-mm-dd')),to_date(from_unixtime(unix_timestamp(order_dt,'mm-dd-yyyy'),'yyyy-mm-dd'))) 
from sample1;