Hive:Create a table to load data from datetime partition table to year, month, day partition table

Question

So, I have a table that data partitioned by datetime(dt) and stored in S3 which the partition look like this

dt=2019-03-22/

dt=2019-03-23/

dt=2019-03-24/

and so on, What I wanted to do is to change how I partition data from this pattern into a subpartition like this

year=2019/month=03/day=22/

year=2019/month=03/day=23/

year=2019/month=03/day=24/

But I don't want to alter the original table so I created an external table that point to another location in S3 which will be the location for this new partition pattern. I have tried creating a table that point to that location using (the same schema as the original one)

CREATE EXTERNAL TABLE `test_partition_new`(
 `order_id` string, 
 `outlet_code` string, 
 . 
 . 
 . 
 .
 `business_date` string, 
  . 
  .
  .
  .
 )
  PARTITIONED BY ( 
 `year` string, 
 `month` string, 
 `day` string)
  ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
  STORED AS INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
  OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
  LOCATION
 's3://data-test/test_partition/db.new_partition/'
  TBLPROPERTIES (
 'orc.compress'='SNAPPY', 
 )

which will partition by year, month and day respectively. So from what I understand I should insert data from the original table into this one. How should I insert data into this new table which a date to be partition by are from column 'business_date' that contain data like '2019-03-20'. Are there any function that can separate this column into three column containing year, month and day

Vamsi Prabhala Vamsi Prabhala · Accepted Answer · 2019-03-28T13:21:57

If the date format is consistent, you can split them into 3 columns and load.

INSERT INTO `test_partition_new` PARTITION(year,month,day)
SELECT --cols to select 
      ,SPLIT(business_date,'-')[0] --year
      ,SPLIT(business_date,'-')[1] --month
      ,SPLIT(business_date,'-')[2] --day
FROM ORIGINAL_TABLE

Hive:Create a table to load data from datetime partition table to year, month, day partition table

1 Answers