Creating a partitioned hive table from a non partitioned table

Question

I have a Hive table which was created by joining data from multiple tables. The data for this resides in a folder which has multiple files ("0001_1" , "0001_2", ... and so on). I need to create a partitioned table based on a date field in this table called pt_dt (either by altering this table or creating a new one). Is there a way to do this?

I've tried creating a new table and inserting into it (below) which did not work

create external table table2 (acct_id bigint, eval_dt string)
partitioned by (pt_dt string);
insert into table2
partition (pt_dt) 
select acct_id, eval_dt, pt_dt
from jmx948_variable_summary;

This throws the error

"FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 189 Cumulative CPU: 401.68 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 6 minutes 41 seconds 680 msec"

veemo veemo · Accepted Answer · 2015-08-17T21:01:09

Was able to figure it out after some trial & error.

Enable dynamic partitioning in Hive:

SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;

Create schema for partitioned table:

CREATE TABLE table1 (id STRING, info STRING)
PARTITIONED BY ( tdate STRING);

Insert into partitioned table :

FROM table2 t2
INSERT OVERWRITE TABLE table1 PARTITION(tdate)
SELECT t2.id, t2.info, t2.tdate
DISTRIBUTE BY tdate;

Creating a partitioned hive table from a non partitioned table

2 Answers