0
votes

We have one HIVE table that is partitioned by date. It has currently Sequence file format, I want to convert it into Parquet Table.

Is it possible that we have new Partition with Parquet Serde, and older with Sequence format, so that I don't need to backfill it?

1
Why not make a separate table? CREATE TABLE t2 LIKE t STORED AS PARQUET? - OneCricketeer
@cricket_007 But then I need to backfill it, by converting Sequence files to Parquet files ( For 2-3 year of history data) . Also it will be different tablename that could break pipeline (that could be fixed by multiple ways) - rajnish
You cannot mix serdes. It's a table level setting, not partition level - OneCricketeer

1 Answers

0
votes
  1. create a external empty table with default serde(LazySimpleSerDe) and default stored(textfile).

  2. add partition.

  3. alter partition set fileformat(or set serde).

Hive LanguageManual DDL

CREATE EXTERNAL TABLE test(ip string, localTime string ) 
PARTITIONED BY (partition__hive__ STRING)  location '/tmp/table/empty';

alter table test add partition (partition__hive__='p_0') location 'hdfs://hdfsTest/hive/table/test/2018/11/21/08';
alter table test partition (partition__hive__='p_0') SET FILEFORMAT parquet;

alter table test add partition (partition__hive__='p_1') location 'hdfs://hdfsTest/hive/table/test/2018/11/21/09'; 
alter table test partition (partition__hive__='p_1') SET SERDE  'org.apache.hive.hcatalog.data.JsonSerDe';