0
votes

I'm trying to generate some parquet files with hive,to accomplish this i loaded a regular hive table from some .tbl files, throuh this command in hive:

CREATE TABLE REGION ( R_REGIONKEY BIGINT, R_NAME STRING, R_COMMENT STRING)

ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE location '/tmp/tpch-generate';

After this i just execute this 2 lines:

create table parquet_reion LIKE region STORED AS PARQUET;

insert into parquet_region select * from region;

But when i check the output generated in HDFS, i dont find any .parquet file, intead i find files names like 0000_0 to 0000_21, and the sum of their sizes are much bigger that the original tbl file.

What im i doing Wrong?

1

1 Answers

1
votes

Insert statement doesn't create file with extension but these are the parquet files.

You can use DESCRIBE FORMATTED <table> to show table information.

hive> DESCRIBE FORMATTED <table_name>

Additional Note: You can also create new table from source table using below query:

CREATE TABLE new_test row STORED AS PARQUET AS select * from source_table

It will create new table as parquet format and copies the structure as well as the data.