0
votes

I have done a sqoop import from mysql and got a csv file. The contents of the file are as below

1,KM,Skypark,null,2017-02-21 14:40:49.0,null
2,KM,null,null,2017-02-21 14:40:49.0,null
3,HD,null,null,2017-02-21 14:40:49.0,null
4,AB,SD,USA,2017-02-21 14:40:49.0,null
5,ABa,SaD,US,2017-02-21 14:40:49.0,null
6,DF,SDF,SF,2017-02-21 14:40:49.0,null
7,DF,SDF,SF,2017-02-21 14:41:44.0,null
8,DF,SDF,SF,2017-02-21 14:44:55.0,null
9,DF,SDF,SF,2017-02-21 14:47:59.0,null

Now the same sqoop import I have done as parquet file. I got a file with .parquet extension.

I want to create a table using the parquet file. I have tried the following but it gave me different wierd error.

create external table test(id int, name string, address string, nation string, date string) row format delimited fields terminated by ',' stored as parquet;

load data inpath '/user/XXXXX/test' into table test;

How do I get the parquet table to give me exact result as the csv table.

Say I got a incremental data to the same folder where I stored the previous data. I got records for ID 10 and 11. Now when I load the data from the folder to the parquet table, I am getting the incremental data as first records and then the initial data.

I mean to say the table looks like

10 ..............
11 ..............
 1 ..............
 2 ..............

Like this I want the first record to be first and incremental data at last

How can we achieve that?

1

1 Answers

2
votes

you dont need to specify the following statement when you create parquet tables

row format delimited fields terminated by ','

just specify stored as parquet is enough

create external table test(id int, name string, address string, nation string, date string) stored as parquet location '/user/XXXXX/test';