0
votes

I Have One CSV file, Which Contains text qualifier(" ") data. I want to load the data into hdfs using PIG/Hive/Hbase without text qualifiers. plz give your help

my file input.CSV
"Id","Name"
"1","Raju"
"2","Anitha"
"3","Rakesh"

I want output like:

Id,Name
1,Raju
2,Anitha
3,Rakesh
1

1 Answers

0
votes

Try this in pig script

Suppose your input file name is input.csv

1.First move this input file to HDFS using copyfromlocal command.
2. Run this below pig script

PigScript:
HDFS mode:

A = LOAD 'hdfs://<hostname>:<port>/user/test/input.csv' AS line;
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'"(.*)","(.*)"')) AS (id:int,name:chararray);
STORE B INTO '/user/test/output' USING PigStorage(',');

Local mode:

A = LOAD 'input.csv' AS line;
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'"(.*)","(.*)"')) AS (id:int,name:chararray);
STORE B INTO 'output' USING PigStorage(',');

Output:

Id,Name
1,Raju
2,Anitha
3,Rakesh