I have a comma separated value file.
Data example:
1001,Laptop,beautify,laptop amazing price,<HTML>XYZ</HTML>,1345
1002,Camera,Best Mega Pixel,<HTML>ABC</HTML>,4567
1003,TV,Best Price,<HTML>DEF</HTML>,8791
We have only 5 columns: id, Device, Description, HTML Code, Identifier
.
For a few of the records there is an extra ,
in the Description
column.
For example, First Records
in above sample data has the extra ,
[beautify,laptop amazing price]
which I want to eliminate.
While loading data into PIG 5:
INFILE1 = LOAD 'file1.csv' using PigStorage(',') as (id,Device,Description,HTML Code,Identifier)
There is a Data issue getting created.
Could you please suggest how to handle this data issue in Pig Script?