1
votes

I have file in my local HDFS and it it is delimited by ':::'.

However when i am using the following command

A = load '/user/vishal/WordCount/hw3data/c0001' using PigStorage(':::') as (a, b, c);

it gives me the following error----

ERROR 1200: could not instantiate 'PigStorage' with arguments '[:::]'

What exactly could be the issue?

Thanks

1
I think Pig will not allow multiple characters as delimiter,you have to change the delimiter of the file manually to single character with unix tr command,later try to load - Balaswamy Vaddeman

1 Answers

3
votes

PigStorage supports single-character delimiter only.
A solution would be to either follow Donald's answer or have a look at MyRegExLoader if you don't want to create a custom loader. In your case it looks something like this:

REGISTER '/my_pig_home/contrib/piggybank/java/piggybank.jar'
A = LOAD '/user/vishal/WordCount/hw3data/c0001' 
  USING org.apache.pig.piggybank.storage.MyRegExLoader(
    '([^\\:]+):::([^\\:]+):::([^\\:]+)') 
      as (a:chararray, b:chararray, c:chararray);