0
votes

I am using the below command to load the file, when I try to dump or illustrate the loaded data, it fails with the below error. I have checked the sanity of the data, each line contains correct number of delimiters, but when the field is empty the delimiter immediately follows , I tried loading the below single sample line. It does not work.

hs_2_inr = LOAD 'hs_2_inr.dat' USING PigStorage('^') as ( year:chararray, country:chararray, s_no:chararray, hs_8:chararray, hs_8_desc:chararray, prevyr_inr:chararray, curyr_inr:chararray, growth:chararray, dummy:chararray);

Here is the sample data

1997^BOTSWANA^1.^10063001^*RICE PARBOILED^^2.43^^

Below is the exception

2013-06-30 21:02:23,015 [main] ERROR org.apache.pig.pen.AugmentBaseDataVisitor - No (valid) input data found!
java.lang.RuntimeException: No (valid) input data found!
    at org.apache.pig.pen.AugmentBaseDataVisitor.visit(AugmentBaseDataVisitor.java:583)
    at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:229)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:82)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:84)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.walk(PreOrderDepthFirstWalker.java:66)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:180)
    at org.apache.pig.PigServer.getExamples(PigServer.java:1180)
    at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:739)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:538)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
2013-06-30 21:02:23,016 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception 

so how do I load a file with empty fields in pig?

1
Just found that ILLUSTRATE is the culprit, if i run ILLUSTRATE first, it breaks DUMP as well. Whereas, running DUMP without running ILLUSTRATE works fine. Probably a bug?vumaasha

1 Answers

3
votes

Your code works fine. As you mentioned in your comment, ILLUSTRATE was your problem. Per the docs, ILLUSTRATE wasn't being maintained for a while. Do not rely on it. You shouldn't need it in any non-diagnostic code anyway. Use DESCRIBE instead.

In newer Pig versions, the warning on ILLUSTRATE seems to have gone away, so it may be safe again, but I'd still rely more heavily on DESCRIBE to avoid a source of potential issues. In Pig 0.10, which I'm on, ILLUSTRATE still gave me the same error you received.