I have an error when I try to load or save my data from Apache Pig into anything but CSV. Here is my pig code:
REGISTER /usr/local/Cellar/pig/0.15.0/libexec/*.jar
REGISTER /usr/local/Cellar/pig/0.15.0/libexec/lib/*.jar
REGISTER /usr/local/Cellar/hbase/1.1.2/libexec/lib/*.jar
REGISTER /usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/*.jar
REGISTER /usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/*.jar
REGISTER /usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/*.jar
REGISTER /usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/lib/*.jar
REGISTER /usr/local/Cellar/pig/0.15.0/libexec/lib/piggybank.jar
--generated_data = LOAD 'tableHive' USING org.apache.hive.hcatalog.pig.HCatLoader(',') AS (level:chararray, score:INT, attraction:chararray);
generated_data = LOAD 'CSVResults/ireland.csv' USING PigStorage(',') AS (level:chararray, score:INT, attraction:chararray);
DUMP generated_data;
fiveRating = FILTER generated_data BY (float)score>4;
level6 = FILTER fiveRating BY (float)level>5;
groupedbylevel = group level6 by attraction;
countAttractions = FOREACH groupedbylevel {
level6Attractions = CROSS level6.level;
generate group, COUNT(level6Attractions) AS listBylevel6;
};
orderlist = ORDER countAttractions BY listBylevel6 DESC;
limitorder = LIMIT orderlist 20;
STORE limitorder into 'Level6AttractionsIreland-limited2' using PigStorage(',');
STORE countAttractions into 'hbase://Level6AttractionsIreland' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('Ireland:level Ireland:score Ireland:attraction');
STORE countAttractions INTO 'Level6AttractionsIreland' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(',');
and here is the error from the pig log file:
Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. java.io.IOException: java.lang.reflect.InvocationTargetException
java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:211)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.getOutputFormat(HBaseStorage.java:928)
at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
at org.apache.pig.PigServer.execute(PigServer.java:1356)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:631)
at org.apache.pig.Main.main(Main.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:459)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:436)
at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:317)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:198)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:160)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:206)
... 25 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:457)
... 30 more
Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:907)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:701)
... 35 more
Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 41 more
================================================================================
As you can see I've tried to add every jar file that might be relevant, and I've also removed everything but the load and store commands to see if it is the code that is causing it, but I get the same results. I am very new to Pig so apologies if this is a silly mistake, I have searched for answers elsewhere, but nothing is working for me at this stage. Also I am on a mac with Hadoop, HBase and Hive installed locally and I am running the command 'pig -x local test.pig' in terminal. Any advice would be great, thanks!
Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace... Does that mean anything to you? - OneCricketeer/usr/local/Cellar). You should run code from the VM, where all the environment variables and paths are correctly setup. - OneCricketeer