We recently upgraded the cluster to use Hadoop 2.0.0-cdh4.4.0.
After the change, we needed to reinstall pig, which used to work absolutely fine. After the installation as described here, the simplest HBase job does not get created.
raw_protobuffer = LOAD 'hbase://data_table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('external_data:downloaded', '-limit=1 -gte=0 -lte=1') AS (data:bytearray);
Which fails with the magical:
Failed Jobs: JobId Alias Feature Message Outputs N/A raw_protobuffer MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For input string: "4f8:0:a111::add:9898" at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319) at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239) at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270) at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160) at java.lang.Thread.run(Thread.java:744) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257) Caused by: java.lang.NumberFormatException: For input string: "4f8:0:a111::add:9898" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at com.sun.jndi.dns.DnsClient.(DnsClient.java:122) at com.sun.jndi.dns.Resolver.(Resolver.java:61) at com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570) at com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430) at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231) at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139) at com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103) at javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142) at org.apache.hadoop.net.DNS.reverseDns(DNS.java:85) at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:219) at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:184) at org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274) ... 16 more
We suspected permissions to tmp folder but they seem to be fine (i.e. the job directory gets created with the pig runner (!) being its owner). Any suggestion what we might have missed would be much appreciated.