0
votes

I am trying to use org.apache.pig.piggybank.storage.MultiStorage from piggybank.jar archive. I downloaded pig trunk and built piggybank.jar by following the instructions here. However, I get the error below when I use the MultiStorage class.

Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

By looking here, it looks like there is a version incompatibility between the piggybank build and the hadoop version. But I am not able to fix this issue. I really appreciate any help on this (spent inordinate amount of time on this already).

pig.hadoop.version: 2.0.0-cdh4.1.0

> hadoop version

Hadoop 2.0.0-cdh4.1.0 Subversion file:///data/1/jenkins/workspace/generic-package-ubuntu64-10-04/CDH4.1.0-Packaging-Hadoop-2012-09-29_10-56-25/hadoop-2.0.0+541-1.cdh4.1.0.p0.27~lucid/src/hadoop-common-project/hadoop-common -r 5c0a0bddbc2aaff30a8624b5980cd4a2e1b68d18 Compiled by jenkins on Sat Sep 29 11:26:31 PDT 2012 From source with checksum 95f5c7f30b4030f1f327758e7b2bd61f

2

2 Answers

0
votes

Though I am not able to figure out how to build a compatible piggybank.jar, I found that a compatible piggybank.jar is located under /usr/lib/pig/.

0
votes

I faced a similar issue when I used piggybank version 0.13 with Hadoop version Hadoop 2.4.0.2.1.5.0-695. It however worked when I used the piggybank jar in the location you mentioned -- /usr/lib/pig.

The additional observation I made is the piggybank jar in /usr/lib/pig is quite old and does not have XPath and other functions available. I believe new piggy jar has dependencies on later Hadoop version.