Mapreduce job to HBase throws IOException: Pass a Delete or a Put

Question

I am trying to output to a HBase table directly from my Mapper while using Hadoop2.4.0 with HBase0.94.18 on EMR.

I am getting a nasty IOException: Pass a Delete or a Put when executing the code below.

public class TestHBase {
  static class ImportMapper 
            extends Mapper<MyKey, MyValue, ImmutableBytesWritable, Writable> {
    private byte[] family = Bytes.toBytes("f");

    @Override
    public void map(MyKey key, MyValue value, Context context) {
      MyItem item = //do some stuff with key/value and create item
      byte[] rowKey = Bytes.toBytes(item.getKey());
      Put put = new Put(rowKey);
      for (String attr : Arrays.asList("a1", "a2", "a3")) {
        byte[] qualifier = Bytes.toBytes(attr);
        put.add(family, qualifier, Bytes.toBytes(item.get(attr)));
      }
      context.write(new ImmutableBytesWritable(rowKey), put);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = HBaseConfiguration.create();
    String input = args[0];
    String table = "table";
    Job job = Job.getInstance(conf, "stuff");

    job.setJarByClass(ImportMapper.class);
    job.setInputFormatClass(SequenceFileInputFormat.class);
    FileInputFormat.setInputDirRecursive(job, true);
    FileInputFormat.addInputPath(job, new Path(input));

    TableMapReduceUtil.initTableReducerJob(
            table,                  // output table
            null,                   // reducer class
            job);
    job.setNumReduceTasks(0);
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

Does anyone know what I am doing wrong?

Stacktrace

Error: java.io.IOException: Pass a Delete or a Put at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:125) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143

In your provided code: context.write(new ImmutableBytesWritable(rowKey), put); its outside of the map method. Please fix it first because it doesn't match what the traceback is showing... — Rubén Moraleda
Thanks for pointing that out Ruben, it was a copy/paste mistake — Marsellus Wallace

Tejus Prasad Tejus Prasad · Accepted Answer · 2015-02-19T16:16:02

It would be better if you can show the full stack trace, so that i can help you solve it easily. I've not executed your code. As far as i've seen your code, this could be the issue
job.setNumReduceTasks(0);

Mapper will be expecting your put object to write directly to Apache HBase. You can increase the setNumReduceTasks OR If you see the API you can find its default value and comment it.

Mapreduce job to HBase throws IOException: Pass a Delete or a Put

2 Answers