
I have a simple map/reduce job that scans one hbase table, and modifies another hbase table. The hadoop job seems to complete successfully, but when I check the hbase table, the entry does not appear in there.

Here is the hadoop program:

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class HBaseInsertTest extends Configured implements Tool {

    public int run(String[] args) throws Exception {
        String table = "duplicates";

        Scan scan = new Scan();

        Job job = new Job(getConf(), "HBaseInsertTest");

        TableMapReduceUtil.initTableMapperJob(table, scan, Mapper.class, /* mapper output key = */null,
            /* mapper output value= */null, job);
        TableMapReduceUtil.initTableReducerJob("tablecopy", /*output table=*/null, /*reducer class=*/job);


        // Note that these are the default.

        return job.waitForCompletion(true) ? 0 : 1;

    private static class Mapper extends TableMapper<ImmutableBytesWritable, Put> {
        protected void setup(Context context) throws IOException, InterruptedException {

        public void map(ImmutableBytesWritable row, Result columns, Context context) throws IOException {
            long id = 1260018L;

            try {
                Put put = new Put(Bytes.toBytes(id));
                put.add(Bytes.toBytes("mapping"), Bytes.toBytes("foo"), Bytes.toBytes("bar"));
                context.write(row, put);
            } catch (InterruptedException e) {

    public static void main(String[] args) throws Exception {
        Configuration config = HBaseConfiguration.create();
        int res = ToolRunner.run(config, new HBaseInsertTest(), args);

From HBase shell:

hbase(main):008:0> get 'tablecopy', '1260018', 'mapping'
COLUMN                          CELL                                                                                    
0 row(s) in 0.0100 seconds

I've simplified the program a lot to try to demonstrate/isolate the problem. I'm also relatively new to both hadoop/hbase. I did verify that mapping is a column family that exists in the tablecopy table.

May be there is no output? Try printing out row and put before context.writeHari Menon
There is output. Switching to string keys fixes the problem.kfox

I think the problem was you were querying for hbase(main):008:0> get 'tablecopy', '1260018', 'mapping'

instead you should have queried for this: hbase(main):008:0> get 'tablecopy', 1260018, 'mapping'

HBase was thinking it was a string key you were querying for, because of the quotations. Also if you just ran a simple client job at your end to retrieve this key from HBase, it would have gotten you the values correctly if it was already present.


Your problem lies in your lack of a reducer. You need to create a class extending TableReducer that takes as input a Put and uses context.write(ImmutableBytesWritable key, Put put) to write that Put to the target table.

I'm imagining it looking something like this:

public static class MyReducer extends TableReducer<ImmutableBytesWritable, Put, ImmutableBytesWritable> {

  public void reduce(ImmutableBytesWritable key, Iterable<Put> values, Context context)
      throws IOException, InterruptedException {
    for (Put record : values) {
      context.write(key, record);

Then, you modify your table reducer intializer to be: TableMapReduceUtil.initTableReducerJob("tablecopy", MyReducer.class, job);

That should do it. Another option would be to continue having no reducer and open an HTable object in the mapper and write the put through it directly like this:

HTable table = new HTable(Context.getConfiguration(), "output_table_name");
Put myPut = ...;

Hope this helps!