0
votes

I have written a mapreduce program where I need to read the data from HBase table from the particular column family.

For example, the data in HBase table looks like:

Row    Column+Cell

1        column=Name:FName, timestamp=...,value=ABC

1        column=Name:LName, timestamp=...,value=XYZ

Now I need to append the FName and LName into another column as FullName under same column family. In map, I'm extracting the data and appending it and sending to the reducer.

In Reducer I'm just getting the key,value pair and trying to add the FullName into the table.

My reducer implementation looks like this:

public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
  Put put = new Put(Bytes.toBytes(key.toString()));
  put.add(Bytes.toBytes("Name"), Bytes.toBytes("FullName"), Bytes.toBytes(values.toString()));
  context.write(null, put);
}

When I check the Fullname in the hbase table the value is not "ABCXYZ" instead i get the value as org.apache.hadoop.mapreduce.task.ReduceContextImpl$ValueIterable.

Kindly let me know how to resolve this issue.

1

1 Answers

0
votes

values argument in reduce function is Iterable, not a single value. This is because normally reduce is used to reduce multiple values which have the same key. But in your program, you have only single value for each key. You can get the first value from this iterable with values.next(). Without call to next(), you simply invoke toString() method on Iterable object itself, which prints its class name.

By the way, because you don't need to reduce multiple values, you may configure hadoop to run entirely without reducers -- only with mappers.