0
votes

I'm using an HBase table to store events, and I want to update an request event with its response event's output. Both of these values are stored in the HBase table on two different rows.

Here's the dilemma I'm running into. I want to use a mapreduce job that will take in all response rows, and update the request rows with the status of the response row. The response and request both having matching user IDs, but the rows are indexed by a correlation id. The format of the rowkey is (event_corrID_userID). The correlation ID may have changed between now and then, but the userID will always be the same.

That's my whole situation. How can I search within a table (in other rows) during a mapreduce? Here's what I have so far:

public class MapReducer {
    public static void main(String[] args){
        Configuration config = HBaseConfiguration.create();
        try{
            String startRow = "response_";
            String endRow = "responsf_";
            Job job = new Job(config, "TestAuditingResponse");
            job.setJarByClass(MapReducer.class);
            Scan scan = new Scan(Bytes.toBytes(startRow), Bytes.toBytes(endRow));
            scan.setCaching(500);
            scan.setCacheBlocks(false);

            TableMapReduceUtil.initTableMapperJob(
                    "test",
                    scan,
                    mapper.class,
                    null,
                    null,
                    job);
            TableMapReduceUtil.initTableReducerJob(
                    "test",
                    null,
                    job);
            job.setNumReduceTasks(0);

            boolean b = job.waitForCompletion(true);
            if(!b){
                throw new IOException("ERROR WITH JOB");
            }
        } catch(IOException e){
            e.printStackTrace();
        } catch(ClassNotFoundException e){
            e.printStackTrace();
        } catch(InterruptedException e){
            e.printStackTrace();
        }
    }
    public static class mapper extends TableMapper<ImmutableBytesWritable, Put> {
        public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
            //TODO find row to put new value into
        }
    }

}

Does anyone know how I can do this? Or a better/faster way to update a table based on other rows in a table in a distributed/easily runnable way?

1

1 Answers

0
votes

Seems that you are going to 'join' inner one table. you could check this new feature.