I want to perform query operation in HBase to fetch records using provided list of row keys. Since Mappers in MapReduce work in parallel, so I want to use it.
Input List of row keys will be in the range of ~100000 and I have created a customInputFormat
for mapper, that will give list of 1000 row keys to each mapper for querying HBase table. These queried records may or may not be present in HBase table, I want to return only those records that are present.
I have seen various examples, and what I found is that hbase table scan
operation is performed to get range of rowkeys and range is specified by startingRowKey
and endingRowKey
, but I want to query for provided list of row keys only.
How can I do this with MapReduce? Any help is welcomed!