0
votes

I have a map reduce job, in which each mapper needs random access to another HBase table for many many times. I am wondering how efficient it is for those large number of random access (concurrently, due to the mappers running concurrently) to HBase tables.

Thanks a lot!

1

1 Answers

1
votes

HBase is efficient at Random access - however depending on how large is the table in the map/reduce and how many tims you perform that i/o you may want to consider alternative options e.g. if the random/access table is small enough - load it into memory in each mapper (override setup to do that). If the random access table is large consider running an additional map/reduce to prepare it for the other map-reduce (so you'd go over both tables/a unified table)