6
votes

How does HBase performs a lookup and retrieves a record ? e.g., what is the equivalent in HBase for the RDBMS's B-Trees?

[EDIT]

I understand how HBase resolves the -ROOT-, and .META. tables to find out which region holds the data. But how is the local lookup performed?

To better illustrated, here's an example:

  1. I am starting a search (get or scan) for record with key 77.
  2. HBase client figures that the key is contains in the 50-100 region, held by RegionServer X.
  3. HBase client contacts RegionServer X to get the data.

How does RegionServer X finds out the location of record 77 ?

Does the RegionServer uses some kind of lookup table (e.g. like the RDBMS's B-Trees ?) for the keys of a region ? Or does it need to read all contents of the StoreFiles, for records from 50 to 77 ?

1

1 Answers

6
votes

TL;DR: it looks like HBase (like BigTable), uses as a structure similar to a B+ tree to do the lookup. So the row-key is the primary index (and the only index of any sort in HBase by default.)

Long answer: From this Cloudera blog post about HBase write path, it looks like HBase operates the following way:

Each HBase table is hosted and managed by sets of servers which fall into three categories:

  • One active master server
  • One or more backup master servers
  • Many region servers

Region servers contribute to handling the HBase tables. Because HBase tables can be large, they are broken up into partitions called regions. Each region server handles one or more of these regions.

There's some more detail in the next paragraph:

Since the row key is sorted, it is easy to determine which region server manages which key. ... Each row key belongs to a specific region which is served by a region server. So based on the put or delete’s key, an HBase client can locate a proper region server. At first, it locates the address of the region server hosting the -ROOT- region from the ZooKeeper quorum. From the root region server, the client finds out the location of the region server hosting the -META- region. From the meta region server, then we finally locate the actual region server which serves the requested region. This is a three-step process, so the region location is cached to avoid this expensive series of operations.

From another Cloudera blog post, it looks like the exact format used to save the HBase on the file system keeps changing, but the above mechanism for row key lookups should be more or less consistent.

This mechanism is very, very similar to Google BigTable's lookup (you will find the details in Section 5.1 starting at the end of page 4 on the PDF), which uses a three-level heirarchy to query the row-key location: Chubby -> Root tablet -> METADATA tablets -> actual tablet

UPDATE: to answer the question about lookups inside a Region Server itself: I don't know for sure, but since the row keys are sorted, and HBase knows the start and end keys, I suspect it uses a binary search or interpolation search, both of which are really fast - log(n) and log(log(n)) respectively. I don't think HBase would ever need to scan rows from the start row key to the one that it needs to find, since searches on sorted keys is well-known problem that has several efficient solutions.