0
votes

I am trying to understand the Hbase architecture with respect to Logical data model vs Physical data storage. I am little confused about the HFile creation. If we have a column family with 2 columns, does Hbase create 2 HFiles or just one?

Below is the diagram that I was looking and example below shows logical to physical mapping for each cf:col. Please help me clear this confusion

https://mapr.com/blog/hbase-and-mapr-db-designed-distribution-scale-and-speed/assets/blogimages/Logical-vs-physical-storage.png

1

1 Answers

1
votes

HFiles are created on a column-family basis, so cf1:a and cf1:b would be stored in the same HFile if they are in the same region, but cf2:a would be in a different one.

On the diagram Address:street and Address:city are both part of the Address column family and the data will end up in the same HFile.

This is the same for the MemStore, so for each column family on a single RegionServer there will be a separate MemStore instance.