1
votes

I am currently learning HBase I am not able to understand when comparing with RDBMS.

  1. How hbase is column oriented, we are inserting data into hbase with rowid and column families.

For instance, if I have two employee records I will insert with row1 for all column families(cf:id,name,salary) and for second row I will insert with row2 id for all column families

Here also we are inserting data like RDBMS, So why we call this as column oriented?

Your help really appreciated.

Thanks Venkata

1

1 Answers

0
votes

In RDMBS you have a fixed schema, which means that each row has same columns. In HBase that is not the case, each row can have different (number of) columns. That is why it is considered a columnar storage.

For example, you could have a table like this:

row1key, cf1:c1, cf1:c2, cf1:c5, cf2:col1, cf2:col5
row2key, cf1:c2, cf1:c3, cf2:col1, cf2:col7, cf2:col8

As you can see, here you have two rows containing values for two column families (cf1 and cf2), but for different (number of) cells. In a relational database this would not be possible. The only way would be to anticipate all the potential columns and include them in advance, but in that case you would have NULL values for each cell not having a value.

For your example you could have records like this:

employee1, id1, name1, salary1
employee2, id2, salary2
employee3, id3, name3
employee4, id4

And these would all be valid records.