0
votes

Most documentation and articles on HBase suggest that a schema in HBase should have not more than 2 or 3 column families. How does the number of column families impact HBase performance? Why are too many column families a bad schema design?

When does it make sense to create multiple tables as opposed to multiple column families to store data?

I have read the explanation here, but don't fully understand it.

1

1 Answers

1
votes

Actually the question is : why would you need multiple column families? Column families are not designed to organize data based on some business considerations, but rather on some technical constraints. For instance, you may have one column family to store all fields-values, and one column family to store binary objects (PDF, images) which are accessed occasionally. One or several HBase tables depend upon your use cases, but if the rowkeys are the same a single table should be sufficient