Because HBase tables are sparse tables, HBase stores for every cell not only the value, but all the information required to identify the cell (often described as the Key, not to be confused with the RowKey). The Key looks as follows:
RowKey-ColumnFamily-ColumnQualifier-Timestamp
And all this information is stored for every entry. That's why there is the recommendation to use short names for Column Families and Column Qualifiers to reduce additional overhead.
My Question: Why do I need to store the ColumnFamily for every entry? From my understanding every Store File belongs to exactly one Column Family. Wouldn't it be enough to store the Column Family name once per Store File? This would reduce overhead, arbitrary Column Family names could be used and we would still be able to identify the Column Family for every entry. What am I missing here?