Problem: I need to insert some user ids in Hbase after every hour and every day (eg: 2201201711, this represents 22nd Jan 2017: 11 AM data). what should be the design of the table if I want to fetch all user ids for a particular hour on a date or in data and time range.
What I have done so far, I keeping user ids as row keys and creating column on run time in same column family. file data : user id | date time 1 2201201711 2 2201201711 3 2201201711
my hbase row keys would be 1, 2 and 3 and new column would be created 2201201711.
I know I can go with composite key using date, hour and user Id. But I wanna understand what benefits it does provides in term of performance.
What is the performance diff if I select a whole column (with out any filter) vs looking up using composite row keys.