I have a hbase table and row key will be like <<timestamp>>_<<user_id>>
where time stamp will be yyyyMMddHHmm. My concern is to query user details in a given time range.
eg: "201602021310_user1"
HTable table = new HTable(conf, tableName);
Scan s = new Scan();
s.setStartRow("20160202".getBytes());
s.setStopRow("20160303".getBytes());
ResultScanner ss = table.getScanner(s);
List<Result> rs = new ArrayList<Result>();
for(Result r:ss){
rs.add(r);
}
According to my understanding there won't be any issue since Hbase store data in lexicographically order. But this implementation will cause the region server hot spotting. In order to avoid hot spotting,(expecting comments)
- I am thinking of use a hashed prefix in my row key. If so I am feeling that my range scan will not work as I want.
- Then use a filtering like fuzzy filter. But I couldn't find a way to achieve range queering.
According to my understating what I can achieve through this is filter up to each month and merge results.
201602??_??????
+20160301_??????
+20160302_??????
+20160303_??????
What will be the best approach for achieve this ? ( eliminating hot spotting while supporting range queering)