3
votes

We need to scan an HBase table, searching for rows which have certain value on a column whose qualifier matches certain pattern.

We're setting up a filter like this:

new FilterList(MUST_PASS_ALL,
    new FamilyFilter(EQUAL, new BinaryComparator(bytes(someFamily))),
    new QualifierFilter(EQUAL, new RegexStringComparator(qualifierRegex)),
    new ValueFilter(EQUAL, new SubstringComparator(detailValue)))

Which when executed in a Scan it matches exactly as we intend on the columns & values we are looking for, but the Scanner returns results containing just the matching columns/values and we need the full row with all columns.

We've tried a lot of combinations with SkipFilter (the only filter available from factory HBase which seems to affect a full row based on another filter) but couldn't find a correct answer.

Of course we could make a custom filter for our case but we are trying to avoid needing to push "deploy jar to all regionservers & restart hbase cluster" kind of instructions to production ops team.

1
Any luck on this? It seems so obvious but it seems NOT possible in HBase: a compound filter (must have this OR must have that) while still returning columns that are NOT mentioned in the compound filter itself Much like : select * from table where (a < 10 AND b = 10) OR c = 'a' Hbase will only ever return columns a,b and c in this case regardless whether you have included other columns with scan.addFamily(d)DataHacker

1 Answers