How do I handle regex filters on binary row keys and column qualifiers in Bigtable?

Question

In my current Bigtable design, all my row keys, column qualifiers and values are binary values. I'm using the Go client, and simply casting []byte keys to string allows me to do write data (seemingly) without issue.

However, this poses some issues when using Bigtable APIs that involve regexes on keys/values, such as the filters bigtable.ColumnFilter, bigtable.ValueFilter and bigtable.RowKeyFilter in the Go client library.

I'm looking for recommendations or best practices for these questions:

How can I escape the binary values to safely use these filters without Bigtable accidentally interpreting a byte as a regex character? E.g. to plainly match a column qualifier byte-by-byte using bigtable.ColumnFilter.
In extension, what hoops do I need to jump through to use regexes safely over these binary values? E.g., I want to use bigtable.RowKeyFilter to match row keys that start and end with specific bytes (I'm aware that will have bad performance).

For context, this is a simplified version of my schema:

Row keys: [16 byte UUID][8 byte big-endian uint][8 byte big-endian uint]
Column qualifiers: [8 byte big-endian uint][8 byte big-endian uint]
Values: [8 byte big-endian uint]

Thank you!

Methkal Khalawi Methkal Khalawi · Accepted Answer · 2020-01-24T12:46:45

to escape all regex characters use : regexp.QuoteMeta

package regexp also can help you to use regexes safely over binary values without any issue. however, keep in mind that

All characters are UTF-8-encoded code points.

How do I handle regex filters on binary row keys and column qualifiers in Bigtable?

1 Answers