1
votes

In my current Bigtable design, all my row keys, column qualifiers and values are binary values. I'm using the Go client, and simply casting []byte keys to string allows me to do write data (seemingly) without issue.

However, this poses some issues when using Bigtable APIs that involve regexes on keys/values, such as the filters bigtable.ColumnFilter, bigtable.ValueFilter and bigtable.RowKeyFilter in the Go client library.

I'm looking for recommendations or best practices for these questions:

  • How can I escape the binary values to safely use these filters without Bigtable accidentally interpreting a byte as a regex character? E.g. to plainly match a column qualifier byte-by-byte using bigtable.ColumnFilter.
  • In extension, what hoops do I need to jump through to use regexes safely over these binary values? E.g., I want to use bigtable.RowKeyFilter to match row keys that start and end with specific bytes (I'm aware that will have bad performance).

For context, this is a simplified version of my schema:

  • Row keys: [16 byte UUID][8 byte big-endian uint][8 byte big-endian uint]
  • Column qualifiers: [8 byte big-endian uint][8 byte big-endian uint]
  • Values: [8 byte big-endian uint]

Thank you!

1

1 Answers

1
votes

to escape all regex characters use : regexp.QuoteMeta

package regexp also can help you to use regexes safely over binary values without any issue. however, keep in mind that

All characters are UTF-8-encoded code points.