Querying Bigtable prefix from Bigquery

Question

Using BigQuery's Bigtable external data source, I'd like to find rowkeys in Bigtable that start with a certain value. This is a very fast operation in cbt or other Bigtable-native tools, but seems slow in BigQuery.

SELECT rowkey from blah.blah where rowkey LIKE '123%' group by rowkey

It seems to be processing a lot of data, perhaps even the whole Bigtable table, to find results, rather than using Bigtable's native prefix operation which is very fast.

Am I using the most efficient query in BigQuery for this?

Related question: Are you using #standardSQL? That might have fixed everything — Felipe Hoffa
It does fix everything! Processes far less data with far faster query result. Thank you. — user01380121
@FelipeHoffa I'm able to query rowkey with #standardSQL but can't figure out how to query cell values in Bigtable - is there a good example? SELECT default.column.cell.value works in legacy SQL but returns a Syntax error: Unexpected keyword DEFAULT in #standardSQL — user01380121

Felipe Hoffa Felipe Hoffa · Accepted Answer · 2018-07-02T21:45:35

Instead of:

WHERE rowkey LIKE '123%'

try:

WHERE rowkey > '123'

The connector between BigQuery and Bigtable could be smarter about this (internal discussion started), but in the meantime this should fix the issue.

Querying Bigtable prefix from Bigquery

1 Answers