1
votes

I'm testing out the read_rows operation in the ruby sdk for bigtable.

https://github.com/googleapis/google-cloud-ruby/blob/master/google-cloud-bigtable/lib/google/cloud/bigtable/read_operations.rb#L174

I noticed that if i pass in a timestamp filter, it will only give me cells(columns) that were updated/created in that timeframe. Instead I would like the filter to give me the entire rows contents if a given row was updated within the specified timeframe. Is that a feature, and if so would could achieve that.

ex: i have a row with 'token' updated at time 3000, 'id' updated at time 3000, 'name' updated at time 3000, 'token' updated at time 7000.

i would like to query with timestamp filter 6000 to 8000, and have all cells returned, but only the most updated one. 'token' at time 7000, 'id' at time 3000, and 'name' at time 3000.

1

1 Answers

1
votes

To do something like this you'll need to use a Condition filter, which comes with some gotchas. These filters tend to be slow (since they involve backtracking within the row at the backend). And the condition is not evaluated atomically with the output.

Depending on your requirements, it may be better to do this in two phases. First, do a scan using the timestamp filter to see which rows match. Then, as the row keys are streamed back, read the matching rows in full and confirm client-side that the most recent update still satisfies the time range.