2
votes

We have created a table in cloud bigtable with two column families. One column family with 30 versions and the other with 1 version. However, when we query the table we are getting multiple versions of the columns for which we have set max number of versions to 1.

Table create statement:

create 'myTable', {NAME => 'cf1', VERSIONS => '30'}, {NAME => 'cf2', VERSIONS => '1'}


Describe 'myTable':

{NAME => ‘cf2’, BLOOMFILTER => ‘ROW’, VERSIONS => ‘**1**’, IN_MEMORY => ‘false’, KEEP_DELETED_CELLS => ‘FALSE’, DATA_BLOCK_ENCODING => ‘NONE’, TTL => ‘FOREVER’, COMPRESSION => ‘NONE’, MIN_VERSIONS => ‘0’, BLOCKCACHE => ‘true’, BLOCKSIZE => ‘65536’, REPLICATION_SCOPE
 => ‘0’}                                                                                                                                                                                                                                                                        
{NAME => ‘cf1’, BLOOMFILTER => ‘ROW’, VERSIONS => ‘**30**’, IN_MEMORY => ‘false’, KEEP_DELETED_CELLS => ‘FALSE’, DATA_BLOCK_ENCODING => ‘NONE’, TTL => ‘FOREVER’, COMPRESSION => ‘NONE’, MIN_VERSIONS => ‘0’, BLOCKCACHE => ‘true’, BLOCKSIZE => ‘65536’, REPLICATION_SCOPE
 => ‘0’}

How does the bigtable garbage collection work? How frequently does it delete the older versions? or are we missing something while creating the table ?

1

1 Answers

6
votes

From Bigtable Docs: Deletion of values happens opportunistically in the background, so you might still be able to read the data for several days after it has expired.

Link to docs

Even more detailed explanation