0
votes

I currently have multiple datasets within a project and I would like to know if there's a way to know at one go what are the tables that are not updated in the last 90 days which means it falls under the long term storage bucket in bigquery. Is there some sql syntax for this?

Thank you

1
thanks, I'm actually looking for more elaborate results, since I have for example 50 datasets with 20 tables each, I would have to check each dataset by 1 by 1. If there will be any way to check all tables in all datasets at once to know their size and when it was last modified - Justine

1 Answers

1
votes

Not SQL but a possible solution is using the bqtool. Describing a table will return the number of Long Term Storage bytes (numLongTermBytes):

$ bq show --format=prettyjson dataset.table
...
  "kind": "bigquery#table", 
  "lastModifiedTime": "1534845362446", 
  "location": "US", 
  "numBytes": "56", 
  "numLongTermBytes": "56", 
  "numRows": "3", 
... 

Therefore, you can extend it to list all tables in each dataset and get the desired information from each. Based from this I did a quick example:

#!/bin/bash

project=PROJECT_ID
dataset=DATASET_NAME
max_results=100

# get list of tables
tables=$(bq ls --max_results $max_results "$project:$dataset" | awk '{print $1}' | tail -n +3)

# get LTS bytes for each table
for table in $tables
do
    printf '%-35s %-50s\n' "$table" "$(bq show --format prettyjson $project:$dataset.$table | grep numLongTermBytes)"
done

and you'll get an output similar to:

Dfp                                   "numLongTermBytes": "0",                        
SO_55506947                           "numLongTermBytes": "144",                      
SO_55506947_bis                       "numLongTermBytes": "144",                      
a                                     "numLongTermBytes": "7",                        
a1                                    "numLongTermBytes": "399",                      
aaa                                   "numLongTermBytes": "8",                        
adaptive                              "numLongTermBytes": "1085",                     
adaptive_view                         "numLongTermBytes": "0",                        
audience_segment_map_exp_test         "numLongTermBytes": "300",                      
b                                     "numLongTermBytes": "7", 
...