15
votes

Google BigQuery - bq command enable you to create, load, query and alter table.

I did not find any documentation regarding dropping table, will be happy to know how to do it.

I found the bq tool much easier to implement instead of writing python interface for each command.

Thanks.

5

5 Answers

39
votes

found it :

bq rm -f -t data_set.table_name

-t for table, -f for force, -r remove all tables in the named dataset

great tool.

15
votes

Is there a way to bulk delete multiple tables? – activelearner

In bash, you can do something like:

for i in $(bq ls -n 9999 my_dataset | grep keyword | awk '{print $1}'); do bq rm -ft my_dataset.$i; done;

Explanation:

  • bq ls -n 9999 my_dataset - list up to 9999 tables in my dataset
  • | grep keyword - pipe the results of the previous command into grep, search for a keyword that your tables have in common
  • | awk '{print $1}' - pipe the results of the previous command into awk and print only the first column
  • Wrap all that into a for loop
  • do bq rm -ft my_dataset.$i; done; - remove each table from your dataset

I would highly recommend running the commands to list out the tables you want to delete before you add the 'do bq rm'. This way you can ensure you are only deleting the tables you actually want to delete.

UPDATE: The argument -ft now returns an error and should be simply -f to force the deletion, without a prompt:

for i in $(bq ls -n 9999 my_dataset | grep keyword | awk '{print $1}'); do bq rm -f my_dataset.$i; done;
3
votes

You can use Python code (on Jupyter Notebook) for the same purpose:

bigquery_client  = bigquery.Client() #Create a BigQuery service object
dataset_id='Name of your dataset'
table_id='Table to be deleted'
table_ref = bigquery_client.dataset(dataset_id).table(table_id)
bigquery_client.delete_table(table_ref)  # API request
print('Table {}:{} deleted.'.format(dataset_id, table_id))

if you want to delete complete dataset:

If dataset contains tables as well. And we want to delete dataset containing tables in one go the command is:

!bq rm -f -r serene-boulder-203404:Temp1   # It will remove complete data set along with the tables in it

If your dataset is empty then you can use the following command as well: To use the following command make sure that you have deleted all the tables in that dataset otherwise, it will generate an error (dataset is still in use).

#Now remove an empty dataset using bq command from Python
!bq rm -f dataset_id
print("dataset deleted successfully !!!")
0
votes

I used the command line for loop to delete a month of table data, but this is reliant on your table naming:

for %d in (01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31) DO bq rm -f -t dataset.tablename_201701%d

0
votes

Expanding on the excellent answer from @james, I simply needed to remove all tables in a dataset but not actually remove the dataset itself. Hence the grep part was unnecessary for me however I still needed to get rid of the

table_id
------------------

header that bq returns when listing tables, for that I used sed to remove those first two lines:

for i in $(bq ls -n 9999 my_dataset | sed "1,2 d" | awk '{print $1}'); do bq rm -f my_dataset.$i; done;

perhaps there's a bq option to not return that header but if there is, I don't know it.