0
votes

I have a very big GCS bucket (several TB), with several sub directories, each with a couple terabytes of data.

I want to delete some of those folders.

I tried to use gsutil from a Cloud Shell, but it is taking ages.

For reference, here is the command I'm using:

gsutil -m rm -r "gs://BUCKET_NAME/FOLDER"

I was looking at this question, and thought maybe I could use that, but is seems like it can't filter by folder name, and I can't filter by any other thing as folders have some mixed content.

So far, my last resort would be to wait until the folders I want to delete are "old", and set the lifecycle rule accordingly, but that could take too long.

Are there any other ways to make this faster?

1

1 Answers

1
votes

It's just going to take a long time; you have to issue a DELETE request for each object with the prefix FOLDER/.

GCS doesn't have the concept of "folders". Object names can share a common prefix, but they're all in a flat namespace. For example, if you have these three objects:

  • /a/b/c/1.txt
  • /a/b/c/2.txt
  • /a/b/c/3.txt

...then you don't actually have folders named a, b, or c. Once you deleted those three objects, the "folders" (i.e. the prefix that they shared) would no longer appear when you listed objects in your bucket.

See the docs for more details:

https://cloud.google.com/storage/docs/gsutil/addlhelp/HowSubdirectoriesWork