0
votes

given a config file that has new line delimitted a set of folders (cannot use complete list of dirs (TOO LARGE)) in Google Cloud Storage as follows:

gs://databucket/path/to/dir/441738
gs://databucket/path/to/dir/441739
gs://databucket/path/to/dir/441740

how can one use gsutil inside a bash script to recursively rsync the files, whilst deleting files present in the destination folder that don't exist on the bucket?

I have tried using the following in a bash script

cat ${1} | gsutil -m rsync -r -d ${2}

after which I receive an error code 126

whereby ${1} references the aforementioned config file and ${2} references the destination folder to which each folder in the config file list is to be rsynced. This works with gsutil cp however rsync more efficiently/effectively suits my needs.

cat ${1} | gsutil -m cp -R -I ${2}

How might one accomplish this? Thanks

1

1 Answers

1
votes

As you know, rsync does not support function uses stdin like -I flag...

So you have to use a different method than cp.

If you want synchronize multiple folders in a single command, Write batch script that has rsync command each line like below.

gsutil -m rsync -r -d gs://databucket/path/to/dir/441738 *destination_folder1*
gsutil -m rsync -r -d gs://databucket/path/to/dir/441739 *destination_folder2*
gsutil -m rsync -r -d gs://databucket/path/to/dir/441740 *destination_folder3*

And run a script file you wrote.

This method is a bit bothersome, but it can work same result you want.