24
votes

I want to sync a local directory to a bucket in Google Cloud Storage. I want to copy the local files that do not exist remotely, skipping files that already exist both remote and local. Is this possible to do this with GSUtil? I cant seem to find a "sync" option for GSUtil or a "do not overwrite". Is it possible to script this?

I am on Linux (Ubuntu 12.04)?

3
gsutil help cp and the online doc are a bit lengthy but fully documents gsutil cp -n which can achieve what you want - Nino Filiu

3 Answers

31
votes

gsutil now supports the noclobber flag (-n) on the cp command. Update your gsutil to the latest version (using gsutil update) and then use the -n flag when performing a copy.

This flag will skip files that already exist at the destination.

15
votes

You need to add (-n) to the command, mentioned officially on Google Cloud Platform:

-n: No-clobber. When specified, existing files or objects at the destination will not be overwritten. Any items that are skipped by this option will be reported as being skipped. This option will perform an additional GET request to check if an item exists before attempting to upload the data. This will save retransmitting data, but the additional HTTP requests may make small object transfers slower and more expensive.

Example (Using multithreading):

gsutil -m cp -n -a public-read -R large_folder gs://bucket_name
10
votes

Using rsync, you can copy missing/modified files/objects:

gsutil -m rsync -r <local_folderpath> gs://<bucket_id>/<cloud_folderpath>

Besides, if you use the -d option, you will also delete files/objects in your bucket that are not longer present locally.

Another option could be to use Object Versioning, so you will replace the files/objects in your bucket with your local data, but you can always go back to the previous version.