0
votes

I want to copy multiple .gz files from one gcs bucket to another. File name pattern has prefix as 'Logs_' and suffix as date like '20160909',so full file name will be Logs_2016090.gz, Logs_20160908.gz etc. I want to copy all files starting with Logs_ from one gcs bucket to another gcs bucket. For this I am using wildcard character * at the end like Logs_*.gz for copy operation as below:

Storage.Objects.Copy request =
            storageService
                .objects()
                .copy("source_bucket", "Logs_*.gz", "destination_bucket", ".", content);

Above I am using "." because all files has to be copied to destination_bucket, so I can't specify single file name there. Unfortunately, this code doesn't work and error that file doesn't exist. I am not sure what change is required here. Any java link or any piece of code will be helpful. Thanks !!

1

1 Answers

1
votes

While the gsutil command-line utility happily supports wildcards, the GCS APIs themselves are lower level commands and do not. The storage.objects.copy method must have one precise source and one precise destination.

I recommend one of the following:

  • Use a small script invoking gsutil, or
  • Make a storage.objects.list call to get the names of all matching source objects, then iterate over them, calling copy for each, or
  • If you're dealing with more than, say, 10 TB or so of gzip files, consider using Google's Cloud Storage Transfer Service to copy the files.