0
votes

When I execute this command ...

$ cat sending.csv | gsutil -m cp -I gs://my-bucket/

I get output like this ...

Copying file://000000000077.jpg [Content-Type=image/jpeg]...                    
CommandException: No URLs matched:                                              
Copying file://000000000086.jpg [Content-Type=image/jpeg]...

...
Copying file://000000002536.jpg [Content-Type=image/jpeg]... ETA 00:00:00       
| [261/261 files][ 41.1 MiB/ 41.1 MiB] 100% Done   3.7 MiB/s ETA 00:00:00       
Operation completed over 261 objects/41.1 MiB.                                   
CommandException: 1 file/object could not be transferred.

I need to know which file failed to transfer, but I don't see an easy way of getting this information.

The file both before and after the error message were successfully transferred

$ gsutil ls gs://my-bucket/000000000077.jpg
gs://my-bucket/000000000077.jpg
$ gsutil ls gs://my-bucket/000000000086.jpg
gs://my-bucket/000000000086.jpg

and there was no file between them in the sending.csv file.

$ cat sending.csv | nl | head
...
     5  000000000077.jpg
     6  000000000086.jpg

I tried passing the -D option to gsutil, but there's too much output to quickly find individual files that failed to copy.

I did the following to compare the list of files to be sent with the list actually stored in the bucket:

gsutil ls gs://my-bucket/*.jpg | sort | sed 's!.*/!!' > sent.csv
diff sending.csv sent.csv

but no differences were found.
I'd like to know which file gsutil thinks it failed to transfer.

2

2 Answers

1
votes

You can use gsutil cp -L cp.log ..., which will record a log of each operation in the format described here.

Alternatively, if you just want to re-run the operation to copy the files that didn't get successfully transferred, you might consider instead using the gsutil rsync command.

1
votes

The only downside of this solution is that you cannot use the parallelization option with gsutil and the files will be uploaded sequentially.

gsutil cp returns 0 if the operation was successful, and a non-0 value otherwise. We can check this return value in bash with the $$ operator.

Supposing that in sending.csv contains one file per line:

#!/bin/sh
while read line
do 
   echo $line; 
   gsutil cp $line gs://my-bucket/
   if [ $? -eq 0 ]
   then
       echo "$line successfully uploaded"
   else
       echo "Houston, we have a problem"
   fi 
done < files.csv