40
votes

I am having trouble downloading multiple files from AWS S3 buckets to my local machine.

I have all the filenames that I want to download and I do not want others. How can I do that ? Is there any kind of loop in aws-cli I can do some iteration ?

There are couple hundreds files I need to download so that it seems not possible to use one single command that takes all filenames as arguments.

7
you can look at aws s3api get-object if you're able to filter/query the list of your files .. if you have the list in a file, you can read the file by line and pipe with aws s3 cp s3://yourbuyet/-Frederic Henri

7 Answers

24
votes

There is a bash script which can read all the filenames from a file filename.txt.

#!/bin/bash  
set -e  
while read line  
do  
  aws s3 cp s3://bucket-name/$line dest-path/  
done <filename.txt
60
votes

Also one can use the --recursive option, as described in the documentation for cp command. It will copy all objects under a specified prefix recursively.

Example:

aws s3 cp s3://folder1/folder2/folder3 . --recursive

will grab all files under folder1/folder2/folder3 and copy them to local directory.

25
votes

You might want to use "sync" instead of "cp". The following will download/sync only the files with the ".txt" extension in your local folder:

aws s3 sync --exclude="*" --include="*.txt" s3://mybucket/mysubbucket .
21
votes

As per the doc you can use include and exclude filters with s3 cp as well. So you can do something like this:

aws s3 cp s3://bucket/folder/ . --recursive --exclude="*" --include="2017-12-20*"

Make sure you get the order of exclude and include filters right as that could change the whole meaning.

4
votes

Tried all the above. Not much joy. Finally, adapted @rajan's reply into a one-liner:

for file in whatever*.txt; do { aws s3 cp $file s3://somewhere/in/my/bucket/; } done
0
votes

I wanted to read s3 object keys from a text file and download them to my machine parallelly.

I used this command

cat <filename>.txt | parallel aws s3 cp {} <output_dir>

The contents of my text file looked like this:

s3://bucket-name/file1.wav
s3://bucket-name/file2.wav
s3://bucket-name/file3.wav

Please make sure you don't have an empty line at the end of your text file. You can learn more about GNU parallel here

-4
votes

I got the problem solved, may be a little bit stupid, but it works.

Using python, I write multiple line of AWS download commands on one single .sh file, then I execute it on the terminal.