12
votes

At the following page

https://googlecloudplatform.github.io/google-cloud-python/latest/storage/blobs.html

there are all the API calls which can be used for Python & Google Cloud storage. Even in the "official" samples on github

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/storage/cloud-client/snippets.py

don't have a related example.

Finally, downloading a directory with the same method used for download files gives the error

Error:  [Errno 21] Is a directory:
4
What is it that you are asking? Also show us code.Benedict K.

4 Answers

22
votes

You just have to first list all the files in a directory and then download them one by one:

bucket_name = 'your-bucket-name'
prefix = 'your-bucket-directory/'
dl_dir = 'your-local-directory/'

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=prefix)  # Get list of files
for blob in blobs:
    filename = blob.name.replace('/', '_') 
    blob.download_to_filename(dl_dir + filename)  # Download

blob.name includes the entire directory structure + filename, so if you want the same file name as in the bucket, you might want to extract it first (instead of replacing / with _)

4
votes

If you want to keep the same directory structure without renaming and also create nested folders. I have for python 3.5+ a solution based on @ksbg answer :

from pathlib import Path
bucket_name = 'your-bucket-name'
prefix = 'your-bucket-directory/'
dl_dir = 'your-local-directory/'

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=prefix)  # Get list of files
for blob in blobs:
    if blob.name.endswith("/"):
        continue
    file_split = blob.name.split("/")
    directory = "/".join(file_split[0:-1])
    Path(directory).mkdir(parents=True, exist_ok=True)
    blob.download_to_filename(blop.name) 
1
votes

Lets say, we want to download FINALFOLDER from the storage path: gs://TEST_BUCKET_NAME/FOLDER1/FOLDER2/FINALFOLDER After downloading, the final path will look like: D:\\my_blob_data\FINALFOLDER

from os import makedirs
from os.path import join, isdir, isfile, basename
from google.cloud import storage

# if your environment was authenticated, the default config will be picked up
storage_client = storage.Client() # comment this line if you want to use service account

# uncomment the line below if you have a service account json
# storage_client = storage.Client.from_service_account_json('creds/sa.json')

bucket_name = 'TEST_BUCKET_NAME'
prefix = 'FOLDER2'
dst_path = 'D:\\my_blob_data'

if isdir(dstPath) == False:
    makedirs(dstPath)

bucket = storage_client.bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=prefix)  # Get list of files
for blob in blobs:
    blob_name = blob.name 
    dst_file_name = blob_name.replace('FOLDER1/FOLDER2', dst_path) #.replace('FOLDER1/FOLDER2', 'D:\\my_blob_data') 
    # extract the final directory and create it in the destination path if it does not exist
    dst_dir = dst_file_name.replace('/' + basename(dst_file_name), '')
    if isdir(dst_dir) == False:
        makedirs(dst_dir)
    # download the blob object
    blob.download_to_filename(dst_file_name)
-1
votes

Refer This Link- https://medium.com/@sandeepsinh/multiple-file-download-form-google-cloud-storage-using-python-and-gcs-api-1dbcab23c44

1 - Add Your Credential Json 2 - List Bucket Items 3 - Download

import logging
import os
from google.cloud import storage
global table_id
global bucket_name
logging.basicConfig(format=’%(levelname)s:%(message)s’, level=logging.DEBUG) 
bucket_name = ‘mybucket’
table_id = ‘shakespeare’
storage_client = storage.Client.from_service_account_json(‘/google-cloud/keyfile/service_account.json’)
# The “folder” where the files you want to download are
folder=’/google-cloud/download/{}’.format(table_id)
delimiter=’/’
bucket=storage_client.get_bucket(bucket_name)
blobs=bucket.list_blobs(prefix=table_id, delimiter=delimiter) #List all objects that satisfy the filter.
# Download the file to a destination 
def download_to_local():
 logging.info(‘File download Started…. Wait for the job to complete.’)
 # Create this folder locally if not exists
 if not os.path.exists(folder):
 os.makedirs(folder)
 # Iterating through for loop one by one using API call
 for blob in blobs:
 logging.info(‘Blobs: {}’.format(blob.name))
 destination_uri = ‘{}/{}’.format(folder, blob.name) 
 blob.download_to_filename(destination_uri)
 logging.info(‘Exported {} to {}’.format(
 blob.name, destination_uri))
if __name__ == ‘__main__’:
 download_to_local()