4
votes

I'm trying to read a .json file as dict() in a python code from VM instance stored in google cloud storage bucket.

I tried reading json file as blob:

client = storage.Client()
bucket = client.get_bucket('bucket-id-here')
blob = bucket.get_blob('remote/path/to/file.json')
str_json = blob.download_as_string()

But I'm unable to decode the str_json. Is my approach correct? if any other approach available please let me know.

I need something like:

# Method to load json
dict = load_json(gcs_path='gs://bucket_name/filename.json')
2

2 Answers

4
votes

Here is an alternative way to reach it using the official Cloud Storage library:

# Import the Google Cloud client library and JSON library
from google.cloud import storage
import json

# Instantiate a Google Cloud Storage client and specify required bucket and file
storage_client = storage.Client()
bucket = storage_client.get_bucket('bucket_name')
blob = bucket.blob('filename.json')

# Download the contents of the blob as a string and then parse it using json.loads() method
data = json.loads(blob.download_as_string(client=None))
0
votes

This method using GCS File System gcsfs an be used read files from Google Cloud storage.

# Reading gcs files with gcsfs
import gcsfs
import json

gcs_file_system = gcsfs.GCSFileSystem(project="gcp_project_name")
gcs_json_path = "gs://bucket_name/path/to/file.json"
with gcs_file_system.open(gcs_json_path) as f:
  json_dict = json.load(f)

This method also works for images stored in GCS with skimage as,

from skimage import io
with gcs_file_system.open(gcs_img_path) as f:
  img = io.imread(f)