1
votes

I am using spark on Google dataproc cluster. I have created a dictionary in Jupyter notebook which I want to dump in my GCS bucket. However, it seems the usual way of dumping to json using fopen() does not work in case of gcp. So, how can I write my dictionary as .json file to GCS. Or, is there any other way to get the dictionary?

It's funny, I could write spark dataframe to gcs without any hassle, but apparently, I can't load JSON on gcs unless I have it on my local system! Please help! Thank you.

1

1 Answers

1
votes

The file in GCS is not in your local file system so that's why you cannot call "fopen" on it. You can either save to GCS by directly using a GCS client (for example, this tutorial), or treat the GCS location as an HDFS destination (for example, saveAsTextFile("gs://...")