I have a spark job which is getting failed due to following error.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 34338.0 failed 4 times, most recent failure: Lost task 0.3 in stage 34338.0 (TID 61601, homeplus-cmp-transient-20190128165855-w-0.c.dh-homeplus-cmp-35920.internal, executor 80): java.io.IOException: Failed to rename FileStatus{path=gs://bucket/models/2018-01-30/model_0002002525030015/metadata/_temporary/0/_temporary/attempt_20190128173835_34338_m_000000_61601/part-00000; isDirectory=false; length=357; replication=3; blocksize=134217728; modification_time=1548697131902; access_time=1548697131902; owner=yarn; group=yarn; permission=rwx------; isSymlink=false} to gs://bucket/models/2018-01-30/model_0002002525030015/metadata/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/attempt_20190128173835_34338_m_000000_61601/part-00000
I'm unable to figure out what permission is missing, since the Spark job was able to write the temporary files, I'm assuming there are write permissions already.
spark,hiveorcoreproperties during cluster creation. I just init the hive metastore usingcloud-sql-proxyI'm running multiple jobs but they do write to different location within same bucket something like belowgs://bucket/outputfolder/job1andgs://bucket/outputfolder/job2I'm seeing this error on every run. - kaysushsudo sh -c 'echo "\nlog4j.logger.com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase=DEBUG" >> /etc/spark/conf/log4j.properties'. It will enable debug logs for the class that logs exception that occurs during rename. You will be able to find this log message in in StackDriver usingGHFS.renamestring. - Igor DvorzhakStorage Legacy Ownerrole on the bucket. I addedStorage Adminrole as well and that seem to solve the issue. Thanks. - kaysush