I have a spark application that runs as expected on one node.
I am now using yarn to run this across multiple nodes. However, this is failing with a file not found exception. I first changed this file path from relative to absolute path but the error persisted. I then read here that it may be necessary to prefix the path with file://
in case the default is for HDFS. This file type in question is json
.
Despite using the absolute path and prefixing with file
, this error persists:
16/11/10 10:19:56 INFO yarn.Client:
client token: N/A
diagnostics: User class threw exception: java.io.FileNotFoundException: file://absolute/dir/file.json (No such file or directory)
Why does this work correctly with one node but not in cluster mode with yarn?
file://me@server/dir/file.json
– LearningSlowly