I'm submitting a Spark job to a remote spark cluster on yarn and including a file in the spark-submit --file
I want to read the submitted file as a dataframe. But I'm confused about how to go about this without having to put the file in HDFS:
spark-submit \
--class com.Employee \
--master yarn \
--files /User/employee.csv \
--jars SomeJar.jar
spark: SparkSession = // create the Spark Session
val df = spark.read.csv("/User/employee.csv")