I get this exception in the spark application submitted with spark-submit (2.4.0)
User class threw exception: org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat, org.apache.spark.sql.execution.datasources.parquet.DefaultSource), please specify the fully qualified class name.;
My application is:
val sparkSession = SparkSession.builder()
.appName(APP_NAME)
.config("spark.sql.warehouse.dir", warehouseLocation)
.enableHiveSupport()
.getOrCreate()
sparkSession.sql(query)
I'm unable to figure out where this duplicate source for parquet is coming from:
Here is my spark-submit:
spark-submit-2.4.0 --master yarn-cluster \ --files="/etc/hive/hive-site.xml" \ --driver-class-path="/etc/hadoop/:/usr/lib/spark-packages/spark2.4.0/jars/:/usr/lib/spark-packages/spark2.4.0/lib/spark-assembly.jar:/usr/lib/hive/lib/"
Any suggestion ?