0
votes

Py4JJavaError: An error occurred while calling o188.parquet. : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found

I tried adding the missing hadoop-aws jar file using spark-submit to the classpath but was unable to add it. This is what I tried:

!spark-submit --jars /content/hadoop-aws-2.7.1.jar

Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.

1

1 Answers

0
votes

os.environ['PYSPARK_SUBMIT_ARGS'] = "--packages=org.apache.hadoop:hadoop-aws:2.7.3 pyspark-shell"