I am trying to write a dataframe to cassandra using pyspark but its thworing me an error:
py4j.protocol.Py4JJavaError: An error occurred while calling o74.save. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 3.0 failed 4 times, most recent failure: Lost task 6.3 in stage 3.0 (TID 24, ip-172-31-11-193.us-west-2.compute.internal, executor 1): java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsSupport$class.$init$(OutputMetricsUpdater.scala:107) at org.apache.spark.metrics.OutputMetricsUpdater$TaskMetricsUpdater.(OutputMetricsUpdater.scala:153) at org.apache.spark.metrics.OutputMetricsUpdater$.apply(OutputMetricsUpdater.scala:75) at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:209) at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:197) at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:183) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Below is my code for write:
DataFrame.write.format(
"org.apache.spark.sql.cassandra"
).mode(
'append'
).options(
table="student1",
keyspace="university"
).save()
I have added the below mentioned spark-caasandra connector in spark-default.conf
spark.jars.packages datastax:spark-cassandra-connector:2.4.0-s_2.11
I am able to read the data from cassandra but issue is with write.