I want to test the Spark-SQL query on DSE Cassandra table in Scala IDE. The query runs flawlessly when the jar file is executed in dse spark-submit. But it gives an error when it runs in Scala IDE. The error is,
Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found:
killr_video.videos; line 1 pos 14;
I think it's the spark master configuration error as I am running the master in local mode.
Here is the spark session I initiated.
val spark = SparkSession
.builder()
.appName("CassandraSpark")
.config("spark.cassandra.connection.host", "127.0.0.1")
.config("spark.cassandra.connection.port", "9042")
.enableHiveSupport()
.master("local")
.getOrCreate();
But I don't know what address to set as master. I tried setting master address as "spark://127.0.0.1:7077" which I found from Web UI (localhost:7080) when I started Cassandra. But still, it gave an error as follows
ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:212) at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:222) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78) at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100) at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110) at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84) at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290) at org.apache.spark.SparkContext.stop(SparkContext.scala:1796) at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 18/05/22 11:46:44 ERROR Utils: Uncaught exception in thread appclient-registration-retry-thread org.apache.spark.SparkException: Error communicating with MapOutputTracker at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:104) at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:110) at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:580) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:84) at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1797) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1290) at org.apache.spark.SparkContext.stop(SparkContext.scala:1796) at org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.dead(StandaloneSchedulerBackend.scala:142) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint.markDead(StandaloneAppClient.scala:254) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anon$2.run(StandaloneAppClient.scala:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:212) at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:222) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78) at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:100) ... 16 more 18/05/22 11:46:44 ERROR SparkContext: Error initializing SparkContext. java.lang.NullPointerException at org.apache.spark.SparkContext.(SparkContext.scala:546) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823) 18/05/22 11:46:44 INFO SparkContext: SparkContext already stopped. Exception in thread "main" java.lang.NullPointerException at org.apache.spark.SparkContext.(SparkContext.scala:546) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
What can I do to make this code work?
dse spark& execute yourspark.sqlquery without creating the SparkSession instance? - Alex Ott