I have been able to successfully run a Spark application in IntelliJ Idea when setting master as local[*]. However, when I set master to a separate instance of Spark an exception occurs.
The SparkPi App I am trying to execute is below.
import scala.math.random
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
/** Computes an approximation to pi */
object SparkPi {
def main(args: Array[String]) {
val conf = new SparkConf().setMaster("spark tjvrlaptop:7077").setAppName("Spark Pi") //.set("spark.scheduler.mode", "FIFO").set("spark.cores.max", "8")
val spark = new SparkContext(conf)
val slices = if (args.length > 0) args(0).toInt else 20
val n = math.max(100000000L * slices, Int.MaxValue).toInt // avoid overflow
for(j <- 1 to 1000000) {
val count = spark.parallelize(1 until n, slices).map { i =>
val x = random * 2 - 1
val y = random * 2 - 1
if (x * x + y * y < 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n)
}
spark.stop()
}
}
Here is my build.sbt contents:
name := "SBTScalaSparkPi"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1"
Here is my plugins.sbt contents:
logLevel := Level.Warn
I executed the Spark Master and a worker by using the following commands in different command prompts on the same machine.
spark-1.6.1-bin-hadoop2.6\bin>spark-class org.apache.spark.deploy.master.Master --host tjvrlaptop
spark-1.6.1-bin-hadoop2.6\bin>spark-class org.apache.spark.deploy.worker.Worker spark tjvrlaptop:7077
[The Master and Worker seems to be up and running without any issues][1]
[1]: http i.stack.imgur.com/B3BDZ.png
Next I tried to run the program in IntelliJ. It fails after a while with the following errors:
Command Promt where Master is running
16/03/27 14:44:33 INFO Master: Registering app Spark Pi 16/03/27 14:44:33 INFO Master: Registered app Spark Pi with ID app-20160327144433-0000 16/03/27 14:44:33 INFO Master: Launching executor app-20160327144433-0000/0 on worker
worker-20160327140440-192.168.56.1-52701 16/03/27 14:44:38 INFO Master: Received unregister request from application app-20160327144433-0000 16/03/27 14:44:38 INFO Master: Removing app app-20160327144433-0000 16/03/27 14:44:38 INFO Master: TJVRLAPTOP:55368 got disassociated, removing it. 16/03/27 14:44:38 INFO Master: 192.168.56.1:55350 got disassociated, removing it. 16/03/27 14:44:38 WARN Master: Got status update for unknown executor app-20160327144433-0000/0
Command Prompt where the Worker is running
16/03/27 14:44:34 INFO Worker: Asked to launch executor app-20160327144433-0000/0 for Spark Pi 16/03/27 14:44:34 INFO SecurityManager: Changing view acls to: tjoha 16/03/27 14:44:34 INFO SecurityManager: Changing modify acls to: tjoha 16/03/27 14:44:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(tjoha); users with modify permissions: Set(tjoha) 16/03/27 14:44:34 INFO ExecutorRunner: Launch command: "C Program Files\Java\jre1.8.0_77\bin\java" "-cp" "C Users\tjoha\Documents\spark-1.6.1-bin-hadoop2.6\bin..\conf\;C Users\tjoha\Documents\spark-1.6.1-bin-hadoop2.6\bin..\lib\spark-assembly-1.6.1-hadoop2.6.0.jar;C Users\tjoha\Documents\spark-1.6.1-bin-hadoop2.6\bin..\lib\datanucleus-api-jdo-3.2.6.jar;C Users\tjoha\Documents\spark-1.6.1-bin-hadoop2.6\bin..\lib\datanucleus-core-3.2.10.jar;C Users\tjoha\Documents\spark-1.6.1-bin-hadoop2.6\bin..\lib\datanucleus-rdbms-3.2.9.jar" "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=55350" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark [email protected]:55350" "--executor-id" "0" "--hostname" "192.168.56.1" "--cores" "8" "--app-id" "app-20160327144433-0000" "--worker-url" "spark [email protected]:52701" 16/03/27 14:44:38 INFO Worker: Asked to kill executor app-20160327144433-0000/0 16/03/27 14:44:38 INFO ExecutorRunner: Runner thread for executor app-20160327144433-0000/0 interrupted 16/03/27 14:44:38 INFO ExecutorRunner: Killing process! 16/03/27 14:44:38 INFO Worker: Executor app-20160327144433-0000/0 finished with state KILLED exitStatus 1 16/03/27 14:44:38 INFO Worker: Cleaning up local directories for application app-20160327144433-0000 16/03/27 14:44:38 INFO ExternalShuffleBlockResolver: Application app-20160327144433-0000 removed, cleanupLocalDirs = true
IntelliJ Idea Output
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/03/27 15:06:04 INFO SparkContext: Running Spark version 1.6.1 16/03/27 15:06:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/03/27 15:06:05 INFO SecurityManager: Changing view acls to: tjoha 16/03/27 15:06:05 INFO SecurityManager: Changing modify acls to: tjoha 16/03/27 15:06:05 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(tjoha); users with modify permissions: Set(tjoha) 16/03/27 15:06:06 INFO Utils: Successfully started service 'sparkDriver' on port 56183. 16/03/27 15:06:07 INFO Slf4jLogger: Slf4jLogger started 16/03/27 15:06:07 INFO Remoting: Starting remoting 16/03/27 15:06:07 INFO Remoting: Remoting started; listening on addresses :[akka tcp [email protected]:56196] 16/03/27 15:06:07 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 56196. 16/03/27 15:06:07 INFO SparkEnv: Registering MapOutputTracker 16/03/27 15:06:07 INFO SparkEnv: Registering BlockManagerMaster 16/03/27 15:06:07 INFO DiskBlockManager: Created local directory at C Users\tjoha\AppData\Local\Temp\blockmgr-9623b0f9-81f5-4a10-bbc7-ba077d53a2e5 16/03/27 15:06:07 INFO MemoryStore: MemoryStore started with capacity 2.4 GB 16/03/27 15:06:07 INFO SparkEnv: Registering OutputCommitCoordinator 16/03/27 15:06:07 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 16/03/27 15:06:07 INFO Utils: Successfully started service 'SparkUI' on port 4041. 16/03/27 15:06:07 INFO SparkUI: Started SparkUI at http 192.168.56.1:4041 16/03/27 15:06:08 INFO AppClient$ClientEndpoint: Connecting to master spark tjvrlaptop:7077... 16/03/27 15:06:09 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20160327150608-0002 16/03/27 15:06:09 INFO AppClient$ClientEndpoint: Executor added: app-20160327150608-0002/0 on worker-20160327150550-192.168.56.1-56057 (192.168.56.1:56057) with 8 cores 16/03/27 15:06:09 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160327150608-0002/0 on hostPort 192.168.56.1:56057 with 8 cores, 1024.0 MB RAM 16/03/27 15:06:09 INFO AppClient$ClientEndpoint: Executor updated: app-20160327150608-0002/0 is now RUNNING 16/03/27 15:06:09 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56234. 16/03/27 15:06:09 INFO NettyBlockTransferService: Server created on 56234 16/03/27 15:06:09 INFO BlockManagerMaster: Trying to register BlockManager 16/03/27 15:06:09 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:56234 with 2.4 GB RAM, BlockManagerId(driver, 192.168.56.1, 56234) 16/03/27 15:06:09 INFO BlockManagerMaster: Registered BlockManager 16/03/27 15:06:09 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 16/03/27 15:06:10 INFO SparkContext: Starting job: reduce at SparkPi.scala:37 16/03/27 15:06:10 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:37) with 20 output partitions 16/03/27 15:06:10 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:37) 16/03/27 15:06:10 INFO DAGScheduler: Parents of final stage: List() 16/03/27 15:06:10 INFO DAGScheduler: Missing parents: List() 16/03/27 15:06:10 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:33), which has no missing parents 16/03/27 15:06:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1880.0 B, free 1880.0 B) 16/03/27 15:06:10 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1212.0 B, free 3.0 KB) 16/03/27 15:06:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.56.1:56234 (size: 1212.0 B, free: 2.4 GB) 16/03/27 15:06:10 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 16/03/27 15:06:10 INFO DAGScheduler: Submitting 20 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:33) 16/03/27 15:06:10 INFO TaskSchedulerImpl: Adding task set 0.0 with 20 tasks 16/03/27 15:06:14 INFO SparkDeploySchedulerBackend: Registered executor NettyRpcEndpointRef(null) (TJVRLAPTOP:56281) with ID 0 16/03/27 15:06:14 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, TJVRLAPTOP, partition 0,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:14
...
TJVRLAPTOP, partition 6,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:14 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, TJVRLAPTOP, partition 7,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:14 INFO BlockManagerMasterEndpoint: Registering block manager TJVRLAPTOP:56319 with 511.1 MB RAM, BlockManagerId(0, TJVRLAPTOP, 56319) 16/03/27 15:06:15 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on TJVRLAPTOP:56319 (size: 1212.0 B, free: 511.1 MB) 16/03/27 15:06:15 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, TJVRLAPTOP, partition 8,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, TJVRLAPTOP, partition 9,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, TJVRLAPTOP, partition 10,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, TJVRLAPTOP, partition 11,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15
...
java.lang.ClassNotFoundException: SparkPi$$anonfun$main$1$$anonfun$1 at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68) at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source) at java.io.ObjectInputStream.readClassDesc(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.defaultReadFields(Unknown Source) at java.io.ObjectInputStream.readSerialData(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.defaultReadFields(Unknown Source) at java.io.ObjectInputStream.readSerialData(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.defaultReadFields(Unknown Source) at java.io.ObjectInputStream.readSerialData(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) 16/03/27 15:06:15 INFO TaskSetManager: Lost task 5.0 in stage 0.0 (TID 5) on executor TJVRLAPTOP: java.lang.ClassNotFoundException (SparkPi$$anonfun$main$1$$anonfun$1) [duplicate 1] 16/03/27 15:06:15 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on executor TJVRLAPTOP: java.lang.ClassNotFoundException
...
INFO TaskSetManager: Starting task 10.1 in stage 0.0 (TID 20, TJVRLAPTOP, partition 10,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15
...
TJVRLAPTOP, partition 3,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:15 INFO TaskSetManager: Lost task 4.0 in stage 0.0 (TID 4) on executor TJVRLAPTOP: java.lang.ClassNotFoundException (SparkPi$$anonfun$main$1$$anonfun$1) [duplicate 8] 16/03/27 15:06:15 INFO TaskSetManager: Lost task 12.0 in stage 0.0 (TID 12) on executor TJVRLAPTOP: java.lang.ClassNotFoundException
...
INFO TaskSetManager: Starting task 2.3 in stage 0.0 (TID 39, TJVRLAPTOP, partition 2,PROCESS_LOCAL, 2078 bytes) 16/03/27 15:06:16 MapOutputTrackerMasterEndpoint stopped! 16/03/27 15:06:16 WARN TransportChannelHandler: Exception in connection from TJVRLAPTOP/192.168.56.1:56281 java.io.IOException: An existing connection was forcibly closed by the remote host 16/03/27 15:06:17 INFO MemoryStore: MemoryStore cleared 16/03/27 15:06:17 INFO BlockManager: BlockManager stopped 16/03/27 15:06:17 INFO BlockManagerMaster: BlockManagerMaster stopped 16/03/27 15:06:17 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/03/27 15:06:17 INFO SparkContext: Successfully stopped SparkContext 16/03/27 15:06:17 INFO ShutdownHookManager: Shutdown hook called 16/03/27 15:06:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/03/27 15:06:17 INFO ShutdownHookManager: Deleting directory C Users\tjoha\AppData\Local\Temp\spark-11f8184f-23fb-43be-91bb-113fb74aa8b9