Running spark code locally on eclipse with spark installed on remote server

Question

I have configured eclipse for scala and created a maven project and wrote a simple word count spark job on windows. Now my spark+hadoop are installed on linux server. How can I launch my spark code from eclipse to spark cluster (which is on linux)?

Any suggestion.

Suggestion, use IntelliJIdea, personally I think it is the best IDE for scala and java — Alberto Bonsanto
Yeah..but my question is how do I run my code on cluster . Lets say if use intellijide then how can I do it in that? — Shashi

Alberto Bonsanto Alberto Bonsanto · Accepted Answer · 2015-11-08T15:06:38

Actually this answer is not so simple, as you would expect.

I will make many assumptions, first that you use sbt, second is that you are working in a linux based computer, third is the last is that you have two classes in your project, let's say RunMe and Globals, and the last assumption will be that you want to set up the settings inside the program. Thus, somewhere in your runnable code you must have something like this:

object RunMe {
  def main(args: Array[String]) {
    val conf = new SparkConf()
      .setMaster("mesos://master:5050") //If you use Mesos, and if your network resolves the hostname master to its IP.
      .setAppName("my-app")
      .set("spark.executor.memory", "10g")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext()

    //your code comes here
  }
}

The steps you must follow are:

Compile the project, in the root of it, by using:

$ sbt assembly
Send the job to the master node, this is the interesting part (assuming you have the next structure in your project target/scala/, and inside you have a file .jar, which corresponds to the compiled project)

$ spark-submit --class RunMe target/scala/app.jar

Notice that, because I assumed that the project has two or more classes you would have to identify which class you want to run. Furthermore, I bet that both approaches, for Yarn and Mesos are very similar.

Running spark code locally on eclipse with spark installed on remote server

4 Answers