OutofMemoryErrory creating fat jar with sbt assembly

Question

We are trying to make a fat jar file containing one small scala source file and a ton of dependencies (simple mapreduce example using spark and cassandra):

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import com.datastax.spark.connector._
import org.apache.spark.SparkConf

object VMProcessProject {

    def main(args: Array[String]) {
        val conf = new SparkConf()
            .set("spark.cassandra.connection.host", "127.0.0.1")
             .set("spark.executor.extraClassPath", "C:\\Users\\SNCUser\\dataquest\\ScalaProjects\\lib\\spark-cassandra-connector-assembly-1.3.0-M2-SNAPSHOT.jar")
        println("got config")
        val sc = new SparkContext("spark://US-L15-0027:7077", "test", conf)
        println("Got spark context")

        val rdd = sc.cassandraTable("test_ks", "test_col")

        println("Got RDDs")

        println(rdd.count())

        val newRDD = rdd.map(x => 1)
        val count1 = newRDD.reduce((x, y) => x + y)

    }
}

We do not have a build.sbt file, instead putting jars into a lib folder and source files in the src/main/scala directory and running with sbt run. Our assembly.sbt file looks as follows:

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")

When we run sbt assembly we get the following error message:

...
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: java heap space
    at java.util.concurrent...

We're not sure how to change the jvm settings to increase the memory since we are using sbt assembly to make the jar. Also, if there is something egregiously wrong with how we are writing the code or building our project that'd help us out a lot too; there's been so many headaches trying to set up a basic spark program!

Wesley Miao Wesley Miao · Accepted Answer · 2015-06-18T02:25:35

sbt is essentially a java process. You can try to tune your sbt runtime heap size for the OutOfMemory issues.

For 0.13.x, the default memory options sbt uses is

-Xms1024m -Xmx1024m -XX:ReservedCodeCacheSize=128m -XX:MaxPermSize=256m.

And you can enlarge the heap size by doing something like

sbt -J-Xms2048m -J-Xmx2048m assembly

OutofMemoryErrory creating fat jar with sbt assembly

4 Answers