4
votes

Solu:I put the params driver-memory 40G in the spark-submit .


Ques:My Spark cluster is consist of 5 ubuntu server ,each with 80G memory and 24 cores. word2vec is about 10G newsdata. and I submit the job with standalone mode like this :

spark-submit --name trainNewsdata --class Word2Vec.trainNewsData --master spark://master:7077 --executor-memory 70G --total-executor-cores 96 sogou.jar hdfs://master:9000/user/bd/newsdata/* hdfs://master:9000/user/bd/word2vecModel_newsdata

When I train word2vec model in spark,I occure : Exception in thread "main" java.lang.OutOfMemoryError: Java heap space, and I don't know how to solve it,please help me:)

1
It may different from thatLei Li
You do not have nearly enough details to determine it might be different. Try the diagnostics and solutions in the linked question, then say how it differs if it does. "It may be different" is at this point about as useful as "it may be cosmic rays".Amadan
ok,I maybe know it,I put the params driver-memory 40G in the spark-submit .Lei Li

1 Answers

3
votes

I put the params driver-memory 40G in the spark-submit,and Then solve it.