6
votes

I'm running Spark 1.5.1 through Scala code and calling the ALS train method (mllib). My code uses MESOS executor. Since the data is large, I get the following error:

15/11/03 12:53:45 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, , PROCESS_LOCAL, 128730328 bytes) [libprotobuf ERROR google/protobuf/io/coded_stream.cc:171] A protocol message was rejected because it was too big (more than 67108864 bytes). To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.

Any ideas on how to increase the limit?

1

1 Answers

1
votes

Sounds like you are hitting limit for "spark.kryoserializer.buffer.max". Check if protobuf is using kryo serializer. If yes, you need to push limit of "spark.kryoserializer.buffer.max", which can be set upto 2047m.

http://spark.apache.org/docs/1.5.1/configuration.html