How to fix "A protocol message was rejected because it was too big" from Google Protobuf in Spark on Mesos?

Question

I'm running Spark 1.5.1 through Scala code and calling the ALS train method (mllib). My code uses MESOS executor. Since the data is large, I get the following error:

15/11/03 12:53:45 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, , PROCESS_LOCAL, 128730328 bytes) [libprotobuf ERROR google/protobuf/io/coded_stream.cc:171] A protocol message was rejected because it was too big (more than 67108864 bytes). To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.

Any ideas on how to increase the limit?

Abhishek Anand Abhishek Anand · Accepted Answer · 2016-03-22T14:16:55

Sounds like you are hitting limit for "spark.kryoserializer.buffer.max". Check if protobuf is using kryo serializer. If yes, you need to push limit of "spark.kryoserializer.buffer.max", which can be set upto 2047m.

http://spark.apache.org/docs/1.5.1/configuration.html

How to fix "A protocol message was rejected because it was too big" from Google Protobuf in Spark on Mesos?

1 Answers