0
votes

I am trying to run a Spark Kafka Job written in Java to produce around 10K records per batch to a Kafka Topic. This is a spark batch job which reads 100(total 1million records) hdfs part files sequentially in a loop and produce each part file of 10K records in a batch. I am using org.apache.kafka.clients.producer.KafkaProducer API

Getting below exception:

org.apache.kafka.common.KafkaException: Failed to construct kafka producer
....
Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
....
Caused by: java.io.IOException: Too many open files

Below is the configurations:

Cluster Resource availability:
---------------------------------
The cluster has more than 500 nodes, 150 Terabyte total memory, more than 30K cores

Spark Application configuration:
------------------------------------
Driver_memory: 24GB
--executor-cores: 5
--num-executors: 24
--executor-memory: 24GB

Topic Configuration:
--------------------
Partitions: 16
Replication: 3

Data size
----------
Each part file has 10K records
Total records 1million
Each batch produce 10K records

Please suggest some solutions for this as this is a very critical issue.

Thanks in advance

1
can you post the code of the spark job?Ehab Qadah

1 Answers

0
votes

In Kafka, every topic is (optionally) split into many partitions. For each partition some files are maintained by brokers (for index and actual data).

kafka-topics --zookeeper localhost:2181 --describe --topic topic_name

will give you the number of partitions for topic topic_name. The default number of partitions per topic num.partitions is defined under /etc/kafka/server.properties

The total number of open files could be huge if the broker hosts many partitions and a particular partition has many log segment files.

You can see the current file descriptor limit by running

ulimit -n

You can also check the number of open files using lsof:

lsof | wc -l

To solve the issue you either need to change the limit of open file descriptors:

ulimit -n <noOfFiles>

or somehow reduce the number of open files (for example, reduce number of partitions per topic).