2
votes

Have you ever had a similar problem about kafka? I get this error: Too many open files. I don't know why. Here are some logs:

[2018-08-27 10:07:26,268] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180821_1_LOCATION-87/leader-epoch-checkpoint: Too many open fis
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2018-08-27 10:07:26,268] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180822_PARSE-136/leader-epoch-checkpoint: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2018-08-27 10:07:26,269] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180813_1_STATISTICS-402/leader-epoch-checkpoint: Too many opens
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2
Would move this to serverfault communityBaptiste Mille-Mathias

2 Answers

5
votes

In Kafka, every topic is (optionally) split into many partitions. For each partition some files are maintained by brokers (for index and actual data).

kafka-topics --zookeeper localhost:2181 --describe --topic topic_name

will give you the number of partitions for topic topic_name. The default number of partitions per topic num.partitions is defined under /etc/kafka/server.properties

The total number of open files could be very huge if the broker hosts many partitions and a particular partition has many log segment files.

You can see the current file descriptor limit by running

ulimit -n

You can also check the number of open files using lsof:

lsof | wc -l

To solve the issue you either need to change the limit of open file descriptors:

ulimit -n <noOfFiles>

or somehow reduce the number of open files (for example, reduce number of partitions per topic).

0
votes

On Linux distributions using Systemd like RHEL and CentOS, you will need to add lines of configuration in the 2nd block to Systemd Service file. Changing in the /etc/security/limits.conf is not sufficient.

[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
LimitAS=infinity
LimitRSS=infinity
LimitCORE=infinity
LimitNOFILE=65536
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target