0
votes

I've a Kafka topic with one partition. I'm trying to send messages to broker. The source is of 1.5 TB in size. My broker has two directories to store the Kafka partitions

/dev/sdc1       1.1T  567G  460G  56% /data_disk_0
/dev/sdd1       1.1T  1.1T     0 100% /data_disk_1

Each one with 1.1 TB size. As my topic has only one partition, Kafka is storing all the messages to /dev/sdd1. Eventually the disk fills up completely because the source size is greater than the target disk size. Can I span my topic partition to store half data in disk0 and the other half in disk1 without changing the number of partitions?

Please advice

I couldn't find any configuration related changes that I can add to Kafka

1

1 Answers

0
votes

This isn't possible at the kafka configuration level. You'd have to use RAID or logical volume groups to pool the disks together as one volume

In the Kafka documentation, it mentions

You can either RAID these drives together into a single volume or format and mount each drive as its own directory

If your data is so heavily skewed to one disk, meaning certain partitions, you should be checking how your producers are partitioning the data, start to persist such a large topic somewhere, or turn on compaction / retention periods for these topics