Recently, one of our Kafka broker (out of 5) got shut down incorrectly. Now that we are starting it up again, there are a lot of warning messages about corrupted index files and the broker is still starting up even after 24 hours. There is over 400 GB of data in this broker.
Although the rest of the brokers are up and running but some of the partitions are showing -1 as their leader and the bad broker as the only ISR. I am not seeing other Replicas to be appointed as new leaders, maybe because the bad broker is the only one in sync for those partitions.
Broker Properties:
Replication Factor: 3
Min In Sync Replicas: 1
I am not sure how to handle this. Should I wait for the broker to fix everything itself? is it normal to take so much time?
Is there anything else I can do? Please help.