0
votes

We are monitoring sidekiq process through monit. Once the sidekiq process reaches the memory of around 2 GB, we are restarting the process. We have start and stop program defined with tiemout of 90 seconds. But the stop program is getting failed (after waiting for the timeout of 90 seconds).

this is the sample monit configuration.

check process sidekiq
  with pidfile /pathtopidfile
  start program = "/bin/sh -c start sidekiq commmand" with timeout 90 seconds
  stop program = "stop sidekiq command" with timeout 90 seconds
  if totalmem is greater than  2GB for 3 cycles then restart
  ***## I need have some condition like this ->  if "stop_program failed" then "do some action"***
end

P.S I dont know the correct syntax for capturing stop program failed in monit.. I checked the monit blogs but i could not.

1
Hello, I don't think such feature exist in Monit. Usually the init/service/daemon scripts handle themself the timeout and take corresponding action. Aside customizing your stop script to timeout and take action itself, i cannot foreseen any clean solution at Monit levelTheCodeKiller
@TheCodeKiller -> Thanks for your comments. I have did the same way that you have suggested to solve the problem. I have customised the stop script (force killing the sidekiq process if failing to stop within the timeout.)Karthy

1 Answers

0
votes

I think no options in monit to capture failure of the stop or start program. So We have to handle those failure cases in our respective program itself. Say if my stop program is getting failed,i have to find why it is getting failed, and take corresponding action in stop program itself.

My original problem was Sidekiq process is not getting killed within the timeout, so stop program got failed. In order to resolve this i have handled in the stop program that if the sidekiq process is not getting killed within the timeout then hard kill the process.