I am having a Rails app run with Nginx/Unicorn. Link to my nginx.conf and unicorn.rb
I am deploying via capistrano and initially the app working great. However, after a while (i.e ~ 10-20 minutes) having no request, then an incoming request will meet problem. (return error 500 something went wrong page).
Rails production.log show that it received request, but I didn't do anything else:
I, [2015-01-29T13:12:03.725959 #24176] INFO -- : Started GET "/" for xxx.18.170.101 at 2015-01-29 13:12:03 +0000
nginx's access.log
xxx.18.170.101 - - [29/Jan/2015:13:14:33 +0000] "GET / HTTP/1.1" 504 1477 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.91 Safari/537.36"
nginx' error.log
2015/01/29 13:14:33 [error] 23786#0: *20 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xxx.18.170.101, server: , request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/unicorn.my_app.sock/", host: "my_server"
unicorn's error.log show that it's timeout and kill the worker
D, [2015-01-29T13:01:45.386741 #22987] DEBUG -- : waiting 121.0s after suspend/hibernation
E, [2015-01-29T13:16:04.784404 #22987] ERROR -- : worker=0 PID:24176 timeout (241s > 240s), killing
E, [2015-01-29T13:16:04.794715 #22987] ERROR -- : reaped #<Process::Status: pid 24176 SIGKILL (signal 9)> worker=0
I, [2015-01-29T13:16:05.797775 #24377] INFO -- : worker=0 spawned pid=24377
I, [2015-01-29T13:16:05.827508 #24377] INFO -- : worker=0 ready
The strange is, after making few request, it will work back again: The normal production.log
I, [2015-01-29T13:34:39.054539 #24666] INFO -- : Started GET "/" for xx.18.170.101 at 2015-01-29 13:34:39 +0000
I, [2015-01-29T13:34:39.122488 #24666] INFO -- : Processing by ProductsController#index as HTML
....
And then if having no request for a while, the error will happen again.
Looking at timestamps and the difference between production.log of a success request and error request, I think there isn't any problem with nginx, unicorn setup. However, normally when a request come to the rails app, it will be processed processing by ...Controller
but after a while without request, the next request was stuck somewhere.
Any advice how to fix, or how debug this problem? I tried strace but the result is hard to figure out. The most consuming time is:
[pid 12649] 06:34:02 select(8, [7], NULL, NULL, {44, 88867}) = 0 (Timeout) <44.105805>