3
votes

I am having a Rails app run with Nginx/Unicorn. Link to my nginx.conf and unicorn.rb

I am deploying via capistrano and initially the app working great. However, after a while (i.e ~ 10-20 minutes) having no request, then an incoming request will meet problem. (return error 500 something went wrong page).

Rails production.log show that it received request, but I didn't do anything else:

I, [2015-01-29T13:12:03.725959 #24176]  INFO -- : Started GET "/" for xxx.18.170.101 at 2015-01-29 13:12:03 +0000

nginx's access.log

xxx.18.170.101 - - [29/Jan/2015:13:14:33 +0000] "GET / HTTP/1.1" 504 1477 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.91 Safari/537.36"

nginx' error.log

2015/01/29 13:14:33 [error] 23786#0: *20 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xxx.18.170.101, server: , request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/unicorn.my_app.sock/", host: "my_server"

unicorn's error.log show that it's timeout and kill the worker

D, [2015-01-29T13:01:45.386741 #22987] DEBUG -- : waiting 121.0s after suspend/hibernation
E, [2015-01-29T13:16:04.784404 #22987] ERROR -- : worker=0 PID:24176 timeout (241s > 240s), killing
E, [2015-01-29T13:16:04.794715 #22987] ERROR -- : reaped #<Process::Status: pid 24176 SIGKILL (signal 9)> worker=0
I, [2015-01-29T13:16:05.797775 #24377]  INFO -- : worker=0 spawned pid=24377
I, [2015-01-29T13:16:05.827508 #24377]  INFO -- : worker=0 ready

The strange is, after making few request, it will work back again: The normal production.log

I, [2015-01-29T13:34:39.054539 #24666]  INFO -- : Started GET "/" for xx.18.170.101 at 2015-01-29 13:34:39 +0000
I, [2015-01-29T13:34:39.122488 #24666]  INFO -- : Processing by ProductsController#index as HTML
....

And then if having no request for a while, the error will happen again.

Looking at timestamps and the difference between production.log of a success request and error request, I think there isn't any problem with nginx, unicorn setup. However, normally when a request come to the rails app, it will be processed processing by ...Controller but after a while without request, the next request was stuck somewhere.

Any advice how to fix, or how debug this problem? I tried strace but the result is hard to figure out. The most consuming time is:

[pid 12649] 06:34:02 select(8, [7], NULL, NULL, {44, 88867}) = 0 (Timeout) <44.105805>
1

1 Answers

0
votes

I'm not 100% sure about this answer, but I can't leave comments yet. So I can't tell you exactly why your request is taking so long but the important error message is the one where your unicorn worker is getting killed.

From your unicorn.rb: timeout 240

From your unicorn_error.log:

ERROR -- : worker=0 PID:24176 timeout (241s > 240s), killing

So that request is taking longer than 240 seconds. Try increasing your unicorn timeout so that the request can go through and then view your log files for what is taking so long. You are right about unicorn not being the problem because unless you are doing something you and the user expects to take a long time you shouldn't be taking +240s.