502 Bad gateway nginx with PHP-FPM under high load

Question

We are currently running PHP-FPM behind nginx on Amazon EC2. The site will stop response and output 502 bad gateway to clients every time under high load.

This is the log from php-fpm error.log

[25-Feb-2014 10:29:50] WARNING: [pool www] server reached pm.max_children setting (14), consider raising it

[25-Feb-2014 12:23:11] WARNING: [pool www] child 2029 exited with code 3 after 8736.088351 seconds from start

[25-Feb-2014 12:23:11] NOTICE: [pool www] child 4142 started

This is the log from nginx error.log

2014/02/25 14:14:30 [error] 2013#0: *51168 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"

2014/02/25 14:24:15 [error] 2013#0: *51310 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"

2014/02/25 14:40:21 [error] 2013#0: *51312 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"

We already implement TCP/IP config based on this exchange. Error 502 in nginx + php5-fpm

and also implement this fix to our php-fpm config as well 502 Gateway Errors under High Load (nginx/php-fpm)

This is the config we used in php-fpm.d/www.conf

listen = 127.0.0.1:9000
pm = dynamic
pm.max_children = 14
pm.start_servers = 7
pm.min_spare_servers = 7
pm.max_spare_servers = 14

and the config on nginx/conf.d/www.conf look like this

fastcgi_buffers 256 16k
fastcgi_buffer_size 32k
fastcgi_connect_timeout 300
fastcgi_send_timeout 300
fastcgi_read_timeout 300

try increase pm.max_children = 14 (ie. 64), and decrease timeouts. — ziollek
@ziollek Are there any specific theory that what numbers we should set on our max children? — maeto
This link might be useful for you in terms of tuning your configuration if-not-true-then-false.com/2011/… — Mohammad AbuShady

datasage datasage · Accepted Answer · 2014-02-25T16:37:44

With PHP FPM, requests that require php process are handed off from nginx to php-fpm process and the result is returned.

If you have too many requests at once (which can happen if any of your requests are taking too long, or your resources are not well matched to your load) you will start to receive requests that either timeout or are rejected by php-fpm. Which is the 502 error you are seeing.

[25-Feb-2014 10:29:50] WARNING: [pool www] server reached pm.max_children setting (14), consider raising it

You can increase this, but this may not be a solution in itself. The reason why you are reaching max children may very well be due to the time it takes to process a single request on your instance. If your CPU is maxing out when this happens, its probably not going to help.

You may want to consider increasing your instance size as a short term solution. Or making code changes to take better advantage of caching.

502 Bad gateway nginx with PHP-FPM under high load

2 Answers