We are currently running PHP-FPM behind nginx on Amazon EC2. The site will stop response and output 502 bad gateway to clients every time under high load.
This is the log from php-fpm error.log
[25-Feb-2014 10:29:50] WARNING: [pool www] server reached pm.max_children setting (14), consider raising it
[25-Feb-2014 12:23:11] WARNING: [pool www] child 2029 exited with code 3 after 8736.088351 seconds from start
[25-Feb-2014 12:23:11] NOTICE: [pool www] child 4142 started
This is the log from nginx error.log
2014/02/25 14:14:30 [error] 2013#0: *51168 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2014/02/25 14:24:15 [error] 2013#0: *51310 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2014/02/25 14:40:21 [error] 2013#0: *51312 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.160.215, server: domain.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
We already implement TCP/IP config based on this exchange. Error 502 in nginx + php5-fpm
and also implement this fix to our php-fpm config as well 502 Gateway Errors under High Load (nginx/php-fpm)
This is the config we used in php-fpm.d/www.conf
listen = 127.0.0.1:9000
pm = dynamic
pm.max_children = 14
pm.start_servers = 7
pm.min_spare_servers = 7
pm.max_spare_servers = 14
and the config on nginx/conf.d/www.conf look like this
fastcgi_buffers 256 16k
fastcgi_buffer_size 32k
fastcgi_connect_timeout 300
fastcgi_send_timeout 300
fastcgi_read_timeout 300