16
votes

We've setup 3 servers:

  • Server A with Nginx + HAproxy to perform load balancing
  • backend server B
  • backend server C

Here is our /etc/haproxy/haproxy.cfg:

global
        log /dev/log   local0
        log 127.0.0.1   local1 notice
        maxconn 40096
        user haproxy
        group haproxy
        daemon

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option redispatch
        maxconn 2000
        contimeout      50000
        clitimeout      50000
        srvtimeout      50000
                stats enable
                stats uri /lb?stats
                stats realm Haproxy\ Statistics
                stats auth admin:admin
listen statslb :5054 # choose different names for the 2 nodes
        mode http
        stats enable
        stats hide-version
        stats realm Haproxy\ Statistics
        stats uri /
        stats auth admin:admin

listen  Server-A 0.0.0.0:80    
        mode http
        balance roundrobin
        cookie JSESSIONID prefix
        option httpchk HEAD /check.txt HTTP/1.0
        server  Server-B <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 2
        server  Server-C <server.ip>:80 cookie app1inst2 check inter 1000 rise 2 fall 3

All of the three servers have a good amount of RAM and CPU cores to handle requests

Random HTTP 503 errors are shown when browsing: 503 Service Unavailable - No server is available to handle this request.

And also on server's console:

Message from syslogd@server-a at Dec 21 18:27:20 ...
 haproxy[1650]: proxy Server-A has no server available!

Note that 90% times of the time there is no errors. These errors happens randomly.

7
Did you ever find the answer? We have something similar.jeesty
Found the answer... Please accept the answer.Siten

7 Answers

29
votes

I had the same issue. After days of pulling my hair out I found the issue.

I had two HAProxy instances running. One was a zombie that somehow never got killed during maybe an update or a haproxy restart. I noticed this when refreshing the /haproxy stats page and the PID would change between two different numbers. The page with one of the numbers had absurd connection stats. To confirm I did

netstat -tulpn | grep 80

Or

sudo lsof -i:80

and saw two haproxy processes listening to port 80.

To fix the issue I did a "kill xxxx" where xxxx is the pid with the suspicious statistics.

9
votes

Adding my answer here for anyone else who encounters this exact same problem but none of the listed solutions above are applicable. Please note that my answer does not apply to the original code listed above.

For anyone else who may have this problem, check your config and see if you might have mistakenly put the same "bind" line in multiple sections of your config. Haproxy does not check this during startup, and I plan to submit this as a recommended validation check to the developers. In my case, I have 3 different sections of the config, and I mistakenly put the same IP binding in two different places. It was about a 50/50 shot on whether or not the correct section would be used or the incorrect section was used. Even when the correct section was used, about half of the requests still got a 503.

1
votes

It is possible your servers share, perhaps, a common resource that is timing out at certain times, and that your health check requests are being made at the same time (and thus pulling the backend servers out at the same time).

You can try using the HAProxy option spread-checks to randomize health checks.

1
votes

I had the same issue, due to 2 HAProxy services running in the linux box, but with different name/pid/resources. Unless i stop the unwanted one, the required instances throws 503 error randomly, say 1 in 5 times.

Was trying to use single linux box for multiple URL routing but looks a limitation in haproxy or the config file of haproxy i have defined.

0
votes

Hard to say without more details, but is it possible you are exceeding the configured maxconn for each backend? The Stats UI shows these stats on both the frontend and on individual backends.

0
votes

I resolved my intermittent 503s with HAProxy by adding option http-server-close to backend. Looks like uWSGI (which is upstream) is not doing well with keep-alive. Not sure what's really behind the problem, but after adding this option, haven't seen single 503 since.

0
votes

don't use the "bind" line in multiple sections of your haproxy.cfg for example, this would be wrong

frontend stats
bind *:443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem

fix like this

frontend stats
bind *:8443 ssl crt /etc/ssl/certs/your.pem
frontend Main
bind *:443 ssl crt /etc/ssl/certs/your.pem