0
votes

Does anyone have any real-world scenarios that load-tested ZMQ sockets for maximum no. of 'concurrent users' (not throughput) they can handle? Looks like ZeroMQ has some serious problems with FD limits.

The scenario is: there are numerous web-server frameworks out there that are boasting of millions of concurrent users they can handle - now if ZeroMQ cannot handle beyond FD_SETSIZE no. of users at any point of time, it is a very serious restriction on scalability (since FDs are not just process resources, but also machine resources, so no point in spawning a new process in the same machine).

To verify, I am trying to load test ZMQ_STREAM to find how many concurrent users it can sustain. Its a simple "hello-world" response server that just listens on ZMQ_STREAM and returns "hello world" for every request (in a strict receive followed by send style).

Now, while testing with JMeter (using users=1000), hit the assertion: zmq_assert (fds.size () <= FD_SETSIZE). What does this signify? That ZMQ is holding FD_SETSIZE number of FDs? But (as per the below code) each connection is opened and closed immediately, I do not see how is it possible that more than few FDs could be simultaneously open at any point of time.

Question: If this is the case, what is the way for any ZMQ based app to achieve million-user concurrent connections? (apart from the obvious and meaningless solution of having 1000 machines each handling 1000 users, or increasing the FD_SETSIZE to be an insanely large number)

Anyone knows anything about how and why these FDs are used and how they get exhausted (and more importantly how other frameworks such as, nginx node.js do not have this problem) please throw some light.

The server code, if it matters is below:

#include <zmq.h>
#include <assert.h>
#include <string.h>
#include <iostream>
int main(void)
{
    void *ctx = zmq_ctx_new();

    void *socket = zmq_socket(ctx, ZMQ_STREAM);
    int rc = zmq_bind(socket, "tcp://*:8080");
    uint8_t id[256];
    size_t id_size = 256;
    char msg[4096];
    size_t msg_size = 4096;
    int nCount = 0;
    char http_response[] =
        "HTTP/1.0 200 OK\r\n"
        "Content-Type: text/plain\r\n"
        "\r\n"
        "Hello, World!";
    int nResponseLen = strlen(http_response);
    while (1) {
        id_size = zmq_recv(socket, id, 256, 0);
        msg_size = zmq_recv(socket, msg, sizeof(msg), 0);
        msg[msg_size] = '\0';
        std::cout << ++nCount << " -----\n";

        zmq_send(socket, id, id_size, ZMQ_SNDMORE);
        zmq_send(socket, http_response, nResponseLen, ZMQ_SNDMORE);

        zmq_send(socket, id, id_size, ZMQ_SNDMORE);
        zmq_send(socket, 0, 0, ZMQ_SNDMORE);
    }
    zmq_close(socket);
    zmq_ctx_destroy(ctx);
    return 0;
}

Using JMeter, users=1000

1
OS is windows. Yes, aware of the possibility to increase the FD_SETSIZE as a work around. But more interested in finding out what is the right way of achieving the maximum no. of concurrent connections (with ZMQ and if possible how other frameworks have avoided this problem). If each connected client of ZMQ socket demands a new FD, then we are back in the game of good old 'one thread per client' style of resource-hungry server problems (slashdot, c10k etc..). FD may be cheaper than thread, but still scalability bottleneck.Gopalakrishna Palem

1 Answers

1
votes

What exactly do you mean when you say "each connection is opened and closed immediately"? You bind on a stream socket, which accepts incoming requests in the while loop, which runs perpetually and never closes anything. The call to zmq_close(socket); after the loop is never reached.

Even the last part of the message explicitly uses ZMQ_SNDMORE, which should keep the connection open waiting for more text. Presumably to allow a small number of clients a lower overhead for repeated connections, I guess. It should probably be:

zmq_send(socket, 0, 0, 0);

I don't know which of these issues would release the resources to allow a larger number of clients, if either, but probably it's an abuse of ZMQ (or at least misguided) to try and write an HTTP server in it or try to make it scale to millions of concurrent peers/clients.

node.js and nginx are event based concurrent I/O systems, they are significantly different architecturally from ZMQ, and they are made to solve different problems. Trying to make ZMQ into them is going about things the wrong way. What you probably want is to use node.js with socket.io, or if you're using it for HTTP then just use it's native http module.