9
votes

I'm writing a chat server for Acani, and I have some questions about Scaling node.js and websockets with load balancer scalability.

  1. What exactly does it mean to load balance Node.js? Does that mean there will be n independent versions of my server application running, each on a separate server?

  2. To allow one client to broadcast a message to all the others, I store a set of all the webSocketConnections opened on the server. But, if I have n independent versions of my server application running, each on a separate server, then will I have n different sets of webSocketConnections?

  3. If the answers to 1 & 2 are affirmative, then how do I store a universal set of webSocketConnections (across all servers)? One way I think I could do this is use Redis Pub/Sub and just have every webSocketConnection subscribe to a channel on Redis.

  4. But, then, won't the single Redis server become the bottleneck? How would I then scale Redis? What does it even mean to scale Redis? Does that mean I have m independent versions of Redis running on different servers? Is that even possible?

  5. I heard Redis doesn't scale. Why would someone say that. What does that mean? If that's true, is there a better solution to for pub/sub and/or storing a list of all broadcasted messages?

Note: If your answer is that Acani would never have to scale, even if each of all seven billion people (and growing) on Earth were to broadcast a message every second to everyone else on earth, then please give a valid explanation.

2

2 Answers

6
votes

Well, few answers for your question:

  1. To load balance Node.js, it means exactly what you thought about what it is, except that you don't really need separate server, you can run more then one process of your node server on the same machine.

  2. Each server/process of your node server will have it's own connections, the default store for websockets (for example Socket.IO) is MemoryStore, it means that all the connections will be stored on the machine memory, it is required to work with RedisStore in order to work with redis as a connection store.

  3. Redis PUB/SUB is a good way to achieve this task

  4. You are right about what you said here, redis doesn't scale at this moment and running a lot of processes/connections connected to redis can make redis to be a bottleneck.

  5. Redis doesn't scale, that is correct, but according to this presentation you can see that a cluster development is in top priority at redis and redis do have a cluster, it's just not stable yet: (taken from http://redis.io/download)

Where's Redis Cluster?

Redis development is currently focused on Redis 2.6 that will bring you support for Lua scripting and many other improvements. This is our current priority, however the unstable branch already contains most of the fundamental parts of Redis Cluster. After the 2.6 release we'll focus our energies on turning the current Redis Cluster alpha in a beta product that users can start to seriously test. It is hard to make forecasts since we'll release Redis Cluster as stable only when we feel it is rock solid and useful for our customers, but we hope to have a reasonable beta for summer 2012, and to ship the first stable release before the end of 2012.

See the presentation here: http://redis.io/presentation/Redis_Cluster.pdf

2
votes

2) Using Redis might not work to store connections: Redis can store data in string format, and if the connecion object has circular references (ie, Engine.IO) you won't be able serialise them

3) Creating a new Redis client for each client might not be a good approach so avoid that trap if you can

Consider using ZMQ node library to have processes communicate with each other through TCP (or IPC if they are clustered as in master-worker)