4
votes

I'm having performance issue with the Node.js http module.

I'm writing a little Node script to do a web server that send data for a specified duration (in seconds). The purpose is to do speed tests between clients and the server. And local speed tests too. To do so I use a custom Readable that I 'pipe' to the http response. That custom Readable stops sending data after a given duration. The data is a looped Buffer of arbitrary values.

The code is like this:

var http = require("http");
var Readable = require('stream').Readable;
var url = require('url');

var buff =  new Buffer(16384); // change this to bigger value for better local performances
buff.fill(0); // data is filled with 0s here, change to whatever

// this function creates a Readable that will provide data for <duration> seconds and then will nd.
function createTimedReadable(duration)
{
  var EndAt = Date.now() + 1000*duration; // when to end
  var rs = new Readable();
  rs.EndAt = EndAt;
  rs._read = function () {
    if (Date.now() < rs.EndAt)
      rs.push(buff);
    else
      rs.push(null);
  }
  return(rs);
}

// each client request will call this function
function onHTTPrequest(request, response) {
  var who = request.connection.remoteAddress + ":" + request.connection.remotePort;
  console.log("Request received from " + who);
  var duration = 10; // (actually we get it from the url query part)
  // send the header: it's unknown size of binary data
  response.writeHead(200,
    {
    'Content-Type'  : 'octet-stream',
    'Cache-Control' : 'no-cache, no-transform'
  });

 // link (pipe) the Readable to the response
  var timedReadable = createTimedReadable(duration);
  timedReadable.pipe(response);
}

// main - start the server
http.createServer(onHTTPrequest).listen(8888);
console.log("Server has started");

To call the server, for 10 seconds:

wget -O /dev/null http://serverip:8888

The data is discarded client side (/dev/null) because we don't want the disk writing slowing things down and messing with the speed test results.

When doing "over the wire" tests (2 machines), the speed seems fine but when doing local test (same machine to same machine) the speed is wrong, very slow and very dependent of the size of the Buffer variable (buff). Using very big size for this buffer yield better performance. For instance on the same test machine:

with a 16k buffer, we get 188 MB/s
with a 128k buffer, we get 487 MB/s
with a 1M buffer, we get 558 MB/s
with a reference code written in C using a 4K buffer, we get 626 MB/s

'over the wire' we don't see that issue because we're using GigaEthernet (Ge) so we can't go faster than ~110MB/s but with faster wires (10Ge for instance) we would be limited to 188MB/s with a 16k buffer. If we use a 4K buffer like the C code than the Node code don't even reach Ge speed, we get 69MB/s only.

So there is a limitation somewhere, I don't know where. And I don't understand why the buffer size impacts so much the performance since we're using a Readable loop. Bascially, why calling 'push' 8 times with 16k data each time is slower than calling it 1 time with 128k (16k*8) data.

OR, Is there may be another way to do efficient 'time limited streams' ?

Thanks.

2

2 Answers

0
votes

Overheads. There would definitely be more overhead to transfer 8 128kb blocks than 1 1mb block. The block size only constitutes the payload but each request also has headers along with body. If you factor in the overhead associated with each request your results will be similar to each other.

Secondly, between the node and C comparison. It is likely that C will outperform node. But what constitutes the test code is also important. Did you write an http server for testing in C ? Otherwise just sending data blocks for the speed test with C, will just be measuring disk bandwidth, without the overhead.

0
votes

Let's just simplify what is node doing by 3 things:

  1. Read from the readable stream
  2. Write to socket

Since nothing else on the server is done, the server is doing the same thing over and over again.

If you set the size to 1KiloBytes, then for each 1KiloBytes of data, once we read from the socket and once we write to the socket.

If you set the size to 128Bytes, then for each 1KiloBytes of data, 8 times we read from the socket and 8 times we write to the socket.

As user568109 said, there's more overheads per amount of data when you set the size to lower.

This is like you bringing the shopping bags from parking to your house. Which is faster? Bring all in once or bring one at a time?

And always C is way faster than node. node contains many abstractions which C doesn't have each having it's own overhead. There's unoptimized javascript, and there's this readable/writable streams stuff, which in C we just have simple writing to socket with a file descriptor.

Technically I suggest increasing the size as much as your connections can take. Node itself will handle the connection limit and the backpressure. It won't read from the socket more than the writable socket can handle.