338
votes

I'm running an Express.js application using Socket.io for a chat webapp and I get the following error randomly around 5 times during 24h. The node process is wrapped in forever and it restarts itself immediately.

The problem is that restarting Express kicks my users out of their rooms and nobody wants that.

The web server is proxied by HAProxy. There are no socket stability issues, just using websockets and flashsockets transports. I cannot reproduce this on purpose.

This is the error with Node v0.10.11:

    events.js:72
            throw er; // Unhandled 'error' event
                  ^
    Error: read ECONNRESET     //alternatively it s a 'write'
        at errnoException (net.js:900:11)
        at TCP.onread (net.js:555:19)
    error: Forever detected script exited with code: 8
    error: Forever restarting script for 2 time

EDIT (2013-07-22)

Added both socket.io client error handler and the uncaught exception handler. Seems that this one catches the error:

    process.on('uncaughtException', function (err) {
      console.error(err.stack);
      console.log("Node NOT Exiting...");
    });

So I suspect it's not a Socket.io issue but an HTTP request to another server that I do or a MySQL/Redis connection. The problem is that the error stack doesn't help me identify my code issue. Here is the log output:

    Error: read ECONNRESET
        at errnoException (net.js:900:11)
        at TCP.onread (net.js:555:19)

How do I know what causes this? How do I get more out of the error?

Ok, not very verbose but here's the stacktrace with Longjohn:

    Exception caught: Error ECONNRESET
    { [Error: read ECONNRESET]
      code: 'ECONNRESET',
      errno: 'ECONNRESET',
      syscall: 'read',
      __cached_trace__:
       [ { receiver: [Object],
           fun: [Function: errnoException],
           pos: 22930 },
         { receiver: [Object], fun: [Function: onread], pos: 14545 },
         {},
         { receiver: [Object],
           fun: [Function: fireErrorCallbacks],
           pos: 11672 },
         { receiver: [Object], fun: [Function], pos: 12329 },
         { receiver: [Object], fun: [Function: onread], pos: 14536 } ],
      __previous__:
       { [Error]
         id: 1061835,
         location: 'fireErrorCallbacks (net.js:439)',
         __location__: 'process.nextTick',
         __previous__: null,
         __trace_count__: 1,
         __cached_trace__: [ [Object], [Object], [Object] ] } }

Here I serve the flash socket policy file:

    net = require("net")
    net.createServer( (socket) =>
      socket.write("<?xml version=\"1.0\"?>\n")
      socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
      socket.write("<cross-domain-policy>\n")
      socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
      socket.write("</cross-domain-policy>\n")
      socket.end()
    ).listen(843)

Can this be the cause?

16
@GottZ maybe this can help (spoke to someone working within node js) gist.github.com/samsonradu/1b0c6feb438f5a53e30e. I ll deploy the socket.error handler today and let you know.Samson
@Gottz the socket.error handles doesn t help, but process.on('uncaughtException') catches the error. Here is the console.log of the error: { [Error: read ECONNRESET] code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }Samson
ECONNRESET could be from network problem. As you know it is impossible to catch all the exceptions when testing. Some will show up on your production server. You will have to make your server robust. You can handle the session deletion by using Redis as storage. It makes your sessions persist even after your node server goes down.user568109
Why is that related with session deletion? They are handled by Redis anyway.Samson
You have at-least one TCP socket listening that does not have the handler set. So now it's time to check where that one is :DMoss

16 Answers

302
votes

You might have guessed it already: it's a connection error.

"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.

But since you are also looking for a way to check the error and potentially debug the problem, you should take a look at "How to debug a socket hang up error in NodeJS?" which was posted at stackoverflow in relation to an alike question.

Quick and dirty solution for development:

Use longjohn, you get long stack traces that will contain the async operations.

Clean and correct solution: Technically, in node, whenever you emit an 'error' event and no one listens to it, it will throw. To make it not throw, put a listener on it and handle it yourself. That way you can log the error with more information.

To have one listener for a group of calls you can use domains and also catch other errors on runtime. Make sure each async operation related to http(Server/Client) is in different domain context comparing to the other parts of the code, the domain will automatically listen to the error events and will propagate it to it's own handler. So you only listen to that handler and get the error data. You also get more information for free.

EDIT (2013-07-22)

As I wrote above:

"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.

What could also be the case: at random times, the other side is overloaded and simply kills the connection as a result. If that's the case, depends on what you're connecting to exactly…

But one thing's for sure: you indeed have a read error on your TCP connection which causes the exception. You can see that by looking at the error code you posted in your edit, which confirms it.

46
votes

A simple tcp server I had for serving the flash policy file was causing this. I can now catch the error using a handler:

# serving the flash policy file
net = require("net")

net.createServer((socket) =>
  //just added
  socket.on("error", (err) =>
    console.log("Caught flash policy server socket error: ")
    console.log(err.stack)
  )

  socket.write("<?xml version=\"1.0\"?>\n")
  socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
  socket.write("<cross-domain-policy>\n")
  socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
  socket.write("</cross-domain-policy>\n")
  socket.end()
).listen(843)
32
votes

I had a similar problem where apps started erroring out after an upgrade of Node. I believe this can be traced back to Node release v0.9.10 this item:

  • net: don't suppress ECONNRESET (Ben Noordhuis)

Previous versions wouldn't error out on interruptions from the client. A break in the connection from the client throws the error ECONNRESET in Node. I believe this is intended functionality for Node, so the fix (at least for me) was to handle the error, which I believe you did in unCaught exceptions. Although I handle it in the net.socket handler.

You can demonstrate this:

Make a simple socket server and get Node v0.9.9 and v0.9.10.

require('net')
    .createServer( function(socket) 
    {
           // no nothing
    })
    .listen(21, function()
     {
           console.log('Socket ON')
    })

Start it up using v0.9.9 and then attempt to FTP to this server. I'm using FTP and port 21 only because I'm on Windows and have an FTP client, but no telnet client handy.

Then from the client side, just break the connection. (I'm just doing Ctrl-C)

You should see NO ERROR when using Node v0.9.9, and ERROR when using Node v.0.9.10 and up.

In production, I use v.0.10. something and it still gives the error. Again, I think this is intended and the solution is to handle the error in your code.

18
votes

Had the same problem today. After some research i found a very useful --abort-on-uncaught-exception node.js option. Not only it provides much more verbose and useful error stack trace, but also saves core file on application crash allowing further debug.

15
votes

I was facing the same issue but I mitigated it by placing:

server.timeout = 0;

before server.listen. server is an HTTP server here. The default timeout is 2 minutes as per the API documentation.

13
votes

I also get ECONNRESET error during my development, the way I solve it is by not using nodemon to start my server, just use "node server.js" to start my server fixed my problem.

It's weird, but it worked for me, now I never see the ECONNRESET error again.

10
votes

Another possible case (but rare) could be if you have server to server communications and have set server.maxConnections to a very low value.

In node's core lib net.js it will call clientHandle.close() which will also cause error ECONNRESET:

if (self.maxConnections && self._connections >= self.maxConnections) {
  clientHandle.close(); // causes ECONNRESET on the other end
  return;
}
10
votes

Yes, your serving of the policy file can definitely cause the crash.

To repeat, just add a delay to your code:

net.createServer( function(socket) 
{
    for (i=0; i<1000000000; i++) ;
    socket.write("<?xml version=\"1.0\"?>\n");
…

… and use telnet to connect to the port. If you disconnect telnet before the delay has expired, you'll get a crash (uncaught exception) when socket.write throws an error.

To avoid the crash here, just add an error handler before reading/writing the socket:

net.createServer(function(socket)
{
    for(i=0; i<1000000000; i++);
    socket.on('error', function(error) { console.error("error", error); });
    socket.write("<?xml version=\"1.0\"?>\n");
}

When you try the above disconnect, you'll just get a log message instead of a crash.

And when you're done, remember to remove the delay.

6
votes

I had this Error too and was able to solve it after days of debugging and analysis:

my solution

For me VirtualBox (for Docker) was the Problem. I had Port Forwarding configured on my VM and the error only occured on the forwarded port.

general conclusions

The following observations may save you days of work I had to invest:

  • For me the problem only occurred on connections from localhost to localhost on one port. -> check changing any of these constants solves the problem.
  • For me the problem only occurred on my machine -> let someone else try it.
  • For me the problem only occurred after a while and couldn't be reproduced reliably
  • My Problem couldn't be inspected with any of nodes or expresses (debug-)tools. -> don't waste time on this

-> figure out if something is messing around with your network (-settings), like VMs, Firewalls etc., this is probably the cause of the problem.

4
votes

I had resolved this problem by:

  • Turning off my wifi/ethernet connection and turn on.
  • I typed: npm update in terminal to update npm.
  • I tried to log out from the session and log in again

After that I tried the same npm command and the good thing was it worked out. I wasn't sure it is that simple.

I am using CENTOS 7

4
votes

I just figured this out, at least in my use case.

I was getting ECONNRESET. It turned out that the way my client was set up, it was hitting the server with an API call a ton of times really quickly -- and it only needed to hit the endpoint once.

When I fixed that, the error was gone.

3
votes

I solved the problem by simply connecting to a different network. That is one of the possible problems.

As discussed above, ECONNRESET means that the TCP conversation abruptly closed its end of the connection.

Your internet connection might be blocking you from connecting to some servers. In my case, I was trying to connect to mLab ( cloud database service that hosts MongoDB databases). And my ISP is blocking it.

2
votes

ECONNRESET occurs when the server side closes the TCP connection and your request to the server is not fulfilled. The server responds with the message that the connection, you are referring to a invalid connection.

Why the server sends a request with invalid connection?

Suppose you have enabled a keep-alive connection between client and server. The keep-alive timeout is configured to 15 seconds. This means that if keep-alive is idle for 15 seconds, it will send connection close request. So after 15 seconds, server tells the client to close the connection. BUT, when server is sending this request, client is sending a new request which is already on flight to the server end. Since this connection is invalid now, server will reject with ECONNRESET error. So the problem occurs due to fewer requests to the server end. So please disable keep-alive and it will work fine.

1
votes

I had the same issue and it appears that the Node.js version was the problem.

I installed the previous version of Node.js (10.14.2) and everything was ok using nvm (allow you to install several version of Node.js and quickly switch from a version to another).

It is not a "clean" solution, but it can serve you temporarly.

0
votes

Node JS socket is non-blocking io. Consider using a non-blocking io connection from other sources. For instance, if you use a blocking Java socket with node it will only work for a few seconds after which the error will be served. Mitigate this by implementing a non-blocking connection I.e. socketchannel with the selector.

-1
votes

Try adding these options to socket.io:

const options = { transports: ['websocket'], pingTimeout: 3000, pingInterval: 5000 };

I hope this will help you !