6
votes

I'm noticing an intermittent issue with our Memcached session handler. The error that occurs is:

Unknown: Failed to write session data (memcache). Please verify that the current setting of session.save_path is correct.

Notes:

  • It seems to be an intermittent issue that occurs 5 or 6 times a day to various users.
  • Memcached is not localhost. i.e. It's on a different server than the web server.
  • I'm using the Memcache extension (as opposed to the MemcacheD extension).
  • I'm using the tcp prefix. If you look at this question, you'll see that the "fix" was to put tcp:// a prefix if you're using the Memcache extension.

My php.ini settings:

session.save_handler = memcache
session.save_path = "tcp://64.233.191.255:11211"

Note that I've also used:

session.save_path = "tcp://64.233.191.255:11211?persistent=1&weight=1&timeout=1&retry_interval=15"

But it doesn't seem to matter.

Checked the memcached.log file, where I found the following error:

Failed to write, and not due to blocking: Connection reset by peer.

Note: This particular error occurs at least once, at the same time (01:07AM), everyday. It will then occur sporadically throughout the day.

1
Have you checke log files of the memcache server at the times where the errors appeared in logs of web server? looks like a connection problem. Maybe because of load peek? - hek2mgl
I would go for network issues. Some cron jobs on memcache machine? Temporary high network load? - Paweł Spychalski
@PawełSpychalski Yep. There are cron jobs on the machine. We have a DB backup running at midnight. However, the machine has a lot of cores and it regularly sees loads of 4.00+ 1:09AM would be very low peak. - Wayne Whitty
@WayneWhitty still, something is happening at 1:09AM what is causing those network problems. Are you sure none of crons is not restarting memcache or doing smth nasty? - Paweł Spychalski
what is the timeout setting for your server on tcp level? maybe it's just not the memcache server itself but a script that runs longer as expected due to slow db responses and times out. Like at a point, where you are waiting for a db reply and writing that to your memcache afterwards, but the connection already timed out at this point - MatthiasLaug

1 Answers

2
votes

Maybe you're running out of filehandles? Perhaps the backups make your machine swap, resulting in slower responses, meaning more concurrent connections to the memcached process resulting in a stampeding hurd.