7
votes

I am having a problem with redis replication that I can not figure out. Master keeps hitting the client-output-buffer-limit.

Master Config:

# redis-cli -p 6380 config get client-output-buffer-limit
1) "client-output-buffer-limit"
2) "normal 0 0 0 slave 536870912 536870912 0 pubsub 33554432 8388608 60"

Master Log:

Client id=3014598 addr={{MASTER}} fd=6 name= age=217 idle=217 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=3723 oll=4806 omem=581952061 events=rw cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.

Master Info:

# redis-cli -p 6380 info
role:master
connected_slaves:1
slave0:ip={{SLAVE_IP}},port=6380,state=wait_bgsave,offset=0,lag=0  //  stays on wait_bgsave

Slave Info:

role:slave
master_host:{{MASTER_IP}}
master_port:6380
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:1

Redis version: 3.0.2

Database size: ~21GB Master is a 30GB Ram EC2 instance Slave is a 60GB Ram EC2 instance

The slave is connecting, however the master always times-out no matter how high I set the buffers. The master is almost always in the wait_bgsave state.

Can anybody provide any insight into why this might be happening?

1
It might be a problem when trying to create the RDB dump file for the replication. It might not have enough RAM available to create the file. You can check that by issuing a bgsave on redis-cli and see if it is succesful. And for a workaround you can try the new diskless replication, it might need less RAM to do it (even though I never found any docs saying that) redis.io/topics/replication#diskless-replication - Liviu Costea
It does create the RDB file successfully (I was able to reload the master database from the RDB.) I will try out diskless replication tonight and see if that does something. Thanks for the info! - Daniel
@LiviuCostea You are absolutely right, diskless replication did it! Thank you so much. If you post your comment as an answer I will accept it. I really want to know why disk replication did not work though, the master server has ~10Gb of free ram and the slave has 60GB free. Very strange - Daniel
Added as a response and also added some additional info regarding client-output-buffer-limit because it doesn't look like a out of RAM issue. - Liviu Costea

1 Answers

6
votes

It might be a problem when trying to create the RDB dump file for the replication. It might not have enough RAM available to create the file or maybe there is a peoblem with the slave not being able to read the incoming data fast enough so it gets disconnected.
You can check that by issuing a bgsave on redis-cli and see if it is succesful and also check the redis.conf for the client-output-buffer-limit part. And for a workaround you can try the new diskless replication, it might need less RAM to do it (even though I never found any docs saying that).