I am an Amazon RDS customer and am experiencing daily amazon RDS write latency spikes, corresponding roughly to the backup window. I will also see spikes at the end of a snapshot (case in point: running a snapshot takes appx 1 hour, and in the final 5 minutes, write latency spikes). I am running a multi-AZ m1.large deployment.
Is there anyone on Stack who can explain how Amazon RDS backup is actually working? I've read the Amazon RDS docs, and as far as I can tell, Amazon RDS is not behaving according to spec. Specifically, these backup/snapshot operations should be hitting my replica, and therefore not causing any downtime/performance hit, or so I thought.
I can distill my problem into six questions:
- What is technically happening during a snapshot and a backup, and how are they different? (If you answer this question, please tell me if you are able to empirically confirm your answer, or are simply quoting me documentation).
- Is a spike in write latency to be expected during the backup window on a multi-AZ deployment?
- Is a spike in write latency to be expected at the end of a snapshot on a multi-AZ deployment?
- Would my write latency spike be even higher if I was not multi-AZ ?
- Architecturally, would I be able to avoid these write latency spikes if I rolled my own database running on two m1.large EC2 instances?
- Are there any configurations I can use that would avoid these write latency spikes while still hosting my DB with RDS, or am I effectively at the mercy of Amazon?
Bonus Question: where and how do you host your mysql database?
I can say that I have been generally happy with RDS except for these daily write latency issues. I love the built-in database monitoring and it was fairly simple to setup and get going.
Thanks!