2
votes

I am using aws aurora mysql cluster version 5.7.mysql_aurora.2.04.2, instance size r5.xlarge (32GB memory) Problem I have is that over a period of few days primary instance memory starts depleting and free memory dip to as low to failover cluster or restart instance.

here is my parameter group parameter: I have so far left innodb_buffer_pool_size to default which is 75% of total memory query_cache_size = 0 query_cache_type = 0 I have not looked into any other parameter yet, just left them with default settings

In the peak time there are approx. 400-500 connections but active connections are only 4-5 . I did check with performance schema memory stats and found that connections are not really consuming much memory here.

here is my recent server status

 =====================================
 2021-06-02 00:21:02 0x2ac656ac3700 INNODB MONITOR OUTPUT
 =====================================
 Per second averages calculated from the last 49 seconds
 -----------------
 BACKGROUND THREAD
 -----------------
 srv_master_thread loops: 64605 srv_active, 0 srv_shutdown, 39 srv_idle
 srv_master_thread log flush and writes: 0
 ----------
 SEMAPHORES
 ----------
 OS WAIT ARRAY INFO: reservation count 482279
 OS WAIT ARRAY INFO: signal count 456598
 RW-shared spins 0, rounds 602358, OS waits 242442
 RW-excl spins 0, rounds 481661, OS waits 9408
 RW-sx spins 59854, rounds 1519404, OS waits 38433
 Spin rounds per wait: 602358.00 RW-shared, 481661.00 RW-excl, 25.39 RW-sx
 
 ------------
 TRANSACTIONS
 ------------
 Trx id counter 20693553688
 Purge done for trx's n:o < 20693552345 undo n:o < 0 state: running but idle
 History list length 2519
 --------
 FILE I/O
 --------
 I/O thread 0 state: waiting for i/o request (insert buffer thread)
 I/O thread 1 state: waiting for i/o request (log thread)
 I/O thread 2 state: waiting for i/o request (read thread)
 I/O thread 3 state: waiting for i/o request (read thread)
 I/O thread 4 state: waiting for i/o request (read thread)
 I/O thread 5 state: waiting for i/o request (read thread)
 I/O thread 6 state: waiting for i/o request (write thread)
 I/O thread 7 state: waiting for i/o request (write thread)
 I/O thread 8 state: waiting for i/o request (write thread)
 I/O thread 9 state: waiting for i/o request (write thread)
 Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
  ibuf aio reads:, log i/o's:, sync i/o's:
 Pending flushes (fsync) log: 0; buffer pool: 0
 0 OS file reads, 0 OS file writes, 0 OS fsyncs
 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
 -------------------------------------
 INSERT BUFFER AND ADAPTIVE HASH INDEX
 -------------------------------------
 Ibuf: size 1, free list len 0, seg size 2, 0 merges
 merged operations:
  insert 0, delete mark 0, delete 0
 discarded operations:
  insert 0, delete mark 0, delete 0
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 Hash table size 5174291, node heap has 0 buffer(s)
 0.00 hash searches/s, 9043.71 non-hash searches/s
 ---
 LOG
 ---
 Log sequence number 8204
 Log flushed up to   0
 Pages flushed up to 8204
 Last checkpoint at  8192
 0 pending log flushes, 0 pending chkp writes
 0 log i/o's done, 0.00 log i/o's/second
 ----------------------
 BUFFER POOL AND MEMORY
 ----------------------
 Total large memory allocated 0
 Dictionary memory allocated 281737
 Buffer pool size   1276548
 Free buffers       61908
 Database pages     1214640
 Old database pages 448316
 Modified db pages  20
 Pending reads      0
 Pending writes: LRU 0, flush list 0, single page 0
 Pages made young 1544709, not young 4583067
 32.45 youngs/s, 84.47 non-youngs/s
 Pages read 966252, created 1919667, written 194
 18.90 reads/s, 98.47 creates/s, 0.00 writes/s
 Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 1 / 1000
 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
 LRU len: 1214640, unzip_LRU len: 0
 I/O sum[3080]:cur[0], unzip sum[0]:cur[0]
 ----------------------
 INDIVIDUAL BUFFER POOL INFO
 ----------------------
 ---BUFFER POOL 0
 Buffer pool size   319137
 Free buffers       12682
 Database pages     306455
 Old database pages 113129
 Modified db pages  0
 Pending reads      0
 Pending writes: LRU 0, flush list 0, single page 0
 Pages made young 389940, not young 1141638
 8.41 youngs/s, 17.20 non-youngs/s
 Pages read 242260, created 482146, written 0
 5.29 reads/s, 20.26 creates/s, 0.00 writes/s
 Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 1 / 1000
 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
 LRU len: 306455, unzip_LRU len: 0
 I/O sum[770]:cur[0], unzip sum[0]:cur[0]
 ---BUFFER POOL 1
 Buffer pool size   319137
 Free buffers       17337
 Database pages     301800
 Old database pages 111386
 Modified db pages  0
 Pending reads      0
 Pending writes: LRU 0, flush list 0, single page 0
 Pages made young 381508, not young 1142265
 7.69 youngs/s, 26.55 non-youngs/s
 Pages read 241892, created 480398, written 0
 4.55 reads/s, 27.35 creates/s, 0.00 writes/s
 Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 2 / 1000
 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
 LRU len: 301800, unzip_LRU len: 0
 I/O sum[770]:cur[0], unzip sum[0]:cur[0]
 ---BUFFER POOL 2
 Buffer pool size   319137
 Free buffers       14628
 Database pages     304509
 Old database pages 112387
 Modified db pages  0
 Pending reads      0
 Pending writes: LRU 0, flush list 0, single page 0
 Pages made young 389540, not young 1154116
 7.88 youngs/s, 25.08 non-youngs/s
 Pages read 242468, created 480735, written 0
 4.69 reads/s, 26.55 creates/s, 0.00 writes/s
 Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 2 / 1000
 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
 LRU len: 304509, unzip_LRU len: 0
 I/O sum[770]:cur[0], unzip sum[0]:cur[0]
 ---BUFFER POOL 3
 Buffer pool size   319137
 Free buffers       17261
 Database pages     301876
 Old database pages 111414
 Modified db pages  20
 Pending reads      0
 Pending writes: LRU 0, flush list 0, single page 0
 Pages made young 383721, not young 1145048
 8.47 youngs/s, 15.63 non-youngs/s
 Pages read 239632, created 476388, written 194
 4.37 reads/s, 24.31 creates/s, 0.00 writes/s
 Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
 LRU len: 301876, unzip_LRU len: 0
 I/O sum[770]:cur[0], unzip sum[0]:cur[0]
 --------------
 ROW OPERATIONS
 --------------
 0 queries inside InnoDB, 0 queries in queue
 0 read views open inside InnoDB
 Process ID=16917, Main thread ID=47055544436480, state: sleeping
 Number of rows inserted 4663043, updated 3954657, deleted 4438202, read 15553254
 407.46 inserts/s, 247.73 updates/s, 283.69 deletes/s, 918.53 reads/s
 ----------------------------
 END OF INNODB MONITOR OUTPUT
 ============================

Please find last 3 days memory/connections/cpu utilization

memory utilization connections utilizationcpu utilization any help how to fix/investigate this issue is appreciated.

1
Additional information request. Are you loading data? # cores, any SSD or NVME devices on MySQL Host server? Post on pastebin.com and share the links. From your SSH login root, Text results of: B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; AND Optional very helpful information, if available includes - htop OR top for most active apps, ulimit -a for a Linux/Unix list of limits, iostat -xm 5 3 for IOPS by device and core/cpu count, for server workload tuning analysis to provide suggestions.Wilson Hauck
@WilsonHauck Thanks for reply. It is on AWS RDS, I will not have access to underline host. I can still pull global variables but I am afraid that there will be many limitations to hear about what we can change what we can't. Recently I have reduced buffer pool size to 50% of total memory to give some room to memory so that it does not hit bottom low quickly.Sukh
50%- Good. If you need more RAM to spare, lower max_connections to 300 and cut wait_timeout in half. Those might keep you out of trouble until we can figure out what is wrong.Rick James
As for your concern about getting the requested metrics -- the outcome may be some recommendation other than changing some of them.Rick James
@Sukh However you got to the point you could run the Server Status you posted in your question, from that same position, you should be able to do the SHOW GLOBAL's and SHOW FULL PROCESSLIST; requested above. copy results to clipboard, then to text file and post on pastebin.com, share the links. AWS may be kind enough to get the OS reports for you in TEXT form, if not, we can still provide positive suggestions to improve your world.Wilson Hauck

1 Answers

0
votes

After adjusting buffer pool to 50% of total memory in aurora mysql instance, i could create more room in freeable memory which has certainly reduce the failovers caused because of over memory consumption.

however some % of memory depletion is still happening, it could be expected behavior when you have open so many connections in database

We are also trying to leverage RDS proxy for connections pooling to RDS clusters which may help in better connections management and reduce further consumption of memory.