1
votes

I have a small Postgres development database running on Amazon RDS, and I'm running K8s. As far as I can tell, there is barely any traffic. I want to enable change capture, I've enabled rds.logical_replication, started a Debezium instance, and the topics appear in Kafka, and all seems fine.

After a few hours, the free disk space starts tanking: enter image description here

It started to consume disk at a constant rate, and eat up all of the 20Gb available within 24 hours. Stopping Debezium doesn't do anything. The way I got my disk space back was by:

select pg_drop_replication_slot('services_debezium')

and:

vacuum full

Then, after a few minutes, as you can see in the graph, disk space is reclaimed.

Any tips? I would love to see what is it what's actually filling up the space, but I don't think I can. Nothing seems to happen on the Debezium side (no ominous logs), and the Postgres logs don't show anything special either. Or is there some external event that triggers the start of this?

3
did you finally found the cause/workaround?Kokizzu
As far as I understood, there is an 'invisible' AWS database on the instance that does stuff, which shares the WAL and has quite a bit of activity. So if change capture isn't progressing, either because there is no activity on your database or some other cause, it will eat disk space pretty quickly. Setting up a heartbeat 'heartbeat.interval.ms' helps when your database has very little activity.Frank Lee

3 Answers

2
votes

The replication slot is the problem. It marks a position in the WAL, and PostgreSQL won't delete any WAL segments newer than that. Those files are in the pg_wal subdirectory of the data directory.

Dropping the replication slot and running CHECKPOINT will delete the files and free space.

The cause of the problem must be misconfiguration of Debrezium: it does not consume changes and move the replication slot ahead. Fix that problem and you are good.

1
votes

You need to periodically generate some movement in your database (perform an update on any record for example).

Debezium provides a feature called heartbeat to perform this type of operation.

Heartbeat can be configured in the connector as follows:

"heartbeat.interval.ms" : "300000", "heartbeat.action.query": "update my_table SET date_column = now();"

You can find more information in the official documentation:

https://debezium.io/documentation/reference/connectors/postgresql.html#postgresql-wal-disk-space

0
votes

Ok, I think I figured it out. There is another 'hidden' database on Amazon RDS, that has changes, but changes that I didn't make and I can's see, so Debezium can't pick them up either. If change my monitored database, it will show that change and in the process flush the buffer and reclaim that space. So the very lack of changes was the reason it filled up. Don't know if there is a pretty solution for this, but at least I can work with this.