3
votes

Postgresql has the functionality of Vacuum for recollecting the space occupied by dead tuples. Auto vacuum is on by default and runs according to the configuration setting.

When I check the output of pg_stat_all_tables i.e. last_vacuum and last_autovacuum, autovacuum was never run for most of the tables in the database which have enough number of dead tuples(more than 1K). We also get a time window of 2-3 hours when these tables are used rarely.

Below are autovacuum settings for my database enter image description here

below is the output of pg_stat_all_tables enter image description here

I want to ask that is it a good idea to depend only on auto vacuum? Are there any special setting required for autovacuum to function properly?
Should we set up a manual vacuum? Should we use both in parallel or just turn off autovacuum and use manual vacuum only?

2
Well, it's like asking whether you should use a manual or automatic transmission in cars. Average user will definitely perform better with automatic. Pro will use manual to squeeze these few last drops of speed. Now, if the autovacuum doesn't run and you are sure that it should then post your configuration and all the necessary data and we will look at it.freakish
Pls. move this thread to dba.stackexchange.com. It is not related to programming.kometen

2 Answers

3
votes

You should definitely use autovacuum.

Are there any autovacuum processes running currently?

Does a manual VACUUM on such a table succeed?

Set log_autovacuum_min_duration = 0 to get information about autovacuum processing in the logs.

If system activity is too high, autovacuum may not be able to keep up. In this case it is advisable to configure autovacuum to be more aggressive, e.g. by setting autovacuum_vacuum_cost_limit = 1000.

2
votes

https://www.postgresql.org/docs/current/static/routine-vacuuming.html

PostgreSQL databases require periodic maintenance known as vacuuming. For many installations, it is sufficient to let vacuuming be performed by the autovacuum daemon, which is described in Section 24.1.6. You might need to adjust the autovacuuming parameters described there to obtain best results for your situation. Some database administrators will want to supplement or replace the daemon's activities with manually-managed VACUUM commands, which typically are executed according to a schedule by cron or Task Scheduler scripts.

vacuum creates significant IO, asjust https://www.postgresql.org/docs/current/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST to fit your needs.

Also you can set autovacuum settings per table, to be more "custom" https://www.postgresql.org/docs/current/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

the above will give you the idea why your 1K dead tuples might be not enough for autovacuum and how to change it.

manual VACUUM is a perfect solution for one time run, while to run the system I'd definitely rely on autovacuum daemon