69
votes

I am currently using pg_dump piped to gzip piped to split. But the problem with this is that all output files are always changed. So checksum-based backup always copies all data.

Are there any other good ways to perform an incremental backup of a PostgreSQL database, where a full database can be restored from the backup data?

For instance, if pg_dump could make everything absolutely ordered, so all changes are applied only at the end of the dump, or similar.

3
Have you found the solution? I also have the same requirement i.e incremental back up in PostgreSQL. I have gone through many articles and websites but i could not found a clear cut way to do incremental backup in PostgreSQL. Do PostgreSQL support incremental backup independently without third party tools like pg rman? Please help me on this. Thanks.Suniel

3 Answers

69
votes

Update: Check out Barman for an easier way to set up WAL archiving for backup.

You can use PostgreSQL's continuous WAL archiving method. First you need to set wal_level=archive, then do a full filesystem-level backup (between issuing pg_start_backup() and pg_stop_backup() commands) and then just copy over newer WAL files by configuring the archive_command option.

Advantages:

  • Incremental, the WAL archives include everything necessary to restore the current state of the database
  • Almost no overhead, copying WAL files is cheap
  • You can restore the database at any point in time (this feature is called PITR, or point-in-time recovery)

Disadvantages:

  • More complicated to set up than pg_dump
  • The full backup will be much larger than a pg_dump because all internal table structures and indexes are included
  • Does not work well for write-heavy databases, since recovery will take a long time.

There are some tools such as pitrtools and omnipitr that can simplify setting up and restoring these configurations. But I haven't used them myself.

13
votes

Also check out http://www.pgbackrest.org

pgBackrest is another backup tool for PostgreSQL which you should be evaluating as it supports:

  • parallel backup (tested to scale almost linearly up to 32 cores but can probably go much farther..)
  • compressed-at-rest backups
  • incremental and differential (compressed!) backups
  • streaming compression (data is compressed only once at the source and then transferred across the network and stored)
  • parallel, delta restore (ability to update an older copy to the latest)
  • Fully supports tablespaces
  • Backup rotation and archive expiration
  • Ability to resume backups which failed for some reason
  • etc, etc..
1
votes

Another method is to backup to plain text and use rdiff to create incremental diffs.