1
votes

I maintain some backups, daily cron jobs, and I'm looking to implement a more sophisticated system of aging out old archives using just bash and common Unix/Linux utilities (such as GNU date).

The policy I want to implement is something like:

  • Keep one for every month, forever
  • Keep one per week for 13 weeks (approximately 3 months)
  • Remove everything else that's older than 14 days

The script I've sketched out maintains these as (hard) links in ./monthly and ./weekly (relative to the $BASEDIR).

It looks something like this:

#!bash
TSTAMP="$(date +%s-%Y%m%d)"
BASEDIR=/backups
TODAYS="myapp-$TSTAMP.tar.gz"

# If today is 01:
[ "$(date +%d)" = "01" ] && {
   ln "$BASEDIR/$TODAYS" ./monthly/
   ln "$BASEDIR/backup-$TSTAMP.log" ./monthly
   }

# If day of week is 01 (Sunday)
[ "$(date +%w)" = "0" ] && {
   ln "$BASEDIR/$TODAYS" ./weekly
   ln "$BASEDIR/backup-$TSTAMP.log" ./weekly
   }

# Find log from 14 days ago (capture results of glob)                                                                               
f=$(eval echo ./backup-*$(date -d "2 weeks ago" +%Y%m%d)*.log)
# ... but only one match (trim off everything past the space)
f=${f%%" "*}
# If such a file exists: remove all files not newer than it
# but only from .
[ -r "$f" ] && find . -maxdepth 1 -not -newer "$f" -type f \
  | xargs rm

But what's a reasonable way to test this for corner cases? One I can see would be if the backup/log failed to run in a given day (system was down when cron was supposed to run it) ... then the glob will fail to match any file, so the [ -r "$f" ] will fail and no trimming will be done. That seems harmless enough (I get an extra day of backups. If someone messes up the permissions on the backups then the -r could fail even if a matching file is there (but that's a bigger problem).

(Note: this is just the part of the system that prepares local application backup file. There's a completely different system that gathers them all up on central storage).

It seems like there ought to be a more elegant way to do all this, though!

Is there a find like utility which takes some sort of retention specifications and generates a list of files in a tree that don't match the specs?

Why not mv the monthly backups into the ./monthly directory -- that takes care of the keep forever. Write a persistent counter 1...13 to a file. When you get to 13, remove the oldest file from the ./weekly folder before mving the latest into the ./weekly folder?David C. Rankin
Have you looked at logrotate? It's the common Unix/Linux utility used for tasks like this.Ed Morton
@EdMorton I'd not thought of logrotate for this, even though I've used it for logs forever. I'll have to think on that.Jim Dennis