2
votes

I'm trying to add to our existing pre-commit SVN hook so that it will check for and block an increase in file size for files in specific directory/s.

I've written a python script to compare two file sizes, which takes two files as arguments and uses sys.exit(0) or (1) to return the result, this part seems to work fine.

My problem is in calling the python script from the batch file, how to reference the newly committed and previous versions of each file? The existing code is new to me and a mess of %REPOS%, %TXN%s etc and I'm not sure how to go about using them. Is there a simple, standard way of doing this?

It also already contains code to loop through the changed files using svnlook changed, so that part shouldn't be an issue.

Thanks very much

2

2 Answers

5
votes

If comparing file sizes is all you need to do, look no further than the svnlook filesize command. The default invocation - svnlook filesize repo path - will give you the size of the HEAD revision of path. To get the size of the path in the incoming commit use svnlook filesize repo path -t argv[2].

Still, here is an example of listing all revisions of a versioned path (except the incoming one, since this is pre-commit hook).

#!/usr/bin/env python

from sys import argv, stderr, exit
from subprocess import check_output

repo = argv[1]
transaction = argv[2]

def path_history(path, limit=5):
    path = '/%s' % path
    cmd = ('svnlook', 'history', '-l', str(limit), repo, path)
    out = check_output(cmd).splitlines()[2:]

    for rev, _path in (i.split() for i in out):
        if _path == path:
            yield rev

def commit_changes():
    cmd = ('svnlook', 'changed', repo, '-t', transaction)
    out = check_output(cmd).splitlines()

    for line in out:
        yield line.split()

def filesize(path, rev=None, trans=None):
    cmd = ['svnlook', 'filesize', repo, path]
    if rev:     cmd.extend(('-r', str(rev)))
    elif trans: cmd.extend(('-t', str(trans)))

    out = check_output(cmd)
    return out.rstrip()

def filesize_catwc(path, rev=None, trans=None):
    '''A `svnlook filesize` substitute for older versions of svn. 
    Uses `svnlook cat ... | wc -c` and should be very inefficient
    for large files.'''

    arg = '-r %s' % rev if rev else '-t %s' % trans
    cmd = 'svnlook cat %s %s %s | wc -c' % (arg, repo, path)

    out = check_output(cmd, shell=True)
    return out.rstrip()


for status, path in commit_changes():
    if status in ('A', 'M', 'U'):
        # get the last 5 revisions of the added/modified path
        revisions = list(path_history(path))
        headrev = revisions[0]

        oldsize = filesize(path, rev=headrev)
        newsize = filesize(path, trans=transaction)
1
votes

It is probably easier to write a whole pre-commit script in python. According to the subversion handbook, there are three inputs to pre-commit;

  • Two command line arguments
    • repository path
    • commit transaction name
  • lock-token info on standard input

If you want to know which files have changed, I suggest you use the subprocess.check_output() function to call svnlook changed. For the files which contents have changed, you should call svnlook filesize, to get the size of the file as it is in the last revision in the repository. The size of the equivalent file in the working directory you'd have to query with os.stat(), as shown in the function getsizes().

import subprocess
import sys
import os

repo = sys.argv[1]
commit_name = sys.argv[2]

def getsizes(rname, rfile):
   '''Get the size of the file in rfile from the repository rname. 
   Derive the filename in the working directory from rfile, and use 
   os.stat to get the filesize. Return the two sizes.
   '''
   localname = rfile[10:].strip() # 'U   trunk/foo/bar.txt' -> 'foo/bar.txt'
   reposize = subprocess.check_output(['svnlook', 'filesize', rname, rfile])
   reposize = int(reposize)
   statinfo = os.stat(localname)
   return (reposize, statinfo.st_size)

lines = subprocess.check_output(['svnlook', 'changed', repo]).splitlines()
for line in lines:
    if line.startswith('U ') or line.startswith('UU'):
       # file contents have changed
       reposize, wdsize = getsizes(repo, line)
       # do something with the sizes here...
    elif line.startswith('_U'):
       # properties have changed
       pass