3
votes

I am currently working with a subversion repository but I am using git to work locally on my machine. It makes work much easier, but it also makes some of the bad behavior going on in the subversion repo quite glaring and that creates problems for me.

There is a somewhat complex local build process after pulling down the code and it creates (and unfortunately modifies) a number of files. Obviously these changes are not meant to be committed back to the repository. Unfortunately the build process is actually modifying some tracked files (yes, most likely because someone mistakenly committed these build artifacts at some point to the subversion repository). Since these are modifications adding them to my ignore file does nothing for me.

I can avoid checking these changes back it, I simple don't stage or commit them, but having unstaged local changes means I can't rebase without first cleaning them up.

What I would like to know is if there any way to ignore future changes to a set of tracked files? Alternatively, is there another way to handle the problem I am having, or will I just have to tell whoever checked in these files to clean them up?

2

2 Answers

5
votes

As Nathan said, cleaning up those files (un-tracking them) is the smart move.

But if you must ignore tracked files (which is not the native Git way when it comes to ignoring files: Git only ignores non-tracked files), you can setup a process copying the content of files you want to ignore, and restoring on commit.

I initially believed that a smudge/clean process, that is a gitattributes filter driver could do the trick:

alt text

, where:

  • the smudge process will make a copy of those files (when updating the working tree)
  • some modifications take place during the build
  • the clean step (during commit) will erase the files content with the copy made in step 1.

BUT, as stated in this post, that would mean abusing this stateless file content transformation by adding a stateful context (i.e. the full path name of the file being smudged/clean).
And that is explicitly forbidden by J.C. Hamano:

Although I initially considered interpolating "%P" with pathname, I ended up deciding against it, to discourage people from abusing the filter for stateful conversion that changes the results depending on time, pathname, commit, branch and stuff.

and even Linus Torvalds had some reservations at the time about the all mechanism:

I have to say, I'm obviously not a huge fan of playing games, but the diffs are very clean.

Are they actually useful? I dunno. I'm a bit nervous about what this means for any actual user of the feature, but I have to admit to being charmed by a clean implementation.

I suspect that this gets some complaining off our back, but I also suspect that people will actually end up really screwing themselves with something like this and then blaming us and causing a huge pain down the line when we've supported this and people want "extended semantics" that are no longer clean.

But I'm not sure how valid an argument that really is. I do happen to believe in the "give them rope" philosophy. I think you can probably screw yourself royally with this, but hey, anybody who does that only has himself to blame


So the right place to add some kind of save/restore mechanism (and effectively ignoring any changes to a set of tracked files in Git) would be in hooks:

  • post-checkout: invoked when a git checkout is run after having updated the worktree. There you can run a script collecting all the files to ignore and saving them somewhere.

  • pre-commit: you can run a second script which will restore the content of those files, before obtaining the proposed commit log message and making a commit.

1
votes

Unless there's some serious political brain damage going on, removing the artifacts from source control is the correct step. (Or rather, "most expedient" step, it's always the correct step.)

I not aware of a way to tell git to ignore changes to tracked files.