27
votes

Some complex subversion merges are coming up in my project: big branches that have been apart for a long time. Svn gives too many conflicts - and some of them seem spurious.


Given that git is praised for a superiour merge experience, Would it be any good to use git-svn just for the benefit of making the merge more manageable?


Can you recommend other alternatives (eg. svk, hgsvn) to lessen the merge pain?

Some conflicts are easy enough to resolve (e.g java imports, whitespaces) - so I'm also wondering if there is any automated solutions for those.

A full switch to DVCS might happen in the future (some of us would love that), but not right now. (UPDATE: this isn't true any longer - the team switched fully recently and are happy about it).

Thanks in advance.

PS: there are posts that seem to be related (eg. git-svn merge 2 svn branches) but they don't fully answer this question.

Update: see my -novice- answer after going down (and up:) this road.

3
Why not just using git-svn for everything?Vi.
@Vi.this sounds like a separate top-level question - you may want to add it as such :-/ Mine was, approximately: "would you introduce git-svn in an SVN-based team, just to help out with a big merge?"inger
After just merging they may think to start just using it...Vi.
that's what happened after all -as you can see belowinger

3 Answers

34
votes

Trying to answer my question: using git for svn merges seems promising.

Update: it's not just promising, it's a great success. In short, Linus was right.

Just completed a huge merge of 2 svn branches that have been apart for 1.5 years; 3k files were changed, got tons of conflicts in svn (~800 I think).

I found git & git-svn a life saver:

  • auto-conflict resolution: for a start, it gave a lot less conflicted files (~half I think)
  • unbelievable performance
  • excellent repo/branching model, flexible workflows: easy experimentation with various approaches, such as chunk-by-chunk(in time) merge, always doing sanity checks(compile,etc); whenever trouble hits: just backtrack. You can always just take a step back when needed.
  • usability, great tooling:
    • git-log (and the underlying git-rev-parse options), nothing can be more powerful than this. It's handy as well: -p gives you diffs in one go; in svn you get a log, then find the diff for that "revision-1:revision", or use clumsy UIs. Find when a string was added/removed into the repo, search multiple branches simultaneously
    • gitk: hugely useful for visualising branch histories, combined with great search capabilities. Haven't seen anything like this in other tools, especially not as fast as this. Nevermind it's in Tk, it's just brilliant
    • git gui: works fine even if not the sexiest - great help for the novice to discover things
    • blame: a miracle. Yes, it detects where the original segment comes from (copy&paste etc)
    • mergetool: much more pleasant experience than kicking off the big svn merge which then stops everytime (ie. every 5 minutes) it runs into a conflict, press '(p)ostpone', than manually hunt for conflicted files later. Preferred a flavour of this integrated in git gui (needed a tiny patch for that). Found integrating external diff tools better configurable than in svn.
    • pluggable merge drivers and fine grained control of them
    • rebase allowed to filter out messier parts of the svn history
  • distribution: no need to come to the office when working on this, could pause & progress step-by-step on train/plane, etc..
    • a USB drive with Unison made syncing work<->home a piece of cake
    • this wouldn't have been possible without git's crazy compression (5 years old project with 26k commits, tons of branches and binary files, trunk svn checkout: 1.9Gb => all of these in the full git repo: 1.4Gb!)

So, this really can make the difference from a nightmare to joy - especially if you enjoy learning (which does take some effort in this case - I guess like learning a motorbike after a bicycle).

Even though I can't force everyone in the company to switch immediately - I really didn't intend to actually. Again, git-svn saves us by 'dipping the toe first' approach.. But seeing colleagues' reactions the switch might happen much before anyone expected:)

I'd say- even if we forget about merges & commits, this stuff is already great as a read-only frontend for queries, visualisation, backups, etc..

Caveat:

"Do not dcommit Git merge commits to the Subversion repository. Subversion doesn’t handle merges in the same way as Git, and this will cause problems. This means you should keep your Git development history linear (i.e., no merging from other branches, just rebasing)." (last paragraph of http://learn.github.com/p/git-svn.html )

Another excellent source is the Pro Git book, section 'Switching Active Branches' basically says that the merge does work, but dcommit will only store the content of the merge, but the history will be compromised (which breaks subsequent merges), so you should drop the work branch after merge. Anyway it makes sense after all, and in practice it's easy to avoid traps here.. in svn, I found people do not usually re-merge anyway so this could only be seen as a step back if you come from git world in the first place.

Anyhow, the dcommit just worked for me. I did it onto my own svn workbranch that I kept for this only, so avoided any extra conflicts that time. However, I decided to do the final merge from this workbranch to the svn trunk in svn (after syncing up everything in git); --ignore-ancestry gave the best results there.

Update: as I found out later, the last few steps above (extra svn branch and merge--ignore-ancestry) is easily avoided by just keeping the branch you're dcomitting from linear. As Gabe says below, merge --squash just creates a simple stupid svn-friendly commit. Just when ready with huge merge(s) on the my local branch (which might take days/weeks), I would now just:

git checkout -b dcommit_helper_for_svnbranch  svnbranch
git merge --squash huge_merge_work_with_messy_nonlinear_history
git commit 'nice merge summary' # single parent, straight from the fresh svnbranch
git dcommit

I know the merge tracking won't work great from the svn-side, until we switch fully. I can't wait for that.


UPDATE: @Kevin requested some more details on the whole process of merging svn branches.. There are lots articles, posts out there, but as a novice I found some of the confusing/misleading/out of date.. Anyhow, the way I do it these days (of course, stuck with git-svn after that merge affair; just as some newly infected colleagues)..

git svn clone -s http://svn/path/to/just-above-trunk  # the slowest part, but needed only once ever..you can every single branch from the svn repo since revision #1. 2) 
git svn fetch          # later, anytime: keep it up to date, talking to svn server to grab new revisions. Again: all branches - and yet it's usually a faster for me than a simple 'svn up' on the trunk:)    
# Take a look, sniff around - some optional but handy commands:
git gui   &    # I usually keep this running, press F5 to refresh
gitk --all     # graph showing all branches
gitk my-svn-target-branch svn-branch-to-merge    # look at only the branches in question
git checkout -b my-merge-fun my-svn-target-branch  # this creates a local branch based on the svn one and switches to it..before you notice :)
# Some handy config, giving more context for conflicts
git config merge.conflictstyle diff3
# The actual merge.. 
git merge  svn-branch-to-merge    # the normal case, with managable amount of conflicts
# For the monster merge, this was actually a loop for me: due to the sheer size, I split up the 2 year period into reasonable chunks, eg. ~1 months, tagged those versions ma1..ma25 and mb1..mb25 on each branch using gitk, and then repeated these for all of them
git merge ma1   # through ma25
git merge mb1   # through mb25
# When running into conflicts, just resolve them.. low tech way: keep the wanted parts, then "git add file" but you can
git mergetool   # loops through each conflicted file, open your GUI mergetool of choice..when successful, add the file automatically.
git mergetool  my-interesting-path # limit scope to that path

Actually I preferred to use 'git gui's builtin mergetool integration (right click on file in conflict). That's slightly limited though,so see my little patch above, which allows you to plugin a shell script where you can invoke whatever mergetools you prefer (I tried a variety of them sometimes in parallel as they caused a surprising amount of grief.. but normally I'm stuck with kdiff3..

When a merge step goes fine (no conflict), a merge commit is done automatically; otherwise, you resolve conflicts then

git commit  # am usually doing this in the git gui as well.. again, lightning fast.

The last phase.. Note that so far we had only local commits, not talking to the svn server yet. Unless you've used --squash or other tricks, you now end up with a graph where your merge commit has 2 parents: the tips of your svn-mirror branches. Now this is the usual gotcha: svn can only take linear history.. so 'git-svn' simplifies it by just dropping the second parent (svn-branch-to-merge in the above case).. so the real merge tracking is gone on the svn side..but otherwise it's fine in this case.

If you want a safer/cleaner way, this is where my earlier snippet comes in: just do the final merge with --squash. Adapted the earlier one to this flow:

git checkout -b dcommit_helper_for_svnbranch my-svn-target-branch  # another local workbranch.. basically needed as svn branches (as any other remote branch) are read-only
git merge --squash my-merge-fun  
git commit 'nice merge summary' # single parent, straight from the fresh svn branch
git dcommit  # this will result in a 'svn commit' on the my-svn-target-branch

oops, this is getting way too long, stopping before too late.. Good luck.

3
votes

I've just worked through this myself. A simpler method is to pass git merge the --squash option, which will perform the merge without recording a merge commit, keeping the history linear so as not to confuse git-svn.

My merge was also very large, and I had to set git config diff.renamelimit 0 so that git would correctly find all the renames.

3
votes

There are new tools available that fix many issues of git-svn and provide much better experience for using both Subversion and Git.

Among other things these tools fix some branching and merging problems. Here is an overview:

  1. git-svn

    From the documentation:

    CAVEATS

    ...

    Running git merge or git pull is NOT recommended on a branch you plan to dcommit from. Subversion does not represent merges in any reasonable or useful fashion; so users using Subversion cannot see any merges you've made. Furthermore, if you merge or pull from a git branch that is a mirror of an SVN branch, dcommit may commit to the wrong branch.

    There are primarily three reasons not to dcommit merge commits:

    • git-svn doesn't automatically send svn:mergeinfo property for merged branches. As result Subversion is not able to track those merges performed by git. This includes normal Git merges and cherry-picks.

    • as git-svn does not convert svn:ignore, svn:eol-style and other SVN properties automatically, merge commit does not have corresponding metadata in Git. As result, dcommit does not send these properties to SVN repository, so they get lost.

    • dcommit always sends changes to the branch referenced by a first parent of a merge commit. Sometimes changes appear where user doesn't expect them.

  2. SubGit

    SubGit is a Git-SVN bi-directional server-side mirror.

    If one has local access to Subversion repository, one can install SubGit into it:

    $ subgit configure $SVN_REPOS
    # Adjust $SVN_REPOS/conf/subgit.conf to specify your branches and tags
    # Adjust $SVN_REPOS/conf/authors.txt to specify git & svn authors mapping
    $ subgit install $SVN_REPOS
    ...
    $ INSTALLATION SUCCESSFUL
    

    At this moment SubGit converts Subversion repository into Git (it works in opposite direction as well) and installs SVN and Git hooks. As result Subversion and Git repositories are synchronized: every commit and push starts hooks that convert incoming modifications immediately.

    SubGit converts svn:ignore properties into .gitignore files, svn:eol-style and svn:mime-type properties to .gitattributes, so merge commits in Git retain this metadata.

    When one pushes merge commit, SubGit converts all the new commits into Subversion revisions. It honors svn:mergeinfo property, so merge operation is properly tracked by SVN afterwards.

    Even if user pushes very complex Git history, SubGit converts all the commits keeping the merge tracking data valid. We once pushed the whole history of git.git repository at once and it was properly converted into SVN.

    SubGit is a commercial product. It is free for open-source and academic projects and also for projects with up to 10 committers.

    For more details please refer to SubGit documentation and git-svn comparison page.

  3. SmartGit

    SmartGit is a client-side alternative for git-svn.

    SmartGit also supports svn:ignore, svn:eol-style and svn:mime-type properties conversion. And it also sets svn:mergeinfo property for merge commits. It even updates necessary merge tracking data for cherry-pick commits.

    SmartGit is a commercial Git and Mercurial client. It is free for non-commercial usage.

Full disclosure: I'm one of SubGit developers.