Why is branching and merging easier in Mercurial than in Subversion?

Question

Handling multiple merges onto branches in Subversion or CVS is just one of those things that has to be experienced. It is inordinately easier to keep track of branches and merges in Mercurial (and probably any other distributed system) but I don't know why. Does anyone else know?

My question stems from the fact that with Mercurial you can adopt a working practice similar to that of Subversions/CVSs central repository and everything will work just fine. You can do multiple merges on the same branch and you won't need endless scraps of paper with commit numbers and tag names.

I know the latest version of Subversion has the ability to track merges to branches so you don't get quite the same degree of hassle but it was a huge and major development on their side and it still doesn't do everything the development team would like it to do.

There must be a fundamental difference in the way it all works.

Damien Diederen Damien Diederen · Accepted Answer · 2008-09-04T20:32:43

In Subversion (and CVS), the repository is first and foremost. In git and mercurial there is not really the concept of a repository in the same way; here changes are the central theme.

+1

The hassle in CVS/SVN comes from the fact that these systems do not remember the parenthood of changes. In Git and Mercurial, not only can a commit have multiple children, it can also have multiple parents!

That can easily observed using one of the graphical tools, gitk or hg view. In the following example, branch #2 was forked from #1 at commit A, and has since been merged once (at M, merged with commit B):

o---A---o---B---o---C         (branch #1)
     \       \
      o---o---M---X---?       (branch #2)

Note how A and B have two children, whereas M has two parents. These relationships are recorded in the repository. Let's say the maintainer of branch #2 now wants to merge the latest changes from branch #1, they can issue a command such as:

$ git merge branch-1

and the tool will automatically know that the base is B--because it was recorded in commit M, an ancestor of the tip of #2--and that it has to merge whatever happened between B and C. CVS does not record this information, nor did SVN prior to version 1.5. In these systems, the graph would look like:

o---A---o---B---o---C         (branch #1)
     \    
      o---o---M---X---?       (branch #2)

where M is just a gigantic "squashed" commit of everything that happened between A and B, applied on top of M. Note that after the deed is done, there is no trace left (except potentially in human-readable comments) of where M did originate from, nor of how many commits were collapsed together--making history much more impenetrable.

Worse still, performing a second merge becomes a nightmare: one has to figure out what the merge base was at the time of the first merge (and one has to know that there has been a merge in the first place!), then present that information to the tool so that it does not try to replay A..B on top of M. All of this is difficult enough when working in close collaboration, but is simply impossible in a distributed environment.

A (related) problem is that there is no way to answer the question: "does X contain B?" where B is a potentially important bug fix. So, why not just record that information in the commit, since it is known at merge time!

P.-S. -- I have no experience with SVN 1.5+ merge recording abilities, but the workflow seems to be much more contrived than in the distributed systems. If that is indeed the case, it's probably because--as mentioned in the above comment--the focus is put on repository organization rather than on the changes themselves.

Why is branching and merging easier in Mercurial than in Subversion?

6 Answers