1
votes

Previous SVN workflow

  • svn cp $PATH/trunk $REPO/branches/feature_xxx
  • git checkout $REPO/branches/feature_xxx
  • developers worked on working copy of branch, ocasionally merging back with master and before marking the branch as "ready to merge"
  • when the feature was ready, owner did svn merge --reintegrate $REPO/branches/feature_xxx, run complete test suite and if everything ok then ```svn commit -m "merged feature_xxx"
  • when a release was ready svn cp $REPO/trunk $REPO/tags/vX.Y.Z and released that tag.

The svn trunk history looked nice and clean:

version 4.5.4
feature_xxx
bugfix_yyy
version 4.5.3
feature_zzz
version 4.5.3
bugfix_aaa

We don't care about internal commits, but if we wanted we can use svn log -g that will expand the merged commits into it's internal ones (was useful one or two times).

Now we switched to Git and Gitlab, mosty because of it's feature to comment on merge requests and diffs. But we are struggling to get the same clean workflow, this was more or less what happened:

Strategy 1: attemped to merge a feature branch (via git or Gitlab "Merge" button)

Ok.. this is easy! Ehh wait WTF are those previous commits that are polluting my master's log.

Resolution: no way, this is a mess.

Strategy 2: learned about git merge --squash

Ok.. this is what we want.. now we have a nice history, but we have to do it from command line because Gitlab CE doesnt' allows it from GUI (https://gitlab.com/gitlab-org/gitlab-ce/issues/34591). No big deal, we do it from command line.. oh wait, why hasn't Gitlab detected we closed the merge request..!? that was happening automatically before.

So, we learned there is no way for it to detect the branch was merged.. sure there is another way (eg. closing with a commit message). Ups not implemented yet: https://gitlab.com/gitlab-org/gitlab-ce/issues/13268.

Resolution: the history looks nice, BUT we need to manually close the merge requests so we loose the ability to track which ones we closed as "not intented to do" from "closed but merged manually via squash". Sure there must be a better way..

Strategy 3: learned about git rebase..

Ok, so when a branch is finished we do git rebase origin/master and then git rebase -i XXXXX where XXXX is the oldest common ancestor and we sqash all commits with a message "implemented feature_xxx". And then we simply merge with master via cmd line or GUI.

Resolution: now gitlab detects which merge requests are merged.. good. But rebasing can be a PITA for long branches and a loss of time sometimes, so when things get hairy we simply go back to merge --squash. Also the log is not so clean becaouse we have for every branch the implemented feature_xxx and then merged feature_xxx commits.. it's not as bad as simply merging but still is noise.

Conclusion

So now we are using the last strategy.. trying to git rebase most of the time but reverting to git merge --squash when things get hairy. But to be honest we are not 100% happy, the SVN workflow was clean and simpler.

Are we missing something? Thanks

1

1 Answers

3
votes

Migrating from Subversion to Git is not an easy task because Git is much more powerful and more complex. But it is definitely a fruitful move. Here is a set of related remarks/advice/references:

Atomic commits

We don't care about internal commits, but if we wanted we can use svn log -g that will expand the merged commits into it's internal ones (was useful one or two times).

Even if you don't want all the time to look at the internal commits that constitute a feature, with Git the best practice is to "commit early and often". The idea is to make each commit small and implement (or fix) only one thing. See for example this blog article atomic-commits. There are also best practices to write "good commit messages", see this article or that one.

I recall that in Git, each commit has a lot of metadata (an author name + e-mail + timestamp; a committer name + e-mail + timestamp; and a SHA1 signature) and contrarily to SVN, git commit only acts locally, so that you need to do git push to publish your changes to a remote repository. All this meta-data can be shown by GUI tools such as gitk.

Feature branches

Among the 3 strategies you are mentioning, the first one is definitely the best one: create one feature branch per feature, then "merge" them in master. But this strategy has several variants that I'll elaborate in the Workflows section below.

Strategy 1: attemped to merge a feature branch (via git or Gitlab "Merge" button) Resolution: no way, this is a mess.

A key concept to know about merges is the notion of fast-forward merge vs. non fast-forward merge or true merge.

If both types of merge are possible, the CLI command git checkout master && git merge feature will do a fast-forward merge, which could be acceptable if the feature branch contains only one commit. Otherwise it is best practice to force doing a non-fast-forward merge by doing git merge --no-ff feature. But if you don't use command-line and instead click on the "Merge" button of GitHub or GitLab, it will perform a non-fast-forward merge.

The advantage of a non-fast-forward merge (= true merge) is that your history looks like a tree (unlike SVN's typical linear history) that allows one to easily keep track of the subset of commits that belongs to the feature.

See for example this screenshot of gitk with a possible master history with 2 branches merged:
enter image description here

Ok.. this is easy! Ehh wait WTF are those previous commits that are polluting my master's log.

This is not really an issue, but FYI there is a feature of git log and gitk that allows to hide the "internal commits" of the feature branches in one such history (assuming the feature branches have been merged with --no-ff mode): the --first-parent option.

Here is a screenshot corresponding to the same example with gitk --first-parent:
enter image description here

Workflows

Branching and merging is very powerful in Git so that there are many possible ways to develop and integrate changes in master. However, it is better to follow a systematic workflow in one's development team, and two different workflows are very popular and have been proven to be very effective:

I just put a link to the main reference presenting these two workflows but you can find many other references on the Web, including scripts to facilitate the application of the Git flow for example.

To summarize the difference between the two: the Git flow has two main branches develop and master, with specific conventions for releases, while the GitHub flow is simpler (no develop branch) and more adapted to the case of continuous delivery.

Rebasing

You mentioned some git rebase commands in your post so I guess you are familiar with this command and related implications, but just to be self-contained here are several remarks:

Rebasing (git rebase another-branch) basically means "replaying the commits of current branch upon another branch", so rebasing is a form of "history rewriting", and rewriting a commit implies its SHA1 is changed. So the main rule here is that you should not rebase changes that have been already published with git push.

Note that both Git flow and GitHub flow use git merge, but not git rebase.

Actually, it is good practice to rewrite one's local history before pushing, to ensure that the commits are atomic or have sensible commit messages, etc. To this aim, one can use git commit --amend or git rebase --interactive ancestor-commit.

So now we are using the last strategy.. trying to git rebase most of the time but reverting to git merge --squash when things get hairy. But to be honest we are not 100% happy, the SVN workflow was clean and simpler.

The strategies 2 and 3 you mention are indeed possible from a command-line-only perspective to keep a linear history, but doing this you'd actually be following a SVN workflow with Git, while it is much more usual to have a non-linear history in Git and take advantage of its feature branches facilities...

Ok.. this is what we want.. now we have a nice history, but we have to do it from command line because Gitlab CE doesnt' allows it from GUI

Actually I guess that GitLab CE doesn't allow you to do this easily because it has been designed in the first place to support Git workflows (which typically implies doing true merges and so on).

Extra references

For more insight on possible workflows beyond the Git-flow/GitHub-flow, here is a long article, but worth reading, written by GitLab on these topics: https://docs.gitlab.com/ee/topics/gitlab_flow.html

Another useful reference is https://git.github.io/git-reference/, which gives a summary of the main Git commands, including git tag that I did not mention in my post but which is very important what it comes to using the Git flow.