163
votes

I was googling around a little bit and couldn't find a good "beginners" guide to SVN, not in the meaning of "how do I use the commands" rather; How do I control my source code?

What I'd like to clear up is the following topics:

  • How often do you commit? As often as one would press Ctrl + s?
  • What is a Branch and what is a Tag and how do you control them?
  • What goes into the SVN? Only Source Code or do you share other files here aswell? (Not considered versioned files.. )

I don't have any idea what branch and tag is so I don't know the purpose, but my wild guess is that you upload stuff to the trunk and when you do a major build you move it to the branch? So, what is considered a major build in this case?

16
A good way to determine a commit 'frequency' is to come at it from a merge point of view. If you had a branch you needed to merge changes from trunk into, picking through a handful of revisions Vs 1000's soon helps you end up at a sensible approach.Phil Cooper

16 Answers

60
votes

The subversion book is an excellent source of information on strategies for laying out your repository, branching and tagging.

See also:

Do you continue development in a branch or in the trunk

Branching strategies

86
votes

I asked myself the same questions when we came to implement Subversion here -- about 20 developers spread across 4 - 6 projects. I didn't find any one good source with ''the answer''. Here are some parts of how our answer has developed over the last 3 years:

-- commit as often as is useful; our rule of thumb is commit whenever you have done sufficient work that it would be a problem having to re-do it if the modifications got lost; sometimes I commit every 15 minutes or so, other times it might be days (yes, sometimes it takes me a day to write 1 line of code)

-- we use branches, as one of your earlier answers suggested, for different development paths; right now for one of our programs we have 3 active branches: 1 for the main development, 1 for the as-yet-unfinished effort to parallelise the program, and 1 for the effort to revise it to use XML input and output files;

-- we scarcely use tags, though we think we ought to use them to identify releases to production;

Think of development proceeding along a single path. At some time or state of development marketing decide to release the first version of the product, so you plant a flag in the path labelled '1' (or '1.0' or what have you). At some other time some bright spark decides to parallelise the program, but decides that that will take weeks and that people want to keep going down the main path in the meantime. So you build a fork in the path and different people wander off down the different forks.

The flags in the road are called 'tags' ,and the forks in the road are where 'branches' divide. Occasionally, also, branches come back together.

-- we put all material necessary to build an executable (or system) into the repository; That means at least source code and make file (or project files for Visual Studio). But when we have icons and config files and all that other stuff, that goes into the repository. Some documentation finds its way into the repo; certainly any documentation such as help files which might be integral to the program does, and it's a useful place to put developer documentation.

We even put Windows executables for our production releases in there, to provide a single location for people looking for software -- our Linux releases go to a server so don't need to be stored.

-- we don't require that the repository at all times be capable of delivering a latest version which builds and executes; some projects work that way, some don't; the decision rests with the project manager and depends on many factors but I think it breaks down when making major changes to a program.

18
votes
* How often do you commit? As often as one would press ctrl + s?

As often as possible. Code doesn't exist unless it is under source control :)

Frequent commits (thereafter smaller change sets) allows you to integrate your changes easily and increase chances to not break something.

Other people noted that you should commit when you have a functional piece of code, however I find it useful to commit slightly more often. Few times I noticed that I use source control as a quick undo/redo mechanism.

When I work on my own branch I prefer to commit as much as possible (literally as often as I press ctrl+s).

* What is a Branch and what is a Tag and how do you control them?

Read SVN book - it is a place you should start with when learning SVN:

* What goes into the SVN?

Documentation, small binaries required for build and other stuff that have some value go to source control.

11
votes

Here are a few resources on commit frequency, commit messages, project structure, what to put under source control and other general guidelines:

These Stack Overflow questions also contain some useful information that may be of interest:

Regarding the basic Subversion concepts such as branching and tagging, I think this is very well explained in the Subversion book.

As you may realize after reading up a bit more on the subject, people's opinions on what's best practice in this area are often varying and sometimes conflicting. I think the best option for you is to read about what other people are doing and pick the guidelines and practices that you feel make most sense to you.

I don't think it's a good idea to adopt a practice if you do not understand the purpose of it or don't agree to the rationale behind it. So don't follow any advice blindly, but rather make up your own mind about what you think will work best for you. Also, experimenting with different ways of doing things is a good way to learn and find out how you best like to work. A good example of this is how you structure the repository. There is no right or wrong way to do it, and it's often hard to know which way you prefer until you have actually tried them in practice.

8
votes

Commit frequency depends on your style of project management. Many people refrain from committing if it'll break the build (or functionality).

Branches can be used in one of two ways, typically: 1) One active branch for development (and the trunk stays stable), or 2) branches for alternate dev paths.

Tags are generally used for identifying releases, so they don't get lost in the mix. The definition of 'release' is up to you.

7
votes

I think the main problem is the mental picture of source control is confused. We commonly have trunk and branches, but then we get unrelated ideas of tags/releases or something to that affect.

If you use the idea of a tree more completely it becomes clearer, at least for me it is.

We get the trunk -> forms branches -> produce fruit (tags/releases).

The idea being that you grow the project from a trunk, which then creates branches once the trunk is stable enough to hold the branch. Then when the branch has produced a fruit you pluck it from the branch and release it as a tag.

Tags are essentially deliverables. Whereas trunk and branches produce them.

4
votes

As others have said, the SVN Book is the best place to start and a great reference once you've gotten your sea legs. Now, to your questions ...

How often do you commit? As often as one would press ctrl + s?

Often, but not as often as you press ctrl + s. It's a matter of personal taste and/or team policy. Personally I would say commit when you complete a functional piece of code, however small.

What is a Branch and what is a Tag and how do you control them?

First, trunk is where you do your active development. It is the mainline of your code. A branch is some deviation from the mainline. It could be a major deviation, like a previous release, or just a minor tweak you want to try out. A tag is a snapshot of your code. It's a way to attach a label or bookmark to a particular revision.

It's also worth mentioning that in subversion, trunk, branches and tags are only convention. Nothing stops you from doing work in tags or having branches that are your mainline, or disregarding the tag-branch-trunk scheme all together. But, unless you have a very good reason, it's best to stick with convention.

What goes into the SVN? Only Source Code or do you share other files here aswell?

Also a personal or team choice. I prefer to keep anything related to the build in my repository. That includes config files, build scripts, related media files, docs, etc. You should not check in files that need to be different on each developer's machine. Nor do you need to check in by-products of your code. I'm thinking mostly of build folders, object files, and the like.

4
votes

Eric Sink, who appeared on SO podcast#36 in January 2009, wrote an excellent series of articles under the title Source Control How-to.

(Eric is the founder of SourceGear who market a plug-compatible version of SourceSafe, but without the horribleness.)

4
votes

Just to add another set of answers:

  • I commit whenever I finish a piece of work. Sometimes it's a tiny bugfix that just changed one line and took me 2 minutes to do; other times it's two weeks worth of sweat. Also, as a rule of thumb, you don't commit anything that breaks the build. Thus if it has taken you a long time to do something, take the latest version before committing, and see if your changes break the build. Of course, if I go a long time without committing, it makes me uneasy because I don't want to loose that work. In TFS I use this nice thing as "shelvesets" for this. In SVN you'll have to work around in another way. Perhaps create your own branch or backup these files manually to another machine.
  • Branches are copies of your whole project. The best illustration for their use is perhaps the versioning of products. Imagine that you are working at a large project (say, the Linux kernel). After months of sweat you've finally arrived at version 1.0 that you release to the public. After that you start to work on version 2.0 of you product which is going to be way better. But in the mean time there are also a lot of people out there which are using version 1.0. And these people find bugs that you have to fix. Now, you can't fix the bug in the upcoming 2.0 version and ship that to the clients - it's not ready at all. Instead you have to pull out an old copy of 1.0 source, fix the bug there, and ship that to the people. This is what branches are for. When you released the 1.0 version you made a branch in SVN which made a copy of the source code at that point. This branch was named "1.0". You then continued work on the next version in your main source copy, but the 1.0 copy remained there as it was at the moment of the release. And you can continue fixing bugs there. Tags are just names attached to specific revisions for ease of use. You could say "Revision 2342 of the source code", but it's easier to refer to it as "First stable revision". :)
  • I usually put everything in the source control that relates directly to the programming. For example, since I'm making webpages, I also put images and CSS files in source control, not to mention config files etc. Project documentation does not go in there, however that is actually just a matter of preference.
3
votes

Others have stated that it depends on your style.

The big question for you is how often you "integrate" your software. Test driven development, Agile and Scrum (and many, many others) rely on small changes and continuous integration. They preach that small changes are made, everyone finds the breaks and fixes them all the time.

However on a larger project (think government, defence, 100k+LOC) you simply can't use continuous integration as it's not possible. In these situations it may be better to use branching to do lots of little commits on but bring back into the trunk ONLY what will work and is ready to be integrated into the build.

One caveat with branching though is that if they aren't managed properly, it can be a nightmare in your repository to get work into the trunk, as everyone is developing from different spots on the trunk (which is incidentally one of the largest arguments for continuous integration).

There is no definitive answer on this question, the best way is to work with your team to come up with the best compromise solution.

2
votes

Version Control with Subversion is the guide for beginners and old hands alike.

I don't think you can use Subversion effectively without reading at least the first few chapters of this.

1
votes

For committing, I use the following strategies:

  • commit as often as possible.

  • Each feature change/bugfix should get its own commit (don't commit many files at once since that will make the history for that file unclear -- e.g. If I change a logging module and a GUI module independently and I commit both at once, both changes will be visible in both file histories. This makes reading a file history difficult),

  • don't break the build on any commit -- it should be possible to retrieve any version of the repository and build it.

All files that are necessary for building and running the app should be in SVN. Test files and such should not, unless they are part of the unit tests.

1
votes

A lot of good comments here, but something that hasn't been mentioned is commit messages. These should be mandatory and meaningful. Especially with branching/merging. This will allow you to keep track of what changes are relevant to which bugs features.

for example svn commit . -m 'bug #201 fixed y2k bug in code' will tell anyone who looks at the history what that revision was for.

Some bug tracking systems (eg trac) can look in the repository for these messages and associate them with the tickets. Which makes working out what changes are associated with each ticket very easy.

1
votes

The policy at our work goes like this (multi-developer team working on object oriented framework):

  • Update from SVN every day to get the previous day's changes

  • Commit daily so if you are sick or absent next day(s) someone else can easily take over from where you left off.

  • Don't commit code that breaks anything, since that will impact the other developers.

  • Work on small chunks and commit daily WITH MEANINGFUL COMMENTS!

  • As a team: Keep a Development branch, then move pre-release code (for QA) into a Production branch. This branch should only ever have fully working code.

0
votes

The TortoiseSVN TSVN Manual is based on subversion book, but available in a lot more languages.

0
votes

I thinks there is two way about committing frequency:

  1. Commit very often, for each implemented method, small part of code, etc.
  2. Commit only completed parts of code, like modules, etc.

I prefer the first one - because using source control system is very useful not only for project or company, the first of all it's useful for the developer. For me the best feature is to roll back all code while searching the best assigned task implementation.