106
votes

If you have multiple, unrelated projects, is it a good idea to put them in the same repository?

myRepo/projectA/trunk
myRepo/projectA/tags
myRepo/projectA/branches
myRepo/projectB/trunk
myRepo/projectB/tags
myRepo/projectB/branches

or would you create new repositories for each?

myRepoA/trunk
myRepoA/tags
myRepoA/branches
myRepoB/trunk
myRepoB/tags
myRepoB/branches

What are the pros and cons of each? All that I can currently think of is that you get mixed revision numbers (so what?), and that you can't use svn:externals unless the repository is actually external. (i think?)

The reason I ask is because I'm considering consolidating my multiple repos into one, since my SVN host has started charging per repo.

13
I also asked the same question a while ago, so if you need anymore help there may be some here: stackoverflow.com/questions/130447/…Nathan W
oh dammit - sorry for the dupe then. I tried searching, I swear!nickf
No probs :) I'm not worried just mentioned it so if you didn't get the help you needed here then there might have been some more help in my QNathan W
nickf, svn:externals work just fine with one big repository. You just point to the subdirectory in the repo with the code you are interested in.Ben Gartner
In any normal commercial setting, of course, obviously, you'd have multiple repos. (So that, obviously, you can be certain different clients/groups can only see their own different projects.) Imagine one of the big free-subversion sites where you can use a subversion repo.... Think how silly it would be if they only had one enormous repo instead of one for each of us!!Fattie

13 Answers

77
votes

The single vs. multiple issue comes down to personal or organizational preference.

Management of multiple vs. single mainly comes down to access control and maintenance.

Access control for a single repository can be contained in a single file; Multiple repositories are may require multiple files. Maintenance has similar issues - one big backup, or a lot of little backups.

I manage my own. There's one repository, multiple projects, each with its own tags, trunk and branches. If one gets too big or I need to physically isolate a customer's code for their comfort, I can quickly and easily create a new repository.

I recently consulted with a relatively large firm on migrating multiple source code control systems to Subversion. They have ~50 projects, ranging from very small to enterprise applications and their corporate website. Their plan? Start with a single repository, migrate to multiple if necessary. The migration is almost complete and they're still on a single repository, no complaints or issues reported due to it being a single repository.

This isn't a binary, black & white issue.

Do what works for you - were I in your position, I'd combine projects into a single repository as fast as I could type the commands, because the cost would be a major consideration in my (very, very small) company.

JFTR:

revision numbers in Subversion really have no meaning outside the repository. If you need meaningful names for a revision, create a TAG

Commit messages are easily filtered by path in the repository, so reading only those related to a particular project is a trivial exercise.


Edit: See Blade's response for details on using a single authorization/authentication configuration for SVN.

25
votes

For your specific case one(1) repository is perfect. You will save a lot of money. I always encourage people to use a single repository. Because it is similar to a single filesystem: It is easier

  • You will have a single place where you look for code
  • You will have a single authorisation
  • You will have a single commit number(ever tried to build a project which is spread over 3 repos?)
  • You can better reuse common libraries and track your progress in these libs(svn:externals are PITA and will not solve all problems)
  • Projects planned as fully different items, can grow together and share functions and interfaces. This will be very difficult to achieve in multiple repos.

There is a single point for multiple repositories: administration of huge repos is uncomfortable. Dumping/loading huge repos takes a lot of time. But as you do not do any administration, I think it will not be your concern ;)

SVN scales very well with bigger repositories, there is no slowdown even on huge (>100GB) repositories.

So you will have less hassle with a single repository. But you really should think about the repo layout!

7
votes

I would use multiple repositories. In addition to the user access issue, it also makes backup and restore easier. And if you find yourself in a position where somebody wants to pay you for your code (and its history), it's easier to give them just a repository dump.

I would suggest that consolidating repositories just because of the charging policies of your hosting provider is not a very good reason.

7
votes

We use a single repository. My only concern was scale, but after seeing ASF's repository (700k revisions and counting) I was pretty convinced performance would not be an issue.

Our projects are all related, different interlocking modules which form a set of dependencies for any given app. For this reason, a single repository is ideal. You may want seperate trunk/branches/tags for each project, but you're still able to atomically commit a change across your entire codebase within a single revision. This is awesome for refactoring.

7
votes

Be aware that when making your decision, many SVN repos can share the same config file.

Example (taken from link above):

In shell:

$ svn-admin create /var/svn/repos1
$ svn-admin create /var/svn/repos2
$ svn-admin create /var/svn/repos3

File: /var/svn/repos1/conf/svnserve.conf

[general]
anon-access = none # or read or write
auth-access = write
password-db = /var/svn/conf/passwd
authz-db = /var/svn/conf/authz
realm = Repos1 SVN Repository

File: /var/svn/conf/authz

[groups]
group_repos1_read = user1, user2
group_repos1_write = user3, user4
group_repos2_read = user1, user4

### Global Right for all repositories ###
[/]
### Could be a superadmin or something else ###
user5 = rw

### Global Rights for one repository (e.g. repos1) ###
[repos1:/]
@group_repos1_read = r
@group_repos1_write = rw

### Repository folder specific rights (e.g. the trunk folder) ###
[repos1:/trunk]
user1 = rw

### And soon for the other repositories ###
[repos2:/]
@group_repos2_read = r
user3 = rw
5
votes

I would create separate repositories... Why? The revision numbers and commit messages will just not make any sense if you have a lot of unrelated projects in only one repository, it will be for sure a big mess in short term....

5
votes

We are a small software company and we use a single repo for all of our development. The tree looks like this:

/client/<clientname>/<project>/<trunk, branches, tags>

The idea was that we would have client and internal work in the same repo, but we ended up having our company as a "client" of itself.

This has worked really well for us, and we use Trac to interface to it. Revision numbers are across the whole repo and not specific to one project, but that doesn't phase us.

4
votes

Personally, I'd create new repositories for each. It keeps the check out process much simpler and makes administration on the whole easier, at least with regards to user access and backups. Also, it avoids the global version number problem, so the version number is meaningful on all projects.

Really though, you should just use git ;)

4
votes

one additional thing to consider is the fact that using multiple repositories cause you to loose the ability to have unified logging(svn log command) this alone will be good reason for choosing single repository.

I use TortuiseSvn and found that the "Show Log" option is a mandatory tool. although your projects are unrelated, I'm sure that you will find that having a centralized global cross-projects information (paths, bug ids, messages and so on....) is always useful.

2
votes

If you plan to or use tool like trac wich integrate with SVN, it makes more sense to use one repo per project.

2
votes

Similar to Blade's suggestion about sharing files, here is a slightly easier, yet less flexible solution. I setup ours like so:

  • /var/svn/
  • /var/svn/bin
  • /var/svn/repository_files
  • /var/svn/svnroot
  • /var/svn/svnroot/repos1
  • /var/svn/svnroot/repos2
  • ...

In "bin", I keep a script called svn-create.sh which will do all of the setup work of creating an empty repository. I also keep the backup script there.

In "repository_files", I keep common "conf" and "hooks" directories that all of the repositories have sym links to. Then, there's only one set of files. This does eliminate the ability to have granular, per-project access without breaking the links, though. That was not a concern where I set this up.

Last, I keep the main directory /var/svn under source control ignoring everything in svnroot. That way the repository files and scripts are under source control as well.

#!/bin/bash

# Usage:
# svn-create.sh repository_name

# This will:
# - create a new repository
# - link the necessary commit scripts
# - setup permissions
# - create and commit the initial directory structure
# - clean up after itself

if [ "empty" = ${1}"empty" ] ; then
  echo "Usage:"
  echo "    ${0} repository_name"
  exit
fi

SVN_HOME=/svn
SVN_ROOT=${SVN_HOME}/svnroot
SVN_COMMON_FILES=${SVN_HOME}/repository_files
NEW_DIR=${SVN_ROOT}/${1}
TMP_DIR=/tmp/${1}_$$

echo "Creating repository: ${1}"

# Create the repository
svnadmin create ${NEW_DIR}

# Copy/Link the hook scripts
cd ${NEW_DIR}
rm -rf hooks
ln -s ${SVN_COMMON_FILES}/hooks hooks

# Setup the user configuration
cd ${NEW_DIR}
rm -rf conf
ln -s ${SVN_COMMON_FILES}/conf conf

# Checkout the newly created project
svn co file://${NEW_DIR} ${TMP_DIR}

# Create the initial directory structure
cd ${TMP_DIR}
mkdir trunk
mkdir tags
mkdir branches

# Schedule the directories addition to the repository
svn add trunk tags branches

# Check in the changes
svn ci -m "Initial Setup"

# Delete the temporary working copy
cd /
rm -rf ${TMP_DIR}

# That's it!
echo "Repository ${1} created. (most likely)"
2
votes

Similar to mlambie's of using a single repo, but went bit further with the folder structure to easily zoom to particular type of projects - web html based projects vs. cs (C#) vs. sql (SQL create/execute scripts) vs. xyz (Domain Specific Languages like afl (AmiBroker Formula Language) or ts (TradeStation)):

/<src|lib>/<app-settings|afl|cs|js|iphone|sql|ts|web>/<ClientName>/<ProjectName>/<branches|tags>

Note, I have trunk live within branches as I treat it as the default branch. The only pain sometimes is when you want to quickly create another project you need to build out the ProjectName/branches|tags structure. I use app-settings simply as place to keep specific Apps settings files in repo so easily shareable to others (and substitute ClientName to VendorName and ProjectName to AppName in this folder structure; and the branches|tags can be useful to tag settings across different major versions of vendor products too).

Welcome to any comments on my structure - I recently changed it to this and so far pretty happy but sometimes find it burdensome to maintain branches|tags structures per project - particularly if the project is simply a project setup simply to Unit Test another project.

1
votes

My suggestion is one. Unless you have different users accessing each one, then I'd say use multiple.

But again, even that's not a good reason to use multiple.