First, credit to the answer from @cmcginty. It was a great starting point for me, and much of what I'll post here borrowed heavily from it. However, the repos that I was moving have years of history which led to a few issues following that answer to the letter (hundreds of branches and tags that would need to be manually moved for one; read more later).
So after hours of searching and trial and error I was able to put together a script which allowed me to easily move several projects from SVN to GIT, and I've decided to share my findings here in case anyone else is in my shoes.
<tl;dr> Let's get started
First, create an 'Authors' file which will translate basic svn users to more complex git users. The easiest way to do this is using a command to extract all users from the svn repo you are going to move.
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt
This will produce a file called authors-transform.txt with a line for each user that has made a change in the svn repo it was ran from.
someuser = someuser <someuser>
Update to include full name and email for git
someuser = Some User <[email protected]>
Now start the clone using your authors file
git svn clone --stdlayout --no-metadata -r854:HEAD --authors-file=authors-transform.txt https://somesvnserver/somerepo/ temp
- --stdlayout indicates that the svn repo follows the standard /trunk /branches /tags layout
- --no-metadata tells git not to stamp metadata relating to the svn commits on each git commit. If this is not a one-way conversion remove this tag
- -r854:HEAD only fetches history from revision 854 up. This is where I hit my first snag; the repo I was converting had a 'corrupted' commit at revision 853 so it would not clone. Using this parameter allows you to only clone part of the history.
- temp is the name of the directory that will be created to initialize
the new git repo
This step can take awhile, particularly on a large or old repo (roughly 18 hours for one of ours). You can also use that -r switch to only take a small history to see the clone, and fetch the rest later.
Move to the new directory
cd temp
Fetch any missing history if you only pulled partial in clone
git svn fetch
Tags are created as branches during cloning. If you only have a few you can convert them one at a time.
git 1.0.0 origin/tags/1.0.0
However, this is tedious if you have hundreds of tags, so the following script worked for me.
for brname in `git branch -r | grep tags | awk '{gsub(/^[^\/]+\//,"",$1); print $1}'`; do echo $brname; tname=${brname:5}; echo $tname; git tag $tname origin/tags/$tname; done
You also need to checkout all branches you want to keep
git checkout -b branchname origin/branches/branchname
And if you have a lot of branches as well, this script may help
for brname in `git branch -r | grep -v master | grep -v HEAD | grep -v trunk | grep -v tags | awk '{gsub(/^[^\/]+\//,"",$1); print $1}'`; do echo $brname; git checkout -b $brname origin/$brname; done
This will ignore the trunk branch, as it will already be checked out as master and save a step later deleting the duplicate branch, as well as ignoring the /tags that we already converted.
Now is a good time to take a look at the new repo and make sure you have a local branch or tag for anything you want to keep as remote branches will be dropped in a moment.
Ok, now lets clone everything we've checked out to a clean repo (named temp2 here)
cd ..
git clone temp temp2
cd temp2
Now we'll need to checkout all of the branches one more time before pushing them to their final remote, so follow your favorite method from above.
If you're following gitflow you can rename your working branch to develop.
git checkout -b WORKING
git branch -m develop
git push origin --delete WORKING
git push origin -u develop
Now, if everything looks good, you're ready to push to your git repository
git remote set-url origin https://somebitbucketserver/somerepo.git
git push -u origin --all
git push origin --tags
I did run into one final issue which was that Control Freak initially blocked me from pushing tags that I didn't create, so if your team uses Control Freak you may need to disable or adjust that setting for your initial push.