When we started with Source Code Management (SCM) – at the time – we still had a virtual Windows server running with QualityHosting.de. So a friend set up VisualSVN for us on this box. This got us started with Subversion.
Two years later we decided that was enough of procrastrinating. Finally the time had come to switch to git for good.
Subversion served us well, because – seriously – it had several attributes that I chose to interpret as advantages. Specifically for my components I wanted to have a central place that I am controlling access to.
Why not the Cloud?
Another question that I get often is: “Why don’t you host your source code in the cloud?”
We could be paying a few dollars per month to GitHub or BitBucket and never have to worry about the administration of the servers that host our precious source code. Simple reason: we don’t want to have to rely on another company and we want to keep our intellectual property in a place that we control.
Because of this self-hosting is a MUST.
Now if I had the money I would happily pay GitHub $5000 per year to self-host a GitHub instance. But I don’t have the size for that, you need to have at least 20 programmers for this solution to be viable.
We briefly dabbled in Gitorious but it had many annoying quirks and so we finally settled on the highly recommended GitLab. I especially like how it very closely resembles the GitHub experience, but totally free. Also setup is a cinch, you just launch their virtual appliance as a new VirtualBox instance.
The criteria for a successful migration are:
- Move the users from user names to email addresses
- Migrate the Source
- Preserve Tags and Branches
My colleague René spearheaded the migration by boldly moving our ELO projects to the new server. He followed the instructions on git-scm.com on how to move from SVN to GIT. This is referred to as “the guide” below.
I was a bit more timid.
I can be bold if I pretend to myself that I am teaching the new information already to somebody. This is why I like to be writing a blog post at the same time as doing something the first time so that I can keep track of the individual pieces and see how it all fits together. Those articles also serve as our knowledge base for future endeavors.
Subversion uses user names for authors, whereas Git uses email addresses. So the first step needs to be to get a list of contributing users on a repo to establish a mapping.
You can create such a basic mapping file by running. Note that on OSX the grep parameter is -e for a regular expression instead of the -P mentioned in the guide.
svn log --xml | grep -e "^<author" | sort -u | perl -pe 's/(.*?)<\/author>/$1 = /' > users.txt
No on most of my repositories I was the only author, but a few had collaborators, so it makes sense to keep appending to the users file for each migrated repo.
You specify the mapping in this format. The fastest method I found was to find an email in mail.app and right-click copy the email address from there. This adds the full name as well as the email address in angle brackets.
The next step is to uses this authors mapping file on a clone of the svn repo.
There is a handy tool that allows git fun boys to work with Subversion repositories: git-svn. This is able to talk to an SVN server and pull down the source code, but locally it becomes a git repository.
Git Svn turns each SVN revision into an equivalent git commit. You have to kiss your revision numbers goodbye. Hello commit SHAs.
git svn clone https://svn.cocoanetics.com/DTLoupe --authors-file=users.txt --no-metadata --stdlayout DTLoupe
There are two differences here versus the guide. First I am using git from the Xcode.app bundle via an alias, so there is no git-svn command for me, but I have to call “git svn” with a space instead. Secondly I find the -s confusing, so I substituted that with –stdlayout.
–authors-file specifies the user mapping file we created earlier
–no-metadata eliminates some unnecessary meta info. This makes subsequent fetching impossible, but we won’t need that anyway
–stdlayout causes the master branch to be connected to the svn standard trunk folder.
If you look for tags or branches now you will not find them. Don’t worry we need some post-processing for these.
Tags and Branches
The git-svn tool creates a remote tracking branch for each tag it encounters on the Subversion repository because on SVN a tag is a folder under the tags folder of the standard layout.
This script enumerates the remote references, creates a tag for each and removes the branch reference.
git for-each-ref refs/remotes/tags | cut -d / -f 4- | grep -v @ | while read tagname; do git tag "$tagname" "tags/$tagname"; git branch -r -d "tags/$tagname"; done
Now you will have tags show up with
git tag -l
Branches get migrated the same way.
git for-each-ref refs/remotes | cut -d / -f 3- | grep -v @ | while read branchname; do git branch "$branchname" "refs/remotes/$branchname"; git branch -r -d "$branchname"; done
You should still execute this second step even if you have no branches. “trunk” is also a remote branch and this removes it and replaces it with a local branch.
So at this point I saw a master and a trunk local branch with
Since both the trunk and the master branch point to the same commit, we can get rid of the obsolete trunk branch.
git branch -d trunk
Now we are ready for the repository to be pushed to its new home.
A New Home
So the next step is to create a repository. On GitLab those are called projects. Projects always belong to a Namespace. This can be either Global, belong to a specific user or Group. In GitLab nomenclature Groups group Projects not Users. Users a grouped in Teams.
The namespace becomes part of the project URL, so you can have multiple projects with the same name as long as they belong to different name spaces.
Once you have created the repository GitLab shows you nicely how to add the remote to a repo.
We want all branches and tags pushed to the repo, so we do:
git push origin --all git push origin --tags
To see that really all things arrived at its new SCM server we can explore what we now see on the repo instead of the setup instructions. We see our master branch, I didn’t have any other in this example.
Also the release tag has arrived.
At this time we permit ourselves a big sigh of relief. The next step is migrate something larger using svn submodules.
The next project to migrate will be DTRichTextEditor which has several external dependencies: DTCoreText and DTWebArchive are open source projects on GitHub which until now I kept a full copy of in the svn repository. Those will turn into git references. On is the previously migrated DTLoupe which was a so-called SVN External, aka submodule. This will also become a git submodule reference but into my own repository.
After cloning the DTRichTextEditor project with git-svn there was no trace of the the svn external. So all I needed to do is to add a new git submodule for it. At the repository root:
git submodule add firstname.lastname@example.org:parts/dtloupe.git Core/Externals/DTLoupe
This clones the above created repository into the specified subfolder. I had to point the reference to the xcodeproj to the new path in the Xcode project since the folder name changed.
For DTCoreText and DTWebArchive I still had the full copies below Core/External. So these had to be removed first. The clean removes the untracked folders.
git rm -r Core/Externals/DTCoreText/ git rm -r Core/Externals/DTWebArchive/ git clean -df
Now we can add the submodules in their stead:
git submodule add https://github.com/Cocoanetics/DTCoreText.git Core/Externals/DTCoreText git submodule add https://github.com/Cocoanetics/DTWebArchive.git Core/Externals/DTWebArchive
There is still one submodule missing because it is not directly references, but indirectly. DTCoreText needs DTFoundation and has this an a submodule.
cd Core/Externals/DTCoreText git submodule init git submodule update
The init sets up some internal reference and the update then clones a copy of DTFoundation.
The above can also be achieved with a one-liner (thanks James Munro and Fabio Gallonetto):
git submodule update --init --recursive
I had to go into the Xcode project and make sure that all targets are still building. Some tweaking was necessary.
Since we don’t want to check in all sorts of Xcode meta info, it is wise to add a gitignore file.
.DS_Store build *.mode1v3 *.pbxuser project.xcworkspace xcuserdata .svn
Having done all these steps we want to see the cleanup commit like this:
git status # On branch master # Changes to be committed: # (use "git reset HEAD ..." to unstage) # # new file: .gitignore # modified: .gitmodules # new file: Core/Externals/DTCoreText # new file: Core/Externals/DTWebArchive # # Changes not staged for commit: # (use "git add ..." to update what will be committed) # (use "git checkout -- ..." to discard changes in working directory) # # modified: DTRichTextEditor.xcodeproj/project.pbxproj #
So we commit these changes and then push everything to the newly created git repo like before.
At this stage we have successfully migrated the project to our git server. All that remains now is to point the Jenkins CI to the new repository and then start adding users to give them access to it.
GitLab differs from GitHub slightly when it comes to working with users contributing changes. On GitHub the process involves forking a repo into your own workspace, modifying the fork and submitting the pull request to the origin maintainer. On GitLab there is no forking. Instead users push their changes to branches on the master repo and these branches can then be merged into master.
There is also a “Network” view on GitLab that shows the branching history on a graph. Seeing this totally flat speaks a tale of branching and merging not being very much fun on Subversion.
I bet we’ll be doing much more branching now that we are on git, if only to make this chart a little more interesting.