====== A Simplistic Comparison of Distributed Revision Control Systems by Example ====== Lately I've been wasting a lot of time reading articles about Distributed Revision Control Systems trying to figure out which one is right for me. After reading dozens of diverging and/or outdated opinions, hateful rants and linus-fanboy-loveletters on the topic I finally gave up and decided to find //the right one//(tm) by myself. In this article I've recorded the most important properties of the different DRCSs which helped me decide on one so others can draw their own conclusions from my findings. The properties I'm discussing are of course just a select few; however, they should be typical enough to draw conclusions about the general behavior of the different systems. The attributes of the greatest importance to me were: - Ease-of-Use \\ I don't want to bother with anything that makes getting the job done more annoying than it has to be or is overly unintuitive to use. - Good Documentation \\ Without a good documentation learning something new can be a pain. - Portability \\ While I personally am happy as long as it runs on Linux it is rather likely that I will want to develop software with others (e.g. Windows users). Having a RCS that is equally at-home in both worlds suddenly sounds like a good idea. - Extensibility / Plugin-Systems \\ If it lacks a feature I want to be able to hack it till it does what i want or easily use the work of somebody else who had the same problem. - Performance \\ While I don't really care if one DRCS is 10% faster or slower than the other things should stay reasonable. The DRCSs I tested: * [[http://git.or.cz/|Git]] 1.5.4.3 * [[http://www.selenic.com/mercurial/wiki/|Mercurial]] 0.9.5 * [[http://bazaar-vcs.org/|Bazaar]] 1.2.0.candidate.1 I left out [[http://darcs.net/|darcs]] because it appears to have some serious performance issues (at least I've read so multiple times; however this information //might// be outdated or the issue //might// get fixed in the future) and 3 systems are already more than enough for a side-by-side comparison. Also i don't have a clue about Haskell but quiet some experience with the other used languages (Git: C+Perl+sh, Mercurial: Python+C, Bazaar: Python), which counts as a lack of "Extensibility" on my personal attribute list. I also left out [[http://monotone.ca/|monotone]] because apparently nobody is using it and the syntax looked rather cumbersome to me. I will be using the following example pseudo project to demonstrate the differences between the DRCSs. >phil@straylight:~/tmp/project % ls bar.py bar.pyc foo.py ''bar.pyc'' contains Python-bytecode and the .py-files normal Python-sourcecode. #! /usr/bin/env python import bar bar.hello() def hello(): print "hello" ===== Documentation ===== * Git \\ http://git.or.cz/gitwiki/GitDocumentation * Mercurial \\ http://www.selenic.com/mercurial/wiki/ * Bazaar \\ http://bazaar-vcs.org/Documentation Against my expectations the documentation of all projects was pretty good; even the one of Git which I've read rather negative things about. I especially liked the explanation of the different possible [[http://bazaar-vcs.org/Workflows|workflows]] from the Bazaar project (Note: most of these are also possible with Mercurial). ===== Portability ===== Mercurial and Bazaar run wherever you want; Git however still got issues with that. Officially Git currently runs only with [[http://www.cygwin.com/|cygwin]] on Windows which is rather annoying to install for a single program. Luckily there is a fork of Git which is compilable [[http://code.google.com/p/msysgit/|using MinGW]] and should soon be merged into the official Git tree. This should solve most issues except for the less-than-great performance on Windows (I haven't benchmarked this myself, but it appears to be consensus that git was written using functions and system calls which are fast on Linux, but not Windows). Also all tested systems got more or less advanced TortoiseCVS-clones for Windows called [[http://repo.or.cz/w/git-cheetah.git/|git-cheetah]], [[http://tortoisehg.sourceforge.net/|TortoiseHg]] and [[http://bazaar-vcs.org/TortoiseBzr|TortoiseBZR]]; so getting Windows-users with a dislike for commandlines to use these should be a none-issue. ===== Extensibility ===== Both [[http://www.selenic.com/mercurial/wiki/index.cgi/UsingExtensions|Mercurial]] and [[http://bazaar-vcs.org/BzrPlugins|Bazaar]] support plugins. Git doesn't. ===== Performance ===== I didn't perform any benchmarking myself because I don't expect that there will be any noticeable performance differences on the projects I'm likely to work on. However, Git is supposed to be the fastest (as long as it runs on Linux at least) and Bazaar the slowest. ===== Ease of Use ===== Here I will take a look at the most common operations and how they manage to annoy me. ==== Getting started ==== In this section I will create a repository, add the projects files to it and do some minor changes to the code (using vim). === Git === >phil@straylight:~/tmp/git % cp ../project/* . >phil@straylight:~/tmp/git % git init Initialized empty Git repository in .git/ >phil@straylight:~/tmp/git % git add . >phil@straylight:~/tmp/git % git commit Created initial commit 3fdb29b: initial commit 3 files changed, 7 insertions(+), 0 deletions(-) create mode 100644 bar.py create mode 100644 bar.pyc create mode 100644 foo.py >phil@straylight:~/tmp/git % vim bar.py >phil@straylight:~/tmp/git % git commit # On branch master # Changed but not updated: # (use "git add ..." to update what will be committed) # # modified: bar.py # no changes added to commit (use "git add" and/or "git commit -a") >phil@straylight:~/tmp/git % git commit -a Created commit 006f2f7: 2nd commit 1 files changed, 1 insertions(+), 1 deletions(-) This is the point where most people normally start wondering why ''git commit'' doesn't do what they are expecting it to do. This is because "//Git tracks content not files//" or to quote the explanation from the [[http://git.or.cz/gitwiki/GitDocumentation|Git tutorial]]:
Many revision control systems provide an "add" command that tells the system to start tracking changes to a new file. Git's "add" command does something simpler and more powerful: git add is used both for new and newly modified files, and in both cases it takes a snapshot of the given files and stages that content in the index, ready for inclusion in the next commit.
So once I changed a file and want to ''commit'' it I've got to ''add'' it again first or explicitly call ''commit'' with the ''-a'' option? Personally I think this is just plain annoying instead of "//simpler and more powerful//". But to be honest I just don't get the use-case here; must have something to do with the special circumstances of ultra-hierarchical kernel development or something. **UPDATE:** [[http://reddit.com/info/6ay28/comments/c03cwvy|This thread]] on reddit explains a possible use-case; while I don't see why this should be the default behavior it is at least an explanation. === Mercurial === >phil@straylight:~/tmp/mercurial % cp ../project/* . >phil@straylight:~/tmp/mercurial % hg init >phil@straylight:~/tmp/mercurial % hg add adding bar.py adding bar.pyc adding foo.py >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead >phil@straylight:~/tmp/mercurial % vim bar.py >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead No surprises here. === Bazaar === >phil@straylight:~/tmp/bzr % cp ../project/* . >phil@straylight:~/tmp/bzr % bzr init >phil@straylight:~/tmp/bzr % bzr add added bar.py added foo.py ignored 1 file(s). If you wish to add some of these files, please add them by name. >phil@straylight:~/tmp/bzr % bzr ci Committing to: /home/phil/tmp/bzr/ added bar.py added foo.py Committed revision 1. >phil@straylight:~/tmp/bzr % vim bar.py >phil@straylight:~/tmp/bzr % bzr ci Committing to: /home/phil/tmp/bzr/ modified bar.py Committed revision 2. ''bar.pyc'' is ignored by default -- nice. ==== Branching ==== In this section I will create an additional branch in the current repository called ''foo'', make some changes in it and then merge it into the main branch. === Git === >phil@straylight:~/tmp/git % git branch foo >phil@straylight:~/tmp/git % git branch foo * master >phil@straylight:~/tmp/git % git checkout foo Switched to branch "foo" >phil@straylight:~/tmp/git % git branch * foo master >phil@straylight:~/tmp/git % vim bar.py >phil@straylight:~/tmp/git % git commit -a Created commit 03de70a: 1st exp commit 1 files changed, 1 insertions(+), 1 deletions(-) >phil@straylight:~/tmp/git % git checkout master Switched to branch "master" >phil@straylight:~/tmp/git % git merge foo Updating 006f2f7..03de70a Fast forward bar.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) === Mercurial === >phil@straylight:~/tmp/mercurial % hg branch foo marked working directory as branch foo >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead >phil@straylight:~/tmp/mercurial % hg branches foo 1:84125d4a839f default 0:7da72b11a288 (inactive) >phil@straylight:~/tmp/mercurial % vim bar.py >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead >phil@straylight:~/tmp/mercurial % hg up default 1 files updated, 0 files merged, 0 files removed, 0 files unresolved >phil@straylight:~/tmp/mercurial % hg merge foo 1 files updated, 0 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead === Bazaar === Bazaar doesn't support inline branching. However, lacking this feature you can still just put your branches in their own repositories. While this is a somewhat less cool solution than real in-repository-branching its IMHO easier to get into and more intuitive to use. On the downside this makes it harder to work on multiple branches with others. ==== Development with a Central Server ==== This is the most important section for me because this it what people spend most of their time with and where they will run into the most problems. To demonstrate this I will first pull non-conflicting changes from a remote repository and then pull conflicting changes and merge them into the local repository. === Git === >phil@straylight:~/tmp/git % git remote add remote-rep ssh://localhost/~/tmp/git-remote >phil@straylight:~/tmp/git % git pull remote-rep master Updating 03de70a..d6745a4 Fast forward bar.py | 4 +++- bar.pyc | Bin 214 -> 284 bytes 2 files changed, 3 insertions(+), 1 deletions(-) While it is possible to create aliases for repositories from the commandline you've got to edit ''.git/config'' by hand to specify a default repository. >phil@straylight:~/tmp/git % git pull remote-rep master remote: Counting objects: 5, done. remote: Compressing objects: 100% (3/3), done. remote: Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. Auto-merged bar.py CONFLICT (content): Merge conflict in bar.py Automatic merge failed; fix conflicts and then commit the result. This leaves ''bar.py'' looking like this: def hello(): <<<<<<< HEAD:bar.py print "hello local!" ======= print "hello remote!" >>>>>>> 3cb2676a226b4399172b97367f8e791fc7a11c9a:bar.py Alternatively you can use ''git mergetool'' to choose between different graphical merging tools (the default being xxdiff...). IMHO this is rather cumbersome for something that should be the default behavior. >phil@straylight:~/tmp/git % git commit -a Created commit 08e16e1: Merge branch 'master' of ssh://localhost/~/tmp/git-remote === Mercurial === >phil@straylight:~/tmp/mercurial % hg pull ssh://localhost/tmp/mercurial-remote/ pulling from ssh://localhost/tmp/mercurial-remote/ searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files (run 'hg update' to get a working copy) >phil@straylight:~/tmp/mercurial % hg update 1 files updated, 0 files merged, 0 files removed, 0 files unresolved Again you've got to edit a configuration file by hand to specify the default repository (''.hg/hgrc'' in this case). >phil@straylight:~/tmp/mercurial % hg pull ssh://localhost/tmp/mercurial-remote/ pulling from ssh://localhost/tmp/mercurial-remote/ searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files (+1 heads) (run 'hg heads' to see heads, 'hg merge' to merge) >phil@straylight:~/tmp/mercurial % hg merge merging bar.py What follows is the nicest **default** behavior I've seen so far: the graphical diff/merge tool [[http://meld.sourceforge.net|meld]] starts and you can conveniently solve any conflicts with it. {{ blog:dscm-meld.png?580 | merging with meld}} 0 files updated, 1 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) >phil@straylight:~/tmp/mercurial % hg ci No username found, using 'phil@straylight' instead === Bazaar === >phil@straylight:~/tmp/bzr % bzr pull --remember bzr+ssh://localhost/home/phil/tmp/bzr-remote/ M bar.py All changes applied successfully. Now on revision 2. ''%%--%%remember'' specifies a default repository -- nice and simple. >phil@straylight:~/tmp/bzr % bzr merge Merging from remembered location bzr+ssh://localhost/home/phil/tmp/bzr-remote/ M bar.py Text conflict in bar.py 1 conflicts encountered. >phil@straylight:~/tmp/bzr % ls bar.py bar.py.BASE bar.pyc bar.py.OTHER bar.py.THIS foo.py Which leaves ''bar.py'' like this: def hello(): <<<<<<< TREE print "hello local!" ======= print "hello remote!" >>>>>>> MERGE-SOURCE ...and also creates the following files: def hello(): print "hello" def hello(): print "hello remote!" def hello(): print "hello local!" If you've got the [[http://erik.bagfors.nu/bzr-plugins/extmerge/index.html|extmerge]]-plugin installed you can also use ''bzr extmerge %%--%%all'' to resolve the conflict using your favorite graphical mergetool. >phil@straylight:~/tmp/bzr % bzr resolve All conflicts resolved. >phil@straylight:~/tmp/bzr % ls bar.py bar.pyc foo.py >phil@straylight:~/tmp/bzr % bzr ci Committing to: /home/phil/tmp/bzr/ modified bar.py Committed revision 3. ===== Conclusion ===== Personally I prefer Bazaar because its easy to use, got the features I need + a plugin system and generally just isn't in the way when I want to get something done. The second place goes to Mercurial. Overall it may be more complex but its still a nice system that tries not to get in your way too much. The ability to use inline branching is also quiet nice if you want to share multiple branches with others. The last place goes to Git. It might have many nice features but the overall lack of concern for usability, unnecessary exposure of internals, not having a plugin-system and many odd choices (''commit -a'', using SHA1-hashes as the only revision ID, ...) ruined it for me, but maybe my usual workflow just differs too much from everything Git ever was intended for. Still, Git got its good sides and might be the right choice for you since apparently many people are happy with it (assuming they are not just all fanboys who follow the hype blindly). Disagreeing with me? Drawing a different conclusion? Just feel like flaming me? Please feel free to comment below. {{tag>software_development DRCS}} ~~LINKBACK~~ ~~DISCUSSION~~