Directory renaming in SCM
SCM stands for Source Code Management. Pretty much the same thing can be called VCS, Version Control Software. Perhaps even more TLA’s are there out in the wild. It all boils down to a program which allows programmers to manage their source code.
Pretty much everybody who started using SCM, started with CVS and then moved to something else. Probably Subversion, which is meant to be a CVS replacement. For more adventurous or demanding developers, there are many other SCM’s: Git, Bazaar, Monotone, Mercurial, Darcs… and more.
Mark Shuttleworth has written an interesting thing: that file and directory renaming is one of the most important operations to be handled with an SCM. I got curious and wrote a test case for three SCM’s I know: Bazaar, Git and Subversion. The scenario is:
1. A project is created, with one directory and one file in it.
`-- project
`-- dir-a
`-- foo.txt
2. Developer A creates a branch and renames the directory
`-- project
`-- dir-b
`-- foo.txt
3. At the same time, developer B creates a branch, modifies the file
and adds another file to the directory.
`-- project
`-- dir-a
|-- bar.txt
`-- foo.txt
4. Developer C creates a branch, merges A’s changes first and then B’s
changes. What I would expect as the result after two merges, is one
directory with two files in it.
`-- project
`-- dir-b
|-- bar.txt
`-- foo.txt
From the three tested SCM’s only Bazaar and Darcs got it right. Git would leave two directories (dir-a and dir-b), and Subversion would discard the bar.txt file altogether.
I wrote test case for the three SCM’s I’m familiar with. If you feel like writing a test case for your SCM, I will be glad to see it! Meanwhile, you can download the test cases and run them:
http://code.blizinski.pl/scm-rename-test.tar.gz
They are written as Bash scripts and are tested under Linux. README file is included.
UPDATE: Dennis Lambe added Darcs test case. I am pleased to tell you that Darcs handles the directory rename in the correct way: after merge, there is one directory with two files. Dennis’ file is currently included in the tarball.
UPDATE: I’ve noticed a short conversation about this issue on #git IRC channel. No conclusions were drawn.
UPDATE, 2007-09-10: Paul wrote Mercurial test case, now included in the tar.gz archive.
June 7, 2007 at 11:12 pm
I wrote an SCM tester for darcs, in the style of the testers for svn, git, and bzr. You can download it from http://malsyned.net/files/darcs-test.sh
Feel free to include it on your page or modify it in any way.
June 24, 2007 at 10:26 pm
As for darcs, please don’t forget about its’ bugs, most importantly:
http://lists.osuosl.org/pipermail/darcs-users/2007-March/010877.html
(also referenced from Wikipedia)
… sooo… the reality is: darcs is not yet ready for production use (but I really REALLY do hope it will be).
August 21, 2007 at 2:07 pm
It seems that Mercurial can also pass this test. I modified the darcs-test.sh script to try it out. The only thing of note is that after both branch-a / branch-b are pulled the 2 heads need to be merged which didn’t have any issues.
September 10, 2007 at 4:47 am
Here’s a Mercurial version of the script: http://dpaste.com/19091/
January 1, 2008 at 5:53 am
m–s:
sooo… the reality is: darcs is not yet ready for production use (but I really REALLY do hope it will be)
Darcs 2 came out and largely alleviated those bugs. As it stands, only windows compatibility is really a problem if you hate cygwin.
January 1, 2008 at 9:54 pm
Clarification — Darcs2 is _going_ to come out. It is currently out as a release candidate, but still undergoing testing and some development.
I use the current darcs on smaller projects all the time and enjoy it greatly. I’m looking forward to being able to recommend darcs2 for projects of all sizes.
January 1, 2008 at 11:13 pm
I realize this is old, but I went ahead and wrote a test for Monotone: http://www.codefu.org/people/darkness/mtn-test.sh.txt
Monotone 0.38 seems to do the right thing here. I also went out of my way to do extra work to more closely simulate multiple developers working off of multiple DBs, which is probably unnecessary (I think you could simulate this and see the same results using a single DB).
January 17, 2008 at 11:11 am
Darcs2 still has a ton of problems. Including ridiculous RAM usage and O(something-large) algorithms.
I have a darcs repo with 1900-ish patches in it and I want to move to git. However, tailor/darcs2git/etc need to be able to perform a ‘darcs pull’ one patch at a time. Darcs2 (or darcs 1) cannot even complete pulling the first patch in the repo. I have a machine with 4GB RAM and darcs runs out of memory.
I cannot run annotate on files in my repo either, which is part of the reason I want to dump it - my history has become useless. And unless I can get my history into something else, it’s all gone forever. (or until someone can lend me a machine with much more RAM than I have!)
Darcs is not suitable for production use if you care about your history. Learn git/hg/bzr/monotone - they are all infinitely more reliable, albeit not quite as simple to use.
Apologies for the rant.