Thursday, January 11, 2007

Importing SourceForge Subversion Projects into Git

At least for the initial import it is faster to create a local mirror of the desired repository using rsync. Taking the Simplified Wrapper and Interface Generator project, SWIG, as an example:

rsync -avz rsync://swig.svn.sourceforge.net/svn/swig/* swig.svn
Made a copy of the Subversion SWIG repository in about 10 minutes.

(On CentOS 4, I had to upgrade svn from the source to be able to access this mirror. I installed the RHEL4 RPMs that were pointed to on the Subversion download page. I needed the apr, apr-util, subversion and subversion-perl packages.)

Then I was able to import this into a git repository using:

git-svnimport -C swig.git \
file://`pwd`/swig.svn
This took about 13 minutes.

After importing a new repository, repacking recommended:

cd swig.git
git repack -a -d
This reduces the amount of space the repository requires. It can be significant on a large or active repository.

Future updates can then be done fairly effeciently from within the swig.git subdirectory with:
git-svnimport -C . https://svn.sourceforge.net/svnroot/swig
Because it is no longer needed, the swig.svn directory can now be deleted.

Just some statistics
While the rsync'd Subversion repository took up 238MiB, the git repository only used 63MiB. Most of this is the working copy of the code. The actual repository swig.git/.git only occupied 25MiB. (Barely a tenth of what Subversion required.)

Alternatives
While I could have run git-svnimport directly against the remote repository, this would have taken about 32 hours. Even using svm or SVN::Mirror in place of rsync would have required around 10 hours.

If you can use rsync or another tool to make a local copy of the repository, the import will run much more quickly.

No comments: