The Gentoo Linux Haskell Project recently moved to using git on github.

We’ve got various repositories, some larger with several contributors and some smaller with only one developer. All repos were based on the darcs revision control system.

You may browse the new repos at our new home.

Overlay repository

The overlay repository is the heart of the Gentoo Linux Haskell Project. The most common packages are available in the portage tree, and thus available to all Gentoo Linux users without any additional configuration on their part.

For all other packages we use the overlay. It can be packages that change rapidly, are tricky to build, etc.

It’s our main repo, ~4000 commits from many users. There was two tools to consider:

We’ve decided to try both.

darcs-to-git

mkdir overlay.git && cd overlay.git
darcs-to-git ../overlay
git commit --allow-empty -m "phony" # hack, described later
darcs-to-git ../overlay

The hack with –allow-empty is used to workaround an error:

Running: ["git", "log", "-n1", "--no-color"]
fatal: bad default revision 'HEAD'

git does not track directory creation commits (when no files are affected). It’s our first commit. To be reported upstream.

darcs-to-git took 7.5 hours(!) to convert our repo.

darcs-fastconvert

mkdir overlay.git && cd overlay.git
(cd ../overlay ; darcs-fastconvert export) | git fast-import

It was very fast! Took less, than 7 minutes to convert everything (~60 times faster than darcs-to-git!)

some notes

  • darcs-fastconvert does not try to make prettier email-only usernames:

    username ‘john@doe’ becomes ‘john@doe <unknown>’. Patch to convert such names to ‘john <john@doe>’ sent upstream (left copy here).

  • darcs-fastconvert does not filter out empty commits (directory-adding in darcs), so in order to get the same amount of commits as for darcs-to-git you will need to run git filter-branch --prune-empty -f

hackport

hackport is the tool we use to generate Gentoo ebuilds from Hackage packages. It geatly simplifies the process and enables us to cover hundreds of packages with relatively few developers.

The hackport project started off as part of the overay repository in 2005. At some point we decided that it deserved its own repository, as it really was a standalone project. The development was forked from the overlay repository, and continued without being mixed with the overlay commits. However, the result was a repository with an messy history: hackport and overlay stuff was mixed. Moving to git gave us a new chance to clean it up.

When using git you have the option of changeing the history of your repository. Of course this is a powerful tool, but it should be used carefully. As we were in a transition of moving to git, we used these advantages. In general git will require greater knowledge of your consequences than similar tools (darcs, mercurial).

For this job we used the features of git filter-branch, see the git documentation at filter-branch. As the projects where clearly separated it was easy to tell git wich files were interesting:

git filter-branch --tree-filter \
      'rm -rf ignore-this-file and-this-directory' HEAD
git filter-branch --prune-empty -f

We repeated until we’ve cleared the history from the overlay commits. The result is clean and only contains the hackport project. You find it at hackport.

This way we could separate the ~400 commits from the ~1100 commits that had nothing to do with the hackport project.

keyword-stat

keyword-stat is a tool to help us see the status of packages regarding Gentoo’s concept of stable and testing status. Each Gentoo user is able to choose the stability level of each package through the keywording concept.

The repo was already at a nice state and the conversion was straight forward:

mkdir keyword-stat.git && cd keyword-stat.git
git init
( cd ../keyword-stat ; darcs-fastconvert export ) \
    | git fast-import