The Gentoo Linux Haskell Project recently moved to using git on github.
We’ve got various repositories, some larger with several contributors and some smaller with only one developer. All repos were based on the darcs revision control system.
You may browse the new repos at our new home.
Overlay repository
The overlay repository is the heart of the Gentoo Linux Haskell Project. The most common packages are available in the portage tree, and thus available to all Gentoo Linux users without any additional configuration on their part.
For all other packages we use the overlay. It can be packages that change rapidly, are tricky to build, etc.
It’s our main repo, ~4000 commits from many users. There was two tools to consider:
- darcs-fastconvert written in haskell
- darcs-to-git written in ruby
We’ve decided to try both.
darcs-to-git
mkdir overlay.git && cd overlay.git
darcs-to-git ../overlay
git commit --allow-empty -m "phony" # hack, described later
darcs-to-git ../overlay
The hack with –allow-empty is used to workaround an error:
Running: ["git", "log", "-n1", "--no-color"]
fatal: bad default revision 'HEAD'
git does not track directory creation commits (when no files are affected). It’s our first commit. To be reported upstream.
darcs-to-git
took 7.5 hours(!) to convert our repo.
darcs-fastconvert
mkdir overlay.git && cd overlay.git
(cd ../overlay ; darcs-fastconvert export) | git fast-import
It was very fast! Took less, than 7 minutes to convert everything (~60 times faster than darcs-to-git
!)
some notes
-
darcs-fastconvert
does not try to make prettier email-only usernames:username ‘john@doe’ becomes ‘john@doe <unknown>’. Patch to convert such names to ‘john <john@doe>’ sent upstream (left copy here).
-
darcs-fastconvert
does not filter out empty commits (directory-adding in darcs), so in order to get the same amount of commits as fordarcs-to-git
you will need to rungit filter-branch --prune-empty -f
hackport
hackport is the tool we use to generate Gentoo ebuilds from Hackage packages. It geatly simplifies the process and enables us to cover hundreds of packages with relatively few developers.
The hackport project started off as part of the overay repository in 2005. At some point we decided that it deserved its own repository, as it really was a standalone project. The development was forked from the overlay repository, and continued without being mixed with the overlay commits. However, the result was a repository with an messy history: hackport and overlay stuff was mixed. Moving to git gave us a new chance to clean it up.
When using git you have the option of changeing the history of your repository. Of course this is a powerful tool, but it should be used carefully. As we were in a transition of moving to git, we used these advantages. In general git will require greater knowledge of your consequences than similar tools (darcs, mercurial).
For this job we used the features of git filter-branch
, see the git documentation at filter-branch. As the projects where clearly separated it was easy to tell git wich files were interesting:
git filter-branch --tree-filter \
'rm -rf ignore-this-file and-this-directory' HEAD
git filter-branch --prune-empty -f
We repeated until we’ve cleared the history from the overlay commits. The result is clean and only contains the hackport project. You find it at hackport.
This way we could separate the ~400 commits from the ~1100 commits that had nothing to do with the hackport project.
keyword-stat
keyword-stat
is a tool to help us see the status of packages regarding Gentoo’s concept of stable and testing status. Each Gentoo user is able to choose the stability level of each package through the keywording concept.
The repo was already at a nice state and the conversion was straight forward:
mkdir keyword-stat.git && cd keyword-stat.git
git init
( cd ../keyword-stat ; darcs-fastconvert export ) \
| git fast-import
3 comments
Comments feed for this article
10 February, 2011 at 5:13 AM
Craig
Seeing as Gentoo is a Free Software project, and you guys clearly value software freedom… wouldn’t a move to gitorious (which is entirely Free Software, and available under AGPLv3) be more appropriate than github (which is proprietary)?
19 February, 2011 at 10:30 AM
Sergei Trofimovich
Our decision was to pick some service friendly to occasional contributor. Github is just good enough for that: a lot of people know about it and it provides everything we need.
I don’t think service’s source code license is relevant here. AFAIU AGPLv3 does not say anything about terms of service for content stored there.
I never used gitorius, but looking at home page they seem to provide similar service. Anyway, in case something will go wrong with github we’ll consider gitorious as well.
20 February, 2011 at 12:57 AM
Craig
I know AGPLv3 doesn’t say anything about hosting services, my comment is more along the lines of consistency and support the Free Software world. Gentoo is all about Free Software, Haskell is Free Software, and the Gentoo packaging of Haskell is Free as well. Therefore, wouldn’t it make sense for this group of Free Software to be hosted on a Free Software system? It certainly doesn’t *have* to be, but wouldn’t it be cool if it was? I love it when Free Software supports Free Software (like Gentoo supporting Haskell by packaging it), and always find it a bit weird when Free Software supports proprietary software (like Gentoo’s Haskell project supporting GitHub), especially when a similarly-capable Free Software option is available (like Gitorious versus GitHub).
Thanks for your awesome work, btw. I don’t mean to be annoying, just hoping to point something out that may not have been considered, and hopefully help out one of my favorite (and IMHO most important) Free Software projects.