The Gentoo Linux Haskell Project recently moved to using git on github.

We’ve got various repositories, some larger with several contributors and some smaller with only one developer. All repos were based on the darcs revision control system.

You may browse the new repos at our new home.

Overlay repository

The overlay repository is the heart of the Gentoo Linux Haskell Project. The most common packages are available in the portage tree, and thus available to all Gentoo Linux users without any additional configuration on their part.

For all other packages we use the overlay. It can be packages that change rapidly, are tricky to build, etc.

It’s our main repo, ~4000 commits from many users. There was two tools to consider:

We’ve decided to try both.

darcs-to-git

mkdir overlay.git && cd overlay.git
darcs-to-git ../overlay
git commit --allow-empty -m "phony" # hack, described later
darcs-to-git ../overlay

The hack with –allow-empty is used to workaround an error:

Running: ["git", "log", "-n1", "--no-color"]
fatal: bad default revision 'HEAD'

git does not track directory creation commits (when no files are affected). It’s our first commit. To be reported upstream.

darcs-to-git took 7.5 hours(!) to convert our repo.

darcs-fastconvert

mkdir overlay.git && cd overlay.git
(cd ../overlay ; darcs-fastconvert export) | git fast-import

It was very fast! Took less, than 7 minutes to convert everything (~60 times faster than darcs-to-git!)

some notes

  • darcs-fastconvert does not try to make prettier email-only usernames:

    username ‘john@doe’ becomes ‘john@doe <unknown>’. Patch to convert such names to ‘john <john@doe>’ sent upstream (left copy here).

  • darcs-fastconvert does not filter out empty commits (directory-adding in darcs), so in order to get the same amount of commits as for darcs-to-git you will need to run git filter-branch --prune-empty -f

hackport

hackport is the tool we use to generate Gentoo ebuilds from Hackage packages. It geatly simplifies the process and enables us to cover hundreds of packages with relatively few developers.

The hackport project started off as part of the overay repository in 2005. At some point we decided that it deserved its own repository, as it really was a standalone project. The development was forked from the overlay repository, and continued without being mixed with the overlay commits. However, the result was a repository with an messy history: hackport and overlay stuff was mixed. Moving to git gave us a new chance to clean it up.

When using git you have the option of changeing the history of your repository. Of course this is a powerful tool, but it should be used carefully. As we were in a transition of moving to git, we used these advantages. In general git will require greater knowledge of your consequences than similar tools (darcs, mercurial).

For this job we used the features of git filter-branch, see the git documentation at filter-branch. As the projects where clearly separated it was easy to tell git wich files were interesting:

git filter-branch --tree-filter \
      'rm -rf ignore-this-file and-this-directory' HEAD
git filter-branch --prune-empty -f

We repeated until we’ve cleared the history from the overlay commits. The result is clean and only contains the hackport project. You find it at hackport.

This way we could separate the ~400 commits from the ~1100 commits that had nothing to do with the hackport project.

keyword-stat

keyword-stat is a tool to help us see the status of packages regarding Gentoo’s concept of stable and testing status. Each Gentoo user is able to choose the stability level of each package through the keywording concept.

The repo was already at a nice state and the conversion was straight forward:

mkdir keyword-stat.git && cd keyword-stat.git
git init
( cd ../keyword-stat ; darcs-fastconvert export ) \
    | git fast-import

We have moved gentoo-haskell overlay to github!

It will require some additional actions for overlay users:

layman -f
layman -d haskell
layman -a haskell

Our move was stimulated by a couple of events:

  • code.haskell.org was not very reliable, and

  • as you might notice haskell overlay was (and still is) inaccessible since last week. Current status of code.haskell.org can be tracked here.

Good news!

We’ve got ghc-6.12.3 stable on x86, amd64 and sparc arches! (more to come)

What does it mean for end user?

You’ll get latest haskell compiler and will be able to taste it’s new features! We also got rid of hacky ghc-updater and switched to haskell-updater.

Unfortunately, some haskell packages in main tree might stop building on new compiler. Please report them as problematic to gentoo bugzilla or drop a line in #gentoo-haskell on freenode IRC network.

I’ve decided to look back and estimate timeframe it took us to deliver ghc for you:

So, it took us almost 4 months.

The major problems were:

  • Resurrect GHC on exotic architectures (we have 10 patches on top of vanilla ghc!). Unfortunately, hppa support was lost.
  • Fix packages breaking with 6.10/6.12 branch of ghc (tons of them)

Brave souls can try to install ghc-7-rc2 (aka 7.0.0.20101028) from overlay (currently masked). It has no base-3 (deprecated in ghc-6.10), so you’ll have great chance to become a contributor to various haskell projects!

Enjoy!

For those of you who have been hanging around in #gentoo-haskell (especially when I’m around… :p) or who have upgraded to dev-lang/ghc-6.10.4, you would probably have noticed the new haskell-updater package. This is our replacement to the venerable ghc-updater script that was packaged with previous versions of GHC.

ghc-updater is a bash script that was based long ago on python-updater. However, whilst python-updater was updated to let users use alternative package managers like Paludis or PkgCore, it wasn’t as simple to upgrade ghc-updater as we bundle it with dev-lang/ghc. So, we (meaning I :p ) decided to split off a new haskell-updater package (with a deliberate name change to avoid problems with name clashes). At first this was just going to be based off the newer versions of python-updater, but this approach was soon abandoned (I had trouble finding my way through what it was doing).

Thus, haskell-updater is unique as far as I know amongst the various app-admin/*-updater packages (well, the only two that are there are python-updater and emacs-updater :p ) in that it is written in the languages whose packages it aims to update and rebuild. Using Haskell for haskell-updater gives us several advantages over the “traditional” bash-based kludge:

  • It uses our language of choice; this means we’re more likely to be interested in it and maintain it (and be able to maintain it!) in future.
  • A more modular design makes it easier to split apart parts of the code rather than a one-file-fits-all bash script.
  • Ability to use the Cabal library (more on why this is a good thing later).
  • Speeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeed! After all, all Gentoo-ers are ricers, aren’t we? :p Seriously, haskell-updater takes roughly 2s for me to run (whilst doing more! see below), ghc-updater takes 27s and I killed python-updater after 8.5 minutes :s
  • Have a piece of software written in Haskell that Don Stewart isn’t going to brag about having it available in Arch😉

Now, haskell-updater doesn’t just find packages installed with previous versions of GHC like ghc-updater did; it also finds broken packages, making it equivalent to revdep-rebuild/reconcilio/etc. for Haskell packages (though just for libraries, since at the moment GHC creates statically-linked binaries). This has become a bigger problem in the last few years as the number of Haskell libraries has almost exploded (especially after the base library being split up). Until now, however, users have had to manually run “ghc-pkg check” and build the corresponding packages by hand (otherwise you face the dreaded Diamond Dependency Problem). However, version 1.6 of the Cabal library includes support for parsing the output of “ghc-pkg check”, so we’re able to use this to have haskell-updater find these broken packages for you as well!

haskell-updater has several other incremental advantages over ghc-updater (supports slotted packages properly; able to find packages installed with previous versions of GHC even if someone manually went and deleted the old GHC directory when they shouldn’t have, etc.). As such, we highly recommend that people install it and try it out. Note, however, that it requires one of the 6.10 series of GHC releases or higher to work (technically it doesn’t as long as you install the necessary libraries yourself; however, we’ve stated this in the ebuild to try and avoid dependency problems when upgrading GHC). With version 6.10.4, it has completely replaced ghc-updater (ghc-updater is still shipped with previous versions) as the used updating tool.

Extra features we’re considering adding in future releases:

  • Allow user-defined package managers (in case of custom scripts, etc.).
  • The ability to print out the command to rebuild the packages rather than actually running it.
  • Detect and fix packages that didn’t get re-registered by ghc-pkg when re-building the same version of GHC (this seems to be a problem when using Paludis).
  • Adding colours to the output😉

Note that haskell-updater is not available on HackageDB like most other Haskell software; this is to avoid name-pollution there, and because we don’t think non-Gentoo users are going to be interested in it (and Gentoo users will probably have it automatically installed for them anyway).

Gentoo Linux supports Haskell!

#gentoo-haskell on freenode