Friday, February 15, 2013

Managing HBase releases

Last year I (perhaps foolishly?) signed up to be the release manager for the 0.94 branch of HBase. I released 0.94.0 in Mai 2012. Since then I have learned a lot of the open source release process (which, in the end, is not that different from release proprietary software).

There are no defined responsibilities per se for such a role other than actually doing the release.

When I started HBase had relatively infrequent releases and there used to be many discussions and delays to a release to get some more "essential" features in.
The partial cure for this is two fold:
  1. Frequent releases on a somewhat strict schedule. If a feature or fix does not get in, it'll be in the next release a few weeks later.
    This reduces the pressure to push a feature into the next point release.
    The only discussions now are typically around serious bugs that have been discovered during the round of release candidates.
    This is the "release train" model. The train stops every few weeks, changes that are ready get on board, the other changes wait for the next train.
  2. A passing, comprehensive test suite, so that we can do the frequent releases with confidence. Problems are identified early (if the tests fail regularly nobody will check out new test failures, or these failures just drown in the noise of failing tests).
We're now on HBase 0.94.5 (released today actually!), and the pattern that emerged after some initial adjustment is one point release (0.94.0, 0.94.1, ...) about every four to six weeks (depending on how many rounds of release candidates were needed), with a relatively constant rate of change of two fixes and improvements per day (hence a point release ends up having 60-80 changes).

As you can see HBase is pretty actively maintained!

So to me being the release manager includes all of the following:
  • Help decide what features or fixes should be included in the release.
  • Help channel the discussion about whether a feature in (unstable) trunk is important enough to be backported to 0.94.
  • Try to review all the changes that go into 0.94. Due to the rate of change I cannot have a detailed look at every fix (I have other responsibilities in my day time job too), but I try to at least skim the changes to see if anything risky or incorrect sticks out.
  • Make sure the test suite passes reliably. This is a pet-peeve of mine and has been especially challenging, but we're now at pass rate of about 70% (up from 20-30% a few months back, but still needs to be improved).
    (Note that many of the failures are due to timing issues in the virtual build machines, and not due to a bug in the HBase code base. A single failing test out of over 1800 tests will make the test suite fail. So 70% is not as bad as it sounds.)
  • Keep timely releases. This my pet-peeve number two.
    Releases should be frequent, on a semi strict schedule, and backward compatible.
    That allows users to get features and fixes sooner and does not require cumbersome serial upgrades (where you need to upgrade from version 0.94.0 to 0.94.1 first in order to then upgrade from 0.94.1 to 0.94.2, and so on). Intermediary releases can be skipped (it is possible to upgrade from 0.94.0 to 0.94.5 directly).
    At the same time - as mentioned above - it allows developers to finish a feature or fix correctly rather than rushing it to "get it in", just because the next release will be 6 months from now.
  • (Sometimes) coordinate with vendors (such as Cloudera and Hortonworks) to time a release or a fix with their releases. This is on a best effort basis, the Apache release is independent of any vendor; but let's be honest, a significant fraction of our users run a release from these vendors.
  • Doing the actual release:
    • Tagging the release in SVN
    • Creating the release artifacts (currently we use the ones generated by the jenkins build for this).
    • Go through a round of one or more RCs and get other committers to test and vote for this RC. Here we need to improve with more automated integration test.
    • Uploading the release to the official Apache mirrors.
    • Pushing the release to the Maven repository (which involves a lot of black voodoo).
0.94 is the current stable release branch of HBase. As long as the next version (0.96) does not have a stable release we will keep backporting new features to 0.94 and keep the frequent releases going.

So far this has been fun (with the occasional frustration about the flaky test suite in the past).

The HBase community is very friendly and invites outside patches and improvements. So download HBase 0.94.5, and start contributing :)