Monday, March 3, 2014

HBase GC tuning observations

By Lars Hofhansl

Java garbage collection tuning depends heavily on the application.

HBase is somewhat special: It produces a lot of short lived objects that typically does not outlive a single RPC request; at the same time most of the heap is (and should be) used for the block cache and the memstores that hold objects that typically live for a while.
In most setups HBase also requires reasonably short response times, many seconds of garbage collection are not acceptable.

Quick GC primer 

There is a lot of (better informed) literature out there about this topic, so here just the very basics.
  • In typical applications most object "die young" (they are created and soon are no longer needed). This is certainly true for HBase. This is called the "weak generational hypothesis". Most modern garbage collectors accommodate this by organizing the heap into generations.
  • HotSpot: manages four generations:
    1. Eden for all new objects
    2. Survivor I and II where surviving objects are promoted when eden is collected
    3. Tenured space. Objects surviving a few rounds (16 by default) of eden/survivor collection are promoted into the tenured space
    4. Perm gen for classes, interned strings, and other more or less permanent objects.
  • The principle costs to the garbage collector are (1) tracing all objects from the "root" objects and (2) collecting all unreachable objects.
    Obviously #1 is expensive when many objects need to be traced, and #2 is expensive when objects have to be moved (for example to reduce memory fragmentation)

Back to HBase

So what does that mean for HBase?

Most of the heap should be set aside for the memstores and the blockcache as that is the principal data that HBase manages in memory.

HBase already attempts to minimize the number of objects used (see cost #1 above) and keeps them at the same size as much as possible (cost #2 above).

The memstore is allocated in chunks of 2mb (by default), and the block cache stores blocks of 64k (also by default).
With that HBase keeps the number of objects small and avoids heap fragmentation (since all objects are roughly the same size and hence new objects can fit into "holes" created by previously released objects).

Most other objects and datastructures do not outlive a single request and can (and should) be collected quickly.

That means we want to keep eden as small as possible - just big enough to handle the garbage from a few concurrent requests. That allows us to use most of the heap for the useful data and keep minor compactions short.
We also want to optimize for latency rather than throughput.
And we'd want GC settings where we do not move the objects in the tenured generation around and also collect new garbage as quickly as possible.
The young generation should be moving to avoid fragmentation (in fact all young gen collectors in Hotspot are moving).

TL;DR: With that all in mind here are the typical GC settings that I would recommend:

-Xmn256m - small eden space (maybe -Xmn512m, but not more than that)
-XX:+UseParNewGC - collect eden in parallel
-XX:+UseConcMarkSweepGC - use the non-moving CMS collector
-XX:CMSInitiatingOccupancyFraction=70 - start collecting when 70% of the tenured gen are full to avoid collection under pressure
-XX:+UseCMSInitiatingOccupancyOnly - do not try to adjust CMS setting

Those should be good settings for starters. You should find average GC times around 50ms and even 99'ile GC times around 150ms, and absolute worst case 1500ms or so; all on contemporary hardware with heaps of 32GB or less.

There are efforts to move cached data off the Java to heap to reduce these pauses especially the worst case.

Update Jan 12, 2015:
Also check out "more-hbase-gc-tuning" for more information about sizing the GC generations for HBase