Monday, December 21, 2015

Yet more on HBase GC tuning with CMS

By Lars Hofhansl
  
Some of my previous articles delve into gc tuning and
more tuning for scanning.

We have since performed more tests with heavy write loads and I need revise my previous recommendation based on the findings.

I wrote a simple test tool that generates 5 million random keys of approximately 200 bytes; then it starts 50 threads that each pick 100k random batches of these keys and write them to HBase. I then start multiple of these and point them to an HBase cluster.
The result is that we see a lot of churn in HBase as new versions of Cells are written but the overall size of the data is kept more or less constant with compactions - so I can test very large write loads with limited disk space.

We find that for these kinds of loads a young generation of 512MB that I had recommended before is hopelessly undersized. We're seeing lots of premature promotions into tenured space, followed by minute long full pauses that eventually cause the region servers to shut down.

For the tests I ran with a 16G heap and found a young gen of 2G is good.

Note that in CMS the size of the young gen is one of the knobs to trade latency for throughput. The smaller the young gen is size the smaller the pauses tend to be... If all short lived garbage fits into the young gen, that is. If it does not, short lived objects get promoted into the tenured space and that will eventually lead to very long (minutes with 30G heaps) full collection pauses.

So the goal is to size the young gen just large enough to fit the short lived objects. In HBase we do some guide posts for this. 40% of the heap (by default) is dedicated to the memstores, another 40% (again by default) to the block cache, and the rest (20%) to "day-to-day" garbage.

With I now recommend the following for the young generation:
  • at least 512MB
  • after that 10-15% of the heap
  • but at most 3GB

For most setups the following should cover all the bases:

-Xmn2g - small eden space (or even -Xmn3g for very large heaps)
-XX:+UseParNewGC - collect eden in parallel
-XX:+UseConcMarkSweepGC - use the non-moving CMS collector
-XX:CMSInitiatingOccupancyFraction=70 - start collecting when 70% of the tenured gen are full to avoid collection under pressure
-XX:+UseCMSInitiatingOccupancyOnly - do not try to adjust CMS setting


In the end you have to test with your own work loads. Start with the smallest young gen size that you think you can get away with. Then watch the GC log. If you see lot's of "promotion failed" type messages, you need to increase eden (-Xmn), do that until the promotion failures stop.

We'll be doing testing with G1GC soon.