I’ve been wanting to write up a juicy post on how we deal with very large heaps in Java to reduce GC pauses. Unfortunately I keep getting side tracked getting the data together. The latest bump in the road is due to a JVM bug of sorts.
Backstory: Todd Lipcon’s twitter post pointed me to the JVM option -XX:PrintFLSStatistics=1 to be able to get out some good information about heap fragmentation. He was even kind enough to provide the Python and R scripts! I figured that it would be a few minutes of fiddling and I’d have some good data for a post. No such luck. Our JVM GC/heap options are -XX:+UseConcMarkSweepGC -Xms65g -Xmx65g. When -XX:PrintFLSStatistics=1 is used with this, the following output is seen:
Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: -1824684952 Max Chunk Size: -1824684952 Number of Blocks: 1 Av. Block Size: -1824684952 Tree Height: 1
A few seconds of digging into the Hotspot source reveals:
void BinaryTreeDictionary::reportStatistics() const {
verify_par_locked();
gclog_or_tty->print("Statistics for BinaryTreeDictionary:\n"
"------------------------------------\n");
size_t totalSize = totalChunkSize(debug_only(NULL));
size_t freeBlocks = numFreeBlocks();
gclog_or_tty->print("Total Free Space: %d\n", totalSize);
gclog_or_tty->print("Max Chunk Size: %d\n", maxChunkSize());
gclog_or_tty->print("Number of Blocks: %d\n", freeBlocks);
if (freeBlocks > 0) {
gclog_or_tty->print("Av. Block Size: %d\n", totalSize/freeBlocks);
}
gclog_or_tty->print("Tree Height: %d\n", treeHeight());
}
in hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/binaryTreeDictionary.cpp. (“%d” just doesn’t cut it with a “long”‘s worth of data.) I filed a hotspot bug so hopefully it will be fixed in some release in the not-too-distant-future.
I can work around this but it has slowed down my getting to the juicy blog post. Stay tuned!