In order to successfully tune your JVM, you must have clearly defined performance targets. This is your definition of success and without a definition of success, you cannot succeed. The goal of performance tuning is to meet these goals, no more, no less.
This article will discuss some concepts to help you understand and tune your ForgeRock application JVMs to meet your goals. The correct values to select for your organization depends entirely on your performance targets.
There has been significant work in the field of garbage collection in the last few years and this is ongoing. At the time of writing this article, there are two exciting prospective garbage collectors, that is, Z Garbage Collector (ZGC) and Shenandoah GC. When these become more mainstream the recommended GC to use for your ForgeRock applications will likely change, more information on ZGC and Shenandoah GC can be found in the See Also section.
As of Java 9, the ConcurrentMarkSweep (CMS) Garbage Collector has been deprecated. At this stage, we will focus on the tried and tested G1 GC which is available in Java 8, as well as Java 11 (Supported JDK for most ForgeRock applications). The focus of this article will be Java 11.
In some cases, CMS GC may be found to be more performant than G1 hence the decision of some organizations to stick with CMS on Java 8. As CMS has been deprecated in later Java versions, the switch to another GC is imminent, therefore moving to G1 and understanding the concepts may be considered a step to future-proof your solution. Remember, you don’t need the fastest solution, you only need to meet your performance targets. See Best practice for JVM Tuning with CMS GC for further information.
G1 GC is a generational garbage collector, that is, the heap is split into generations with the premise most objects die young. It is more efficient to deal with (clean) objects in the young generation rather than move to the old generation and clean it up there. This is no different than Serial, Parallel and CMS GC.
The G1 GC is a low pause-time collector, whose priority is to attempt to meet a maximum pause-time target. This may come at the expense of throughput, however, from experience, it’s generally deemed more important to avoid long application pauses than overall throughput. Remember the goal here is to meet your performance targets.
Most generational garbage collectors split the heap into contiguous regions. In the diagram that follows we can see the heap has been split into its various generations, Young Generation (consisting of Eden and Survivor Regions) and Old Generation. The space is contiguous, that is, from position 1,2,3….n.
G1 GC differs greatly in the allocation of heap space. G1 splits the heap into a number (typically 2048) smaller heap regions, for this reason heap size makes direct difference in region size. Each of the regions can be allocated as an Eden, a Survivor or an Old region. The number of regions allocated to either Eden, Survivor or Old is flexible and determined by the GC at runtime. The following diagram depicts the very different allocation of the heap.
This method of breaking up the space into regions allows the GC to avoid (as long as possible) large GCs of the entire heap.
During Garbage Collection, there are a number of events where the entire application is paused; these are often referred to as Stop-The-World (STW) events. During these pauses, any requests to the JVM need to wait until the pause is finished. As such, one of the goals of GC tuning is to minimize these pauses. The Full GC is usually the longest STW pause as the entire heap needs to be traversed, whereas some other STW pauses (which may be needed by the Garbage Collector) may be shorter and within acceptable limits. During the process of tuning the GC, log analysis should help to identify excessive pauses and their causes. You can then take steps to rectify the issues.
At a high level, the G1 GC has 3 main collections:
- The Young GC only cleans the young generation, that is moving live objects from Eden to Survivor, from one Survivor to another as well as moving objects that have reached their MaxTenuringThreshold into Old generation.
- A Full GC (both New and Old regions) still occurs as a fallback position, this is very expensive and significant effort is involved in avoiding the need for a Full GC.
- The G1 GC also has the concept of a Mixed GC which gives G1 GC its name - Garbage First. In this GC, the young generation is cleaned as well as a number of regions (configurable) from the Old space that contain the most garbage, that is, Garbage First. This mechanism allows the G1 GC to attempt to avoid the Full GCs for as long as possible. As the full GCs are mainly responsible for the long pause time associated with garbage collections, the G1 GC is able to minimize the need for these expensive operations.
Young Garbage Collection
- Normal Young GC - A few young GCs move objects between Eden and Survivor and eventually to old space. At a certain old generation threshold, determined by the Initiating Heap Occupancy threshold (IHOP), a Concurrent Start young GC is started.
- Concurrent Start - Start of concurrent marking process. This phase works concurrently with young GCs until finished. In this phase live objects are determined in the old region, this phase ends with two stop the world events (application pauses) - Remark and Cleanup:
- Remark - Finalizes marking, reclaims empty regions and class unloading, also starts to determine old regions which can be cleaned concurrently. Stop-the-world event.
- Cleanup - Determines if mixed GC follows. Stop-the-world event.
Mixed Garbage Collection
This phase involves one or more mixed collections, that is, new generation as well as a number of old regions (configurable) that have the most garbage. At the end of a mixed collection, G1 determines if it needs another mixed collection in order to reach its threshold (configurable). After this the cycle starts again with another young GC phase.
Full Garbage Collection
Like other GCs, this is the fall back position. If the application runs out of memory while gathering liveness information this can result in a stop-the-world Full GC, that is, both Young and Old Generation. One of the major goals of G1 GC and other generational garbage collectors is to avoid expensive Full GCs.
G1 GC has significantly less JVM options available than CMS and the intention is to use less. When moving from CMS to G1, or from/to any GC the majority of installations inherit previous JVM options without consideration or understanding of their use. Do not do this.
The basic strategy to tune your JVM for G1 GC is to set heap size and pause-time goal, then let the JVM dynamically modify the required settings to attempt to meet the pause-time goal. If the performance goals are not met, then consider other options based on GC monitoring and log analysis. This is an iterative process and it is important to ensure enough time and resources are allocated to this critical task.
Tuning recommendations are outside the scope of ForgeRock support; if you want more tailored advice, consider engaging Deployment Support Services.
The following sections cover tuning advice, split as follows:
It is recommended to explicitly set the required GC. To set the G1 GC, you add the following JVM option:-XX:+UseG1GC
By setting this explicitly, you know exactly what you are getting and are generally not subject to change unless you decide to. For example on Java 8, the default GC is Parallel GC, while on Java 11 the default is G1 GC. This means that on upgrading your Java version, this will be changed on your behalf, for better or worse, unless you’ve explicitly set the GC.
It is recommended to explicitly set the minimum and maximum heap size to the same value to avoid dynamic shrinking and growing of the heap during the applications lifecycle. You do this using the following JVM options:-XX:InitialHeapSize (Minimum Java heap size) -XX:MaxHeapSize (Maximum Java heap size)
The -Xms and -Xmx options are shortcuts for the above options, so you can use either, but you will see the full option names in JVM output.
See the following articles for product specific details:
- How do I change the JVM heap size for AM (All versions)?
- How do I tune DS (All versions) process sizes: JVM heap and database cache?
- How do I change the JVM heap size for IDM (All versions)?
- How do I change the JVM heap size for IG (All versions)?
Pause Goal and Young Generation Sizing
The G1 GC has a pause time-target that it tries to meet, that is, a soft target.
During young collection, the G1 GC adjusts the size of the young generation to meet the real-time target, this includes New and Survivor regions.
For this reason, it’s generally recommended to set the pause time target and let the GC change the heap as needed. This is an important concept: Do not set the new generation size unless required.
In order to set the pause time target, set the following JVM option (the default value for this is 200):-XX:MaxGCPauseMillis
For example, you could start by setting this value between 200 and 500, (for example, -XX:MaxGCPauseMillis=500) and test to see if your performance targets are met.
Garbage Collection Logging
Tuning is an iterative process based on data collected throughout the tuning phases; therefore, it’s recommended to enable GC logging, even in Production environments. Obviously, you’ll require a logging strategy to prevent logs on the system from consuming resources (space). The default logging level is info and can be adjusted as required.
Unified JVM Logging has replaced old logging options as of Java 9. The logging options with Java 8 will not work with Java 11. See JEP 158: Unified JVM Logging for further details and the following articles for product specific details:
|-XX:+DisableExplicitGC||Recommend setting this value to disable processing of calls to the System.gc() method.|
|-XX:+UseStringDeduplication||String deduplication reduces the memory footprint of String objects on the Java heap. This is disabled by default.|
|-XX:MaxMetaspaceSize||Sets the maximum amount of native memory that can be allocated for class metadata. Recommend setting this value to 256MB and monitor for any issues.|
Sets the maximum amount of iterations to keep live objects in the new generation. This defaults to 15.
If objects do not need to be kept in the new generation for a long time because they will end up in the old generation anyway, you can lower this value. For example, it is recommended to set this to 1 for Directory Server. What this says is an object will likely either live for one iteration or likely live for a long time, so move it to old space. This clears out Eden space for new objects. This recommendation does not apply to all applications and setting this too low will end up with garbage sitting in old space that may have been efficiently cleaned in the new space. Monitor your application to determine the best setting for this value.
|-XX:+ParallelRefProcEnabled||Recommend setting this value to enable parallel reference processing. By default, this option is disabled.|
The following represents an example of all of the above settings as JVM options for a DS instance.-XX:+UseG1GC -XX:InitialHeapSize=2g -XX:MaxHeapSize=2g -XX:MaxGCPauseMillis=500 -XX:+DisableExplicitGC -XX:+UseStringDeduplication -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=256m -XX:MaxTenuringThreshold=1 -Xlog:gc=debug:file=/tmp/gc.log:time,uptime,level,tags:filecount=5,filesize=100m
DO NOT cut and paste this into your application, make informed decisions about the values you set based on your targets.
Here are a few options to consider. There are no specific recommended values as they will be based on your analysis of the GC behavior.
This value sets the number of parallel marking threads. By default, this is set to approximately 25% of the number of parallel garbage collection threads (ParallelGCThreads). For example, a system with 16 logical processors will default ParallelGCThreads to 16 and therefore ConcGCThreads to 4. You can increase the number of ConcGCThreads to increase the number of parallel marking threads and lower pauses during marking.
Mixed GCs are initiated when certain conditions are met and after successful completion of concurrent marking phase.
This value determines when the initial marking process starts, the value defaults to 45% but G1 GC attempts to find an optimal value for IHOP and only uses this value if there is not enough information to optimize or the adaptive IHOP is overridden. You can set the value higher to start concurrent marking later or lower to start marking earlier.
For example, if you are getting Full GCs due to allocation failure or you see Evacuation Pause/Evacuation Failure, this generally means:
If this is happening, you could try lowering the IHOP value to start marking earlier.
In order to override the adaptive behavior, you would set the -XX:-G1UseAdaptiveIHOP option.
These options can be considered when you want to change the mixed garbage collection decisions. That is, they are used to determine which regions to collect on mixed collection, with the goal being to decrease the time of mixed collections.
If you need to change these options, please refer to Space-Reclamation Phase Generation Sizing for further information.
|-XX:G1HeapRegionSize||In G1 GC we’ve already described how objects are stored in regions. If the object size is equal to or greater than 50% of the region size, it is considered a humongous object. Humongous objects are allocated directly to the old generation, are handled differently and can cause fragmentation. A Full GC could be initiated to find contiguous regions to store humongous objects. You could consider resizing the region size by increasing the heap size (and therefore increase region size) or manually increasing region size by setting G1HeapRegionSize.|
-Xmn or -XX:NewSize
|It is possible that the young generation has been dynamically tuned by G1 GC based on previous application behavior and then the application behavior changes. This could result in incorrect young generation optimization. While it’s generally recommended to avoid setting these, under some conditions they might allow a more accurate reflection of the correct distribution of new/old heap than the G1 algorithm.|
This article provides you with a high level view of the Garbage First Garbage Collector, G1 GC, as well as some useful and recommended JVM options to tune your ForgeRock application JVMs. When tuning your JVM, consult product documentation to find application specific recommendations. By no means have all the possible options been covered here though. A list of references has been provided for you in the See Also section for further research.
The following concepts are very important and cannot be stressed highly enough:
- Without a definition of success, you cannot succeed so know your performance targets.
- Start with the basic options and let the JVM modify as required to meet its target. If your goals cannot be met, then, and only then, start using more detailed options.
- Allocate enough time and resources for performance testing.
If you want to look into monitoring the JVM’s Garbage Collection further, you can read this blog: How to Monitor your Java Application’s JVM.
- FAQ: AM performance and tuning
- FAQ: DS performance and tuning
- FAQ: IDM performance and tuning
- FAQ: IG performance and tuning
- Garbage First Garbage Collector Tuning - (Technical article)
- Garbage-First Garbage Collector Tuning - (JDK 11 Documentation)
- JEP 158: Unified JVM Logging
- Garbage Collection Logging in Java 11
- Java Platform Standard Edition Tools Reference - java
- Stack Overflow - Is there a replacement for the garbage collection JVM args in Java 11?
- JRockit to HotSpot Migration Guide - HotSpot Logging Options
- The Garbage-First Garbage Collector
- Understanding the JDK’s New Superfast Garbage Collectors - (ZGC, Shenandoah and G1)
- Java Platform Standard Edition Tools Reference - java Decorations