Troubleshooting IG/OpenIG

This book provides information on troubleshooting various issues in IG/OpenIG including collecting useful troubleshooting information such as logs, heap dumps and stack traces.


FAQ: IG/OpenIG performance and tuning

Last updated Aug 10, 2020

The purpose of this FAQ is to provide answers to commonly asked questions regarding performance and tuning for IG/OpenIG.

2 readers recommend this article

Frequently asked questions


This article contains tips on Performance and Tuning to get you started. Performance tuning is environment specific and is outside the scope of ForgeRock support; if you want more tailored advice, consider engaging Deployment Support Services

Q. How can I troubleshoot performance issues?

A. Follow the advice in How do I generate more detailed debug logs to diagnose an issue in IG (All versions)? to obtain more detailed logs for troubleshooting. You should also note the good practice advice of decorating filters and handlers with the timer decorator as this can help pinpoint possible bottlenecks in the route as it generates start/stop timings in the logs for each component decorated. See Configuration Reference › TimerDecorator for further information.

Q. What cache settings can I enable?

A. The following table details the objects that you can enable caching for along with the relevant properties (they are disabled by default):

Object Property What it caches Documentation
TemporaryStorage -- Streamed content Configuration Reference › TemporaryStorage
AmService sessionCache Session information from AM Configuration Reference › AmService
UserProfileFilter cache in userProfileService AM user profiles Configuration Reference › UserProfileFilter
PolicyEnforcementFilter cache Policy decisions from AM Configuration Reference › PolicyEnforcementFilter
OAuth2ClientFilter cacheExpiration User information from the OIDC Provider Configuration Reference › OAuth2ClientFilter
OAuth2ResourceServerFilter cache OAuth2 access tokens Configuration Reference › OAuth2ResourceServerFilter

See Maintenance Guide › Cache for further information. 

Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?

A. Once a request (connection) has been accepted by the web application container where IG/OpenIG runs, it is put in the process queue. If the backend is slow or can't handle a high load, then the process queue can build up. If the client subsequently closes the connection (for example, due to a timeout) the connection remains in the queue. Potentially, these connections can build up and cause IG/OpenG to become unresponsive and stop accepting new connections.

You can use one of the following approaches to prevent the queue building up in the first place:

Q. What throttling algorithm does IG/OpenIG use and how does it work?

A. The throttling algorithm used in IG/OpenIG is the Token bucket algorithm. This is the only supported algorithm in IG/OpenIG and allows a more regulated flow under load; however, since this algorithm does not prevent traffic bursts, the throttling rate may occasionally deviate from the defined limit. See Gateway Guide › Throttling the Rate of Requests to Protected Applications for further information.

This algorithm throttles and smooths the traffic once the incoming rate of requests is higher than the declared throttling rate. Initially the bucket is filled with tokens (requests) without any rate limit (seen as a traffic burst). Once the bucket is full, the rate limit is applied:

  • Before the bucket is full: IG/OpenIG keeps accepting requests.
  • Once the bucket is full and IG/OpenIG receives more requests than the configured throttling rate (for example, 6 requests every 60 seconds), the algorithm allows the bucket to push one token out every duration/requests seconds (in this example, 60/6, which  is every 10 seconds). As tokens are pushed out, additional tokens can be accepted.

In your log, you will see HTTP/1.1 200 responses when tokens are accepted and HTTP/1.1 429 responses when the bucket is full. Among the HTTP/1.1 429 responses will be HTTP/1.1 200 responses when a token has been pushed out, thus allowing another token to be accepted. For example, you will see something similar to this in your log:

[.559s] HTTP/1.1 200 
[.663s] HTTP/1.1 200 
[.747s] HTTP/1.1 200 
[.837s] HTTP/1.1 200 
[.925s] HTTP/1.1 200 
[1.007s] HTTP/1.1 200 
[1.093s] HTTP/1.1 429 
[1.167s] HTTP/1.1 429 
[1.260s] HTTP/1.1 429 
[1.350s] HTTP/1.1 429 
[1.440s] HTTP/1.1 429 
[1.518s] HTTP/1.1 429 
[1.602s] HTTP/1.1 429 
[1.688s] HTTP/1.1 429 
[1.761s] HTTP/1.1 429 
[1.846s] HTTP/1.1 429 
[1.920s] HTTP/1.1 429 
[1.998s] HTTP/1.1 429 
[2.083s] HTTP/1.1 200 
[2.161s] HTTP/1.1 429 

Q. What other settings can I tune to determine the capacity a system can handle?

A. You can also tune the TCP level backlog queue, paying particular attention to the following operating system parameters:

  • net.ipv4.tcp_max_syn_backlog
  • net.core.somaxconn
  • net.core.netdev_max_backlog

It is recommended that you start with the default values and then tune them once you have finished tuning IG/OpenIG and the JVM. There is a third-party blog that provides a good explanation on these particular settings: Tuning your Linux kernel and HAProxy instance for high loads > TCP parameters.

Large backlog queues are only suitable for well tuned environments that can process a lot of requests quickly. If your backlog queue is too big for your system, you will likely see slow responses, requests getting queued at the TCP level and clients timing out. This happens because the web application container is not able to accept requests at the same rate, which means they are queued up while they are waiting. Conversely, if the backlog queue is too small, clients will see connection exceptions instead of timing out.

You should also consider these settings in conjunction with the maximum number of connections/threads permitted by your web application container (Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?). If the number of connections permitted is low but you have a big backlog queue, connections will continue to be accepted even when the servers are running slowly.

Monitoring the size of the backlog

You can use one of the following commands to monitor the size of the backlog:

  • Isof command - this command indicates how many connections are queued at the TCP level:
    $ lsof -a -i -s TCP:SYN_RECV -p [processId]
    Where [processId] is the process ID of the web application container that is running the IG/OpenIG process.
  • Netstat command - this command shows the current global count of connections in the queue:
    $ netstat -an | grep -c SYN_RECV 
     You can break this down further by port of the container's http(s) listener:
    $ netstat -an | grep -c SYN_RECV | grep [port]
    Where [port] is the port that the container is listening on, for example, 443.

Depending on your operating system, you may need to replace SYN_RECV in the above commands with SYN_RCVD. You should check which version of SYN_RECEIVED is used in your distribution.

Q. How does streaming work for the ClientHandler and ReverseProxyHandler?

A. IG 6 introduced a streaming option, which is enabled in the ClientHandler and the ReverseProxyHandler with the following option:

asyncBehavior: streaming

When streaming is enabled, IG makes use of the following: 

  • Java's Executor.newFixedThreadPool() with the number specified in the numberOfWorkers option to execute requests asynchronously. As a guide, numberOfWorkers should be set to approximately the number of CPUs.
  • Apache's HttpClient's I/O reactor to react to I/O events and to dispatch event notifications. See Asynchronous I/O based on NIO for further information.

It is possible to end up in a thread starving situation as described in OPENIG-2417 (Thread starvation on CHF response reception/consumption with async http client), particularly if the numberOfWorkers option is set incorrectly.

See Configuration Reference › ClientHandler and ReverseProxyHandler for further information about the asyncBehavior setting in these handlers.

Q. How do I increase the connection timeout in IG/OpenIG?

A. You might want to increase the connection timeout in IG/OpenIG if you are seeing errors such as the following:

org.apache.http.ConnectionClosedException: Connection closed null

You can do this by increasing the timeouts (connectionTimeout and soTimeout) for the ClientHandler in the IG/OpenIG route. See Configuration Reference › ClientHandler and 502 Bad Gateway or SocketTimeoutException when using IG (All versions) for further information. 

Q. Is a handler (and associated settings such as connection pools and timeouts) shared across all routes?

A. Handlers, filters etc that are defined in a heap are available in the current configuration and to all its children. For example, a handler of type ClientHandler called "MyClientHandler" defined in the heap of config.json will be shared by all routes that make use of "MyClientHandler" as a handler. Considerations around workers, timeouts and connection pools need to be understood when sharing handlers. If unique settings are required for a particular use-case, then defining the handler in the route's heap or inline might be a better option.

See Gateway Guide › Configuring Objects Inline or In the Heap for further information.

Q. How can I avoid Cannot assign requested address exceptions?

A. The "Cannot assign requested address" exception indicates there are too many open connections and is often seen with 502 Bad Gateway errors. 

Examples of the "Cannot assign requested address" exception that you may see in your logs:

java.lang.RuntimeException: Cannot assign requested address Cannot assign requested address Cannot assign requested address (Address not available)

You can resolve this issue on Linux® systems by changing the following operating system parameters:

  1. Increase the local ports range to something like 9000-65535:
    • Check the current local ports range:
      $ cat /proc/sys/net/ipv4/ip_local_port_range 
    • Increase the range:
      $ echo 9000 65535 > /proc/sys/net/ipv4/ip_local_port_range
  2. Enable sockets in a TIME_WAIT state to be re-used for new connections:
    $ echo "1" > /proc/sys/net/ipv4/tcp_tw_reuse

Q. Should I use Apache's HttpClient in my scripts?

A. No, It is preferable to use IG/OpenIG's HttpClient in your configuration and Groovy scripts because it gives better performance (increased throughput). You can implement and tune IG/OpenIG's HttpClient (org.forgerock.http) using the ClientHandler. See Configuration Reference › ClientHandler and API Javadoc › Class Client for further information. 

If you already use Apache's HttpClient (org.apache.http) and are seeing high CPU, you should change your scripts to use the IG/OpenIG implementation instead.

See Also

How do I configure an idle timeout in IG/OpenIG (All versions)?

How do I change the JVM heap size for IG/OpenIG (All versions)?

How do I enable Garbage Collector (GC) Logging for IG/OpenIG (All versions)?

Best practice for JVM Tuning with G1 GC

Best practice for JVM Tuning with CMS GC

Performance tuning and monitoring ForgeRock products

Maintenance Guide › Tuning Performance

Related Training


Copyright and TrademarksCopyright © 2020 ForgeRock, all rights reserved.