FAQ
ForgeRock Identity Platform
ForgeRock Identity Cloud

FAQ: IG performance and tuning

Last updated Jun 21, 2021

The purpose of this FAQ is to provide answers to commonly asked questions regarding performance and tuning for IG.


2 readers recommend this article

Frequently asked questions

Caution

This article contains tips on Performance and Tuning to get you started. Performance tuning is environment specific and is outside the scope of ForgeRock support; if you want more tailored advice, consider engaging Deployment Support Services

Q. How can I troubleshoot performance issues?

A. Follow the advice in How do I generate more detailed debug logs to diagnose an issue in IG (All versions)? to obtain more detailed logs for troubleshooting. You should also note the good practice advice of decorating filters and handlers with the timer decorator as this can help pinpoint possible bottlenecks in the route as it generates start/stop timings in the logs for each component decorated. See TimerDecorator for further information.

Q. What cache settings can I enable?

A. The following table details the objects that you can enable caching for along with the relevant properties (they are disabled by default):

Object Property What it caches Documentation
TemporaryStorage -- Streamed content TemporaryStorage
AmService sessionCache Session information from AM AmService
UserProfileFilter cache in userProfileService AM user profiles UserProfileFilter
PolicyEnforcementFilter cache Policy decisions from AM PolicyEnforcementFilter
OAuth2ClientFilter cacheExpiration User information from the OIDC Provider OAuth2ClientFilter
OAuth2ResourceServerFilter cache OAuth2 access tokens OAuth2ResourceServerFilter

See Cache for further information. 

Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?

A. Once a request (connection) has been accepted by the web application container where IG runs, it is put in the process queue. If the backend is slow or can't handle a high load, then the process queue can build up. If the client subsequently closes the connection (for example, due to a timeout) the connection remains in the queue. Potentially, these connections can build up and cause IG to become unresponsive and stop accepting new connections.

You can use one of the following approaches to prevent the queue building up in the first place:

  • Use a Throttling Filter - you can insert a ThrottlingFilter into your route to limit the number of requests that reach the application. See Throttle the Rate of Requests to Protected Applications and ThrottlingFilter for further information.
  • Set the maximum number of connections/threads on the HTTP connector - you can limit the number of consecutive connections that can be accepted and/or the number of simultaneous requests that can be handled by the web application container with the following settings:

See Tuning IG's Tomcat Container for further information.

Q. What throttling algorithm does IG use and how does it work?

A. The throttling algorithm used in IG is the Token bucket algorithm. This is the only supported algorithm in IG and allows a more regulated flow under load; however, since this algorithm does not prevent traffic bursts, the throttling rate may occasionally deviate from the defined limit. See Throttle the Rate of Requests to Protected Applications for further information.

This algorithm throttles and smooths the traffic once the incoming rate of requests is higher than the declared throttling rate. Initially the bucket is filled with tokens (requests) without any rate limit (seen as a traffic burst). Once the bucket is full, the rate limit is applied:

  • Before the bucket is full: IG keeps accepting requests.
  • Once the bucket is full and IG receives more requests than the configured throttling rate (for example, 6 requests every 60 seconds), the algorithm allows the bucket to push one token out every duration/requests seconds (in this example, 60/6, which  is every 10 seconds). As tokens are pushed out, additional tokens can be accepted.

In your log, you will see HTTP/1.1 200 responses when tokens are accepted and HTTP/1.1 429 responses when the bucket is full. Among the HTTP/1.1 429 responses will be HTTP/1.1 200 responses when a token has been pushed out, thus allowing another token to be accepted. For example, you will see something similar to this in your log:

[.559s] HTTP/1.1 200 [.663s] HTTP/1.1 200  [.747s] HTTP/1.1 200  [.837s] HTTP/1.1 200  [.925s] HTTP/1.1 200  [1.007s] HTTP/1.1 200  [1.093s] HTTP/1.1 429  [1.167s] HTTP/1.1 429  [1.260s] HTTP/1.1 429  [1.350s] HTTP/1.1 429  [1.440s] HTTP/1.1 429  [1.518s] HTTP/1.1 429  [1.602s] HTTP/1.1 429  [1.688s] HTTP/1.1 429  [1.761s] HTTP/1.1 429  [1.846s] HTTP/1.1 429  [1.920s] HTTP/1.1 429  [1.998s] HTTP/1.1 429  [2.083s] HTTP/1.1 200  [2.161s] HTTP/1.1 429  ...

Q. What other settings can I tune to determine the capacity a system can handle?

A. You can also tune the TCP level backlog queue, paying particular attention to the following operating system parameters:

  • net.ipv4.tcp_max_syn_backlog
  • net.core.somaxconn
  • net.core.netdev_max_backlog

It is recommended that you start with the default values and then tune them once you have finished tuning IG and the JVM. There is a third-party blog that provides a good explanation on these particular settings: Tuning your Linux kernel and HAProxy instance for high loads > TCP parameters.

Large backlog queues are only suitable for well tuned environments that can process a lot of requests quickly. If your backlog queue is too big for your system, you will likely see slow responses, requests getting queued at the TCP level and clients timing out. This happens because the web application container is not able to accept requests at the same rate, which means they are queued up while they are waiting. Conversely, if the backlog queue is too small, clients will see connection exceptions instead of timing out.

You should also consider these settings in conjunction with the maximum number of connections/threads permitted by your web application container (Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?). If the number of connections permitted is low but you have a big backlog queue, connections will continue to be accepted even when the servers are running slowly.

Monitoring the size of the backlog

You can use one of the following commands to monitor the size of the backlog:

  • Isof command - this command indicates how many connections are queued at the TCP level: $ lsof -a -i -s TCP:SYN_RECV -p [processId]Where [processId] is the process ID of the web application container that is running the IG process.
  • Netstat command - this command shows the current global count of connections in the queue: $ netstat -an | grep -c SYN_RECVYou can break this down further by port of the container's http(s) listener: $ netstat -an | grep -c SYN_RECV | grep [port] Where [port] is the port that the container is listening on, for example, 443.
Note

Depending on your operating system, you may need to replace SYN_RECV in the above commands with SYN_RCVD. You should check which version of SYN_RECEIVED is used in your distribution.

Q. How does streaming work for the ClientHandler and ReverseProxyHandler?

A. As of IG 6, you can enable streaming mode in the ClientHandler and the ReverseProxyHandler with the following option: asyncBehavior: streaming

To understand how streaming works, let's first look at how the ClientHandler and the ReverseProxyHandler handle requests (this processing is the same regardless of whether streaming is enabled or not):

When the ClientHandler/ReverseProxyHandler receives an incoming request, it requests a pooled connection from the PoolingNHttpClientConnectionManager, which in turn creates a connection manager thread (named pool-<id>-thread-<id>) to process the request. This connection manager thread then runs worker threads (named I/O dispatcher <id>) to handle the actual I/O processing.

IG makes use of the following to process these incoming requests:

  • Java's Executor.newFixedThreadPool() with the number specified in the numberOfWorkers option to execute requests asynchronously. Because the I/O worker threads handle the actual I/O process quite quickly and because thread management is resource intensive, we recommend setting the numberOfWorkers option to approximately the number of CPU cores.
  • Apache's HttpClient's I/O reactor to react to I/O events and to dispatch event notifications. See Asynchronous I/O based on NIO for further information.

Now let's look at how streaming mode affects this processing:

  • When streaming is not enabled, the connection manager thread waits for the worker thread to receive all the entity content from the backend. This means the request handling is:ClientHandler/ReverseProxyHandler -async-> connection manager thread -sync-> worker thread.
  • When streaming is enabled, the connection manager thread waits for the worker thread to receive the response header but does not wait for all the entity content to be received from the backend. This means the request handling is:ClientHandler/ReverseProxyHandler -async-> connection manager thread -async-> worker thread

See ClientHandler and ReverseProxyHandler for further information about the asyncBehavior setting in these handlers.

Note

It is possible to end up in a thread starving situation when streaming is enabled as described in OPENIG-2417 (Thread starvation on CHF response reception/consumption with async http client), particularly if the numberOfWorkers option is set incorrectly.

Q. How do I increase the connection timeout in IG?

A. You might want to increase the connection timeout in IG if you are seeing errors such as the following:

org.apache.http.ConnectionClosedException: Connection closed java.net.SocketTimeoutException: null

You can do this by increasing the timeouts (connectionTimeout and soTimeout) for the ClientHandler in the IG route. See ClientHandler and 502 Bad Gateway or SocketTimeoutException when using IG (All versions) for further information. 

Q. Is a handler (and associated settings such as connection pools and timeouts) shared across all routes?

A. Handlers, filters etc that are defined in a heap are available in the current configuration and to all its children. For example, a handler of type ClientHandler called "MyClientHandler" defined in the heap of config.json will be shared by all routes that make use of "MyClientHandler" as a handler. Considerations around workers, timeouts and connection pools need to be understood when sharing handlers. If unique settings are required for a particular use-case, then defining the handler in the route's heap or inline might be a better option.

See  Configure Objects Inline or In the Heap for further information.

Q. How can I avoid Cannot assign requested address exceptions?

A. The "Cannot assign requested address" exception indicates there are too many open connections and is often seen with 502 Bad Gateway errors. 

Examples of the "Cannot assign requested address" exception that you may see in your logs:

java.lang.RuntimeException: java.net.BindException: Cannot assign requested address java.net.NoRouteToHostException: Cannot assign requested address java.net.NoRouteToHostException: Cannot assign requested address (Address not available)

You can resolve this issue on Linux® systems by changing the following operating system parameters:

  1. Increase the local ports range to something like 9000-65535:
    • Check the current local ports range: $ cat /proc/sys/net/ipv4/ip_local_port_range
    • Increase the range: $ echo 9000 65535 > /proc/sys/net/ipv4/ip_local_port_range
  2. Enable sockets in a TIME_WAIT state to be re-used for new connections: $ echo "1" > /proc/sys/net/ipv4/tcp_tw_reuse

Q. Should I use Apache's HttpClient in my scripts?

A. No, It is preferable to use IG's HttpClient in your configuration and Groovy scripts because it gives better performance (increased throughput). You can implement and tune IG's HttpClient (org.forgerock.http) using the ClientHandler. See ClientHandler and Class Client for further information. 

If you already use Apache's HttpClient (org.apache.http) and are seeing high CPU, you should change your scripts to use the IG implementation instead.

See Also

How do I configure an idle timeout in IG (All versions)?

How do I change the JVM heap size for IG (All versions)?

How do I enable Garbage Collector (GC) Logging for IG (All versions)?

Best practice for JVM Tuning with G1 GC

Best practice for JVM Tuning with CMS GC

Performance tuning and monitoring ForgeRock products

Tuning Performance

Related Training

N/A


Copyright and Trademarks Copyright © 2021 ForgeRock, all rights reserved.