- Q. How can I troubleshoot performance issues?
- Q. What cache settings can I enable?
- Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?
- Q. What throttling algorithm does IG use and how does it work?
- Q. What other settings can I tune to determine the capacity a system can handle?
- Q. How does streaming work for the ClientHandler and ReverseProxyHandler?
- Q. How do I increase the connection timeout in IG?
- Q. Is a handler (and associated settings such as connection pools and timeouts) shared across all routes?
- Q. How can I avoid Cannot assign requested address exceptions?
- Q. Should I use Apache's HttpClient in my scripts?
This article contains tips on Performance and Tuning to get you started. Performance tuning is environment specific and is outside the scope of ForgeRock support; if you want more tailored advice, consider engaging Deployment Support Services.
A. Follow the advice in How do I generate more detailed debug logs to diagnose an issue in IG (All versions)? to obtain more detailed logs for troubleshooting. You should also note the good practice advice of decorating filters and handlers with the timer decorator as this can help pinpoint possible bottlenecks in the route as it generates start/stop timings in the logs for each component decorated. See TimerDecorator for further information.
|Object||Property||What it caches||Documentation|
|AmService||sessionCache||Session information from AM||AmService|
|UserProfileFilter||cache in userProfileService||AM user profiles||UserProfileFilter|
|PolicyEnforcementFilter||cache||Policy decisions from AM||PolicyEnforcementFilter|
|OAuth2ClientFilter||cacheExpiration||User information from the OIDC Provider||OAuth2ClientFilter|
|OAuth2ResourceServerFilter||cache||OAuth2 access tokens||OAuth2ResourceServerFilter|
See Cache for further information.
A. Once a request (connection) has been accepted by the web application container where IG runs, it is put in the process queue. If the backend is slow or can't handle a high load, then the process queue can build up. If the client subsequently closes the connection (for example, due to a timeout) the connection remains in the queue. Potentially, these connections can build up and cause IG to become unresponsive and stop accepting new connections.
You can use one of the following approaches to prevent the queue building up in the first place:
- Use a Throttling Filter - you can insert a ThrottlingFilter into your route to limit the number of requests that reach the application. See Throttle the Rate of Requests to Protected Applications and ThrottlingFilter for further information.
- Set the maximum number of connections/threads on the HTTP connector - you can limit the number of consecutive connections that can be accepted and/or the number of simultaneous requests that can be handled by the web application container with the following settings:
See Tuning IG's Tomcat Container for further information.
A. The throttling algorithm used in IG is the Token bucket algorithm. This is the only supported algorithm in IG and allows a more regulated flow under load; however, since this algorithm does not prevent traffic bursts, the throttling rate may occasionally deviate from the defined limit. See Throttle the Rate of Requests to Protected Applications for further information.
This algorithm throttles and smooths the traffic once the incoming rate of requests is higher than the declared throttling rate. Initially the bucket is filled with tokens (requests) without any rate limit (seen as a traffic burst). Once the bucket is full, the rate limit is applied:
- Before the bucket is full: IG keeps accepting requests.
- Once the bucket is full and IG receives more requests than the configured throttling rate (for example, 6 requests every 60 seconds), the algorithm allows the bucket to push one token out every duration/requests seconds (in this example, 60/6, which is every 10 seconds). As tokens are pushed out, additional tokens can be accepted.
In your log, you will see HTTP/1.1 200 responses when tokens are accepted and HTTP/1.1 429 responses when the bucket is full. Among the HTTP/1.1 429 responses will be HTTP/1.1 200 responses when a token has been pushed out, thus allowing another token to be accepted. For example, you will see something similar to this in your log:[.559s] HTTP/1.1 200 [.663s] HTTP/1.1 200 [.747s] HTTP/1.1 200 [.837s] HTTP/1.1 200 [.925s] HTTP/1.1 200 [1.007s] HTTP/1.1 200 [1.093s] HTTP/1.1 429 [1.167s] HTTP/1.1 429 [1.260s] HTTP/1.1 429 [1.350s] HTTP/1.1 429 [1.440s] HTTP/1.1 429 [1.518s] HTTP/1.1 429 [1.602s] HTTP/1.1 429 [1.688s] HTTP/1.1 429 [1.761s] HTTP/1.1 429 [1.846s] HTTP/1.1 429 [1.920s] HTTP/1.1 429 [1.998s] HTTP/1.1 429 [2.083s] HTTP/1.1 200 [2.161s] HTTP/1.1 429 ...
It is recommended that you start with the default values and then tune them once you have finished tuning IG and the JVM. There is a third-party blog that provides a good explanation on these particular settings: Tuning your Linux kernel and HAProxy instance for high loads > TCP parameters.
Large backlog queues are only suitable for well tuned environments that can process a lot of requests quickly. If your backlog queue is too big for your system, you will likely see slow responses, requests getting queued at the TCP level and clients timing out. This happens because the web application container is not able to accept requests at the same rate, which means they are queued up while they are waiting. Conversely, if the backlog queue is too small, clients will see connection exceptions instead of timing out.
You should also consider these settings in conjunction with the maximum number of connections/threads permitted by your web application container (Q. How can I limit the number of requests if the backend is slow or it can't handle a high load?). If the number of connections permitted is low but you have a big backlog queue, connections will continue to be accepted even when the servers are running slowly.
Monitoring the size of the backlog
You can use one of the following commands to monitor the size of the backlog:
- Isof command - this command indicates how many connections are queued at the TCP level: $ lsof -a -i -s TCP:SYN_RECV -p [processId]Where [processId] is the process ID of the web application container that is running the IG process.
- Netstat command - this command shows the current global count of connections in the queue: $ netstat -an | grep -c SYN_RECVYou can break this down further by port of the container's http(s) listener: $ netstat -an | grep -c SYN_RECV | grep [port] Where [port] is the port that the container is listening on, for example, 443.
Depending on your operating system, you may need to replace SYN_RECV in the above commands with SYN_RCVD. You should check which version of SYN_RECEIVED is used in your distribution.
When the ClientHandler/ReverseProxyHandler receives an incoming request, it requests a pooled connection from the PoolingNHttpClientConnectionManager, which in turn creates a connection manager thread (named
pool-<id>-thread-<id>) to process the request. This connection manager thread then runs worker threads (named
I/O dispatcher <id>) to handle the actual I/O processing.
IG makes use of the following to process these incoming requests:
- Java's Executor.newFixedThreadPool() with the number specified in the numberOfWorkers option to execute requests asynchronously. Because the I/O worker threads handle the actual I/O process quite quickly and because thread management is resource intensive, we recommend setting the numberOfWorkers option to approximately the number of CPU cores.
- Apache's HttpClient's I/O reactor to react to I/O events and to dispatch event notifications. See Asynchronous I/O based on NIO for further information.
Now let's look at how streaming mode affects this processing:
- When streaming is not enabled, the connection manager thread waits for the worker thread to receive all the entity content from the backend. This means the request handling is:ClientHandler/ReverseProxyHandler -async-> connection manager thread -sync-> worker thread.
- When streaming is enabled, the connection manager thread waits for the worker thread to receive the response header but
does not wait
for all the entity
content to be received from the backend . This means the request handling is:ClientHandler/ReverseProxyHandler -async-> connection manager thread -async-> worker thread
It is possible to end up in a thread starving situation when streaming is enabled as described in OPENIG-2417 (Thread starvation on CHF response reception/consumption with async http client), particularly if the numberOfWorkers option is set incorrectly.
You can do this by increasing the timeouts (connectionTimeout and soTimeout) for the ClientHandler in the IG route. See ClientHandler and 502 Bad Gateway or SocketTimeoutException when using IG (All versions) for further information.
Q. Is a handler (and associated settings such as connection pools and timeouts) shared across all routes?
A. Handlers, filters etc that are defined in a heap are available in the current configuration and to all its children. For example, a handler of type ClientHandler called "MyClientHandler" defined in the heap of config.json will be shared by all routes that make use of "MyClientHandler" as a handler. Considerations around workers, timeouts and connection pools need to be understood when sharing handlers. If unique settings are required for a particular use-case, then defining the handler in the route's heap or inline might be a better option.
See Configure Objects Inline or In the Heap for further information.
Examples of the "Cannot assign requested address" exception that you may see in your logs:java.lang.RuntimeException: java.net.BindException: Cannot assign requested address java.net.NoRouteToHostException: Cannot assign requested address java.net.NoRouteToHostException: Cannot assign requested address (Address not available)
You can resolve this issue on Linux® systems by changing the following operating system parameters:
- Increase the local ports range to something like 9000-65535:
- Check the current local ports range: $ cat /proc/sys/net/ipv4/ip_local_port_range
- Increase the range: $ echo 9000 65535 > /proc/sys/net/ipv4/ip_local_port_range
- Enable sockets in a TIME_WAIT state to be re-used for new connections: $ echo "1" > /proc/sys/net/ipv4/tcp_tw_reuse
A. No, It is preferable to use IG's HttpClient in your configuration and Groovy scripts because it gives better performance (increased throughput). You can implement and tune IG's HttpClient (org.forgerock.http) using the ClientHandler. See ClientHandler and Class Client for further information.
If you already use Apache's HttpClient (org.apache.http) and are seeing high CPU, you should change your scripts to use the IG implementation instead.