Tuning concurrent request settings for Apache HTTP Client
Apache HTTP client library is often used in Java applications. It is easy to use and comes with both blocking and non-blocking(async) variants. Even after introduction of built-in HTTP client inside modern JDK where developers may prefer to use built-in HTTP client - Apache HTTP client still could be part of applications via use of third-party libraries that depend on it. For example when using Spring Boot RestTemplate or Elasticsearch Java API Client (or Elasticsearch Java REST Client in older versions) which internally rely on Apache HTTP client for making REST calls to the server.
In this post I am going to discuss about default settings for controlling number of concurrent requests for Apache HTTP client and how to tune them. I have prepared set of example client-server applications to check impact of tuning connection pool settings. You can refer to source code and instructions on how to run particular set of client and server applications at apache-http-client-tuning-examples.
When an instance of Apache HTTP client with default settings is created regardless of blocking or non-blocking(async) variants, it has following connection pool settings -
Max connections per route: 2
Max total connections: 20
According to these default settings - Max connections per route controls how many concurrent requests can be sent per route (hostname + port combination). For example, if some service is running on 184.108.40.206:8888 then Apache HTTP client instance with default settings will just send maximum of 2 concurrent requests to that service. Other concurrent requests initiated on a same route will get queued until active requests are completed. Max total connections setting controls overall maximum number of concurrent requests from given Apache HTTP client instance regardless of the route.
For testing this behavior with default settings - we can refer to demo-web application server (which is Reactive Spring Boot application) and example apps mentioned in Default HTTP Client Test section. demo-web application server has dummy API controller (/api/sample/request) which asynchronously waits for 5 seconds before sending response. Also it keeps logging total number of active requests on every 1 second. Blocking version of demo-client app creates single Apache HTTP client (with default settings) and spawns 50 dedicated threads which attempts to send request to demo-web app server at the same time. It also logs time taken to complete each request and total time taken to complete all requests. Non-blocking version of demo-client app is similar but instead of creating 50 dedicated threads for sending requests it initiates async requests from main thread and rely on callback handler to calculate total time taken to complete a request. From server logs we can observe that only 2 concurrent requests are sent to the server at any given point in time.
Being able to send maximum 2 concurrent requests per route sounds very limiting. Modern server applications are usually able to handle hundreds of concurrent requests without any noticeable degradation in throughput. We can tune those connection pool settings while creating an instance of Apache HTTP client -
According to these new settings - Apache HTTP client instance can send max 15 concurrent requests per route. For testing this behavior with tuned settings - we can refer to demo-web application server (as mentioned previously) and example apps mentioned in Tuned HTTP Client Test section. From server logs we can observe that these tuned example apps are able to send 15 concurrent requests to the server.
Many libraries are built on top of Apache HTTP client where libraries usually define higher-level library specific abstractions and rely on Apache HTTP client for lower-level HTTP based transport. Concurrent request related settings discussed in this post are applicable to such libraries as well and they usually have a way to customize default settings related to underlying Apache HTTP client. These libraries may define their own set of default settings when we don’t customize them.
Let’s take an example of Elasticsearch Java API client which is one of the popular library dependent on Apache HTTP client. It defines higher-level abstractions by which developers can easily build different kinds of requests for querying, indexing, monitoring/administrating elasticsearch server. Under the hood it depends on Apache HTTP client for sending raw HTTP requests. When an instance of Elasticsearch Java API client is created without any customization - it will consider following connection pool settings -
Max connections per route: 10
Max total connections: 30
For testing behaviors with elasticsearch client - we can refer to demo-web application server again. It has 2 elasticsearch client instances defined where one instance is using default settings and other one is using custom concurrent requests related settings defined like following -
In addition to this, demo-web application server is having 2 API controllers which indexes received document in elasticsearch. First controller (/api/default/airline) is using elasticsearch client with default settings and other controller (api/tuned/airline) is using elasticsearch client with tuned settings. After document is indexed, controllers are waiting until indexed document is available for search (that is until index is refreshed which happens on every 1 second as per default index setting). In real-world applications usually there may not be a requirement to wait until currently indexed document is available for search. But in my example application, just to demonstrate effect of connection pool settings against long running requests, I am choosing request pattern like this.
Example client apps mentioned in Default Index Request Test section and Tuned Index Request Test section are used to call these controllers respectively. They initiate 1000 concurrent requests to the demo-web application server and logs time taken to complete each request and total time taken to complete all requests.
From server logs we can observe that all 1000 requests are sent simultaneously from demo-client apps to the demo-web application server and requests to elasticsearch server are throttled according to max connections per route setting. Elasticsearch client with default settings can send 10 concurrent requests to elasticsearch and takes about 1 minute 43 seconds to complete 1000 requests and Elasticsearch client with tuned settings can send 50 concurrent requests to elasticsearch and takes about 21 seconds to complete 1000 requests.
By default Apache HTTP client maintains pool of persistent connections on per route basis. Idle persistent connections from the pool can be reused to send new requests on the same route. It is helpful in a way that it reduces frequent round-trips for establishing new connections which can be time consuming. However default pool sizes might be too constraining for many modern use cases and needs to be tuned properly considering expected concurrent load and target server’s capacity.