Thursday, 27 July 2017

Why I hate Java HTTPClient MaxConnectionsPerHost

This is a copy of a useful post on the topic, which has been taken down.
Originally at : http://blog.scriptkiddie.org/2010/09/06/why-i-hate-java-httpclient-maxconnectionsperhost/#sthash.gZ5qhg2R.dpbs
Now on : http://archive.is/NFnl7


Background
The JAVA httpclient package is used by many software devs in SOA architecture shops to make back end connections from service to service. The apache developers who wrote the httpclient library clearly were considering the use of the httpclient library to make web browsers. As such they included a parameter, MaxConnectionsPerHost which would limit the number of simultaneous requests a browser could make to a website to 2, in order to avoid overloading the site. This made more sense back in the 90s when RFC 2068 was written with this recommendation, and firefox now has upped the default limit to 15, and I believe that IE has raised the default limit to 8.
My contention is that in SOA shops where servers are calling services that this connection limit is useless and should be disabled or raised to a very, very large value (10,000+).
The Problem
If you have a bank of servers calling a bank of other servers, in an SOA environment, you can wind up hitting an artificial limit caused by this httpclient limit which is nearly identical in behavior to exhaustion of database connection pools. It should be noted that database connection pooling, however, is necessary in order to reduce the cost of expensive database connection establishment and to reduce the memory impact on the database server of large amounts of database connections (particularly with Oracle, less so with a thinner database like MySQL).
What this appears like is that you have a bank of java (tomcat or whatever) servers, which are idle but periodically spiking to insane latency and timeouts. There’s no maxing out of CPU, I/O, network bandwidth or any other server resources. Similarly, the back end service that these servers are trying to contact is also scaled out adequately for the load and there’s no obvious performance issues that it is hitting, but response times in getting back a reply from the service is very, very slow as measured by the java making the httpclient call. This problem can masquerade as networking or load balancer issues and can drive network engineers nuts trying to track down why “the network is slow”.
What is observed, however, on the java app in thread dumps is potentially hundreds of threads stuck in doGetConnection:
  "XXX THREAD NAME CHANGED TO PROTECT THE GUILTY XXX" daemon prio=10 tid=0x0000002ddb745800 nid=0x5edd in Object.wait() [0x000000005587f000]     java.lang.Thread.State: TIMED_WAITING (on object monitor)          at java.lang.Object.wait(Native Method)          at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:509)          - locked <0x0000002aa37a9208> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)          at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:394)          at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:152)          at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)          at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)  [...etc...]  
So, we have idle service on one end, idle service on the other end, everything is working but the service is effectively crashed because of this artificial limit. This happens because in an SOA environment, having a limit of 2 simultaneous connections from server to service VIP is way, way too low. In a bank of 8 servers that means you can only have 16 simultaneous connections and if you’re doing in aggregate 100 tps it only takes a few slow, expensive calls to the service to “clog up the pipes”.
Why raising the limit to 10 or 20 or 30 is bad
The obviously prudent thing to do is to raise this limit up to some limit which is reasonable but still “protects” the back end service. We wouldn’t want to disable the limit entirely because… well, something bad would clearly happen and someone put that limit in there for a reason.
I’m going to try to argue that the reason that limit was put in there had nothing to do with your SOA environment, and that I’ve worked in truly massive non-java SOA environments that didn’t have this kind of limit and never saw an issue, and that there’s much better ways to deal with scaling limits and brown-outs and that this limit is always the behavior you don’t want.
I just diagnosed this problem, again, for the hundredth time in a situation where the MaxConnectionsPerHost limit had been raised to 10 on a bank of 8 servers. This had been running fine for a long time, but 5 of the servers crashed at once due to a memory exhaustion issue. That was bad, but the set of servers is so overscaled that there was still 60% idle cpu cycles available on the clients. The problem was that the farm went from having a limit of 80 simultaneous connections down to having a limit of 30 simultaneous connections. That was the only thing that caused the entire farm to fail (due to timeouts).
Granted, having 5 of 8 servers out of rotation is a bad thing, but the farm actually could have taken the load and this would have been an “oops, we had 5 servers out, damn we’re overscaled, good no customers were impacted” problem, but the “prudent” limit of 10 resulted in an outage. I’d rather jack the limit up to something very, very large and make this problem simply go away and stop encountering it. It didn’t do any good, and just caused us another outage.
Effect of Removing the Limit Completely
I worked in a very large Seattle-based Internet Retailor for 5 years as one of the “Tier 3 or 4″ Senior SEs who would see any kind of crazy infrastructure problem like this bubble up to us. We were not java-based at the time and were instead using process-based clients that simply had no concept of this kind of connection pooling to back end SOA services. Any server could open up as many connections to a back end service as it liked, each process could open up as many back end connection as it liked, and the processes did not share any state to know in aggregate how many connections were open to any back end server. With 30,000 servers and thousands of different deployed applications (literally) we never encountered any issues that the maxrequestsperhost limit would have solved. In my opinion, in an SOA shop this is a solution which is looking for a problem. I lived for 5 years in a massive environment and never once saw some kind of issue which made me think to utilize something like this limit.
And I would argue that this is because HTTP connections are *massively* cheap compared to Oracle connections. Sure you need to use a little bit of TCP/IP to get it going, but modern processors can do many more of those connection opens per second than your servers are ever going to want to be submitting (100 tps coming from a typical java tomcat is going to be impressive programming, but the TCP/IP stack won’t break a sweat). Trying to do some kind of HTTP/1.1-based connection pooling with a finite limit on it (which in my experience is *not* what is typically going on when I see the httpclient bug — most of the time these connections are not being reused at all) is a premature optimization in the Knuth-sense.
Poor Behavior on Surge Traffic
A common thought is that this “protects” your back-end services from surges. But the infinite-queuing behavior of the httpclient is precisely what you don’t want. As soon as the client overall starts to require more simultaneous connections than it can submit to the back end service the queue will attempt to grow infinitely long, creating infinite latency. What effectively happens is that every single request will take as long as the timeout period of whatever meta-client is calling the java service that uses httpclient.
In brown-out loading what you want is to start aggressively dropping connections to take load off, but you want to do that based on real brown-out of your back end service. SDEs are terrible at estimating what level of simultaneous connections would actually result in a real brown-out of the back end service. Nobody measures this in Q/A or load, or looks at it in production, and it would probably take a team of people in a large site to keep measuring and tweaking all the clients in production. The only way to reliable tell you really are in a brown-out situation is by wrapping your back-end calls with timeouts — and not queueing.
Retries are also poor behavior as well, unless you have exponential backoff like TCP/IP uses — otherwise you ensure that a momentary brown-out produces an permanent overload of the back-end service.
You could still use a simultaneous connection limit if you must, but you must not queue, you must drop. If you queue, in an overload you will build latency without limit once the pipes are filled up, causing every request to timeout which results in a 100% outage anyway. If you immediately drop requests over the limit then there is the possibility that you could hit a situation where dropping 10% of the requests allows 90% of the other requests to succeed in a timely manner. However, again, this requires being able to accurately measure exactly what the simultaneous connection threshold should be. Set it too low and you start to deny requests before you have overloaded your backend service. Set it too high and you overload your back-end service anyway — you will never manage to set it correctly and budget adequate time to maintain it as the software changes, so its effectively useless go down that road. Anyway, the httpclient blocks requests in doGetConnection when all the connections are being used and does not drop them, so the httpclient does not implement this kind of behavior.
MaxConnectionsPerHost recommendation
Find some way to disable it, or set it to something “insane” like 10,000. All it does, in an SOA environment, is cause problems without usefully solving any problem.
You will then never again see the problem where an idle java app is having problems talking to an idle back end service when everything in the network checks out fine — an outage due simply to configuration.
You can then focus on scalability and brownouts using timeouts, exponential backoff, or simply elastic or “cloudy” scaling of services in response to demand.
And in general, you should not try to “protect” back end services with any kind of artificial limits. Invariably this result in exercising the Law of Unintended Consequences when those limits are accidentally hit when all the services are fine. It will not ‘save’ you but will cause you problems. Of course, when you encounter a problem like Oracle’s expensive connections you need to manage that resource and establish limits, but still those limits should be pushed to the point where you are using the limits to protect Oracle’s memory and not in an attempt to prevent apps from delivery ‘too many’ requests to Oracle. If your database melts down under load you need to address the problem well upstream with whatever is misbehaving — or just buy a bigger database machine, or use some more horizontally scalable NoSQL solution.
Summary
The MaxConnectionsPerHost parameter is a throwback to RFC 2068, written in 1997, before anyone dreamed up the term “Service Oriented Architecture”. In an SOA shop, this behavior is worse than useless and wastes time and causes needless outages without solving any problems.
The apache httpclient authors probably will not by default remove this limit, since they have to consider that their code could also be used to write web browsers, where some kind of finite limit is certainly prudent. But in any SOA shop this limit should be disabled or raised to something like 10,000 in all service-calling code — effectively making it unlimited and removing it.

3 thoughts on “Why I hate Java HTTPClient MaxConnectionsPerHost

  1. Vineet Tripathi says:
    On apache HttpClient web site, I found following –
    “The process of establishing a connection from one host to another is quite complex and involves multiple packet exchanges between two endpoints, which can be quite time consuming. The overhead of connection handshaking can be significant, especially for small HTTP messages. One can achieve a much higher data throughput if open connections can be re-used to execute multiple requests.”
    As per this should we not limit the Maximum number of connections per host to a low value, such as 10/20/30, What do you suggest?
  2. Vineet Tripathi says:
    I guess, I got my answer by reading the following lines on apache site (this is for threadsafe/Multithreaded connection manager) –
    “Per default this implementation will create no more than than 2 concurrent connections per given route and no more 20 connections in total. For many real-world applications these limits may prove too constraining, especially if they use HTTP as a transport protocol for their services. Connection limits, however, can be adjusted using HTTP parameters.”
  3. Lamont Granquist says:
    Those are two different concepts.
    Re-using open connections to execute multiple requests is utilizing HTTP/1.1 keepalives — where the socket is left open after the HTTP request/response and another request can be sent in the pipeline.
    Tweaking the MaxConnectionsPerHost parameter does not affect keepalives at all, it simply means that a given server can only have N simultaneous requests open in-flight to a given back-end service. It is largely irrelevant as to if these requests are going across kept-alive socket connections or not.
    I believe the httpclient will negotiate HTTP/1.1 keepalives by default, so you do not need to tweak anything there. But you still want to increase the MaxConnectionsPerHost setting to increase the number of pipelines available and reduce blocking waiting for the pipelines to become available.