Welcome to new episode of fun with threads. Today we will cover the exciting topic of asynchronous servlet processing. In the last episode we have learned that it’s not a problem to start more than 10,000 threads in modern JVM, today we will play with asynchronous “threadless” model.
The first important thing thing to note is that we still need a thread everytime our code is doing something. If you have your code in front of you, you always need a thread to move between the lines of the code. But if the code is waiting for something like IO, you do not need a thread and we can leverage that. We have to realize that the vast majority of Java applications act as a middleman – they just toss data from one side to another. From database to HTML, from proprietary back-end to SOAP WS, from REST service to another REST service. You get the picture. I usually do not write chat applications that need to keep a connection open and wait for an event. I usually end-up writing a classical request-response applications that just need to handle lot of requests.
Let’s examine how such applications work. What are the characteristics of such applications? First thing to notice is that they are waiting for most of the time. They are waiting for an incoming request. When the request comes, they do some processing, validation and transformation and then call a back-end. And than they wait again, this time for the response. When the response arrives, our application does some transformation and sends the response back. And then it waits again.
Of course it’s application dependent, but in most cases there are few milliseconds of work followed by tens or hundreds of milliseconds of waiting time. You do not have to block a thread when you are waiting. Even though the threads are pretty cheap, we do not want to waste them.
But as is usually the case, there are some tradeoffs. The traditional threaded model is easier to reason about, the asynchronous model is harder to grasp and easier to mess-up. I am not talking about shared state and similar stuff, I am just talking about the fact that my brain works better when I can read the code in a linear way an not thinking about jumping here and there.
Take the following example of a naive proxy using HTTP client 4
log("Servlet no. {} called.", number); HttpGet backendRequest = createBackendRequest(req, number); HttpResponse result = client.execute(backendRequest); copyResultToResponse(result, resp); log("Servlet no. {} processed.", number); log("Servlet no. {} returning.", number);
It’s evident what’s going on. I create a back-end request, then I execute it and then process the result. It’s straightforward.
Now let’s take a look at the asynchronous version. The easiest way is to delegate the asynchronous processing to libraries. On the top we will use Servlet 3 asynchronous processing, on the back-end side we will use HttpClient asychronous capabilities. Luckily both features play well together, so I can use this code.
log("Servlet no. {} called.", number); HttpGet backendRequest = createBackendRequest(req, number); //start async processing final AsyncContext asyncContext = req.startAsync(req, resp); client.execute(backendRequest, new FutureCallback<HttpResponse>() { @Override public void completed(HttpResponse result) { ServletResponse response = asyncContext.getResponse(); copyResultToResponse(result, response); log("Servlet no. {} processed.", number); asyncContext.complete(); } // error handling removed for brevity }); log("Servlet no. {} returning.", number);
Well it’s kind of readable. I can imagine that readability could be better in modern languages. What I do not like is that’s hard to grasp the code. Try it for yourself, what will be the order of log messages? It can be
a) Servlet no. 1 called. – Servlet no. 1 processed. – Servlet no. 1 returning.
b) Servlet no. 1 called. – Servlet no. 1 returning. – Servlet no. 1 processed.
c) Servlet no. 1 returning. – Servlet no. 1 called. – Servlet no. 1 processed.
What’s your answer? It’s of course the second one. The first one is for the synchronous version and the third is just made-up. Why the second version? It’s easy. By calling req.startAsync(req, resp) I say that I am starting the asynchronous processing. Then I execute back-end call but I put the response processing to a callback. So it’s processed after the back-end response arrives. Now I have finished the request execution and the request processing thread provided by the servlet container finishes the execution. Of course it has to unroll the whole stack. So it goes through all your filters as if the processing has already finished.
But it has not. It’s still waiting for the back-end response. Once the response arrives, the HttpClient calls the callback using its own thread. It has to, we have already returned the servlet thread to the container.
In the callback we process the response and by calling asyncContext.complete() we say to the servlet container that we have really finished. When we leave the callback, HttpClient will use the same thread to process another back-end response.
So in reality, the log from asynchronous call is
ServletThread - Servlet no. 1 called. ServletThread - Servlet no. 1 returning. HttpClientThread - Servlet no. 1 processed.
It’s easy to understand it once you get used to it. But you have to be aware of few downsides. The first is, that servlet filters do not work as you are used to. Do you want to do a response postprocessing in a filter? You are out of luck. You can use asyncContext.dispatch() method, but it adds even more complexity. Better solution may be to use framework support like the one in Spring MVC that can help you with the task.
The other downside is that you have to think more about the threads and sizing of thread pools. For example, if we have more time-consuming response processing, we have to add more threads to the HttpClient thread pool.
The third downside is that if you do not have support in the client library as we had in HttpClient, you have to take care about the threading yourself. Not only the library needs to support asynchronous calls, it has to support callbacks. If it uses Futures, you have to figure out how to get the result from the future and process it. Again, you will end up with some threading work.
So, if the asynchronous model is so complicated why would we want to use it? Why do not we just use the old synchronous model? The truth is that the synchronous model make sense for most of the use-cases. It is able to process few thousands of parallel requests and it is usually enough for most sites.
But if you have really massive load and you have to utilize you hardware as much as possible, asynchronous processing performs much better. In my test, I recursively call a servlet which in turns calls itself which calls itself until some limit is reached. Jetty is able to process 20,000 connections in less than 15 seconds (after some warm-up when it creates all the connections needed and then just reuses them). Keep in mind that under this test all connections are kept open. So effectively, in the middle of the test I have 20,000 connections open on servlet side and 20,000 connections open on the HTTP client side. All of this with 30 threads including JVM housekeeping threads. All of this on my five-year-old laptop with less than 1.5GB of heap. Pretty cool.
You can try it for yourself the source code is available here.