How JBoss Netty helped me build a highly scalable HTTP communication server for GPRS

It has been nearly 1 year and 3 months since I discovered JBoss Netty. We were building an end-to-end bus transportation solution and the server-side required a highly scalable
communication layer which could support concurrent transportation devices to the tune of 10000s and with a throughput of above 500 requests/sec posting transportation related data to the server over GPRS using HTTP.

In the first version of this solution wherein I was the architect of the communication server, I had already burnt my fingers using a commercial application server's servlet container and naively building upon a synchronous architecture which could not scale upto more than 500 devices at nearly 25 requests/sec! Of course, a lot more was wrong with it than just using a servlet container - the entire architecture was synchronous and there were other humungous mistakes, but that's another story.

This time we were making a radically different architecture under the guidance of a seasoned architect. As I had built the communication server for the first generation of the solution and had already learnt a lot from our experiences of making the solution live, I was determined to get it right this time. In keeping with the overall architecture of the solution and for supporting this high scalability and throughput, I decided to use an asynchronous HTTP engine based on NIO.
Why NIO? Well my fascination for NIO started in early 2006 when I came across the following articles published in O'Reilly

However, my project manager-cum-architect at the time (early 2006) was perturbed with the idea of trying out a 'new' and 'untested' architectural approach and as I had not much architecture experience at the time to counter the arguments, I gave in and used a regular Core-J2EE-patterns inspired 3-tier architecture utilizing a commercial application server's servlet container.

When the first generation of the complete solution was built, deployed and made live, I became more and more convinced that my first inclination to use NIO was justified.

Here were the reasons-

  • I was building a communication server which used HTTP as a means of transporting the actual application data. I was not putting up a web site for human users with an HTML GUI. As such I did not need lots of the facilities of the servlet containers and was in fact hampered by a few restrictions as I will elaborate below.
  • Pre-Servlet3.0 specifications only supports synchronous architecture and a thread-per-connection or thread-per-request model. Servlet3.0 specification has only recently been released and the majority of application servers are yet to comply with this specification. For more information, read the article Asynchronous processing support in Servlet 3.0. Our solution was built and deployed much before that. Of course, application servers also have non-servlet-standard APIs for asynchronous HTTP in their servlet engines (read article Asynchronous HTTP and Comet architectures). But I had no awareness of it at the time.
  • The transportation medium was GPRS for which bandwidth is quite low. Moreover, in developing countries, the quality of GPRS is not as consistent as Europe or United States. Furthermore, our devices were mounted on moving vehicles which could move at very high speeds changing cells and GPRS connection could be quite fragile. Under the hood, GPRS is a packet oriented service offered on 2G networks and as such manages to send only a few packets so long as communication channels are not used up by voice communications. That means the entire HTTP request would arrive in packets or chunks or bursts over GPRS. Often we would also get HTTP requests that were incomplete and would eventually time out. For example, if the GPRS device was trying to send 512 bytes our server would get only 78 bytes and then the request would time out. In the thread-per-request model, this would mean that one precious thread of the container would be blocked just trying to read the complete request and would eventually timeout. Depending on the read timeout value set in the application server, this could mean as much as 30 secs which is a huge amount of time. What we required was an engine which could do a non-blocking read and assemble the entire HTTP POST request in memory and only then call a servlet or a handler to do the rest of the processing. In other words, we needed the proactor pattern instead of the reactor pattern. And while lots of application servers use NIO or native IO, they internally use the reactor pattern!

The case for NIO was thus clear. But I still was having a hangover of the Core J2EE patterns and the numerous MVC frameworks built around Servlets for presentation tier. So I would probably have used NIO frameworks in the typical synchronous style. Then along came our seasoned architect who reviewed our solution and proposed a radically different - distributed and asynchronous architecture. I realized that if I used a synchronous approach with the reliability that data sent by the clients was getting successfully processed by the backend, I would have to block my thread till the backend processing of the data sent by the vehicular devices was done.

The other possibility would have been to use reliable JMS, but with the store-and-forward technique of reliable JMS, we would have probably sacrificed the desired high scalability and throughput for the reliability. Besides that was rather like arm-twisting the architecture to comply with servlet container constraints.

What I needed was an HTTP engine which would not impose the following restriction of pre-Servlet-3.0-

Each request object is valid only within the scope of a servlet’s service method, or within the scope of a filter’s doFilter method. Containers commonly recycle request objects in order to avoid the performance overhead of request object creation. The developer must be aware that maintaining references to request objects outside the scope described above may lead to non-deterministic behavior.

An engine that would allow me to do this-

That way, there would be no scalability bottleneck in accepting more requests from larger number of clients and reliability would have been sufficiently achieved as well.

Moreover I needed this in late 2008-early 2009 when the final Servlet 3.0 specification was not yet released and there was no reference architecture either.
Relying on my NIO experience, I proceeded to evaluate the following three options of NIO frameworks-

I had had experience building NIO TCP/IP servers with Apache MINA but the MINA releases were taking a long time and HTTP protocol was not directly supported. Yes, there had been an asyncweb project from safehaus built on MINA0.8, but the project had been migrated to Apache and was not developed further to be compatible with latest versions of MINA. It was a big risk using Apache MINA+Asyncweb.

Grizzly, I found, had too many options - plain NIO framework, Embeddable HTTP Web Server, Embeddable Comet WebServer, Embeddable Servlet container. I was simply baffled. It seemed a great learning curve and I had very little time. Moreover, this post from Grizzly's creator alerted me that Grizzly would by default, use the reactor pattern rather than proactor, which I was not comfortable with for my scenario.

In contrast, JBoss Netty's HTTP support was short and sweet - with no-frills. All I had to do was to use the ready-made HTTP Protocol encoder and decoder and attach my own HTTP request handler. After all, HTTP for my scenario was just a protocol over TCP/IP sockets carrying the application data in its body.

Within no time, I had my prototype ready and I was confident that I was on the right track. The performance test reports were extremely heartening. But could Netty fit in my scenario?

Turns out that it far exceeded expectations. That simple HTTP communication server I built using Netty could sustain the concurrent load of 10000 clients with the desired throughput of 500 requests/sec on a single desktop PC with 1 GB RAM and 1 CPU in our lab. And I am not kidding! It was celebration time for all of us. The memory utilization was well within the 812MB heap size we had granted to the JVM and the average CPU utilization was 78%. In other words, we were utilizing the full resources that the machine offered.

My reasoning for building a scalable HTTP communication for GPRS devices with NIO, asynchronous HTTP was thus vindicated by using JBoss Netty Framework. Thanks to Trustin Lee for making this splendid framework.

Go Back