Maximum TPS (c10k!)
With all the benchmarks testing how many Transactions Per Second can a web service crank through, it is sometimes easy to forget that serving web service requests should fundamentally be quite simple and fast. With today's multi-gigahertz number-crunching beasts of CPUs, numbers often quoted for SOAP-based web services can apparently still be expressed with just 2 digits ("we improved SOAP TPS from 20 to 80! breakthrough in efficiency!" [yes, I should dig up the reference]). This is pathetic.
So why do I think it's pathetic if getting up to 100 TPS was considered a huge achievement? What would be a more reasonable baseline performance to expect? I set out to try out some sort of baseline, by a very simple experiment: writing a web service (in the basic sense of the term: a server that serves HTTP GET requests), that:
- Is based on Servlet API, and runs on a servlet container (Tomcat in this case)
- Serves basic HTTP GET requests
- Returns very simple (non-xml) payload, along with a result code.
In addition to Tomcat-deployed dead simple service, I also obviously need matching simple client; one that can run multiple threads that send simple requests, and verify responses.
Running the server on a single 2 Ghz single-CPU Intel xeon box, and a 10-thread client on another similar box, over LAN, results in slightly over 4000 (!) requests per second throughput. Yes, that's right: FOUR THOUSAND. And CPU load on server is light, at around 15%. Adding another client box results in almost linear increase: almost 7500 TPS, with slightly below 50% CPU utilization. And third one can boost this up to almost 9000 TPS. So with little tweaking, such a toy service could conceivably serve up to 10k requests per second, with Plain Old Http (although, it's worth noting, this does require HTTP 1.1 pipelining -- without pipelining one gets less than 50% of the throughput).
Now, the service as tested is but a toy: it serves a short static string as a response and does not even do any logging. So what about simplest possible POX service: xml response to HTTP GET? With a set-up similar to the first case, but returning payload of almost 4 kB of XML (list of web service methods along with arguments and descriptions, as simple xml structure), and with full request-level logging, 3 multi-threaded clients can get throughput of slightly over 3000 requests per second on the same hardware. And this with no funky optimization, just using basic Woodstox XMLStreamWriters wrapped in StaxMate output framework. But xml generation itself is dynamic, starting from a list of services configured. This example service method was actually cut from a real web service: it will be interesting to see how the actual 'active' service methods (ones that need to access other backend services) will fare.
At this point, the trend should be clear: add more things, and get lower throughput. But the level of throughput is still an order of magnitude above what many consider maximum throughput obtainable. So what happens between fully static service invocations (like ones mentioned above), and more dynamic ones, to slow things down by a factory of 20? What kind of catastrophe can lead to such scalability degradation? One significant problem is that if the service itself has to call other services (or make database queries) -- something that usually happens -- latency of requests increases on server side, and so does number of threads needed to serve parallel requests (due to Servlet processing model). So even if the request itself is heavily I/O-bound (just waiting for another service to reply), thread scheduling overhead starts mounting, and uses CPU. And this is where projects like AsyncWeb start to matter a lot... but like they say, that's another story. ;-)
Anyway: as simple and naive as examples given above are, it should be food for thought to think of what "optimal performance" really means, and what should be achievable. For me it's clear that I want my services to serve "a thousand or more" requests per second. One hundred is just... well, so late-90s. :-)