More on JSON performance in Java (or, lack thereof!)
(UPDATE, 06-May-2009: here are more recent results)
Other JSON parser implementations
After previous blog entry, I decided to have a look at other alternative JSON parser implementations available for Java. I figured that the json.org's reference JSON parser implementation is probably mostly aimed at show-casing the concept, not to act as the ultimate solution, perhaps some of the other choices might be optimized for performance.
For some reason this appears not to be the case. Some implementations obviously have other goals, like StringTree JSON which takes pride in having miniscule bytecode footprint. Small can be beautiful. But others could conceivably perform quite nicely: for example, BerliOS' JSONTools would seem like a good candidate, given it is built on top of a lexer-generated scanner: this approach could yield some mean lean tokenizers. But not in this case, it seems.
So let's have a look at some numbers. I will use a document similar to the earlier test (it would be nice to have a wider collection of test data, but it'll have to do for now). Here are numbers from "TestJsonPerf" (after running for a while to stabilize timings):
- Test 'Jackson, stream' -> 190 msecs
- Test 'Jackson, Java types' -> 304 msecs
- Test 'Json.org' -> 867 msecs
- Test 'StringTree' -> 733 msecs
- Test 'JSONTools (berlios.de)' -> 2727
(and as before, time taken is for 2500 repetitions of parsing a given json document from in-memory buffer)
So... StringTree implementation is on par with the reference implementation, actually, even a little bit faster (although nowhere near Jackson speedwise). But what is rather surprising is exactly how slow JSONTools appears to be. This was a big surprise, given how one would expect different outcome. With amount of code the package has, it perhaps has some other particularly interesting features to make up for rather more relaxed pace?
Although benchmarks can be misleading, it does seem like Jackson has a suitable raison d'etre even if it was only due to its efficiency. To me it is still a bit puzzling as to why no one had so far considered performance to be something to look for. Did everyone just assume that JSON would be super-fast purely by virtue of being a simple format to handle? Surely it should be well-known that xml parsers are ridiculously extensively optimized, and that naive approaches yield less than stellar speeds.
Anyway, with this brief detour, I'll be off working on the second core data mapper for Jackson, "Dynamic JSON mapper" (or, perhaps, "JsonTypeMapper"?). It'll be something closer to how people work with XML trees, but without keeping simple things from being simple.