Faster Than The Speeding Bullet?
Nope: that would be Superman. But perhaps Jackson
can at least sting like a bee? Anyway, to try to answer the question, I
decided to repurpose code from StaxTest (loose set of performance test
components used for Woodstox
development) and see how Jackson compares to the venerable Json.org
reference implementation. Test classes in question will be available as
part of the next Jackson source code bundle (under src/perf), and others
can check out their experiences. But here are some choice tidbits until
then.
First of all, I decided to use sample documents from http://www.json.org/example.html.
The documents are quite short (from less than 1 kB to about 4), but
since there do not seem to be similar sample document repositories as
there are for xml, these would have to do. The test consists of repeated
parsing of specified document. Document is first read into a byte array
before running tests (to minimize I/O overhead), and then feed using
implementation dependant mechanism.
For repetition count of 2500 over the largest (4 kB) of sample JSON
documents, on my (t)rusty old single-CPU Athlon box, I got following
numbers:
-
Jackson, fully streaming: 224 milliseconds
-
Jackson using simple Java type mapper: 333 milliseconds
-
Json.org reference implementation: 883 milliseconds
(I also did test out the other documets; numbers I saw were similar)
Fully streaming case will just iterate over all tokens of the input,
without further processing. Java mapper, on the other hand, will
actually construct in-memory representation (Lists, Maps, Numbers,
Strings, Booleans). So for this particular case, Jackson would be about
4 times as fast as the reference implementation, when using the fastest
mode. This comparison is not completely fair, of course, since the
reference implementation does actually build an in-memory
representation. Of course it is not necessarily true that one always
needs such "tree", so your mileage may vary.
At any rate, a simplified and somewhat naive answer would be that
Jackson may be 3 - 4 times as fast as the reference implementation
if you use the fastest access mode (streaming); and 2 - 3 times as
fast if you need an Object representation of JSON data. The usual
disclaimers apply, of course: it is not always easy to give fair
comparison; different kind of input might give different results and so
forth. But hopefully this gives some perspective on kinds of
improvements one could get. And I would love to see others doing similar
measurements.
But how about the absolute speed?
So it seems like Jackson might be a wee bit faster than the most
commonly used alternative. But beyond this, how would JSON compare to,
say, equivalent XML parsing? Well, given the input document size and
repetition counts, streaming parsing with JSON appears to proceed with
respectable rate of about 50 MBps on this particular system. The usual
XML processing rates using Woodstox, on same machine, is anywhere
between 10 and 30 MBps, depending on complexity of the document (plain
text and elements are fastest to process, attribute slower and so
forth). So assuming similar information density (some people claim JSON
has less fluff, but this seems debatable -- however, I haven't heard
anyone claim that XML would have more compact representation in its
textual serialization) it would appear that processing JSON is indeed
somewhat faster, which is to be expected given simplifity of JSON as a
data (transfer) format.
The real question is whether this advantage can be converted to even
more significant speed boost at higher level, like when doing full Java
data binding (a la JAXB). We should find it in near future once people
get more serious about building toolkits on top of efficient JSON
parsers...