New xml performance comparison: Woodstox fastest java xml parser
I bumped into another interesting new article, at xml.com, this one regarding xml parser performance: XML Parser Benchmarks: Part 1. One particular interesting feature is that this time comparison included both native and Java xml parsers, using what seemed to be fair comparison (although it'd be nice o independently verify the results, which doesn't seem possible).
Amongst the findings were:
- Woodstox was the fastest xml conformant Java parser of the ones they measured
- Although libxml2 (written in C) was faster than the fastest Java parser, difference was not particularly high (in my opinion)
- Throughput for all parsers was rather high: for Woodstox sustained throughput looked something like 25 MBps, which is in line with my own measurements. So for one's typical short (in xml terms at least...) soap messages, it's possible to parse (and write) thousands of messages per second. Parsing and writing really should not be the most expensive step any more.
According to their measurement, one would pay something like 30% extra overhead for Java. To me that seems like a bargain. And on the other hand, that there is some difference also suggests it is probably a fair comparison (as opposed some of more suspicious "my language is faster than your language" comparisons): here are parts of xml processing where native code still has advantage (low level byte manipulation for character decoding, for example, memory mapping of content), so it seems reasonable there is some speed benefit. It may also be a win-win situation: those who favour using low-level languages to squeeze out last cpu cycle will find comfort in that all that tweaking with memory management and buffer handling will have some dividends. And others can feel ok with the comfort of a managed runtime environment, with modest overhead.