As some of you already know, Yet Another high-performance Java xml
processor project was recently launched.
Aalto xml processor is work-in-progress, and approaching its 1.0 release.
I will try to write bit more on reasons behind starting this project on
another entry, but for now it is enough to know that there are 2 main
technical goals:
-
Be Wicked Fast (check
this out for some suggestions as to what is achievable)
-
Implement Non-Blocking XML parsing mode (reads from underlying content
do not block, but rather return EVENT_INCOMPLETE or such)
Both of these goals are already achieved to some degree: Aalto is almost
twice as fast as Woodstox on many common documents (and hence matches or
exceeds speeds of native code parsers like libxml2 -- I kid you not;
likewise, binary xml parsers will get good run for their money when
being compared to Aalto); and it does have experimental non-blocking
(aka asynchronous) parser implementation. Challenges still remain, such
as how to define standard extensions to support non-blocking mode.
For those interested in learning more, the important links are:
And how about immediate roadmap? Plan is to get Stax 1.0 API completed
for 1.0 release (to be released within next few months), and the missing
pieces are:
-
Implementation of coalescing mode (which, however, is missing from the
Stax Reference Implementation, so hardly a must-have feature even if
supposedly non-optional as far as Stax specs are concerned)
-
Implementation of repairing XMLStreamWriter
Other than these main features, the only significant missing thing is
DTD-handling: Aalto does not parse DTDs (it does know how to skip
internal subsets well), and although there is nothing fundamentally
preventing from adding support, amount of work is big enough that it
will not be done before 2.0 (if even then).
Anyway, hope to write little bit more about this exciting new (or, "new
old"... project history is not all that short) project shortly. Don't
switch the channel!