Quick Introduction: project Aalto, from Cowtown Skunkworks
As some of you already know, Yet Another high-performance Java xml processor project was recently launched. Aalto xml processor is work-in-progress, and approaching its 1.0 release. I will try to write bit more on reasons behind starting this project on another entry, but for now it is enough to know that there are 2 main technical goals:
- Be Wicked Fast (check this out for some suggestions as to what is achievable)
- Implement Non-Blocking XML parsing mode (reads from underlying content do not block, but rather return EVENT_INCOMPLETE or such)
Both of these goals are already achieved to some degree: Aalto is almost twice as fast as Woodstox on many common documents (and hence matches or exceeds speeds of native code parsers like libxml2 -- I kid you not; likewise, binary xml parsers will get good run for their money when being compared to Aalto); and it does have experimental non-blocking (aka asynchronous) parser implementation. Challenges still remain, such as how to define standard extensions to support non-blocking mode.
For those interested in learning more, the important links are:
And how about immediate roadmap? Plan is to get Stax 1.0 API completed for 1.0 release (to be released within next few months), and the missing pieces are:
- Implementation of coalescing mode (which, however, is missing from the Stax Reference Implementation, so hardly a must-have feature even if supposedly non-optional as far as Stax specs are concerned)
- Implementation of repairing XMLStreamWriter
Other than these main features, the only significant missing thing is DTD-handling: Aalto does not parse DTDs (it does know how to skip internal subsets well), and although there is nothing fundamentally preventing from adding support, amount of work is big enough that it will not be done before 2.0 (if even then).
Anyway, hope to write little bit more about this exciting new (or, "new old"... project history is not all that short) project shortly. Don't switch the channel!