Thursday, August 03, 2006

Using Stax2 (Woodstox 3.0) Validation API, part 1

One of the new features of Woodstox 3.0 is its completely redesigned and reimplemented validation system. Changes are complete, as both the interface (2.0 implemented basic Stax 1.0 API, and simple property-based native extensions) and the implementation (2.0 had in-built DTD validator) have been completely re-built.

The new interface to the validation sub-system is via experimental Stax2 package (defined under org.codehaus.stax2 package and its sub-packages, included in Woodstox distribution). This is in addition to the basic "enable DTD validation" property that was all that the original Stax 1.0 API defined in regards to validation. Internal implementation of the DTD validator was changed to be accessible via this new interface, and an additional optional Sun's Multi-Schema Validator based Relax NG validator was also added (initially it was hoped that a W3C Schema validator would also be included, but this was deferred until after 3.0 release).

The main features of the new Validation API can be summarized as follows:

  • Fully bi-directional: both documents processed with Stream/Event Readers AND Writers can be validated against same schemas, using same interface. Schema and validator instances work on both, since the interface they define (and context they get) is identical.
  • Implementations are pluggable: Schema instances are created using factories similar to basic Stax 1.0 XMLInputFactory and XMLOutputFactory (org.codehaus.stax2.validation.XMLSchemaValidationFactory), and registered using standard service definition mechanism.
  • Validators are chainable: one can use more than one validator per input/output processor.
  • Dynamic enabling/disabling of validators: it is possible to start/stop validation mid-stream (within constraints that the validator implementations may impose): specifically, it should be possible to validate sub-trees, instead of complete documents.
  • Possible to register error handlers, to implement different validation error handling strategies: from fail-fast to collect-all-problems or somewhere in between.
  • High-performance streaming validation: interface is designed to avoid unnecessary overhead when passing content to validate, so that implementations can try to optimize for performance.

So how does one use the new API? I just recently added first 2 sample classes into Woodstox distribution, to show-case simple reader-side validation. These classes are under 'src/samples' in Woodstox SVN repository, for those who need to learn it now.

Tomorrow I will show specific examples (based on above-mentioned sample classes), to show how simple validators can be written using Woodstox 3.0 and its Stax2 Validation API. Stay tuned!

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.