Friday, November 23, 2007

W3C Schema Validation with Woodstox

To All You Schema Lovers

(... yes, both of you)

Ok, so maybe not many software developers truly love W3C Schema, deep down in their cold cold hearts. But the fact is that it may be ugly, bulky and all (and unlikely to grow into a swan too!), but it also has its uses. It is used as data typing language for things like Soap and such. Occasionally it may even be useful for its original raison d'etre, validation of xml documents. So if the earlier validation support in Woodstox (DTD, RelaxNG) was not enough, now you can finally also validate documents you read (and write!) against W3C Schemas. This is possibly with Woodstox 4.0 version, including the first pre-4.0 preview release, 3.9.0 (fresh out of oven).

So how do I use it?

If you have been following this blog for a while, you may recall that this has already been covered -- given that same Stax2 API is used for all validation, be it for reader- or writer-side validation, and whichever supported schema language, all you have to do is indicate the correct type, and validate exactly the same way as you would validate against a RelaxNG schema. For others, here are some helpful pointers:

Essentially it all boils down to these simple steps:

  1. Get a schema factory that knows how to parse W3C Schema instances
  2. Ask factory nicely to parse a schema document and return you the resulting Stax2 validation schema object (be sure to ask very nicely, otherwise it'll insist you must provide something less daft, like RelaxNG schema instead! Sending a small bottle of Gran Marnier or PayPal donation to Woodstox author might help as well)
  3. Construct the stream reader/writer, and tell it to use schema object for validation
  4. Read/write xml content; this is needed as validator gets called when content is read or written.

Which might look something like:

  XMLValidationSchemaFactory sf = XMLValidationSchemaFactory.newInstance(XMLValidationSchema.SCHEMA_ID_W3C_SCHEMA);
  XMLValidationSchema vs = sf.createSchema(new URL("http://www.w3.org/schema/sample.xsd"));
  XMLStreamReader2 sr = (XMLInputFactory2) XMLInputFactory.newInstance().createXMLStreamReader(new FileInputStream("mydoc.xml"));
  sr.validateAgainst(vs);
  try {
    while (sr.hasNext()) {
      sr.next();
    }
    System.out.println("Validated ok!");
  } catch (XMLValidationException ve) {
    System.err.println("Validation problem: "+ve);
  }
  sr.close();

Or something.

And for something quite cool, try the same when you are writing xml content. Instead of just catching crap some other system sends you (by diligent validation of incoming content), how about do some more due diligence, validate your own output and avoid sending garbage to others!

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.