Tuesday, March 27, 2012

Jackson 2.0: now with XML, too!

(note: for general information on Jackson 2.0.0, see the previous article, "Jackson 2.0.0 released")

While Jackson is most well-known as a JSON processor, its data-binding functionality is not tied to JSON format.
Because of this, there have been developments to extend support for XML and related things with Jackson; and in fact support for using JAXB (Java Api for Xml Binding) annotations has been included as an optional add-on since earliest official Jackson versions.

But Jackson 2.0.0 significantly increases the scope of XML-related functionality.

1. Improvements to JAXB annotation support

Optional support for using JAXB annotations (package 'javax.xml.bind' in JDK) became its own Github project with 2.0.

Functionality is provided by com.fasterxml.jackson.databind.AnnotationIntrospector implementation 'com.fasterxml.jackson.module.jaxb.JaxbAnnotationIntrospector', which can be used in addition to (or instead of) the standard 'com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector'.

But beyond becoming main-level project of its own, 2.0 adds to already extensive support for JAXB annotations by:

  • Making @XmlJavaTypeAdapter work for Lists and Maps
  • Adding support for @XmlID and @XmlIDREF -- this was possible due to addition of Object Identity feature in core Jackson databind -- which basically means that Object Graphs (even cyclic ones) can be supported even if only using JAXB annotations.

the second feature (@XmlID, @XmlIDREF) has been the number one request for JAXB annotation support, and we are happy that it now works.
Canonical example of using this feature would be:


    @XmlAccessorType(XmlAccessType.FIELD)
    public class Employee
{ @XmlAttribute @XmlID protected String id; @XmlAttribute protected String name; @XmlIDREF protected Employee manager; @XmlElement(name="report") @XmlIDREF protected List<Employee> reports; public Employee() { reports = new ArrayList<Employee>(); } }

where entries would be serialized such that the first reference to an Employee is serialized fully, and later references use value of 'id' field; conversely, when reading XML back, references get re-created using id values.

2. XML databinding

Support for JAXB annotations may be useful when there is need to provide both JSON and XML representations of data. But to actually produce XML, you need to use something like JAXB or XStream.

Or do you?

One of experimental new projects that Jackson project started a while ago was something called "jackson-xml-databind".
After being developed for a while along with Jackson 1.8 and 1.9, it eventually morphed into project "jackson-dataformat-xml", hosted at Github.

With 2.0.0 we have further improved functionality, added tests; and also worked with developers who have actually used this for production systems.
This means that the module is now considered full supported and no longer an experimental add-on.

So let's have a look at how to use XML databinding.

The very first thing is to create the mapper object. Here we must use a specific sub-class, XmlMapper

  XmlMapper xmlMapper = new XmlMapper();
// internally will use an XmlFactory for parsers, generators

(note: this step differs from some other data formats, like Smile, which only require use of custom JsonFactory sub-class, and can work with default ObjectMapper -- XML is bit trickier to support and thus we need to override some aspects of ObjectMapper)

With a mapper at hand, we can do serialization like so:


  public enum Gender { MALE, FEMALE };
  public class User {
    public Gender gender;
    public String name;
    public boolean verified;
    public byte[] image;
  }

  User user = new User(); // and configure
  String xml = xmlMapper.writeValueAsString(user);

and get XML like:
  <User>
<gender>MALE</gender>
<name>Bob</name>
<verified>true</verified>
<image>BARWJRRWRIWRKF01FK=</image>
</User>

which we could read back as a POJO:

  User userResult = xmlMapper.readValue(xml, User.class);

But beyond basics, we can obviously use annotations for customizing some aspects, like element/attribute distinction, use of namespaces:


  JacksonXmlRootElement("custUser")
public class CustomUser { @JacksonXmlProperty(namespace="http://test") public Gender gender;
@JacksonXmlProperty(localname="myName") public String name; @JacksonXmlProperty(isAttribute=true) public boolean verified; public byte[] image; } // gives XML like:
<custUser verified="true">
<ns:gender xmlns:ns="http://test">MALE</gender>
<myName>Bob</myName>
<image>BARWJRRWRIWRKF01FK=</image>
</custUser>

Apart from this, all standard Jackson databinding features should work: polymorphic type handling, object identity for full object graphs (new with 2.0); even value conversions and base64 encoding!

3. Jackson-based XML serialization for JAX-RS ("move over JAXB!")

So far so good: we can produce and consume XML using powerful Jackson databinding. But the latest platform-level improvement in Java lang is the use of JAX-RS implementations like Jersey. Wouldn't it be nice to make Jersey use Jackson for both JSON and XML? That would remove one previously necessary add-on library (like JAXB).

We think so too, which is why we created "jackson-jaxrs-xml-provider" project, which is the sibling of existing "jackson-jaxrs-json-provider" project.
As with the older JSON provider, by registering this provider you will get automatic data-binding to and from XML, using Jackson XML data handler explained in the previous section.

It is of course worth noting that Jersey (and RESTeasy, CXF) already provide XML databinding using other libraries (usually JAXB), so use of this provider is optional.
So why advocate use of Jackson-based variant? One benefits is good performance -- a bit better than JAXB, and much faster than XStream, as per jvm-serializer benchmark (performance is limited by the underlying XML Stax processor -- but Aalto is wicked fast, not much slower than Jackson).
But more important is simplification of configuration and code: it is all Jackson, so annotations can be shared, and all data-binding power can be used for both representations.

It is most likely that you find this provider useful if the focus has been on producing/consuming JSON, and XML is being added as a secondary addition. If so, this extension is a natural fit.

4. Caveat Emptor

4.1 Asymmetric: "POJO first"

It is worth noting that the main supported use case is that of starting with Java Objects, serializing them as XML, and reading such serialization back as Objects.
And the explicit goal is that ideally all POJOs that can be serialized as JSON should also be serializable (and deserializable back into same Objects) as XML.

But there is no guarantee that any given XML can be mapped to a Java Object: some can be, but not all.

This is mostly due to complexity of XML, and its inherent incompatibility with Object models ("Object/XML impedance mismatch"): for example, there is no counterpart to XML mixed content in Object world. Arbitrary sequences of XML elements are not necessarily supported; and in some cases explicit nesting must be used (as is the case with Lists, arrays).

This means that if you do start with XML, you need to be prepared for possibility that some changes are needed to format, or you need additional steps for deserialization to clean up or transform structures.

4.2 No XML Schema support, mixed content

Jackson XML functionality specifically has zero support for XML Schema. Although we may work in this area, and perhaps help in using XML Schemas for some tasks, your best bet currently is to use tools like XJC from JAXB project: it can generate POJOs from XML Schema.

Mixed content is also out of scope, explicitly. There is no natural representation for it; and it seems pointless to try to fall back to XML-specific representations (like DOM trees). If you need support for "XMLisms", you need to look for XML-centric tools.

4.3 Some root values problematic: Map, List

Although we try to support all Java Object types, there are some unresolved issues with "root values", values that are not referenced via POJO properties but are the starting point of serialization/deserialization. Maps are especially tricky, and we recommend that when using Maps and Lists, you use a wrapper root object, which then references Map(s) and/or List(s).

(it is worth noting that JAXB, too, has issues with Map handling in general: XML and Maps do not mesh particularly well, unlike JSON and Maps).

4.4 JsonNode not as useful as with JSON

Finally, Jackson Tree Model, as expressed by JsonNodes, does not necessarily work well with XML either. Problem here is partially general challenges of dealing with Maps (see above); but there is the additional problem that whereas POJO-based data binder can hide some of work-arounds, this is not the case with JsonNode.

So: you can deserialize all kinds of XML as JsonNodes; and you can serialize all kinds of JsonNodes as XML, but round-tripping might not work. If tree model is your thing, you may be better off using XML-specific tree models such as XOM, DOM4J, JDOM or plain old DOM.

5. Come and help us make it Even Better!

At this point we believe that Jackson provides a nice alternative for existing XML producing/consuming toolkits. But what will really make it the first-class package is Your Help -- with increased usage we can improve quality and further extend usability, ergonomics and design.

So if you are at all interested in dealing with XML, consider trying out Jackson XML functionality!

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.