Tuesday, September 07, 2010

Jackson 1.6 released

After almost 6 months of development, Jackson 1.6 was finally released last night (download responsibly!). Despite preliminary plans of creating a somewhat smaller incremental version after the big bang of 1.5, over time things changed and we actually have another biggie-size increment at our hands. But whereas 1.5 was focused on implementing a complete solution for just one big hairy problem (handling of polymorphic types), 1.6 is a full frontal assault against remaining hard-to-handle use cases. It both expands set of use cases that Jackson can handle and improves support for existing use cases making usage even more convenient and performant.

For the full list of features, check out FasterXML Jackson 1.6 features page and 1.6.0 release notes. But here is an overview of most notable changes.

1. Structural changes: 2 new optional jars

At surface level, one obvious thing is that there are now 2 more optional jars you can include. They contain new functionality known as "Mr Bean" and "Project Smile"; more on these in a moment. There is also an addition of "jackson-all" jar which simply contains everything from all the other jars; to be used when you just don't want n+1 separate jars around and would rather have a single fat jar for all Jackson stuff.

Otherwise packaging remains the same; and backwards compatibility works as expected for a "minor" release -- that is, code written for earlier 1.x versions should work as is. Considering scope of changes, upgrade from versions 1.4 and 1.5 specifically should be very safe.

2. Shiny New Things: Big 4 of 1.6

There are multiple ways to group changes and improvements. Let's start with what I view as 4 major new features:

  • ObjectMapper.updateValue(): ability to merge changes, deltas
  • Automatic Parent/Child reference handling: better support for Object/Relational Mapper (ORM) values (Hibernate, iBatis)
  • Interface/Abstract class Materialization ("Mr Bean"): give Jackson your interfaces, forget about boiler plate classes
  • JSON-compatible high-performance binary data format ("Project Smile"): even more performance without sacrificing convenience of schema-free data model

2.1 ObjectMapper.updateValue()

I actually wrote 'New feature: ability to "update" beans, not just recreate' a while ago, since this was the first new thing implemented after 1.5. The idea is that you can now optionally provide an existing object ("root value") when deserializing, and ObjectMapper can just update its properties, instead of instantiating a new object. This is useful when merging properties, for example by using default values and overrides, possibly with multiple levels of priorities, or when loading settings from multiple sources.

Usage is as simple as:

  Properties properties = new Properties();
  ObjectMapper mapper = new ObjectMapper();
  mapper.updatingReader(properties).readValue(jsonWithOverrides);
  // can call multiple times if you want to merge multiple sets of values:
  mapper.updatingReader(properties).readValue(higherPriorityOverrides);

method ObjectMapper.updatingReader() creates a Reader of type ObjectReader, which can be further configured (this also reduces number of methods that need to be added to ObjectMapper itself). Object to update can be any type supported by Jackson's regular ObjectMapper.readValue().

2.2 Parent/Child reference handling

One thing that has been problematic for serialization is linkage between parent and child objects for trees and ORM-mapped classes (or for simple double-linked lists). Problem is that without special handling this cyclic dependency causes serialization failure. Prior to 1.6 the way to handle this problem has been to suppress serialization of one of links (usually the "back link" from child object to parent), and lose back link; or to write custom serializers and deserializers. Jackson 1.6 offers a better way: use of 2 new annotations, @JsonManagedReference and @JsonBackReference. Consider an example of a two classes:

public class Root {
  @JsonManagedReference
  public Leaf[] leaves; // works for simple POJOs, arrays, Lists, Maps etc
}

public class Leaf {
  @JsonManagedReference
  public Root root;

  public String id;
}

serialization of a Root object with 2 Leaf objects would produce something like:

{
  "leaves" : [
    { "id" : "leaf1Id" },
    { "id" : "leaf2Id" }
  ]
}

which is similar to just using @JsonIgnore on 'public Root id;' field. But the real trick is with deserialization, which will automatically set 'root' field to point to deserialized instance, as if that link was serialized.

So behavior is:

  • @JsonManagedReference will be used as marker for something that points to corresponding @JsonBackReference, used when deserializing; does not change serialization
  • @JsonBackReference will suppress serialization, allow re-constructing reference on deserialization

About only additional feature is that in case there are multiple link references, it is possible to explicitly define id to use for matching managed/back reference pairs. Note, too, that it is possible to use self-references; this would be needed for nodes of doubly-linked lists for example.

This feature is most useful for ORM beans, handling one-to-one and one-to-many references, but is also useful for some cyclic data structures.

2.3 Mr Bean: Let Jackson do Monkey Coding

One of more boring and mundane tasks with Java has traditionally been the requirement to fully write out basic value holding Beans: adding fields as well as getters and setters. Although Jackson has made it possible to reduce need for such boilerplate code (for example, eliminate need for setters by annotating private fields, or by using @JsonCreator annotated constructors -- see the previous article about Jackson with Immutable Objects!), more could still be done.

And Jackson 1.6 does that something. Consider following piece of code:

public interface Bean { // or could be an abstract class
  public String getName();
  public int getAge();
}

ObjectMapper mapper = new ObjectMapper();
// org.codehaus.jackson.mrbean.AbstractTypeMaterializer, extends org.codehaus.jackson.map.AbstractTypeResolver
mapper.getDeserializationConfig().setAbstractTypeResolver(new AbstractTypeMaterializer());
Bean value = mapper.readValue("{\"name\" : \"Billy\", \"age\" : 28 }");

With earlier versions of Jackson (and with any other Java data binder), what you would most likely get is an exception indicating problem of not being able to create an instance of an interface. But thanks to that AbstractTypeMaterializer thing, you can now let Jackson materialize bean classes and relax.
Just remember to include that new "jackson-mrbean-1.6.0.jar" (or Maven dependency) and you are good to go. Pretty neat eh?

Bean materialization works for simple interfaces and abstract classes: methods recognized as setters and getters are implemented; other methods either cause a failure, or can optionally be made to be implemented as error-throwing placeholders. Appropriate setters are created if there are getters (which is needed for cases like above) but if you want to modify values yourself, you can also add explicit setter signatures. You can also use all the usual Jackson annotations for configuration: since type materializer is only concerned with creating classes, and does NOT handle actual serialization or deserialization, standard Jackson ObjectMapper will use them as before. Ability to define abstract classes could be especially useful in cases where you want to control specific aspects or properties, but leave simple properties to Mr Bean.

I hope to write some more about Mr Bean in another article. And I would especially appreciate feedback from users -- this has been the number one missing feature (as per my own priorization) for almost 2 years now, and I expect it to be a big hit, comparable to effect of mix-in annotations.

Finally, big Thank You to Sunny G who contributed the initial version of Mr Bean, so it could be included in 1.6.

2.4 Even More Extreme Performance: Project Smile, JSON-compatible binary format

And last but not least, another longer-term project that I have wanted to do for a while is defining an efficient and 100% JSON compatible binary format, similar to how various Binary XML formats have tackled high-performance XML use cases. Although there have been prior attempts at doing this (like BSON), none have been fully JSON compatible and performant (BSON for example is neither super nor subset of JSON). Yet others insist on having to specify rigid schema to use (Thrift, Protocol Buffers).

Project Smile tackled this challenge, and produced what we hope to be a very compelling binary data format, as well as full support for using that data format exactly as one uses JSON. Sort of like just having different representation of JSON. For those interested in low-level details, feel free to check out Smile Data format specification (and specifically if anyone is interested in implementing Smile support on other platforms, PLEASE check it out!).

To use Smile, all you need is to instantiate org.codehaus.jackson.smile.SmileFactory (from jackson-smile-1.6.0.jar) -- which extends standard org.codehaus.jackson.JsonFactory -- and use it as is (to create SmileGenerators and SmileParsers; respectively extending JsonGenerator and JsonParser), or via ObjectMapper (construct ObjectMapper with SmileFactory). All the usual functionality should work as is, including streaming parsing and generation, full data binding and Tree Model access.

Obvious and measurable benefits include:

  • More compact data -- especially so for larger and more repetitive data, such as rows from database or entries for Map/Reduce tasks.
  • Event faster parsing, and possibly faster generation (one of design criteria was that generation speed is not sacrificed for parsing speed or data size -- and hence Smile is one of fastest binary data formats to write)

I hope to update Thrift-protobuf performance benchmark with Smile-based test results in near future: based on measurements I have done so far (using locally modified version), Smile is typically 25-50% faster and produces 25-50% more compact data than Jackson with textual JSON. This makes it generally faster than Thrift or Avro on Java (which are often no faster than textual JSON with Jackson), and comparable in speed to Protocol Buffers -- and all this without sacrificing any of Jackson flexibility or expressive power.

I am specifically hoping to show how Smile would be a good alternative to Avro for large-scale date processing; using optionally enabled property name and String value back references, data size can be compact enough to render schemas unnecessary; and turbo-charged Jackson parsing and generation keep data flowing at wire speeds.

I will definitely write some more about Smile in future so stay tuned.

3. Other significant areas of improvement

Beyond "big four", 1.6 includes numerous improvements and fixes (release notes include 39 resolved Jira issues, mostly improvements and new features).

3.1 Enum value handling, customization

Handling of enum values has been somewhat lacking prior to version 1.6. With 1.6 it is finally possible to simply define that Enum.toString() is to be used as serialization value (instead of Enum.name()) using SerializationConfig.Feature.WRITE_ENUMS_USING_TO_STRING (and matching DeserializationConfig.Feature.READ_ENUMS_USING_TO_STRING ). Or for serialization, define serialization by existing @JsonValue annotation that is now supported for Enum types; obvious case being to annotate 'toString()' with @JsonValue.

It is also possible to define "Creator" methods (aka factories) using @JsonCreator annotation (constructors can not be used with Enums).

3.2 More convenient Tree Model

There are numerous additions to Tree Model API (org.codehaus.jackson.JsonNode), such as:

  • coercion of numeric types (JsonNode.getValueAsInt() and other variants) to convert JSON String values
  • JsonNode.has(String fieldName) for checking existence of a property
  • set of findXxx() methods: JsonNode.findParent(), findParents(); findPath(), findValue(), findValueAsText() (check out Jackson Javadocs for details)

which should simplify common Tree traversal tasks a lot. I probably should write bit more about these methods in future.

3.3 Serialization performance improvements

Although performance has always been a strong point of Jackson, there was room for improvement on serialization side. With some tweaks, serialization speed was increased by an average of 20% (as per http://wiki.github.com/eishay/jvm-serializers/ test). No configuration changes are needed beyond upgrade to 1.6.

3.4 Allow registration of sub-types for Polymorphic Handling, without annotations

It is now possible to register subtypes for deserialization, instead of having to use @JsonSubTypes annotation -- this was number one request for improving polymorphic type handling. Registration is done using new ObjectMapper.registerSubtypes() method(s).

3.5 Better support for OpenContent using @JsonAnySetter and @JsonAnyGetter

Although @JsonAnySetter annotation has been around since 1.0, to allow binding unknown properties during deserialization, there wasn't anything similar to serialize miscellaneous set of properties. But now you can use @JsonAnyGetter to annotate a method that returns a Map with values; which generally works nicely for things collected using @JsonAnySetter.

3.6 Yet More Powerful Generics Support

Although Jackson has always supported generic types reasonably well, some advanced use cases (with type variable aliasing) could lead to sub-optimal handling in 1.5. These have been improved with 1.6.

4. What Next?

Probably 1.7. :-)

Seriously though, there is no need for major backwards-incompatible change (which would mean 2.0). But some obvious bigger areas for improvements are:

  • Better support for plug-in modules for third party datatypes -- this has been planned for a while, and really needs to be done to help further improve Jackson's support for all kinds of commonly used Java datatypes. This also includes support for contextual serializers/deserializers, ability to support per-property pluggable (datatype-specific) annotations
  • Support for fully cyclic data types, object identity. This is a rather hard nut to crack, but something that is needed for complete Java Object serialization support.
  • JSONPath support, ideally at JsonParser level; possibly as filter for materializing trees. This would be ideal for many large-scale data processing operations
  • Rewrite of annotation processing part to better support concept of logical property accessed using various accessors (setter, getter and/or direct field access)
  • Advanced code generation for generating optimal (as-fast-as-hand-written) serializers, deserializers.
  • Support for serializing as XML? ("JAXB mini") -- although we promise not to support weird automatic-mappings like Badgerfish, it is not out of question that we might be able to support clean solid subset of JAXB-style code-first serialization between POJOs and XML.
  • Improved support for values of non-Java runs-on-JVM languages; Scala, Clojure, Groovy.

Which of these get tackled depends on contributions, feedback from users, and general fun-factor of working on adding things. So let your voice be heard, be it via Jackson user group, mailing lists or Jira voting.

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.