Wednesday, February 02, 2011

Jackson 1.7; quest for Maximum Extensibility

At this point Jackson 1.7 has been out for almost a month (and in fact, 1.7.2 is by now the latest patch release), so it's high time to write something about this release.
1.7 turns out to be third "anything but minor" minor release in a row, which is part of the reason why I have procrastinated a bit: it is not a simple matter of just listing set of simple features, or linking the release notes page (which can be found here, for anyone interested). Rather, it makes sense to talk a bit about 1.7 development cycle.

But it is actually good that I have had some time to think about what to write, instead of rushing to document release that just happened: especially since there is now some progress that was directly germinated by this release. But more on this bit later.

1. Background

After 1.6, a whopper of a release that boasted 4 major new featurers and a boatload of smaller ones, the initial plan for 1.7 was to make a somewhat smaller incremental release. Beyond tackling some fixes that required API changes (and thus couldn't go in one of 1.6.x patch releases), the focus was on the most important concern at the time: difficult in cleanly extending Jackson with modular extensions. So it seemed like this might be a modest incremental upgrade.

It was quickly found out that needed changes to allow modular extensibility were quite wide-spread, since information needed was not propagated through all the pieces. But the focus on a single cross-cutting concern turned out to be a good thing, so that major changes to interfaces could be done in one fell swoop and hopefully abstractions added (and changes to existing ones) will form a solid foundation for further development.

2. Aspects of extensibility

While the main goal was to improve extensibility, there are multiple kinds changes that are needed to support proper modular extensibility. For example:

  • Changes to allow registration of bundles of new functionality in a way that it is possible to add multiple extensions that ideally do not conflict, and that need not even be aware of other extensions that may be used.
  • Retrofitting existing components and interfaces to allow clean extension (i.e. avoiding having to sub-class things)
  • Adding new extension points to replace older extension methods
  • Making existing extension points more powerful, to further reduce need for more invasive techniques (overrides with sub-classing)

Another way to consider this is to think of Jackson becoming a platform; the way a web browser can be seen a platform to build on (via addition of plug-ins and add-ons). In fact, given new projects that support many non-JSON data formats (see below), it is not a strecth to claim that Jackson is becoming a "Java data format conversion platform" at this point.

3. New mechanism for registering extensions: Module API

The most visible new construct is Module API. It is also amongst simplest, since there are basically just two things a Jackson Module developer needs to learn:

  1. org.codehaus.jackson.map.Module interface, which must be implemented by one class of the module; and specifically its "setupModule(SetupContext ctxt)" method (other methods are for exposing metadata such as module version)
  2. Module.SetupContext (passed to "setupModule" method) that exposes set of extension points (methods) that module can use to register handlers it wants to add.

And from user end point, it is even simple; there is but one thing to know. To use, say, new Jackson-guava-module (available from FasterXML GitHub repository; provides support for reading/writing Guava data types), you will do:

  ObjectMapper mapper = new ObjectMapper();
  mapper.registerModule(new GuavaModule());

that is, add a one-line call to let module register whatever it wants to offer, via interfaces that ObjectMapper provides it.

From above description, definition of a Jackson module is quite simple: it is piece of code that defines one class that implements org.codehaus.jackson.map.Module, and which registers all functionality offered by the module.

3.1. Module interface: not just for extensions -- use for your own app too!

One thing worth noting is that while Module interface is really designed to allow writing of reusable third-party extensions, it actually works pretty well just for encapsulating ObjectMapper configuration and extensions that are only used by a single application, or company-wide (but not published externally). So it is a good idea to use modules, for example, when registering custom serializers and deserializers; there is no overhead and this helps in encapsulating configurability and customization in one place.

4. Modular extension points: Serializers, Deserializers

Beyond having a simple registration mechanism for extensions (which I will from here on simply refer as "modules"), the obvious problem with extensibility has been that it has been limited to application developer being able to override custom behavior, either by setting an explicit handler, or by sub-classing and replacing existing components (like SerializerFactory). True extensibility requires that it must be possible for multiple modules to add handlers without overriding each other's changes (unless they happen to truly conflict like trying to define handler for same data type); ability for modules to peacefully co-exist, co-operate without explicitly having to plan for it.

The first obvious thing was to add mechanisms for adding custom serializers and deserializers without having to replace default SerializerFactory and DeserializerFactory instances. This was done by adding new interfaces org.codehaus.jackson.map.Serializers and org.codehaus.jackson.map.Deserializers (and matching basic implementations), which just define a way for a module to provide serializers and deserializers for specific data types. These can then be registered with SerializerFactory.withAdditionalSerializers(Serializers) and DeserializerFactory.withAdditionalDeserializers(Deserializers); which is exactly what ObjectMapper exposes via SetupContext.setupModule() method.

These simple extension points alone cover much of what most module need to do: to provide specific handlers for third party libraries. And when using org.codehaus.jackson.map.module.SimpleModule (default implementation of Module), addition of these handlers is a one-line operation.

5. Modular extension points: BeanSerializerModifier, BeanDeserializerModifier

But beyond ability to conveniently register deserializers and serializers, it was understood that ability to modify functioning of standard BeanSerializer and BeanDeserializer instances (things that take your POJOs, find out properties, handle annotations and pretty much do most of the magic Jackson provides) is a definite must. This because in most cases much of existing functionality is fine, but there is need to tweak specific aspects of serialization or deserialization: for example, one may want to override handling of just one specific property, for specific class of POJOs. And while annotations can configure many things well, there are limitations.

To support this, two new interfaces (and matching registration methods, added in Module.SetupContext) were added: BeanSerializerModifier and BeanDeserializerModifier.

Methods defined in these interfaces are called during (and right after) building BeanSerializer and BeanDeserializer instances; and can be used for example to:

  1. Add or remove properties to be serialized, deserialized
  2. Change the order in which properties are serialized
  3. Completely replace BeanSerializer/-Deserializer that has been built, with specified JsonSerializer/JsonDeserializer (this is often done by constructing a new BeanSerializer / BeanDeserializer, using some properties from initial serializer/deserializer)

Which pretty much means that the whole serializer and deserializer configuration and construction process can be modified; but without having to replace everything. Possibilities are unlimited.

6. Contextual configuration of serializers, deserializers

While ability to change the way bean serializers, deserializers are configured and constructed is powerful, there was one other aspect of construction process that needed revamping. Up until version 1.6, once a serializer (deserializer) was constructed for a given type, same instance was used for properties of that type. This meant that any context-specific behavior (serialization of a field of specific type being handled differently, depending on which exact property is being serialized) was hard to do; and basically could not be done from within serializer or deserializer implementation.

Consider something that would seem like a simple extension: ability to define which DateFormat to use for serializing specific properties. For example, we might want something like:

  public class Bean {
    @JsonDateFormat("YYYY-MM-DD")
public Date createDate; }

in which 'createdDate' property would be serialized using specified DateFormat, instead of the default DateFormat mapper uses.

Problem is two-fold: first of all, JsonSerializer/JsonDeserializer does not get enough contextual information to do much configuration. But worse, even if it did, there would be just one instance that is used regardless of location of property. So the only way (pre-1.7) to implement such feature would be to explicitly add support within core Jackson data binder; BeanSerializerFactory and AnnotationIntrospector would need to be modified at minimum.

One obvious way to solve the problem would have been to pass contextual information during serialization/deserialization. But while this would be a powerful mechanism, it would add significant amount of overhead, especially if configuration was to be done using annotations. Instead we decided to pass this information during construction of serializer/deserializer instance; from design perspective this is compatible with the general goal of trying to gather as much information as possible during non-performance-critical phase of constructing handlers, and minimize work to be done during performance-critical serialization phase.

Specific mechanism chosen is that of defining two interfaces (ContextualSerializer, ContextualDeserializer) that serializer and deserializer instances can implement. And if they do, SerializerProvider / DeserializerProvider will first construct instance, and then call methods in new interfaces, to allow creation of contextual instances, passing information about context in form of BeanProperty instance which gives property name and access to all related annotations (as well as currently active configuration).

With this information it will be possible to support use cases such as one explained below: in fact, unit tests used to verify functionality define trivial serializer types (like StringSerializer that can conditionally lower-case property values based on existence of a test annotation).

7. From theoretical to practical extensibility

While it has been just 4 weeks since the release, extensibility improvements outlined above have already been made good use of by multiple projects. I am aware of at least following extension projects (please let me know of others if you know):

  • bson4jackson (support for BSON format (used by MongoDB))
  • jackson-module-scala (support Scala data types) (there is also another noteworthy Scala-with-Jackson project, Jerkson)
  • jackson-module-hibernate (support lazy-loaded Hibernate types):
  • jackson-module-guava (support google Guava collection types)
  • jackson-xml-databind (support reading/writing XML instead of JSON, "mini-JAXB") -- I will definitely need to write bit more about this in near future (can't use XStream or JAXB at GAE? jackson-xml-databind actually can be -- and it is much faster than either on J2SE platform as well)

and new ones are bound to come up (there have been talks for adding Joda-module, CSV-module for example)

8. Beyond extensibility: other new features, improvements:

As imporant as extensibility (and benefits it brings, such as new modules!), 1.7 actually contains a few important other improvements and new features that are not directly related to extensibility. Here's a quick list of most noteworthy ones:

  • @JsonTypeInfo can now be used for properties (fields, getter/setter methods), not just types (classes) -- useful for "untyped" fields (like ones using java.lang.Object as value), so one need not enable default type information
  • Dynamic Filtering: powerful new filtering mechanism using @JsonFilter to specify filter id, ObjectMapper.filteredWriter(FilterProvider) to specify which id maps to which filter -- this is a major new feature, and I hope to write more about it too (
  • Support for wrapping output within "root name" (similar to JAXB), for interoperability with other JSON tools, frameworks
  • @JsonRawValue for injecting "raw text" (such as pre-encoded JSON without re-parsing) during serialization
  • SerializedString for high-efficiency serialization of pre-encoded (quoted, utf-8 encoded) String values, property names
  • Feature to enable/disable wrapping of runtime exceptions (separately for serialization, deserialization)

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.