Tuesday, November 30, 2010

7 Killer Features that set Jackson apart from competition (Java JSON)

Jackson JSON processor is a well-established Java JSON processor, known to offer things like:

  • Simple and convenient parsing of JSON, with conversion to/from Java objects
  • Extensive configurability with annotations, settings
  • Ultra-fast performance with streaming parsing as well as full data binding

But many other Java JSON libraries offer convenience and configurability; and performance is not always amongst most important aspects to users.
So why should a Java developer choose Jackson over competition?

Above short list of features is actually but a tip of the iceberg of Jackson functionality. True, these three general areas are important; but in a way they are just a starting point, the baseline that all JSON processors should offer to be worthy of even being considered as the tool to use. But beyond this baseline there is much much more that could and should be offered; and this is where Jackson really delivers.

So let's look at sampling of 7 -- nice round number -- "killer features" that set Jackson years ahead of the competition, presented in order of being introduced (starting with version 1.0, last ones being added by 1.6).

1. Multiple processing modes, all co-operating nicely

Starting with the basics, there are multiple ways to consume and produce JSON data. Although many libraries offer just a single way (processing model), there are essentially three complementary ways (read "There are Tree ways..." for longer explanation) to process JSON:

  • Incremental ("streaming") parsing and generation: high-performance, low-overhead sequential access. This is the lowest-level processing method, comparable to SAX and Stax APIs for XML processing. All packages must have such a parser internally, but not all expose it.
  • Tree-based data model ("DOM for JSON"). Tree is a natural conceptual model to present JSON content; and as such many packages offer functionality to operate on JSON as a logical tree. This is a flexible model, well-suited for some tasks, and great for prototyping or ad hoc access.
  • Data-binding (JSON to/from POJOs). Ultimate in convenience, and typically more efficient than tree-based access, data binding is usually the most natural fit for Java developers. It is used with most Java REST frameworks, such as JAX-RS

Despite obvious benefits of offering multiple views, each with their own optimal use cases, few (if any?) other Java JSON package offers these canonical processing models.
Most offer just one (org.json exposes data as Trees; Gson implements data binding, for example). Jackson offers the full set; all modes fully supported, and best of all, in such a way that it is easy to convert between modes, mix and match. For example, to process very large JSON streams, one typically starts with a streaming parser, but uses data binder to bind sub-sections of data into Java objects: this allows processing of huge files without excessive memory usage, but with full convenience of data binding.

2. Use any constructors, factory methods that you want (not just default zero-arg one!)

Most data binding tools (for JSON as well as XML) require one to define and use zero-argument constructor, to instantiate Java objects, and then set properties with setters or direct field access. This is unfortunate as it makes it difficult to use "immutable objects" pattterns; and is different from access patterns used with regular code.

Jackson thinks that developers deserve ability to specify whatever constructor or factory methods they want for instantiation; just annotate thing you want like so:

public class MyBean {
  private final int value;
public MyBean(@JsonProperty("value") int v) { this.value = v; } public int getValue() { return value; } }

And you can define POJOS like, well, like you would do it anyway in absence of JSON processing.
(for more on immutable objects with Jackson, see this blog entry)

3. Not just Annotations, but Mix-in Annotations!

Although there are many benefits to using Java Annotations for defining metadata (like type-safety, compile-time checking, elimination of separate XML configs, DRY principle etc. etc), there are drawbacks: the most obvious one being that to add annotations you must be able to modify classes. And you can not usually (nor should you) modify code of third-party libraries, at least not just to configure your JSON serialization aspects.

But what if you could just loosely associate annotations on-the-fly, instead of embedding them in code? I think that's a marvellous idea, and it is more or less what Jackson Mix-in Annotations are all about: you can associate annotations (which are declared as part of a surrogate interface or class) with target classes, to make target classes handling work as if annotations were declared by target class itself.

To learn more, read "use Mix-In Annotations to reuse, decouple"

4. Complete support for generic types

By now, generic types are an integral part of modern Java development. Except that not all JSON libraries support generics; and even those that do often fail at handling more complex cases.

Consider following generic types:

  public class Wrapper<T> {
    public T value;
  public class ListWrapper<E> extends Wrapper<List<E>> { }

When asked to deserialize such types, like so:

  ListWrapper<Integer> w = objectMapper.readValue("[{\"value\":13},{\"value\":7}]",
new TypeReference<ListWrapper<Integer>>() { } );

Jackson has little trouble figuring out pieces necessary, and producing expected value. It may well be the only Java JSON package that does this (and more) at this point.

5. Polymorphic types

Here's another factoid: inheritance and polymorphic types can be great tools for OO developers; but they are also major PITA for anyone implementing data binding systems.
Much of complexity for ORMs (like Hibernate) is due to functionality needed to flatten and unflatten data along inheritance hierarchy; and same is true for data serialization packages like JAXB. It is little wonder, then, that very few Java JSON packages support deserialization of polymorphic types; most require user to build explicit type resolution as application code.

How about Jackson? Not only does Jackson support automatic serialization and deserialization of dynamic and polymorphic types, it tries to its best to Do It Right. Specifically, one does not have to expose Java class names (which is the mechanism used by the only other JSON packages that offer any support) as type information -- although, one can, it is configurable -- but can use logical type names (configurable via annotations, or registration). And regardless of what type identifiers are used, inclusion method is also configurable (which is nice due to extreme simplicity of JSON as format). All this with sensible defaults, and contextual applicability (meaning you can define different settings for different types, too!).

For more on how Jackson handles Polymorphic Types, see "Jackson 1.5: Polymorphic Type Handling"

6. Materialized interfaces (even less monkey code to write!)

Whereas support for Polymorphic Types is a very powerful feature -- but one that has plenty of inherent complexity -- here is a feature that is all about simplifying things: ability to "materialize interfaces" (and abstract classes).

That is, given an interface, like:

  public interface Bean {
public int getX();
public void setX(int value);

you might want to skip the step of "now implement Bean interface with a class that has twice as many lines of monkey code", and proceed straight to

  Bean bean = objectMapper.readValue(json, Bean.class);

(without collecting your $200... er, writing your 10 lines of monkey code -- note, too, that we could have omitted 'setX()' from interface; Mr Bean is smart enough to know that some method is needed for injecting values)

There is just one line of configuration to enable this magnificent piece of magic (aka "Mr Bean"); check out "Materialized Interfaces" for more info.

I have yet to find a coder who would rather write implementation for such interfaces, so if there's a single feature to sell Jackson, this just might be it.

7. Parent/child reference support (one-to-many, ORM)

After preceding set of general-purpose functionality, let's wrap up with something more specific: ability to cleanly handle certain subset of cyclic type references, known as parent/child links. These are closely-coupled references where two objects have cross-references that are structured in a hierarchic fashion, such as parent/child linkage of tree nodes. Or, even more commonly, as references used with Object-Relational Mappers (ORM) to express table joins.

Problem with such references (or more generally, of cyclic references) is that JSON has no way to handle them naturally; there is no identity information to use, unlike with Java objects.
One commonly used work-around is to just mark one of references to be ignored (with Jackson this could be done by using @JsonIgnore annotation), but this has the drawback of losing actual intended coupling when deserializing.

Jackson has simple annotation-based solution to the problem: both references need an annotation (@JsonManagedReference for "child" link, @JsonBackReference for "parent" or "back" link), and based on this Jackson knows to omit serialization of back reference, but to reinstate it when deserializing objects. This works well for typical ORM use cases.

8. Is that all, folks?

Actually, some of Jackson features I did not include might be killer feature for others. If so, feel free to add comments to point out what else should be show-cased.

Saturday, November 20, 2010

StaxMate 2.0.1 released; improved DOM-from-Stax, compatibility with default JDK 1.6 Stax implementation

Quick update from "XML world" -- in which I have spent much less time, due to explosive growth in JSON land: StaxMate 2.0.1 was just released.

1. StaxMate?

First question you might ask is "What the heck is StaxMate?". Fair enough -- given how little attention it has gotten, here is the main idea.

StaxMate is meant to offer "convenience of DOM with performance of Stax (or SAX)". Although Stax API was an improvement in usability for many use cases, it is still a rather low-level access API. StaxMate builds concept of "cursors" when reading content; and output context objects when writing content. Sample code and bit more in-depth explanation can be found from StaxMate Tutorial page; but basic idea is to offer better abstractions than simple flat event iterator. Sort of like how automatic transmission can simplify driving, compared to manual stick shift.

Working with cursors is typically similar to how DOM documents are traversed in simple top-down (recursive-descent) fashion: you start with root element, get child elements, locate more children, textual content and so forth. Same is done with StaxMate, with just one crucial limitation: all access must be done in document order (parent first, them children, in order they are in XML document). If you need to retain some information, you will do it explicitly (attribute values from parents need to be access before child elements, for example). StaxMate will take care to synchronize access when you use child cursors, so will never need to worry about skipping remaining siblings; you just can not access things in random order. Same is also true for output side; although there are ways to temporarily "freeze" output which does allow building content somewhat out-of-order, as necessary. This may be necessary for doing things like calculating parent attribute values based on content written for child elements.

The benefit of requiring access to be done in document order is that it means that there is no additional performance or memory overhead for keeping track of past content. Memory usage, therefore, is not very different from that of "raw" Stax parser or generator; same is true for performance. Overhead of DOM documents is often 3x - 5x that of streaming access; overhead of using StaxMate is typically in 10-20% range, sometimes even lower.

2. Fixes in 2.0.1

This patch release contains just 2 fixes, but both are quite important, so upgrade is strongly recommended.

First fix is to DOM-compatibility part (see "Reading DOM documents using Stax XML parser, StaxMate" for details on usage). It turns out that although building full DOM document worked fine with 2.0.0, there were issues if binding sub-trees; these issues should now be resolved.

Second fix is to interoperability with Stax parsers that do not implement Stax2 extension API (to date, Woodstox and Aalto do implement this, but not others; most notably, Sun Sjsxp which is the default Stax parser bundled with JDK 6). Although most operations work just fine, Typed Access accessors (getting XML element text as number, boolean value, enum) could cause state update to work incorrectly, leading to issues when accessing sequence of typed values. This has been resolved, by fixing the underlying problem in Stax2 API reference implementation library that StaxMate depends (version 3.0.4 of the library contains fixes).

Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.