Monday, August 31, 2009

Processing non-standard "JSON"

One of the basic design principles of Jackson is that it is to read and write JSON format, and nothing else. That "else" includes not only other formats like XML, but also outcast mutant data that only calls itself JSON but does not conform to JSON specification.
For purposes of this entry, let's call such content "JSON", with quotes to emphasize its almost-but-not-quite nature (we could as well call it JSON--, !JSON, JSON* or pidginJSON -- whatever, name choice is arbitrary).

Whereas there is just one well-specified and -documented structure for JSON, there are countless possible ways to do "something like JSON" ("JSON"). But there is a relatively small set of commonly seen apocryphical features for such JSON pidgins.
These are:

  • Use of comments: usually C ("/* ... */") or C++ ("// ....") comments. Origins of this addition is probably JSON specification itself: some earliest drafts did indeed include ability to add such comments
  • Optional quoting around field names (i.e. leaving double-quotes out of field names). This usage is probably related to the fact that quoting is optional for Javascript, and related logical leap caused by origins of name JSON itself

These are the most commonly seen deviations, at least based on user request for supporting "JSON" content.

1. ToSupport or !ToSupport

From philosophical perspective, it would be tempting to outright decline to have anything to do with "JSON" content. After all, adding support might be seen as encouragement to use of non-standard features, leading to interoperability problems (depending which extra features various processors chose to support). But on the other hand, it is often useful to follow "liberal with what you accept; conservative with what you produce" guideline when promoting interoperability -- especially when there already exists implementations that make use of non-standard extensions mentioned earlier.

So with some procrastination, discussion and delay, optional support features have been added over time to let Jackson deal with "JSON".

With the warning that you should use such features only if you absolutely must (for interoperability reasons), here's what features exist and how they can be enabled.

2. Handling of comments

Comments that are encountered with "content like JSON" come in 2 main flavors: C-style (/* ... */) and C++ - style (// ....). Both forms were actually allowed in some earlier drafts of JSON specification, but dropped from the final version (what a shame -- my opinion is that this would have been a valuable and useful feature).

By default such comments in content will result in a parse exception. But it is possible to make parser simply skip such comments by configuring parser with:

  jsonFactory.enable(JsonParser.Feature.ALLOW_COMMENTS); // Jackson 1.2+
  jsonFactory.configure(JsonParser.Feature.ALLOW_COMMENTS, true); // Jackson 1.0+

(and similary directly for JsonParser instances, using 'parser.configure()')

or, when using ObjectMapper:

  objectMapper.configure(JsonParser.Feature.ALLOW_COMMENTS, true); // Jackson 1.2+

(which will basically configure parser instances mapper creates and uses)

For interoperability, it's best not to create any such content: but if you really must create such comments, you can do that by using "raw output" methods of JsonGenerator:

  jsonGenerator.writeRaw(" // my non-standard comment\n")

3. Handling of unquoted field names

Another common deviation from standard is use of unquoted names (field names not surrounded by double quotes). This deviation is probably due to common misconception that JSON is a straightforward static subset of Javascript (which is not true).
At any rate such content is often seen; and sometimes even preferred by some rogue JSON practitioners.

Generating such fields can be enabled by a JsonGenerator feature:

jsonFactory.disable(JsonGenerator.Feature.QUOTE_FIELD_NAMES); // Jackson 1.2+
jsonFactory.configure(JsonGenerator.Feature.QUOTE_FIELD_NAMES, false); // Jackson 1.0+

after which names will be output without surrounding double-quotes.
And to accept such content, parser needs to be configured with:

  jsonFactory.enable(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES); // Jackson 1.2+ only

or, when using ObjectMapper:

  objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true); // Jackson 1.2+

4. Warning

Did I already mention that you shouldn't really be using features I talked about above? And that if you do end up enabling these features, you should seriously consider repenting (take a shower after use, another one for extra measure; do 50 pater nosters, and spit over your left shoulder).

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.