Monday, June 29, 2009

So Long Marchex

(and thanks for all the pfish?)

Last week marked the end of another chapter in my career. Although that story is not a Tom Clancy style novel by any means, the cast so far have been pretty colorful. And while I am bit sad to go, there are times when you realize fairly soon that it is time to move on. This was one of those times.

Regarding my stay; my biggest regret is not so much that my stint was a short one (it fell short of 1 year minimum I had mentally prepared for), but that I didn't stay there long enough to get the system I was working on completed and pushed to production. There is something to be said for getting a system together and churning some real data; no matter how ad-hoc the design and implementation might be, as long as in the end it works, and works reliably. But in this case I only managed to wrap up some components (significants chunks, but still just pieces) -- the rest is up to remaining Marchexians, who fortunately are competent and should be able to complete it. And with any luck, will feel at least a pinch of pride for the release.

But wait: there are more cliches to be served! For example: I should note that for every end, there's a new beginning. Like this one has. As it happens, I am now starting my new (or, shall we say, old... more on that later on) job. With all the usual high hopes. :-)

... which, as they say, is another story altogether. So more on that later on.

Saturday, June 27, 2009

Woodstox, high impact factor & being #32 on Top Open Source Java libs list

Another interesting data point, this time from analysing Maven Dependency paths: "Most Referenced" list. Looks like Woodstox is quite widely used by projects that use or at least declare their dependencies using Maven: I assume magic number 1838 (which gives rank #32) could mean number of other projects depending on Woodstox. Not too shabby for an xml parser. Getting on the first result page is quite remarkable; especially considering that Woodstox ranks higher than many other worthy Java open source libraries like XStream, Hibernate, Quartz, Xalan and Velocity. And only slightly (by about 50% :-) ) trailing such ubiquitous thingy as Spring.

Although this is just one of way of estimating popularity of various (Java) OS libs, it is still interesting, because it has similarities to how scientific articles are ranked (impact factor; although here weights are uniform). And also since it could lend itself to Google PageRank style extensions as well... let's see.

Friday, June 26, 2009

Finally, after 3 years and 160 blog entries...

I collected enough clicks for the Big G, (and specifically their omni-present AdSense program), to get actual payment scheduled! These are 100$ (minimum amount program pays out) won by aspiration, inspiration and perspiration, and bit of persistence and perseverence as well. But maybe next time I should just blog about something like, asbesto law suits or mesothelioma. :-)

Anyway: money is still not in the bank (hence, scheduled), but I trust to find a hundred bucks minus taxes, on my low-interest-bearing checking account, in very near future. And in the meantime, my AdSense account now displays two new links too (stats since last payment!)

ps. Just to make sure this is not interpreted the wrong way by my colleagues -- no, this new-found wealth was in no way contributing to my recent career decisions. :-D

Thursday, June 25, 2009

Jackson data binding: which is faster, method (geter/setter) or field access?

After adding the ability to use direct field access for data binding for Jackson 1.1 -- in addition to already existing setter/getter-based access -- (see the previous entry), I started wondering whether there might be significant performance differences between two methods. Why? Because an earlier micro-benchmark from a while back suggested that there might just be such a difference, in favor of method-based access.

But is there? I modified code under 'src/perf' a bit (classes TestSerPerf, TestDeserPerf), and here's a sample from output from running both micro benchmarks.

Serialization:

Test 'Jackson, object+GET, final' -> 51 msecs (236).
Test 'Jackson, object+Field, final' -> 51 msecs (8).
Test 'Jackson, tree' -> 49 msecs (36).

Deserialization:

Test 'Jackson, object+SET' -> 72 msecs (54).
Test 'Jackson, object+Field' -> 69 msecs (33).
Test 'Jackson, tree' -> 80 msecs (122).

("tree" variant gives perspective on how different is speed of serializing/deserializer JsonNode - based TreeModels)

So the short answer is: no, there is not much difference (actual values fluctuate during the run, but the relative ratios are rather constant). This is quite different from the earlier results; and either JDK Reflection access for Fields has improved a lot with 1.6 updates; or most of the time is spent in places different from actual direct access (quite plausible).
So fortunately you can choose access method you prefer based on factors other than one of them being significantly slower than the other. :-)

Also, on a related note, it is interesting to note that it is slightly faster to serialize Tree Models (write json from in-memory tree) than from POJOs, but conversely bit slower. Latter is probably because building LinkedHashMaps is slower than constructing actual beans.

Monday, June 22, 2009

Jackson Goes All 1.1

Due to rapid speed of Jackson JSON processor development, a significant new release of was just cut. The release goes with catchy nickname "1.1" (given that 1.0 was to be known as "Hazelnut", this could perhaps be known as "Macadamia", but I digress).
Beyond obvious utility aspect (I use Jackson myself and want to start use some of new features at work, with the "official" version), this should also be good for getting feedback on some exciting new features (like JSON Schema generation).

Here are the highlights from 1.1 announcement (for more complete list, refer to full 1.1 release notes):

  • Support for JAXB annotations: you can reuse existing JAXB-annotated beans; and support can optionally be combined with 'native' Jackson annotations (using AnnotationIntrospector.Pair for chaining)
  • Ability to generate JSON Schema definitions using Jackson serializers on arbitrary POJO (package for schema is "org.codehaus.jackson.schema", part of Mapper jar, but it is invoked using ObjectMapper like all data mapping operations)
  • Support for direct field access: public member fields and explicitly annotated fields (using @JsonProperty) can be serialized, deserialized. And unlike with JAXB, it is ok to find both field and methods (methods have precedence if this happens
  • Annotation set has been streamlined: although all existing 1.0 annotations work (and will work for all 1.x releases); almost all functionality can be defined using but 3 new annotations:
    • @JsonProperty for indicating getters/setters/accessible fields, and to override logical property name associated if need b
    • @JsonSerialize to configure serialization (external serializer to use, whether to output null/non-default properties etc
    • @JsonDeserialize to configure deserialization (external deserializer to use, sub-types to use)

Part of new functionality (namely, JAXB annotation support) lives in a brand new jar ("jackson-xc" aka "Xtra-Curricular stuff"); otherwise deployment aspects haven't changed.

With 1.1 done, development for 1.2 version can start next. There's a big list of more functionality to implement -- but discussing that will be worth a separate blog entry. Stay tuned!
(and remember to check out 1.1 JavaDocs at Codehaus: it's the easiest way to document things and I really try to make them useful)

ps. The Really Useful Backing Corporate Entity known as FasterXML.com offers full support for using this new version to maximum effect. Just in case you weren't aware of such support.

Thursday, June 18, 2009

Jackson and JAX-RS: now with RestEASY too (with its 1.1 release)

Ok, this following announcement (for RESTEasy 1.1 release) caught my eye. Looks like JBoss is another corporate user of Jackson. Release notes do not give detailed information (nor related Jira entry), so I am not 100% sure what the level of integration is, but it sounds good at any rate. Even if just as a sanity check to verify that Jackson's JAX-RS json provider implementation works on multiple JAX-RS implementations. Which is good.

This should be one of those win-win situations: besides RESTeasy users having a way to avoid Franken-JSON, Jackson users may benefit from additional insights extended user community and visibility can provide. I see this as another sign pointing to Jackson becoming something of a "Woodstox of JAX-RS world" (as Woodstox is the de facto xml parser used by Soap 2 stacks, i.e. JAX-WS).

Wednesday, June 17, 2009

Reading DOM documents using Stax XML parser, StaxMate

One of new features of StaxMate 2.0 is the ability to read DOM Documents (given a plain old Stax XMLStreamReader), and write DOM documents (using a Stax XMLStreamWriter). This is something no Stax parser (no, not even Woodstox!) provides, since it is in the "reverse" direction of what Stax implementation could support (reading DOM documents as Stax streams, or directing output of a stream writer into DOM document.

Functionality for converting to/from DOM is contained in class org.codehaus.staxmate.dom.DOMConverter.

To read DOM documents, you do:

  FileInputStream in = new FileInputStream("input.xml");
  XMLStreamReader sr = XMLInputFactory.newInstance().createXMLStreamReader(in);
// ... then do whatever processing (if any), and point to START_ELEMENT
// (or leave at START_DOCUMENT: that'll work too) Document doc = new DOMConverter().buildDocument(sr); in.close();

and to write DOM document:

  FileOutputStream out = new FileOutputStream("output.xml");
  XMLStreamWriter sw = XMLInputFactory.newInstance().createXMLStreamWriter(out);
// and output stuff, if need be... new DOMConverter().writeDocument(doc, sw); sw.close(); out.close();

Ok, so you can do it but why would you? Most commonly this is useful when there is need to use tree-based processing tools like XSL transformers, or access using using XPath. Ability to build smaller documents from sub-trees is crucial to limit memory usage and thereby improve performance (or make such usage possible at all).

So far this interoperability support is still quite limited; but with little bit of encouragement, following future features could be implemented:

  • Similar functionality for building JDOM trees (code actually exist, in old Woodstox "stax-utils" package, just need to clean up), and perhaps XOM, DOM4j. (for XOM, there is already NUX, however, that covers the use case)
  • Ability to directly bind things straight via StaxMate input cursors and output objects. This is an obvious improvement -- the main reason current functionality operates on "raw" Stax objects is just that code to do so existed; to use StaxMate objects, little bit more work is needed to ensure proper synchronization. One nicety from doing this would be ability to filter out non-text/non-element nodes (comments).

As usual, feel free to comment on this functionality, or join StaxMate mailing lists. I will also incorporate these code samples in StaxMate documentation page(s)i.

Tuesday, June 09, 2009

Faster, XML, Faster!

It appears that FasterXML -- the commercial support organization behind Jackson, Woodstox, StaxMate and Aalto) is debuting on Seattle Startup Scene: according to this survey, it is close to breaking into hotly contested Northwest Startups Top-300 list. :-)
In fact, one of our fellow up-and-comers, MarketOutsider (hi Bryce!) is within our sight with ranking north of 300 limit.

One of important next steps will be figuring out exact details of licensing for Aalto -- it is something that actually has lots of potential, even if it is bit of a uncut diamond right now. Its asynchronous (non-blocking) parsing specifically should be very useful for high-concurrency (thousands of concurrent connections) use cases. And being 2x as fast as Woodstox (essentially, as fast as fast C XML parsers!) is nice as well. Shaving off CPU cycles pays off if you pay by cycle (think EC2).

And beyond that, it would be good to get to build some of actual new products, from Hadoop-on-S3 processing systems to plug-n-play database front-end web services. And of course all the momentum Jackson has: maybe it'll work nicely with GWT in near future.
But more on these things when plans inch forward.

Thursday, June 04, 2009

Java odds and ends: replacing Quartz, waiting for Servlet 3.0, standardising IoC

Ok, here's a brief link parade on some things that look interesting within Java world.

First: Cron4j looks like it could replace Quartz for a fairly common use case I have which is that of scheduling background batch processes. Quartz is all good, but it is rather a big library and does much more than what I mostly need (and pulls a few dependencies in too). So this might simplify some deployments a lot if it works like it should (haven't yet played with it)

And then, the long-awaited Servlet 3 specification seems to be improving, as per Greg W of Jetty who has been watching it closely, and has good perspective on things it tries to address. Sounds good.

And last but probably not least, there's one more potentiall small but useful project to "standardize the good stuff": specifically, @Inject (potentially to be known as JSR-303, see James' blog entry for more info, commentary) proposal. Seems like a good idea to allow some level of compatibility for IoC annotations that are used independent of IoC provider being used. For what it's worth, I agree with James -- better get simple useful no-brainer stuff out sooner, and let the "big plans" take their course. No need to delay things that want to be agile.



Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.