Monday, May 31, 2010

On prioritizing Open Source projects, tasks: First In, First Out

I have been bit less productive with my "extra-curricular" activities (open source coding, writing blogs) lately. This is mostly due to intense focus on my paying daytime job, and is part of the natural cycle of things for me.
But altohugh I have had bit less time and energy to spend on these tasks, I have had relatively more time to think about things. It turns out that amount of time I have to think about things does not quite correlate with amount of time to do things (which is interesting thing in its own right, but maybe worth a separate blog entry). More thinking often leads to having more ideas for writing blog entries; even if writing activity itself is still constrained by time crunch.

1. FIFO as a priorization mechanism

Anyway: I have realized that a fundamental operating principle that I have with respect to managing my open source projects -- tasks within projects, relative focus between projects -- is that of trying to tackle oldest issues first. Good old First-in-First-out (FIFO) queuing of things.
Except for high-priority urgent bug fixes, which I generally fast track (but which fortunately are not very common), I do try to increase priority of issues that have not been fixed. This is actually quite different from many task priorization methods (even if it's well-known algorithm for operating system process priorization), but which I feel is an important part of maintaining "culture of excellence", ensuring that qualtity of project output remains high. And to reduce risk of letting most of your open source projects stagnate to death.

Thing is: as with fish (and guests, as per Mark Twaine), bugs also smell more longer they stay. Smell of code rot comes from long-standing unresolved issues. It also tends to be the case that it is easier to keep momentum than to get things rolling -- this makes it hard to revive stagnant projects; but possible to keep recently(-enough) worked-on projects chugging along.

Now: it is bit more well understood that teams should at least occasionally go through list of long-standing issues and ideally not only re-visit them but also fix. In for-profit development targets and priorities tend to fluctuate more than with labour-of-love projects that most open source projects still are. This tends to make it less likely that older issues get resolved, as priorities are more driven by who cries loudest, and most recently. Still, most good developers I know are uncomfortable leaving issues unresolved before starting to extend functionality, to build new things.

But I think FIFO principle as driving force for priorization goes beyond increasing priority of old tasks. I think it also should (and does, in my case) drive relative priorities of different projects (or services, systems, libs, frameworks). And this is something I also try to do more.

2. Example case: my own project priorization ("product backlog")

Specific case in point is that of my focus on my current "flagship" project, Jackson JSON processor. Jackson has had my main focus for well over a year now -- mostly since it is by far the most popular of things I have built; and will likely remain so for a while now. So if I chose to, I could spend all my time and then some just working on issues related to Jackson.
Doing so would, however, essentially kill other projects I am heavily involved with -- Woodstox, StaxMate, Aalto, Java Uuid Generator -- as well as prevent me from expanding to new areas (like upcoming "compact trie" package, google for "ning tr13"; more on this once it is ready to be announced). And that is not something I find compelling as an idea. Especially since code that is not worked on will start to rot at fast rate, and becomes useless surprisingly fast.

So: as frustrating it is to watch issues stack for important projects like Jackson, the way I try to do things is to cycle through projects that I consider worth keeping alive. Sometimes it just means flipping between two projects; and some projects may even become complete in their own right, meaning there just isn't that much to work on, and thus the cost of switching focus for some minor work just isn't worth it. But to give further example of what I mean, here is my current thinking of roughly how I hope to complete next tasks, with respect to projects I work on (note: priorities are not absolute or carved in stone; they are akin to Scrum sprint plan, if even that solid):

  1. Complete Java Uuid Generator rewrite to version 3.0
  2. Work on Woodstox 4.1, focusing on XML Schema handling improvements (working with couple of other OS developers who have stake in this area -- including work on Sun Multi-Schema Validator)
  3. Finish version 1.0 for the "compact trie" project
  4. Implement minor extensions for StaxMate, to get to version 2.1
  5. Jackson version 1.6: must have better Enum type handling; should have "materialized interfaces"; can not be delayed too long since there's always version 1.7 to write...
  6. Compact binary format alternative for JSON (with Jackson as reference implementation)
  7. ... maybe consider implementing "DataMate" (if you haven't heard my brainstorming on what it is, consider yourself lucky :) )
  8. Cycle through projects again!

So why consider work on JUG as the first priority? Well, for one, time is right to both upgrade it to be useful and relevant -- there are things like JDK 1.4 - introduced java.util.Uuid class; JDK-1.6 introduced access to Etherner Mac address; as well as "new" use cases (Cassandra, for example, uses Time-based UUIDs heavily!). And as importantly, work has reasonably limited scope: it will take about week more of focus to get it all done; as I have already spend some time over past month or so to make this reality.

And Woodstox? While XML does not have same momentum in J2EE world as it once had, Woodstox is very widely used, complete and useful library. But its XML Schema support has only recently been more heavily exercised, and multiple implementation flaws have been incovered. To make Woodstox relevant as a first-class Java XML Schema supporting tool, some work is needed. Further, there are some useful improvements in trunk that can only be released with version 4.1 (retro-fitting to 4.0 would be too risky).

I think you get the idea -- maybe not exactly why I feel this order is sensible, but at least see that there are multiple conflicting factors. I guess I know my own working habits well enough to know that "out of sight, out of mind", meaning that longer a project is not being worked on, harder it will be to get back to work on it. And so the best way to keep all the balls in the air is to juggle through them; and do this as a semi-formal process and not rely on user request to trigger such changes (esp. since there are more requests for work than time for it).

Wednesday, May 19, 2010

Un-hibernating projects: Java Uuid Generator, getting ready for 3.0!

As cycle of seasons has rolled to late spring, it is time for hibernating things -- bears, and stagnant open source projects -- to wake up and start moving. It just so happens that this is the case with venerable Java UUID Generator (JUG): my first true Open Source project.
Although it is not exactly the first thing that I ever released as open source (that would be something called "NetReaper", or perhaps "DLR", both from late 90s -- few have ever heard of them -- heck, even "Fractalizer" is older!), and much less the first piece of software I have released (shareware lib/app that compressed Amiga Soundtracker files using delta compression is probably the first one, from late 80s!), I count it as basically starting point of my open source "career".

So what is happening? Well: there is the new JUG project page (at the OS darling GitHub); matching skeletal JUG product page at FasterXML; and of course the brand new JUG users discussion group (java-uuid-generator-users) at Google Groups, waiting for users to talk about it. And probably most interestingly, actual development effort to produce third major version, 3.0.

Given that the project has spent past 5 or so years changing very little, why is there new development effort? Mostly because JDK finally caught up with JUG, so to speak -- JDK 1.6 finally has a pure Java method of accessing Ethernet interface MAC addresses -- but partly also because of other niceties that can now be added (java.util.UUID was added in JDK 1.4; which was not the stable version at the time of writing JUG 2.0). And finally, there's quite a bit of clean up that would be nice to do if I was to work on the code.

Given above, here are the modest goals for version 3.0

  • Add convenient support for using local Ethernet address, without using JNI library (requires JDK 1.6); and remove legacy code that was needed for JNI
  • Change UUID type to use from JUG-specific to java.util.UUID (also allows removing quite a bit of code)
  • Build/deployment changes: change build to Maven (including releasing builds to Maven repos); jars built as OSGi bundles as well
  • SCM changes: move from Safehaus/svn to GitHub/git
  • Improve API to avoid relying heavily on singletons; streamline for simpler (and perhaps more elegant) access
  • Add support for one "new" UUID generation method (using SHA-1 instead of MD5 for name/hash based generation)
  • Maybe even write a simple tutorial for using the lib!

Which is just to say, renovate the package so it does not feel quite so 2002 any more (which is when it was written originally). :-)

Monday, May 17, 2010

Finally: Java JSON Schema validator; based on Jackson

Ok here are some good news for Java developers who use (or might like to use) JSON, but are bothered by lack of data format validation options: Nicolas Vahlas has written the first Java JSON Schema validator, and it is available from Gitorious as project json-schema-validator.

I have not yet have time to dig deep into it, but there all signs for it being so-called Good Stuff. Not just because it is based on Jackson -- although proper reuse of existing solid components is a general good sign -- but because description gives an idea of author being someone actually knows what he is doing.

So please check it out if said functionality seems at all interesting: the best way to ensure it becomes a first-class tool (and maybe even help JSON Schema standard improve along the way) is to use it, give feedback, and get the whole flywheel-of-virtue (aka virtuous cycle) thing going on. That's how things like Jackson and Woodstox became good: feedback is the amplifier of open source productivity.

Ok, enough raving: I'm off to get the sources for bit of closer look. :-)

Thursday, May 13, 2010

Happy First Birthday, Jackson the JSON processor!

Time sure flies when you are having fun: I just realized thatJackson JSON processor had its first birthday, if we use its 1.0 release's date (of May 9th, 2009) as the reference point.

And how far has Jackson community come? Here are some interesting statistics, mostly provided by good old Google Analytics:

  • 10,000 monthly visits to Jackson wiki; with 33,000 page views (getting close to half million page views per year)
  • 5,000 monthly visits to Jackson Javadocs; with 17,000 page views (200k on annual basis)
  • close to 4,500 downloads (== visits toJackson Download page) -- meaning about 50k direct downloads on annual basis; and many more if access via Maven was included (wonder how could this be tracked)

It is fascinating how far and how fast Jackson has gone: currently its downloads beat Woodstox by factor of 4:1, for example, although Woodstox has been the leading high-performance Java XML parser for years now -- difference has to do with both declining popularity of XML (compared to JSON), and because Woodstox is more commonly bundled with frameworks, whereas Jackson is quite pleasant to use directly. Furthermore looking at trends, it looks like "Jackson traffic" has doubled or tripled over past 8 months or so.

Of course, traffic numbers are not quite as interesting as more applicable things -- number of users, and even more importantly, amount of functionality Jackson offers. But still, above numbers give some guidance as to growing popularity of Jackson, as well as of JSON as THE dominant data format on Java platform.



Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.