Wednesday, April 28, 2010

Making Java Reflection field access 40 times faster

Think it can not be done? Think again!

Here is a useful nugget of information that should be common knowledge among performance-conscious Java developers:but based on code I've seen in the wild it might not yet be, hence this entry.

Basically, if you just do this:

  Field f = MyBean.class.getDeclaredField("id");
  f.set(beanInstance, valueForField);

performance stinks -- setting value is about hundreds of times slower than direct field access. But if you add this little call:

  f.setAccessible(true);

in there, overhead is drastically reduced. In a simple micro-benchmark running on my old desktop speed difference is 40x, and while YMMV, it will be an order of magnitude or more and this ratio does not seem to improve with new JVMs (has been steady with 1.4, 1.5 and 1.6). This actually takes overhead down enough so as to make reflection-based access "fast enough" for most use cases: Jackson for example just uses it and does not bother with byte-code generation to create most optimal access (but who knows? maybe it will one day).

So what is the difference? Access checks that would be done by default are heavy-weight, as can be seen from profiler stack traces and JDK sources. And all setAccessible() method does is set a boolean flag that will by-pass these checks. Works for Methods and Constructors as well, although with slightly less pronounced results.

Anyway, thought others might find this useful.

Tuesday, April 27, 2010

Promising new Java libs: BoneCP connection pool

Given somewhat sorry state of Java JDBC connection pools -- there are a few existing almost-mature and nearly-well-working connection pools, but with lackluster development and documentation (namely, DBCP and C3P0; Proxool might be bit better, hard to say) -- it is refreshing to see little bit of reinvention going on. First I learnt that some webapp container teams are writing their own simple connection pools (wish I had the link at hand -- was it Tomcat team?); as well as DB vendors (H2 bundles a decent a simple connection pool for example; maybe based on MiniConnectionPoolManager?). These may be more due to smell of code rot by alternatives than stand-alone developments.

But more interestingly there is BoneCP, which looks like it could actually become a full-feature mature and well-supported JDBC connection pool that actually "just works". I have not yet used it extensively, but by the looks of mailing lists, and some glimpses at code, it does look promising. There's a chance it might end up in my "Java library pearls" list quickly once I do properly test it.

Wednesday, April 14, 2010

Simpler Java database access with jDBI

One of most common tasks on server-side Java development is that of interacting with relational databases. Given that support for doing this is almost as old as stable versions of Java (JDBC API was added in JDK as of Java 1.1, and drivers for most databases were quickly published), one might think this as a well-known and covered area of Java development.

But it is not really: on one hand, there are rather complex ORM solutions that work acceptably for cases where complexity is on object-hierarchies and where significant overhead is accepted; and on the other hand, there is mostly just "raw" JDBC. If JDBC itself was a good API, this would not be so bad: unfortunately, it is not. JDBC is a victim of having been written back when apparently few knew how to write decent Java APIs (it has distinct feel of C style); and of course it was heavily influenced by companies not known for producing good Java APIs or code (relational database vendors -- Oracle may well win the title of "most crap Java shipped world-wide", and don't even get me started on catastrophe known as MySQL).
But I digress. It is enough to know that JDBC is a crappy API, first and foremost, and direct use should be avoided.

Which brings us to the actual point of this blog entry: there is hope for developers who feel squeezed between intrinsic ugliness of JDBC-based code, and torturous complexity of Hibernate and similar frameworks. There is a simpler way, using library called jDBI (which I briefly mentioned a few blog entries ago).

To here is a simple tutorial for jDBI.

1. It all starts with a Handle

To do anything useful, you need to start by creating DBI instance, which basically represents the database instance (schema). This in turn allows you to create Handle, which is essentially wrapper around JDBC Connection.

DBI instances can be created from DataSource instances (commonly the case when using JNDI or connection pools), or by giving the usual connect information (user, password, JDBC URL). So we will start with something like:

  DataSource ds = JdbcConnectionPool.create("jdbc:h2:mem:test", userName, pwd); // using in-mem H2 database
DBI database = new DBI(ds);

(for more information, check out jDBI javadocs)

Now: DBI instance does allow creating Handles directly; however, it is usually more convenient to use callbacks, so that we do not have to worry about cleaning up after operation (especially not have to worry about exception cases).

2. Inquisitive Idiots vs Simple Queries

If we are just doing one-off queries, we can use DBI.withHandle(), like so:

  Integer rowCount = database.withHandle(new HandleCallback<Integer>() {
public Integer withHandle(Handle h) {
return h.createQuery("select count(1) from USER").map(IntegerMapper.FIRST).first(); }});

and not to have to worry about handling of all underlying Connections, ResultSets and so on.

So what's with the map() and first() calls? Map() if for mapping first column of ResultSet rows into Integer (using one of existing standard mappers), and first() is for choosing value for first result row.

In this case latter may seem superfluous, but we could also have done something like:

  List<Integer> ids = dbi.withHandle(new HandleCallback<List<Integer>>() {
public List<Integer> withHandle(Handle h) {
return h.createQuery("select ID from USER").map(IntegerMapper.FIRST).list(); }});

and so on. You get the idea.

3. S/M section: binding your parameters

Well, simple queries are simple; and similarly simple updates (INSERTs, DELETEs; basically stuff other than queries) are simple. Next common need is for parameter binding. For example, to delete a user, you can do:

  final String lastName = "Foobar"; // final since it is reference from inner class
Integer deleted = database.withHandle(new HandleCallback<Integer>() {
public Integer withHandle(Handle g) {
return h.createStatement("delete from USER where LAST_NAME = ?").bind(0, lastName); // note: 0-based index
// or:
// h.createStatement("delete from USER where LAST_NAME = :lastName).bind("lastName", lastName);
}
});

You can bind both positional parameters (like in JDBC, except that indexes start at 0), or more conveniently, by name.

Parameter binding works similarly with queries; but there is even more convenient way to

3b. VIP: De Luxe Parameter Binding

Ok, so basic binding looks simple enough. But wait! There's more!

In fact, there are convenience short cuts for common CRUD operations, so that you could insert a new user like so:

  h.insert("insert into USER (FIRST_NAME, LAST_NAME) values (?, ?)", firstName, lastName);

and similarly for Handle.update() (SQL update statement) and Handle.select() (for SQL select statement)

4. And even more bondage: binding the results

The other place where more convenient data binding is needed is that of handling query results. There are two main ways to do this. First one is to use explicit user-provided ("custom") converters; these are implementations of jDBI ResultSetMapper:

  static class UserMapper implements ResultSetMapper<User>
  {
public User map(int rowIndex, ResultSet rs, StatementContext ctxt) throws SQLException { return new User(rs.getString(1), rs.getString(2)); } } static class User {
private String firstName, lastName;
public String getFirstName(String n) { return firstName; } public String getLastName(String n) { return lastName; } public void setFirstName(String n) { firstName = n; } public void setLastName(String n) { lastName = n; } }

And then actual result binding:

  List<User> users = database.withHandle(new HandleCallback<List<User>>() {
    public List<User> withHandle(Handle h) {
      return h.createQuery("select FIRST_NAME, LAST_NAME from USER").map(new BeanMapper()).list();
    }
  });

Alternatively (and even more conveniently since you need not write UserMapper class) you can use basic Bean-based binding: if value classes conform Java Beans specification (has appropriately named "getters" and "setters"), you can also just use:

  List<User> users = database.withHandle(new HandleCallback<List<User>>() {
    public List<User> withHandle(Handle h) {
      return h.createQuery("select FIRST_NAME, LAST_NAME from USER")
.map(new BeanMapper<User>(User.class)).list();
} });

Nifty, eh? In fact, not very much less convenient than Hibernate for many common cases.

5. Transactions

So far all code has used "withHandle" callback, which basically uses JDBC auto-commit mode. It usually works well for queries and one-step CRUD operations; but not when regular transactions are needed. But we could have as easily used transactions for operations, like:

  final String account1 = "1234";
final String account2 = "1235";
database.inTransaction(new TransactionCallback<Integer>() { // we don't use nominal return value for anything
public Integer inTransaction(Handle h, TransactionStatus status) {
// transfer money as a transaction
h.createStatement("update ACCOUNT set BALANCE = BALANCE - 10 where ACCOUNT_ID = :account1")
.bind(0, account1).execute();
h.createStatement("update ACCOUNT set BALANCE = BALANCE + 10 where ACCOUNT_ID = :account2")
.bind(0, account2).execute();
}
}
}

So how does this differ from "withHandle"? Transaction is basically committed if execution terminates normally (no exception), but rolled back if an exception is thrown by code within callback block. This is the intuitive way of handling things, although not something basic JDBC would guarantee.

6. Life is a Batch

One simple way to improve efficiency of database code in Java is to use batches. As with other basic operations, JDBC does not offer much convenience support, so jDBI can simplify things here as well

final List<User> users = figureOutUsersToCreate(); // however these are determined
int inserted = database.inTransaction(new TransactionCallback<Integer>() {
public Integer inTransaction(Handle h, TransactionStatus status) { PreparedBatch b = h.prepareBatch("insert into USER (FIRST_NAME, LAST_NAME) values (?, ?)");
for (User user : users) {
b = b.add() // start new batch entry
.bind(0, user.getFirstName())
.bind(1, user.getLastName());
}
int[] stmtResults = b.execute(); // should be all 1s; let's count for fun
int count = 0;
for (int c : stmtResults) {
count += c;
}
return count;
}
});

which is relatively simple; the only gotcha comes with differing values that difference JDBC drivers return for updates (for example, MySQL has most bizarre values ever; when using 'merge' statement, inserting one row can return 1, 2 or even 3 as value!).

Nonetheless, jDBI makes process little bit more tolerable; and still transparent and efficient (no excessive magic involved)

7. Anything more?

There is lots more to learn -- like how to call stored procedures; how to deal with database-specific access and so on -- but so far I have only used quite basic functionality. So if you need other features, make sure to check out author's blog and Javadocs.

Monday, April 12, 2010

More efficient client-side HTTP handling with the new Async HTTP client @GitHub

1. Yet another HTTP-client?

Ok now: I am aware of the fact there are quite a few contestant for the "best Java HTTP client"; starting with the well-rounded and respected Apache HTTP Client (esp. version 4.0). But there is now a very promising, up and coming young challenger, aptly named Async HTTP Client ("Ning async http client", considering its corporate sponsor at Github) written by a very competent guy whose past work includes things like Glassfish, and especially its Atmosphere module (async http goodness; Comet, WebSocket etc).

Given it has the single most important thing an open source project needs (at least one technically strong developer who knows the domain well), I have high hopes for this project, and recommend you to keep it in mind if you need an HTTP client for high-volume server-side systems (why server-side? because that's where you typically need much more concurrent client-side HTTP access, when talking to other webb services).

2. Asynchronous? So... ?

So why does it actually mind whether you use blocking or non-blocking client? Well, the "async" (aka non-blocking) part is obviously important in general for highly concurrent use cases, where JVM thread scaling is not very good beyond hundreds of threads.
But more interestingly, it also really starts to matter when you have "branching" with your service: that is, for each call your service handles, it needs to make multiple calls to other services. With blocking http clients you either have to spin new threads (complicated, and somewhat costly); or do requests sequentially. Former can achieve low(er) latency; latter is simpler and more efficient. But with asynchronous calls, you can actually fire all (or some) requests concurrently, as early as possible; do some processing after this, and when necessary, check for request results (via Futures). While not as trivially easy as sequential calls, this can be almost as good, and with much improved latency.
High branching factor is what powers many high-volume web sites: for example, high-traffic web pages such as Amazon.com's pages are composed from multiple separately computed blocks, many of which are built based on multiple independent calls to backend services. This can not be done with tolerable latency by using sequential web service calls.

Beyond non-blocking part, it is also likely that over time blocking convenience facade will be developed as well, so it is not unreasonable to expect this to develop into more general-purpose solution for HTTP access (at least that is my personal opinion/wish).

Anyway: cool beans; we'll see how this project advances. So far progress has been remarkably rapid -- in fact, version 1.0 seems to be in sight; as tentative feature list has been discussed on the user list. More on 1.0 when it is out in the wild.

3. Disclosure

In spirit of full disclosure, I should mention that Jean-Francois (the author) is actually my current co-worker -- but at least I know what I am talking about when praising him. :-)

Sunday, April 11, 2010

Mental note: do NOT buy my hardware from these guys

At the risk of resorting to humor of last resort -- making fun of names of other people and companies is the bottom-feeding genre of comedy -- I'll have to say this: from security perspective, you might consider twice before ordering your computer gear from "Chown Hardware". Name just, you know, suggests that there is a reason why pricing appears rather attractive... And why is the number one entry on product list labeled "Access and Exit Control"?

What next: "Internet Cafe 404"?

ps. How did I spot this one? No, not via SPAM, just via regular unsolicited snail mail.

Friday, April 09, 2010

Rock on Kohsuke!

Term "Rock start programmer" is thrown around casually when discussing best software developers. But as with music, true stars are few and far between. While knowing the lifestyle can help, you got to have the chops, be able to influence and inspire others, and obviously deliver the goods to fill the stadiums, and data centers.

In Java enterprise programming world there are few more worthy of being called a rock star than Kohsuke Kawaguchi. List of projects he has single-handedly built is vast; list of projects he has contributed to immense, and his coding speed mighty fast (as confirmed by his use of term POTD, Project of the Day -- very very few individuals write sizable systems literally in a day!). It all makes you wonder whether he is actually a mere human being at all (maybe he's twin brother of Jon Skeet?!). For those not in the know, list of things he has authored or contributed to contains such programming pearls such as Multi-Schema Validator, Sun JAXB (v2) and JAX-WS implementations, Hudson, Maven, Glassfish, Xerces, Args4j, Com4j, and so on and on (for a more complete list, check out his profile at Ohloh; read and weep)

But to the point: it seems that mr. Kawaguchi is now moving on from sinking ship formerly known as Sun. This is not a sad thing per se (we all gotta move on at some point), nor unexpected -- steady stream of Sun people leaving Oracle has been and wil be going on for a while -- but it still feels strange. End of an era in a way; gradual shutting down of Sun brand. Image of a lonely cowboy riding against Sun settings (pun intended) comes to mind.

Anyway: rock on Kohsuke, onnea & lycka till! I look forward to seeing exactly what awesomeness you will come up with next!

Growing pains: upgrading from Jetty 6 to Jetty 7

Over last week or so I decided to try upgrading my Jetty knowledge, by trying to deploy an existing web app (built to run on Jetty 6) on Jetty 7. Jetty 7 is a major upgrade, so it would not be realistic to expect no problems; but at the same time, it seems to be positioned as a transitional version before even bigger changes for version 8.

At any rate, a casual glance at documentation did not seem to have big blinking warnings. Jetty FAQ does say following: "Is Jetty 7 a drop in replacement for Jetty 6? -- No, while Jetty 7 has very much the same architecture as Jetty 6, there are packaging changes and other minor tweaks between the versions", but that does not sound like a big honking warning.

So I tried out using exact set up from previous deployment. And... surprise surprise, things did not work. What happened?

1. Library path: no more "-Djetty.lib=my-path"

Ok, first things first: turns out that a commonly used system property "jetty.lib" is no more. This was not mentioned directly anywhere; and more importantly, its replacement was not mentioned. And curiously some other similar properties (like "jetty.home") still works as before.

But simplest work-around I could found -- based on glancing at "start.config" and various documents -- was to simply "downgrade" System property into regular "property" (start.config distinguishes between the two by using either curly braces [System Property] or parentheses [plain properties), like so:

  java -jar start.jar lib=my-path

and voila, libraries are ones again found as expected.

2. Got JSP?

Ok, me neither. This is what I saw next:

  2010-04-09 18:56:06.927:INFO::NO JSP Support for /test-app, did not find 
  org.apache.jasper.servlet.JspServlet

Turns out that the Eclipse download specifically does NOT include things needed for JSP support; nor do default start up argument include options. Rather, you MUST add "OPTIONS=xxx" (where 'xxx' can be "All", for convenience, or list options like "Server,jsp").
And a very important point: DO NOT try to pass OPTIONS as system property (with leading "-D") -- this will not work. I accidentally tried that (or maybe it is just because most other libs and apps pass any command-line arguments via system properties; and very few do their own parsing like Jetty does [so why does it? I have no clue])

After changing my start up line to:

  java -jar start.jar -Djetty.home=$PWD/jetty-base lib=deploy/lib OPTIONS=Server,deploy,jsp $PWD/deploy/config/jetty.xml

(and, peculiarly, "jetty.home" IS a system property -- while cleaning things up, why was this NOT changed to a "regular" Jetty property? inquiring minds want to know!)

things started working. Happiness!

3. What else?

So far so good: these were the only hurdles. But given other indications in documentation, I suspect there are other differences I may bump into; in areas of default values (webdefault.xml), for example. But we'll cross those streams when we get there.

But so help others to troubleshoot issues, here are links to couple of really useful pages at Jetty wikis:

Thursday, April 08, 2010

Strive for Simplicity: So Long Quartz, yello' Cron4j!

A while ago I wrote that Cron4j seems like a nice replacement for venerable kitchen-sink-scheduler, Quartz. This especially in cases where all-you-can-eat-buffet of Quartz scheduling feature is not needed; and when resulting largish set of dependencies is bit hard to swallow. And now I finally found time to test the idea in practice, tinkering with a little backburner project of mine which only needs very basic Cron-like triggering to start task.

And yes: so far Cron4j has delivered as expected for my use case: it does what is needed (just reading Cron-expressions, doing callbacks), has reasonable documentation, and overall has the attractive smell of "Just-works-ness" that good libraries have. It is probably too early to declare a victory, but so far things are looking good.

The one issue I had to deal with, which I need to follow up further is that cron4j only supports limited set of 5 fields as opposed to all 7 (6 mandatory, one optional) as documented on Cron expressions Wikipedia page. Fields missing are seconds and years; which in general are not all that useful. But it would have been better to support standard notation: and specifically omission of seconds is unfortunate since cron expressions are not portable.

Still, simple is good, and I think Cron4j will be a good little helper library for my toolbox.

Wednesday, April 07, 2010

Lesser-Known Java Pearls: jDBI for straight-forward SQL access

Ok: here is another pleasant recent discovery: jDBI library for doing SQL access. This is library that if it didn't exist, I would need to write. Actually, I have written fragments of it multiple times (and not as well); but no more.

So what is it? It is like starting with something like Spring's JdbTemplate (to do away with having all those boilerplate try-catch-close constructs); adding convention name-bound variables, and convenience methods for common things like accessing first row of result set, binding single-column values, and building batch inserts/updates. Common, common-sense things that most developers would quickly build on top of "raw" JDBC access, in cases where full ORM solution is either overkill, or gets overly complicated. And using my favorite design principle of building something that "just works".

I will try to come up with some sample code to show why I think it's cat's meow, but right now I just want to mention that it exists, since javadocs and Brian's jDBI blog entry should be able to guide brave adopters on right path. API is intuitive, so even if method descriptions on Javadocs are bit sparse it should be possible

ps. I originally posted this on April 1st. I should have paid more attention to timing... :-)

Tuesday, April 06, 2010

How to pass system properties to "maven test" task?

Unit test should by definition be very self-contained; and this includes their configuration. But there are cases where it is convenient to either allow overriding such default settings, or run other types of configurable tests (esp. integration) tests as JUnit or TestNG tests. And one straight-forward way to pass configuration is by using Java system properties.

Defining system properties is simple enough: just add "-DpropertyName=value" on command line. But with Maven these will not automatically get propagated (at least not with Surefire plugin, default plugin used for "test" goal. This was apparently done to prevent accidental leakage of system properties that wrapper might otherwise pass (especially environment settings that might vary between environments). At any rate, you have to do some extra work to pass system properties to be accessible by unit test.

Although what seems to be recommend way is to use <systemPropertyVariables> tag (see this page), there is actually an easier way (as suggested by this StackOverflow answer): pass system property "argLine" to Maven, which will then pass the whole value as additional command line argument when test plugin forks JVM. Something like:

  mvn test -DargLine="-Dsystem.test.property=test"

Without doing this, you would basically have to explicitly pass each and every system property using something like:

  <plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
  <systemPropertyVariables>
<propertyName>sys.prop.name</propertyName>
<buildDirectory>${sys.prop.name}</buildDirectory>
[...]
  </systemPropertyVariables>
</configuration>
  </plugin>

which is just a whole lot of monkey code, best done without.

On a related note, this somewhat complete list of Maven(-recognized) system properties is a nice resource too.



Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.