Saturday, February 24, 2007

Reflection considered slow -- but how slow is slow?

One commonly held notion, with respect to Java performance (or lack thereof), is that the dynamic field access and method dispatching via Reflection is slow. This turns out to be true (see below). The relative slowness is due to multiple factors, some of this could be addressed (if this was a very high priority performance problem), and some of which would be hard or impossible to eliminate. But it is probably reasonable to assume that the relative performance difference between straight access and Reflection based acces is likely to remain similar for foreseeable future. Because of this, it is useful to have some ballpark idea of how much slower is access using Reflection, so that the benefits of the very dynamic (if not very convenient, as Java did not start with such dynamic dispatch with 1.0) method and field access can be balanced against increased overhead.

So let's see what we can find out by doing some simple micro-benchmarking (with all the caveats micro-benchmarks come equipped with)!

1. Raw results

Just to make it easier to discuss performance aspects, let's start with actual numbers, running a simple micro-benchmark (see staxperf.misc.TestReflection class within StaxPerf sub-project hosted in Woodstox svn repository).

System used is an AMD Athlon XP 1700+ work station, with Fedora Core 4.

Benchmark output when running under JDK 1.5.0:

Took 51 ms for 500000 x Reflection construct (x = 12)
Took 26 ms for 500000 x Direct construct (x = 11)
    
Took 62 ms for 2000000 x Reflection call set (x = -1)
Took 886 ms for 2000000 x Reflection assign set (x = 15)
Took 0 ms for 2000000 x Method set (x = 4)
    
Took 49 ms for 2000000 x Reflection call get (x = -1)
Took 464 ms for 2000000 x Reflection access get (x = -1)
Took 0 ms for 2000000 x Method get (x = -1)

And when running under JDK 1.6.0:

Took 39 ms for 500000 x Reflection construct (x = 3)
Took 19 ms for 500000 x Direct construct (x = 5)
    
Took 61 ms for 2000000 x Reflection call set (x = -1)
Took 391 ms for 2000000 x Reflection assign set (x = 6)
Took 1 ms for 2000000 x Method set (x = 0)
    
Took 65 ms for 2000000 x Reflection call get (x = -1)
Took 201 ms for 2000000 x Reflection access get (x = -1)
Took 0 ms for 2000000 x Method get (x = -1)

Number of iterations were chosen to give enough range for values: exact values (especially for cases where runtime is at or below 1 millisecond) are less important than the proportions between the cases.

As a general note, it seems that 1.6 is actually a bit faster for most cases.

These tests fall into 3 distinct groups, discussed separately below. As a general note, it also looks like JDK 1.6 is a bit faster than 1.5 (which makes sense).

2. Constructing Objects

Object construction tests are very simple: new instances are created using a no-arg constructor, first test using Class.newInstance() (Reflection), and the other using simple new operator. Due to the simplicity (there are no type conversions or checking involved; something that appears to cause much/most of the overhead for other tests), difference between straight constructor calls and its Reflection counterpart is only 2-to-1. While clearly visible, for most use cases such difference is negligible: especially considering that the constructor in question is empty, and could be inlined for non-Reflection case. So in more realistic cases, where objects are bigger and there is some actual initialization code, difference appears negligible for all practical purposes.

It would be interesting to test if and how performance would differ, if constructor arguments needed to be passed. Based on other tests, the overhead would probably be higher.

At any rate, it seems that constructing objects using Reflection and no-arg constructor poses little performance problems for normal use, even if significant number of objects need to be created.

3. Setting field values

Another set of tests is used to test how fast is it to set field/property values. Two main approaches (independent of whether Reflection is used) are the direct field assignment (obj.x = 32), and calling a mutator ("setter") method.

The first challenge in measuring speed of the non-Reflection vs. Reflection access is in using enough iterations to be able to measure time taken by former method. Turns out that JVM is rather fast at both doing assignments, and calling (possibly inlined) set methods. While this is good news in general, it means that the ratios we get are crude approximations. Anyway, after cranking up the counter to 2 million calls, time taken approaches one millisecond (indicating that one could do hundreds of millions of calls per second -- HotSpot must be completely inlining the calls), to give at least some perspective as to the speed ratio.

With the disclaimers stated above (wrt. accuracy of ratios), it appears that using Reflection is about two orders of magnitude (~100 times) slower, when comparing calling a setter method; and about three orders of magnitude (~1000 times) if Reflection field accessor is used.

In this case, the speed difference may actually start to matter: order of magnitude differences can not be trivially ignored. It is also quite interesting to note that there is significant speed disadvantage in doing raw Field access via Reflection, compared to calling an equivalent method. This is not very intuitive result, considering that type checking should be similar. Perhaps there are some related security/access checks that only need to be in direct field access case.

On the other hand, even with the "slow" field access method one can still execute about 2 million operations per second. So for most use cases, it is probably plenty fast enough. Plus, with Java6, it's also tad faster, at, say, almost 5 million ops.

4. Getting field values

Similar to setting field values, three test cases exist for the reverse use case, accessing the value of a property. Results are similar; the main difference being that accessing the value via Reflection is a bit faster than setting the value. The reason appears to be that of type checking: since get method in question does not take arguments, and set method takes one, the main performance difference is that JVM has to do type more type compatibility checking in the latter case.

Since the performance is similar, the performance ramifications are similar to those listed above.

5. Reasons for slow field access using Reflection

Although it is not surprising that Reflection-based field access (get/set) is slower, it is somewhat surprising that it is measured in orders of magnitude. Why does this happen? Using the default JDK profiler points at type assignability checks as being the hot spots, followed by some security checks (for field sets); whereas most normal code can be statically analyzed by class loader, to ensure no illegal access is tried by bytecode, same can not be done in advance for Reflection access. It may also be that Reflection operations just haven't been very high on JVM optimizers' work lists: they fall on the wrong side of 80/20 split.

6. But does it matter?

Even though there is a clear speed disadvantage to using Reflection (except for the simple no-arg constructor), how much does it matter? For most use cases, not much. For it to matter, these operations have to be done in significant numbers, and for on-going time-limited operation (like handling a heavy load of a web service).

One of use cases where it just might matter is that of data binding (usually xml to/from Java objects) used with web services. Since all requests and responses are generally bound to objects, for each request, and each field is (in general) separately set when constructed, and accessed when serialized, the trade-off between Reflection-based access and code generation may become significant. Some data binding/serialization toolkits rely extensively on Reflection (XStream, for example, has multiple backends, but the default one uses Reflection field access), whereas others (like JibX) appear to use dynamic code generation. One would expect latter to be more performance; but once again, the degree to which this would matter is open to debate (or preferably, performance benchmarking and profiling!).

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.