Today many software developers consider Java to be the modern-day
equivalent of Cobol. This is evident from comments comparing amount of
Java code needed to do tasks that can be written as one-liners using
more dynamic and expressive scripting languages such as Python or Ruby.
Funny how time flies -- it wasn't all THAT long ago that Java was seen
as relatively concise language compared to C, due to its in-built
support for things like garbage collection and standard library that
contained implementations for host of things that in C were DIY (note
that I did not say "due to simplicity of language itself")
1. Java verbose?
But while it is true that Java syntax can lead to code much more verbose
than seems prudent (especially when traversing and modifying data
structures), sometimes its reputation exceeds reality. I was reminded by
this by a tweet I came across. The tweet asked "and how many lines would
this be in Java", regarding a task of downloading JSON from a URL and
parsing contents to extract data; something that can be done with a
single line of Python (or Ruby or Perl). Implied assumption being be
that it would take many more lines of Java code.
2. Ain't necessarily so
This assumption is not completely baseless: if a developer was to do
this as part of a service, a typical java developer might well end up
with code that exceeded ten lines; and this even without code itself
being badly written. I will come back to question of "why" in a minute.
But assumption is also off base, for the simple reason that it can be a
one-liner even in Java; for example:
Response resp = new ObjectMapper().readValue(new URL("http://dot.com/api/?customerId=1234").openStream(),Response.class);
// or if you prefer, bind similarly as "Map<String,Object>"
(and in fact, ".openConnection()" is actually unnecesary, as
ObjectMapper can just take URL -- but if it didn't, one can open
InputStream directly from URL, which sends request, takes response and
so forth).
Code snippet just uses standard JDK URLConnection via URL, and a JSON
library (Jackson in this case, but might as well be, GSon, flex-json,
whatever); and results in request being made, contents read, parsed and
bound to an object of caller's choosing, either a Plain Old Java Object,
or simple Map.
Given that it IS that simple, why was there assumption that something
more was needed?
3. But often is
Above use case happens to be doable in quite concise form; but there are
other tasks where Java equivalent ends up being either a call to a very
specific library tailored to condense usage, or is much fluffier than
equivalents in modern scripting languages. But I don't think this is the
main reason for the universal appearance of Java's bloatedness, i.e. it
is not just case of choosing a wrong example.
I think it is because most Java developers would actually write piece of
code that spanned more than a dozen lines of code. Why? Either because:
-
They didn't know JDK or libraries, and use much more cumbersome
methods (case for less experienced developers)
-
They actually understand complexities of the task, within context
where task needs to be done.
First one is easy to understand: if you don't know your tools, you can't
expect a good outcome. But second point needs more explanation.
Let's consider the same task of sending a request to a service that
returns a JSON response that we need to return as an object. What
possible additional things should we cover, beyond what one-liner did?
Here's sampling of possible issues:
-
There is no error handling in code snippet: if there are transient
problems with connection, it will just fail for good, regardless of
type of problem there is
-
How about problems with service itself? Requesting unknown customer?
Do we get an HTTP error response; different JSON or what?
-
Do we really want to wait for unspecified amount of time, if request
can not be made (TCP will try its damnest to connect, so it there is
an outage it'll be minutes before anything fails)
-
URL to connect to is fixed (and hard-coded), including parameters to
send; should they really be hard-coded
-
How is caching handled? What are connection details?
-
When there are failures, who is notified and how?
-
Are we happy with the default JDK URLConnection? It may not work all
that well for some use cases (i.e. shouldn't be using Apache
httpclient or something)
To cover such concerns for production systems, one probably would want
much more complicated handling: possible retries for transient errors;
definitely logging to indicate hard failures; way to handle error
responses and indicates those to caller. Due to testing, end points
being used are typically dynamically determined and passed; connection
settings may need to be changed, and sometimes different parameters need
to be sent. And for production systems we probably need more caching;
whereas during testing we may want to disable any and all caching.
Since there are often many more aspects to cover, there is then tendendy
to wrap all calls within helper objects or functionality; and if we did
define something like "fetchJSONDataFromURL()", it surely would end up
being more than dozen of lines of code. Yet calling functionality might
still be no longer than a single Java statement.
So which one should we focus on? Helper method that is, say 50 lines
long; or call to use it, which may be a one-liner? Former is what can be
used to "prove" how bloated Java code is; yet it is written just once,
whereas one-liners to use it are written ideally much more often.
By the way, above is not meant to say that it is ALWAYS necessary to
handle all kinds of obscure error modes, or to create perfect system
that is as efficient as possible. It is clearly not, and Java developers
seem especially prone to over-complicating and -engineering solutions.
But in other cases, happy-go-lucky approach (that I would claim is more
common with "perl scripters") won't do. This is just a long way of
saying that complexity of code should be based on actual requirements;
and that those requirements vary widely.
4. Concise Java by Composition
I think my insight (if any) here is this: since Java, the language,
offers relatively in way of writing compact code, economical source code
must come from proper use of libraries, as well as design of those
libraries. Furthermore, I think many Java developers have started
wrongly believing that Java code must be verbose; and that this makes
perception more of a self-fulfilling prophecy. This means that to write
compact Java code one absolutely MUST be familiar with libraries to use
for things that JDK does not support well (or at all).