Monday, October 23, 2006

StaxMate based web service (UUID generator), part IV (Output details)

(continuing the story of StaxMate based simple web service, part 4)

After having a closer look at the input side, it is now time to do the same for the output side (the source code is still available from http://docs.codehaus.org/display/WSTX/StaxMateSampleWebService).

1. Handling output

After processing the input, there are two kinds of output that may need to be output, both in xml: error messages for input or processing problems, and result messages for succesful generation of requested uuids. Sequence of calls made is similar between the two: error case is handled by method reportProblem() (lines 180 - 214), and the success case by writeResponse() (lines 146 - 165), so let's just go over writeResponse() (and writeDocWithRoot() that is called by it):


void writeResponse(HttpServletResponse resp, List<UUID> uuids,
                  int totalRequested)
  throws IOException, XMLStreamException
{
  resp.setContentType("text/xml");
  OutputStream out = resp.getOutputStream();
  SMOutputElement rootElem = writeDocWithRoot(out, "response");
  for (UUID uuid : uuids) {
    rootElem.addElement("uuid").addCharacters(uuid.toString());
  }
  if (totalRequested > uuids.size()) { // truncated?
    rootElem.addComment("had to truncate "+(totalRequested - uuids.size())
+" uuids; will only generate up to "+MAX_UUIDS_PER_REQUEST+" UUIDs per call");
  }
  // Need to close the root, to ensure all elements closed, flushed
  ((SMOutputDocument)rootElem.getParent()).closeRoot();
  out.flush();
}

SMOutputElement writeDocWithRoot(OutputStream out, String nonnsRootName)
  throws XMLStreamException
{
  SMOutputDocument doc = 
  SMOutputFactory.createOutputDocument(SMOutputFactory.getGlobalXMLOutputFactory().createXMLStreamWriter(
out, "UTF-8"), "1.0", "UTF-8", true);
  // Let's indent for debugging purposes (not for production)
  // These settings give linefeed, plus 2 spaces per level
  doc.setIndentation("\n ", 1, 2);
  return doc.addElement(nonnsRootName);
}

First couple of lines get the output stream to use (as you may recall, it is generally preferable to pass OutputStreams instead of Writers, to Stax and StaxMate components).

Writing of the root element is separated as a helper method, since it is needed from multiple places (helper method could also deal with OutputStream generation in this case). Creation of StaxMate output document (which acts as the container for root-level entities, most important of which is the root element) is done in style similar to that of creating the root cursor.

The last thing done before adding the root element is to enable indentation -- although it is not really needed for production systems, it may be nice for debugging purposes, or when generating human editable xml documents.

Generating actual output is, well, trivial: since the root element directly contains <uuid> elements, code just loops over the list, creates the enclosing element, and adds textual serialization of the uuid to the element. Quoting of special characters ('<', '&', ...) is done if and as needed by the underlying stream writer. One minor addition over plain vanilla output is that if number of uuids generated was truncated (since service caps max number to 100 per request), an xml comment is added after elements.

The only unusual method call that is needed is call to SMOutputDocument.closeRoot(): this is needed so that all possibly open start elements are closed. Since code at this point does not have direct reference to the output document, we will just refer to it via parent reference (the alternative would have been to hold on to the document reference -- that would have saved us from having to upcast the parent reference).

2. Benefits over Plain Old Stax?

As with input handling, this simple service would be quite simple to implement using any xml output technology. But there are some areas where StaxMate aims to improve productivity of developers, over using Stax API directly. For example:

  • You don't have to worry about closing all of those start elements: as soon as you add a new sibling, previous one gets closed, recursively. You will never end up with unmatched tags.
  • It is easy to pass these output scopes to other methods (or even sub-systems): pass an SMOutputElement to a method, and it will only be able to add entries within the implied sub-tree.
  • Output calls can be easily chained (for deep single-branched sub-trees), resulting in compact code.
  • Namespace binding is fully automatic, very efficient, safe (never miss a namespace declaration again), and easy to use .While the example didn't use namespaces, many other use cases do, as most xml standard nowadays use namespaces for modularization. Using namespaces with StaxMate is a breeze.
  • Output indentation is easy to enable (and disable, on per sub-tree basis if necessary). While it is one of most commonly requested features for Stax, it is not implemented currently by any Stax processor (there are however other stax helper libraries available that do implement it externally).
  • It is possible to use limited amount of memory buffering to do out-of-order outputting: for example, it is possible to "freeze" output of a sub-tree for as long as necessary, to be able to add attributes to the root element of the sub-tree, and "release" buffered sub-tree as soon as attribute or attributes are added.

Many of potential benefits listed above are similar to the ones that exist on the input side. This is not a coincidence: StaxMate tries to use similar concepts on both sides, where applicable.

3. Further Improvements on the Output Side

One thing that could be further simplified is the construction of the root output context. It should not be necessary to have to use raw Stax API for creating the underlying XMLStreamReader: the details could (and should) be encapsulated within SMOutputFactory

4. Next Steps

Now that we have covered the actual service, it will be time to focus on the client side, and show how service can be used. It may also be interesting to know what kind of throughput is to be expected from a simple service like this.

To be continued...

Sunday, October 22, 2006

StaxMate based web service (UUID generator), part III (Input details)

(continuing the story of StaxMate based simple web service, part 3)

The previous entry outlined expected inputs and output of the web service. So let's now have a look at how this functionality is implemented (and as a reminder, source code is available from http://docs.codehaus.org/display/WSTX/StaxMateSampleWebService)

1. Handling input

Handling of GET and POST requests is different, since the input is in different form. For anyone interested in details of GET request input processing, have a look at doGet() method (lines 76 - 95) of sample.UuidServlet class (due to simplicity of the service, all functionality implemented in just one class).

But since we are more interested in how xml input is processed, let us have a look at doPost() method (lines 97 - 144). doPost() method is called by the Servlet base class, when a POST request is sent to the server. The core input handling starts on line 102, and continues through line 137 (comments are removed or modified to keep sample compact; code structure is unchanged):


InputStream in = httpRequest.getInputStream();
SMInputCursor rootc = SMInputFactory.rootElementCursor(SMInputFactory.getGlobalXMLInputFactory().createXMLStreamReader(in));
rootc.getNext(); // well-formed docs have single root
if (!"request".equals(rootc.getLocalName())) {
  reportProblem(resp, "Root element not <request>, as expected, but <"+rootc.getLocalName()+">", null);
  return;
}
// Request has no attributes, but has 0+ methods (batches)
SMInputCursor requests = rootc.childElementCursor();
int totalReq = 0;

List<UUID> uuids = new ArrayList<UUID>();
while (requests.getNext() != null) { // ignore or signal error? latter
  if (!"generate-uuid".equals(requests.getLocalName())) {
reportProblem(resp, "Unrecognized element '"+requests.getLocalName()+"', expected <generate-uuid>", null);
return;
  }
  UUIDMethod method = determineMethod(requests.getAttrValue("method"));
  int count = determineCount(requests.getAttrValue("count"));
  String name = requests.getAttrValue("name");
  checkParameters(method, count, name);
  // (removed code for restricting max. uuids per request)
  uuids.addAll(generateUuids(method, count, name));
}
// Proceed to write output

First couple of lines create the root-level cursor: this is just needed to match the root element, since we don't care about comments or processing instructions outside of the root element. The root element is checked to verify it is of the expected type; and if not, an error message is output (details of that output will be explained in the next entry -- for now, we'll skip it).

The main handling loop is done with the help of a child cursor: it will traverse over (immediate) child elements of <request>, ignoring all other node types, like white space that may be used for indentation, comments and processing instructions. As with the root element, we will verify that elements have the expected name. If so, necessary attributes are accessed, verified (with the help of a separate method, which need not parse any xml), and finally UUIDs are constructed and added to the result list. Nothing very complicated.

2. Benefits over Plain Old Stax?

Example above was simple, but mostly due to the service being quite simple. The equivalent pure Stax solution would be quite simple as well. However, even this simple example shows some areas where StaxMate does or can help:

  • Skipping over non-relevant xml events can be automatic. White space used for indentation is seldom of interest, so why require it to be explicitly skipped? Input cursors can use filtering (implemented using SMCursor interface): in the example case, all but START_ELEMENT events are automatically filtered out. Such filtering is optional -- you can traverse over any and all events you want to see.
  • It is easier to delegate handling of sub-trees to helper methods. For example, if <generate-uuid> elements could have child elements (to contain more complicated arguments), it would be easy to create a separate method that takes a child iterator as argument, and only handle that sub-tree. One of chief benefits of the delegation is that the called method can not inadvertently skip more end elements than it should: it is restricted to just the sub-tree that the cursor traverses over. Think of it as simple xml tree scoping: code only needs to concern itself with the scope it is responsible over.
  • Skipping over sub-trees is automatic. While there are no ignorable elements in this example, it is quite common for other tasks to have sub-trees that can and need to be ignored (such as annotation or comment sections of many xml vocabularies). With Stax, one has to keep track of number of open start and elements: with StaxMate, one just moves cursor past the START_ELEMENT that is root of the sub-tree, and all enclosed events are ignored as well.
  • Collecting textual content that an element contains is trivial: just call cursor.collectDescendantText() and you will get a String containing contents of all text nodes under element the cursor currently points to.

There are other more advanced (element tracking, to effectively store partial sub-tree information in memory) or less often useful features (node and element indexes): but for now above should suffice as teasers to get you interested in learning more.

3. Further Improvements on Input Side

Although the code as shown is quite simple, I realized during writing this entry that it could be further simplified.

For example:

  • It seems unnecessary that application needs to create XMLStreamReader instance separately: instead, StaxMate could abstract away these details within its own SMInputFactory, and completely hide these low-level details. This could also simplify configuring of the input factories; task that Stax does not make particularly simple or type-safe.
  • In addition to basic event type based filtering, it would seem reasonable to have standard filters that allow skipping not only all non-element events, but also elements with names except for specified ones. Specifically, perhaps the child cursor should only return <generate-uuid> events, and ignore any others (if such exist). This would further simplify checking; if no strict validation is needed.

There is a good chance that such features will be added into the first "official" StaxMate major release, 1.0.

4. To be Continued...

So much for the input side: the next entry will deal with the output side.

Stay tuned!

Wednesday, October 18, 2006

StaxMate based web service (UUID generator), part II (GI/GO)

(continuing the story of StaxMate based simple web service, part 2)

1. Platform

Although the web service can be deployed on any servlet container, the bundled download archive comes with embedded Jetty 6 container. I highly recommend Jetty for tasks like this -- it is a mean clean servlet container. In fact, for this particular use case, I could have made even more light-weight version that need not use Servlet API, just Jetty's "raw" interface. But just in case someone wants to try it out on Tomcat or something, I decided to do it the standard way.

2. Garbage In

As I said earlier, the web service takes request in two forms: simple GET requests, in which all argument come in as part of request URL (as query parameters); and xml-based POST requests, in which client will send request xml document that contains same information. To demonstrate potential benefits of the latter approach (as well as to get to use StaxMate), POST requests allow slightly more complicated requests to be made: namely, one can request multiple batches of UUIDs to be generated using just a single request. While it would be possible to do this with query parameters, it would require hacks with query parameter naming, and soon become unmaintainable if new orthogonal features were to be added.

Instead of documenting in detail everything one needs to know about GET URLs, here is an example of URL you could use to contact a locally running server (just pound it in your favourite browser and see the results), assuming default server port from the downloadable package was used:

http://localhost:7272/uuid-server/generate-uuid?method=RANDOM&count=3

This would generate 3 UUIDs, using the Random number based generation method (version 4).

Likewise, a sample POST request could look like:

  <request>
   <generate-uuid method="random" />
   <generate-uuid method="location" count="3" />
   <generate-uuid method="name">http://www.cowtowncoder.com/foo</generate-uuid>
  </request>

and would create 5 UUIDs all in all, using all 3 different generation methods (last of which requires an argument). This one is trickier to invoke using browser, and would generally be invoked using a special client, be that written in Java, Perl, C# or Javascript.

3. Garbage Out

Although there are 2 different ways to send a request, the output will always be in same format, a simple xml document, of type 'text/xml'. So, for example, the first request would return result document like:

  <response>
   <uuid>733365c3-2d44-4f93-accd-43cb39b0cedf</uuid>
   <uuid>249df610-c658-491f-bf58-d21bcee110cb</uuid>
   <uuid>a93a3ecf-2636-4f1c-8d14-d63bc84f2d67</uuid>
  </response>

About the only additional thing to note is that the servlet will cap maximum number of UUIDs generated per request to 100: if total requested (including all batches, for POST request) is more, only 100 will be returned, and the resulting xml document will contain a comment indicating this restriction, after all the uuid entries.

4. To be Continued...

So far so good. In the next episode, dark secrets of the underlying code (Java on server side, and just a teeny weeny dose of Javascript on client side) will be exposed. Stay tuned!

Tuesday, October 17, 2006

StaxMate based web service (UUID generator), part I

1. Problem to Solve

First challenge in writing a sample web application to showcase StaxMate is that of finding a simple (but hopefully non-trivial) problem to solve. This task is made difficult by the fact that many highly useful services have already been implemented: the problem of returning cheerful "Hello World" greetings has been adequately solved; likewise for the ever-so-useful "Echo the string I send" use case. And even highly complicated tasks such as calculating sum of two integer numbers have been automated as highly scalable web services.

And finally, anything that involves actual storing and retrieving of data is immensely boring, and besides it is something I already get paid to work on during daytime. I am not getting paid enough to also work on that during my free time (no, not even if my advertising revenue was skyrocketing!).

So what is there left to automate?

2. Very Funny... But What Are You Really Going to Write?

Given above constraints, I decided I should just recycle something I have already written earlier. And to this end, I just happen to be the author of the nifty Java Uuid Generator (JUG) library... so, how about just writing a web service that allows one to ask server to generate bunch of UUIDs? Yup, that oughta do it. Couple that with an invocation to Echo service, and perhaps add 128-bit UUID numbers, and we should have a fully functional insta-code SOA system, just add some hot gas!

(as a sidenote, in case you wonder what UUIDs are, feel free to consult Wikipedia)

3. Approach

Although by far the simplest way to implement such a web service would be to maybe just fire off a cgi script (after all, uuidgen exists for enough platforms), that wouldn't have much use of StaxMate. On the other hand, I could go for full Soap implementation, and either write 1000 lines of code to implement bare bones soap handler, or with the help of great automation toolkits that save me from writing 100 lines of code, write perhaps 5000 lines of code. :-)
But that would mostly showcase these helper toolkits (or soap handling in general). And great as they are, I'd like to have at least maybe 5% of code do actual xml tasks for which StaxMate was written. And since that code is maybe 5 lines all in all, it seems like I better hand write about 100 lines of glue code around xml handling.

Thus, I will write a simplish "REST" (aka Plain Old Xml, POX, over Http) web service. To make it slightly easier to demo, as well as access using but a browser, I will make it accept both GET and POST requests. Latter is mostly there to give an excuse to use StaxMate parsing: but in both cases, output will be xml written using StaxMate. The main difference is then that with GET, all arguments come as query parameters, whereas with POST, request is in xml. And just to show off, POST request can use batching, that is, request multiple ranges of UUIDs to be generated. GET request can only create multiple UUIDs of same type (which almost always is enough for your needs).

To top it off, the service as implemented will be invocable in at least 3 ways:

  • Typing in URL on your favourite browser, and see the xml response as it is returned
  • Using the simple but trendy html page (included in package, no batteries required), powered by some Ajax mojo, to do the same
  • Using a simple client class from your application, to encapsulate details of sending request, and parsing resulting xml

As of now, first 2 methods have been implemented: third will be implemented in near future (as in next 2 or 3 months), depending on how high is the ad revenue for this article. :-)

4. Sneak peek

At this point, the service itself exist, and all the needed pieces are downloadable from StaxMate Subversion repository: links to Subversion as well as direct links can be found on this page. Page should explain the simple procedure to deploy the service; simple being relative term, and referring to someone who has ever deployed a servlet using a web archive (.war file).

Unfortunately, at this point I do not have a publicly accessible Servlet-enabled server to run the web service on, but if anyone wants to host it, I would be more than willing to link to such instance (or, if you'd rather wait to see how much it would add, I am planning to load test the service in "near future", which will be after client library gets written).

5. In the next Episode, the Ab(d)ominable CoderMan will...

In the next blog entry, I will outline the code snippets that actually use StaxMate for input (parsing of input message, when serving POST requests) and output (when writing response message, either an error, or success message with set of generated UUIDs).

Stay tuned! Oh, and also, please add some feedback if you try out the web service (or just have strong opinions on the matters of xml, web services, or obscure subjects like, say, finnish pop music) -- beyond adsense numbers, it's always nice to figure out if anyone is reading your musings.

Thursday, October 12, 2006

StaxMate - based UUID generator web service almost done...

But the article still needs to wait for my having enough time to wrap up odds and ends. However, the server side code is mostly done, and checked in Staxmate Subversion repository (see StaxMate home page for details) if anyone wants to have a sneak peak.

One other thing that I realized I need to do before documenting the sample web service is to make sure that the latest StaxMate release has all the pieces I use. To that end, I decided I should just release version 0.9.1 of StaxMate, and use that for the service, available from StaxMate download page. Changes to the API are minor, but the release does include one important fix to the way multiple input cursor are kept in sync -- big thanks to the developers who submitted the patch (you know who you are! and others can check out CREDITS file).

On a related note: one of the biggest things on my own wishlist for StaxMate is simple XPath capability: ability to get a cursor pointing to the first element of the resulting node set (or iterator to ordered set of cursors, one of which can be active at any given point), given an XPath expression using a "streaming subset" of XPath (basically anything referring to the current stack of elements). I am hoping to investigate Jaxen a bit more, to see if I could use it. I did have a look at Java 5 XPath support, but was disappointed since the approach taken means I can not make it work over StaxMate (only Sun can). It does, however, seem like Jaxen might be bit easier to repurpose. If anyone is interested in seeing such features, ping me with email (cowtowncoder at yahoo dot com).



Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.