« Converting Monotone To Mercurial | Main | AS3 AMQP Client: First Cut »

Proposed Hessian Extension For Erlang

In a lead up series of articles, I discussed the motivation and protocol flow of an extension to the Hessian web services protocol. This installment outlines the first implementation of the proposed protocol written in Erlang. This article assumes knowledge of the Erlang programming language and the Hessian serialization protocol in order to run the examples. For those that are primarily interested in Hessian and not Erlang, this article is tractable to the extent that you can read over the Erlang specifics.

Hessian for Erlang - The Cotton Project

Cotton began life as an implementation of version 1.0 of the Hessian protocol. Soon after this, handling for version 2.0 of the protocol was added. The code base for version 2.0 handling has now been forked to implement the proposed type negotiation extension to the protocol.

Implementing Type Negotiation

Although the article describing the motivation behind the extension included a patch to the Caucho Java implementation of Hessian 2.0, this implementation was little more than a proof-of-concept. It was based on the current Java library for the following reasons:

  • Caucho's module is the mostly widely used Hessian library that serves as an informal reference implementation of the protocol;
  • By patching an existing library, it provided a relatively neutral comparison between protocol variants. Doubt may have been cast over the comparison, had a different library or language been used to implement the extension proposal.

To turn the proof-of-concept patch into a formal library release, I implemented the protocol extension by forking the Cotton library, of which I am the author. The fork constitutes release 0.3.0 of Cotton. The intention is to embrace the type negotiation as the way forward, so the 0.2.x series will remain as a maintenance branch for those who require a 2.0 implementation for Erlang.

Because the flow of the protocol has been described previously, the rest of the article will focus on example usage of the Cotton library.

New Features In Cotton 0.3

Several new aspects have been added to version 0.3 of Cotton:

  • Implementation of type negotiation as a major feature;
  • Splitting the previous monolithic type mapping facility into two dedicated parts to maintain separate states for encoding and decoding respectively;
  • Dividing the encoding-decoding into two conceptual phases:
    • One phase deals with state contained solely within the transmission scope of an invocation. By and large this encapsulates the dynamic reference counting introduced in Hessian 2.0, which is implicitly yet not formally bound to a request-repsonse cycle;
    • The other phase comprises the received types whose hash values span multiple invocations. This represents the receiver's view of the type hashes sent to it over the course of multiple calls. Since the prime motivation behind the hash mechanism was to implement an idempotent information exchange strategy, this phase reflects the inherently static nature of the pool of known types;
  • Extending the functionality of the core module to provide abstract mechanisms to handle type negotiation between peers:
    • The call/5 function provides negotiation-aware handling for a synchronous client;
    • The invoke/4 function provides the inverse handling to call/5 and is designed for server-side RPC handlers;
  • A simple http server for the new version of Hessian based on the Mochiweb framework;
  • More extensive test coverage for protocol handling as a whole.

Using Hessian Type Negotiation In Erlang

The core library provides abstract decoding and encoding facilities that are independent of the wire format or synchronicity of the underlying transport mechanism. This enables the module, for example, to be embedded in a synchronous http request-response scenario or in an asynchronous fashion using a messaging service. The core API and, indeed, the protocol extension has been designed to cater for both interaction paradigms.

The simplest way to see how this works is to run the bundled Hessian http server and run a test client against it. This example is taken from the integration_test module that is bundled with the distribution.

  1. Firstly, you need to obtain the source either as tarball or from the repository;

  2. After unpacking it, build the distribution using the supplied Makefile and start an Erlang shell (this has been tested with OTP R12B-0):

    $ make
    mkdir -p ebin
    erlc +debug_info -I include -o ebin -W0 src/*.erl
    $ erl -pa ebin
    Erlang (BEAM) emulator version 5.6 [source] [smp:2] 
    [async-threads:0] [kernel-poll:false]
    Eshell V5.6  (abort with ^G)
    1>
    
  3. Verify that the core library is functioning correctly by running the test suite:

    1> hessian_test:test().
    =INFO REPORT==== 15-Apr-2008::23:04:37 ===
        application: crypto
        exited: stopped
        type: temporary
      All 25 tests successful.
    ok
    2>
    
  4. Start up a second Erlang shell to run the Hessian http server. This server depends on an installation of Mochiweb, which is a simple http server that provides a straightforward mechanism for implementing custom http request handlers. The easiest way to get the necessary Mochiweb modules is to check out the source from their Subversion repository and build it:

    $ svn co http://mochiweb.googlecode.com/svn/trunk/ mochiweb
    $ cd mochiweb
    $ make
    .....
    $ cd ..
    $ erl -pa ebin mochiweb/ebin
    Erlang (BEAM) emulator version 5.6 [source] [smp:2] 
    [async-threads:0] [kernel-poll:false]
    Eshell V5.6  (abort with ^G)
    1> hessian_server:start().
    {ok,<0.32.0>}
    2>
    
  5. In the server shell, register the type information of the tuples you want to serialize for this example. This provides the server with the static type information that it requires to successfully decode Hessian method calls and dispatch them to the appropriate function:

    2> integration_test:server_registry().
    ok
    3>
    
  6. Going back to the first shell, run the integration test suite:

    1> integration_test:test().
      All 2 tests successful.
    ok
    2>
    
  7. If you want to see a more verbose output of the main test case that illustrates the new protocol flow, stop the server shell and restart it. This will flush any type information it has negotiated from the previous test run. Make sure you rerun steps 4 and 5. Then switch back to the first shell and run the following test:

    2> integration_test:concatenate_test().
    Sent  payload:
    <<99,2,0,109,0,11,99,111,110,99,97,116,101,110,97,116,101,
      79,1,215,172,52,111,144,3,102,111,111,3,98,97,114,122>>
    Rec'd payload: <<113,2,0,1,215,172,52,122>>
    Sent  payload:
    <<79,1,215,172,52,116,0,26,110,101,116,46,115,102,46,99,
      111,116,116,111,110,46,114,101,99,111,114,100,115,46,80,
      97,105,114,146,5,102,105,114,115,116,6,115,101,99,111,
      110,100,122>>
    Rec'd payload: <<114,2,0,78,122>>
    Sent  payload:
    <<99,2,0,109,0,11,99,111,110,99,97,116,101,110,97,116,101,
      79,1,215,172,52,111,144,3,102,111,111,3,98,97,114,122>>
    Rec'd payload: <<114,2,0,6,102,111,111,98,97,114,122>>
    
    
    .... %% last 2 lines repeated a number of times
    
    
    ok
    3>
    

The last log shows the binary peer-to-peer exchange, which can only be understood by those versed in the protocol.

Protocol Flow

For those less familiar with the low level details of the protocol, the log trace from step 7 adheres to the following steps:

  1. The sender starts transmitting instance data without any regard for whether or not the receiver actually knows anything about the types being sent:

    • The byte sequence 99,2,0 represents a Hessian 2.0 call;
    • Bytes 0 through 101 form a 13-byte sequence indicating the function to be invoked;
    • The following 5 bytes (79,1,215,172,52) indicate a type definition with a 32-bit hash value;
    • 111,144 indicates that an instance (111) of the 0th (144) type was visited in the current invocation context. The 0th type in this scenario is the hash sequence 1,215,172,52;
    • 3,102,111,111 is the first field of the 0th type - a string with the value "foo";
    • 3,98,97,114 is the second field of the 0th type - a string with the value "bar";
    • 122 ends the call;
  2. The receiver does not recognize the hash 1,215,172,52 so replies with a type query for this value;

  3. The sender sends the explicit type information containing the hash value and the entire type definition including its fields and namespace. This is the exact information that the protocol extension intends to optimize away. Because this is inherently static, resending it on every request-response cycle would be redundant;

  4. The receiver acknowledges the reception of the type information with the 114,2,0,78,122 byte sequence. Whilst this is not required from a protocol perspective per se, in this example we are dealing with a synchronous request-response transport, so we need to send back something for the invoking function to be able to return. In an asynchronous scenario this step would not be necessary;

  5. After this, the sender resends the exact same information from step 1. The sender then proceeds to send the same pattern for each subsequent invocation: send a hash reference, then send instance data without sending any explicit type information. Because the type information was already sent in step 3, the receiver no longer needs to query the sender.

Discussion

A number of implications arise from this approach to type negotiation:

  • The sender does not need to maintain any state of the types that they have sent and to whom they have been sent. Because the hash function is idempotent, the sender can apply it dynamically to every type they transmit and a consistent value will always be sent;

  • The sender does not need to care about the sequencing of events nor be aware of an explicit session concept. The sender always assumes that the receiver has prior knowledge of each type they wish to send. When this is not the case, the receiver handles it by exception and sends a type query back. By not having an explicit session concept, the protocol is not tied to a specific transport;

  • In this particular example, the size of the type definition is 49 bytes. By only sending a hash reference and reverse querying by exception, the average wire size has been reduced significantly in comparison to the standard Hessian 2.0 protocol flow. Obviously your mileage will vary depending on the size and verbosity of the type definitions for the object graph you are serializing and on the overall chattiness of the interaction. However, this approach demonstrates that there are significant cost savings to be made.

Outlook

This article has discussed the implementation of type negotiation as an extension to the Hessian protocol using Erlang. The Cotton library offers transport-agnostic encoding and decoding facilities as well as a simple synchronous Hessian server using the Mochiweb http toolkit.

The next stage of the protocol extension will be to offer implementations in different languages. These will serve as, firstly, a verification for the existing Erlang library and, secondly, implementations for different programming platforms to be able to speak to each other.

Posted on Thursday, April 17, 2008 at 10:40PM by Registered Commenter0x6e6562 in , | CommentsPost a Comment

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>