Proposed Serialization Protocol in AS3

Release 0.3.2 of Cotton introduces an AS3 client library which implements a proposed type negotiation extension to the Hessian binary serialization protocol. This article demonstrates the protocol extension using the AS3 library as a client that speaks to a remote server implemented in Erlang. The first section discusses the design principles and protocol flow, so if you are more interested in the working example, you can skip to that section.

Introduction

Because the nature of the protocol extension is to facilitate inter-language type negotiation, implementing a peer library in a different language to the core Erlang library seemed to be an appropriate proof-of-concept for the proposition. Furthermore, every new language binding helps to increase the overall usefulness of the proposal.

The quintessence of type negotiation is that a peer need not redundantly encode type information on every request-response cycle. Instead, a peer may send a token that represents a commonly understood type. The advantage of this approach is that the wire size of a token is much more compact than the verbatim type information. In a chatty exchange scenario, the reduced wire size has great efficiency benefits in terms of transmission and decoding.

Design Principles

The primary motivation behind the proposed extension is to reduce the redundancies in Hessian 2.0. The initial patch to the Caucho Hessian Java library eliminated the redundant sending of type information by expanding the scope of transmission to span multiple request-response cycles. Though this was very efficient, it made many assumptions about the mode of exchange between two specific peers. Furthermore, it did not cater for the failure of any peer on either side.

In light of these observations, the formal design of the protocol extension has been further articulated. The protocol design intends to follow the following principles:

  1. The type exchange procedure should be robust;
  2. As a pure serialization protocol, it should not be bound to the semantics of a particular transport mechanism;
  3. The flow of the protocol should function in both synchronous and asynchronous exchange scenarios;
  4. The type exchange should be stateless with respect to the interaction with other peers, i.e. it should not rely on the concept of a session;
  5. The negotiation process can be asymmetric, peers need not use a unified token to represent a particular type, they merely need to be understand the meaning of a token received from their opposing peer;
  6. A peer requires no knowledge of other peers that it sent data to;
  7. A participant need not maintain state about the type tokens it has already sent on a per-peer basis;
  8. The generation of type tokens should be idempotent with respect to their type;
  9. The cardinality of the peers in the exchange model should not affect the the protocol flow.

Protocol Flow

Following these design principles, an extension proposal has already been discussed. This proposal outlined the basic intention from a conceptual perspective. However, in the course of implementing the Erlang and AS3 libraries, it became apparent that some aspects of the original design required rethinking.

  • Firstly, the original design put the onus on the receiving peer to store payloads for which it couldn't resolve a hash reference. It would then be forced to buffer the undecodable request as opaque binary data whilst it queried the sender for the relevant type information;
  • Secondly, the flow didn't differentiate based on the directionality of the exchange. During the implementation, it transpired that it makes a difference whether the initiator is sending a potentially unknown type or whether the responder is returning a type of which the initiator has no knowledge of.

The two different scenarios are illustrated in the following flow diagrams. The first scenario involves a caller sending a unknown type whilst the second scenario covers the event in which the responder is sending an unknown type. These diagrams depict a timeline for the protocol exchange between both peers.

Caller Sends An Unknown Type

In this scenario, the calling peer encodes the instance n of type x and sends it to the receiver. Making no assumptions about whether the receiving peer recognizes the type token that has been sent, the sender starts the interaction by hashing the type, encoding the hash and then back-referencing all subsequent instances of this type with the type hash's relative position in the stream. In the diagram, this is highlighted by the dotted arrow from the position reference to the hash declaration during the first interaction.

At this point in time the receiver has not received anything from the sender, hence it cannot recognize the hash value. This is indicated in the diagram by the relative position of type x along the respective timelines of the sender and receiver. Because it has no knowledge of type x, the receiver cannot process the stream, so it drops the incoming payload and sends back a query for the hash value that the other peer sent.

In a synchronous scenario, the calling peer will receive the type query in the same execution context as the original payload was sent. In this situation therefore, the onus can be put on the caller to resend the original payload after the type has been successfully negotiated. From a robustness perspective, it would be brittle to force the receiver to buffer this stream in stateful fashion. It is much easier for the sender to resend, because the original data it sent will still be on its execution stack when it receives the type query.

In an asynchronous scenario, the receiver has the opportunity to discard the incoming message and submit a query for the type. In this situation, redelivery can be achieved by leveraging broker infrastructure because the original stream will have been sent in a fire and forget fashion.

After receiving a query for a particular hash value, the caller encodes the type information. This includes the fully qualified name of the type and the names of its fields. This is indicated in the diagram in the third transmission. Upon decoding this, the receiver now has the type information pertaining to the hash value. This is shown in the timeline on the receiver's side.

The receiver acknowledges the reception of the type information. This event indicates to the caller that it can now re-transmit instance n of type x. After that, the caller can transmit further instances of type x without having to resend the type information.

Responder Sends An Unknown Type

If you look at the start of the timeline in this diagram, you can see that both peers have already negotiated type x. In this scenario, the initiator is invoking a remote function that takes type x as a parameter and returns an instance of type y as a response. At this point in time, the calling peer has no knowledge of type y so it cannot decode instance m.

This is where the directionality of the exchange becomes apparent. In the first scenario, the onus was put on the sender to buffer the original payload in event that a retransmission should become necessary. This was because the original payload remains on the execution stack in a synchronous exchange.

For the same reason, in this scenario, the onus is put on the initiator to buffer the response from the other peer. The difference is that in the previous case, the receiver was querying the sender, whereas here, the sender is querying the receiver. The implication for the protocol flow is that the initiator does not have to acknowledge the receipt of the type information. In the previous scenario, this was necessary in order for the receiver to preempt retransmission of the original payload. This is not required in this scenario because the original payload is already in the execution context of the querying peer.

After receiving the information about type y, the calling peer then decodes instance m of type y.

API Design

As with the Erlang library, the API has been divided up into two layers:

  • A lower layer that provides basic binary encoding and decoding functionality;
  • An upper layer that implements the protocol flow for type negotiation and provides high level bindings to physical transports such as HTTP or a messaging protocol.

Currently, the encoding layer is the most stable in terms of API changes. Because the transport binding layer has only been tested with HTTP at the moment, potential improvements may be necessary to accommodate for different transports. One roadmap item is to integrate the library with the AMQP library for AS3.

Furthermore, the upper layer uses a factory pattern to create dynamically proxied instances of application interface definitions. There is scope to change this to use a more declarative mechanism, for example, one based on dependency injection.

AS3 Implementation

This example is based on release 0.3.2 of the Cotton project. The AS3 library is available as a binary swc library. Alternatively you can check out the source tree from the repository for the 0.3.x series. If you choose to use the binary, you will have to set up your own project. If you check out the source, you can import the project containing the test suite into your IDE (if you are using one) and run the Flexunit test suite.

In order to test the library against a server, you will need a running instance of the Erlang server that implements the proposed Hessian extension. A full description of how to boot and test the server is described in this article. The article assumes that you have Erlang OTP installed on your machine. Erlang is available as a source distribution or as a binary for Windows from the OTP site. Various package managers have can install Erlang on other platforms.

Once you have run the integration tests described in the server article to check that everything is working correctly, you can now use the AS3 library to connect to the server. The server example exposes three interface methods that can called remotely via the proposed protocol extension:

package org.hessian.transport
{
    import net.sf.cotton.records.Pair;

    public interface TestInterface
    {
        function concatenate(p:Pair):String;
        function split(s:String):Pair;
        function add(x:int,y:int):int;      
    }
}

Connectivity Test

To perform a simple connectivity test, simply invoke the add method which requires no type negotiation because it only uses primitive types. To invoke the method, you will need the interface definition on your build path. The invocation uses a dynamic proxy that encapsulates the low level remoting details behind the interface of the business method that you want to call:

import org.hessian.transport.ProxyFactory;
import org.hessian.transport.AsyncHandle;
import org.hessian.transport.BottomHalfEvent;

// later on in your code

var url:String = "http://localhost:2345";
var proxy = ProxyFactory.createProxy(TestInterface,url);
var handle:AsyncHandle = proxy.add(2,3);            
handle.addResponseListener(onResult);

Because HTTP responses are handled asynchronously in AS3, you will need to register an event handler to process the response, for example:

public function onResult(event:BottomHalfEvent):void {
    var sum:int = event.result as int;
    // assert that 2 + 3 = 5
    assertEquals(5, sum);   
}

Remoting Components

This simple example demonstrates the three main components of the library that are exposed to application code:

  • ProxyFactory - this is responsible for generating a dynamic instance that proxies the business interface definition and dispatches method calls to the remote peer;
  • AsyncHandle - this provides the application with a easy way to register a callback handler for an asynchronous response;
  • BottomHalfEvent - since the top half of the asynchronous RPC is incorporated into the proxy invocation, this event represents the bottom half of the RPC and is used to retrieve the return value from the remote response.

This approach has been taken to minimize the amount of moving parts exposed to user code. There are two limitations within AS3 that prevent the dynamic proxy from being strongly typed (i.e. compile time type safety):

  • There is no concept of blocking I/O in AS3;
  • The dynamic type cast operator checks the compile time type of the instance returned from the factory, which in this case is Object. This won't work because you cannot coerce Object to the interface type.

There is a possibility to generate AS3 byte code dynamically using AS3 Eval in the future. This could feasibly create quasi-static dynamic instances, but this would require further investigation.

Testing Type Negotiation

Until now, we haven't tested any type negotiation because the add function only used primitive values. To see type negotiation, you can reuse the same code as before, but this time use either the split or concatenate functions:

var onResult:Function = function(event:BottomHalfEvent):void {
    var s:String = event.result as String;
    // assert that s = "foobar"
    assertEquals("foobar", s);
}   
var pair:Pair = new Pair();
pair.first = "foo";
pair.second = "bar"
var handle:AsyncHandle = proxy.concatenate(pair);           
handle.addResponseListener(onResult);

This demonstrates how the protocol handling is completely abstracted away from the application code. The only intrusion into user code from an interface perspective is the fact that the response from a remote call has to be handled asynchronously. This is a restriction imposed by the AS3 runtime as opposed to a limitation of the Cotton library.

Flexunit Test Suite

The previous examples were taken from the test suite which is available in the source code repository as a project that can be imported into an IDE of your choice. The project is located in the as3/test/ directory.

The project contains three Flexunit test suites:

  • main - this provides unit tests for the core encoding and decoding functionality. This does not require any remoting at all;
  • transport - this provides integration tests for connecting to the remote server and invoking dynamic proxies;
  • negative_flow - this tests the protocol handling when the remote peer fails.

Limitations

Currently, the AS3 library does not contain a complete implementation of all of the data types in the Hessian specification. The aspects of the protocol that have been implemented and tested in the AS3 library include:

  • 8, 16, 24 and 32-bit integer support;
  • Chunked string decoding;
  • Single chunk string encoding;
  • Support for type definitions and instances;
  • Type negotiation;
  • Call, reply and fault handling;
  • Null values;
  • Lists;
  • Maps;
  • Floating point numbers;
  • Dates;
  • Booleans;
  • Raw binary data.

It was decided to concentrate on proving the type negotiation flows between different implementations using a subset of the built-in types. Because the type negotiation is the major feature of Cotton, it was important to release a solid working version in order for interested parties to have the opportunity to test and critique the general approach.

The less commonly used data types that have not been included in the AS3 implementation include:

  • 64-bit integers - mainly because they don't exist in AS3;
  • Typed Lists - due to the typeless nature of Arrays in AS3;
  • References - for cyclical data structures, which are less prevalent in data exchange scenarios;
  • Remote objects - these appear to be included in the Hessian specification in order to support EJB remoting;

Roadmap

Depending on community feedback some of the next scheduled tasks could include:

  • Completing the support for the remaining built-in types;
  • Implementing a further library in an alternative language;
  • Demonstrating the usage of the AS3 library in an asynchronous scenario.
Posted on Friday, May 16, 2008 at 08:35PM by Registered Commenter0x6e6562 in , | CommentsPost a Comment

Converting Monotone To Mercurial

Today I had to convert a Monotone repository to Mercurial. Tailor is a tool that can convert changesets from one SCM into the format of another SCM. It currently supports many major SCM systems, including Monotone and Mercurial.

Tailor is written in Python and you can check out the latest version from their source repository:

$ darcs get --partial http://darcs.arstecnica.it/tailor

To do this, you will need to have darcs installed. Once you've got this, install tailor by running:

$ [sudo] python setup.py install

To be able to perform any conversions, you will need to have both the source and target systems installed in your environment. For this scenario, I needed to have both Monotone and Mecurial installed locally.

Using Tailor involves a two step process:

  1. Use Tailor to create a configuration file that describes the source and target repositories;
  2. Supply this configuration file to Tailor and tell it to run the conversion.

There is a good manual page that describes all of the options you can supply, but I just want to demonstrate how to convert all revisions of a particular branch in a Monotone repository to a Mercurial repository.

To do this, I ran Tailor with the following options, redirecting the output into a configuration file, which will be used in step two:

$ tailor -s monotone -t hg -R mtn.db \
-module com.acme.your_branchname \
target_dir --verbose > config.tailor

This indicates the location of the Monotone database and the particular branch to be converted. To perform the conversion, run the following:

$ tailor --configfile config.tailor

On the first attempt, the conversion was successful, apart from the way the commit comments had been migrated. Instead of copying the committer's actual comments, Tailor had produced comments that look this:

[project @ 4ff5bc70bed0ed59cbbc736d486044cb31047f6c]

To remedy this, there is an attribute called patch-name-format that you can set in the configuration file to tell Tailor to keep the original comments. You can see this option in the final version of my configuration file:

[DEFAULT]
verbose = True

[project]
target = hg:target
start-revision = INITIAL
root-directory = /tmp
state-file = tailor.state
source = monotone:source
subdir = target_dir
patch-name-format = ""

[hg:target]

[monotone:source]
module = com.acme.your_branchname 
repository = /tmp/mtn.db

After adjusting this, I reran the conversion and everything was fine.

If you want to, you can skip step 1 and just use this configuration file, adjusting the paths and branch names where necessary.

Posted on Friday, May 9, 2008 at 06:30PM by Registered Commenter0x6e6562 | CommentsPost a Comment

Proposed Hessian Extension For Erlang

In a lead up series of articles, I discussed the motivation and protocol flow of an extension to the Hessian web services protocol. This installment outlines the first implementation of the proposed protocol written in Erlang. This article assumes knowledge of the Erlang programming language and the Hessian serialization protocol in order to run the examples. For those that are primarily interested in Hessian and not Erlang, this article is tractable to the extent that you can read over the Erlang specifics.

Click to read more ...

Posted on Thursday, April 17, 2008 at 10:40PM by Registered Commenter0x6e6562 in , | CommentsPost a Comment

AS3 AMQP Client: First Cut

Updated on Tuesday, April 1, 2008 at 05:39PM by Registered Commenter0x6e6562

In a previous article I introduced a proof-of-concept client for AMQP written in AS3, which at the time had not been released as a formal artifact. The proof-of-concept implementation has now been refactored and cleaned up in order to release it formally. The article announces the first release of this library and describes its usage.

Click to read more ...

Posted on Monday, March 24, 2008 at 09:05PM by Registered Commenter0x6e6562 in , | Comments2 Comments | References2 References

Camel Case

Recently I had to convert delimited String tokens into CamelCase Strings in Java. The input was something like "foo-bar" and the output should be "FooBar". A little googling found the TreeBind library, which does exactly that. An example of the API usage looks like this:

Util.ToUpperCamelCase("foo-bar");   // returns "FooBar"
Util.ToFirstUpper("foobar")         // returns "Foobar"

Not rocket science, but it did save me some string manipulation. BTW, I don't even know what the TreeBind API does in general.

Posted on Saturday, March 15, 2008 at 06:31PM by Registered Commenter0x6e6562 in | CommentsPost a Comment
Page | 1 | 2 | Next 5 Entries