« Introducing Shovel: An AMQP Relay | Main | OS X Key Mapping »

Build Your Own AMQP Client

This article describes how to build your own client for the AMQP protocol. It is intended to give implementors an overview of the considerations that may influence design decisions in their target language. The article is a discussion of the AMQP protocol and the relevant touch points for implementing a client in a particular language.

Introduction

Don't get put off by the fact that the name Advanced Message Queueing Protocol contains the word advanced. Writing an AMQP client is not an overly complicated task, provided you have gained an appreciation of the flow and overall intention of the protocol. Once you have understood the AMQP model, is is not hard to design a client that can separate the concerns that are inherently bound to the wire format of the protocol from those concerns that are specific to a target language.

The AMQP Protocol

In a broad sense, the AMQP protocol is divided into layers that address different concerns:

  • A high level execution model that defines a set of commands which represent the interactions an application may want to have with a broker, such as publish, subscribe, unsubscribe etc;
  • How individual commands are marshalled and unmarshalled to and from frames suitable for sending over a wire;
  • How the framing layer is mapped on to a lower level transport protocol such as TCP, UDP or SCTP;
  • How to represent primitive data types in a language independent wire format.

The execution model follows a workflow that addresses concerns such as client authentication, defining queue endpoints, setting up message routing rules, conversational multiplexing and sending and receiving data. A simple generic workflow would follow these basic steps:

  1. Connect to a broker and provide credentials for authentication;
  2. Open up a communication channel in which a conservation can be held;
  3. Define an exchange that messages can be sent to;
  4. Define queues that messages can be read from;
  5. Set up routing rules between exchanges and queues so that messages can be delivered to the appropriate endpoint;
  6. Send messages to an exchange;
  7. Receive messages from queues;
  8. Delete queues and/or exchanges when they are no longer required;
  9. Close the communication channel opened previously;
  10. End the connection to the broker.

The goal of this article is not to discuss each facet of the protocol in detail, rather, it is intended to give a high level overview of the overall flow and discuss the implications for developing a client. An implementor would need to read the formal specification anyway. To gain a complete picture containing the low level details, one should refer to the specification as defined by the working group. This article aims so to shed light on that specification by illustrating some things that may be unclear from a client's perspective or things that an implementor may easily over read.

Nature Of The Protocol

Like many other protocols, the interaction in AMQP centers around sending and receiving commands in a certain order. So from a high level perspective, an AMQP client is nothing more than a component that knows:

  • How to format and send commands from a user application to an AMQP broker;
  • How to receive and decode commands received from a broker and how to dispatch these command events back to the application.

In contrast to a simple protocol like SMTP, each AMQP command contains a lot of attributes that need to be encoded to a certain predefined structure. Furthermore, AMQP performs command framing, which allows layered command decoding that is independent of the underlying transport protocol. The framing provides provides transparency to encoding and decoding layers who do not need to know whether the connection is based on TCP, UDP, SCTP or some other kind of driver.

The protocol has similarities with Stomp, because it also defines a language neutral way of sending asynchronous messages to an endpoint. The channel multiplexing and framing notions are mildly reminiscent of BEEP. It also has the capability to carry messaging APIs such as JMS.

Comparison To Existing MOM

In comparison to existing MOM implementations such as MQ Series, Tibco or Sonic MQ, AMQP defines an open specification for the interaction between messaging peers.

In contrast to JMS, which is merely an API for Java that a MOM vendor has to implement a client-side driver for, AMQP defines a language neutral wire format.

Furthermore, the AMQP model is consumer orientated rather than producer orientated. This means that messages are simply published to an exchange and it is up to a consumer to decide what gets routed into queues that they consume from. To a limited extent this is akin to the selector mechanism in JMS, whereby a client can do a selective receive of publish messages based on certain criteria.

Design Considerations

The execution model and protocol layering as defined in the specification provide a good basis on which to structure a client. From that structure, one can identify the individual components that will be required:

  1. A facility to encode and decode a defined set of data types to and from a stream. The data types and their wire format are defined in the protocol and include fairly generic types such as integers, booleans, strings, time stamps, key value maps etc;

  2. A module to parse and generate the AMQP commands such as publish, consume, get, queue declare, queue bind, etc. Part of the working group's specification artifacts is an XML document that defines the structure of every command in AMQP. This document is intended to be used by code generators to avoid implementations having to write and maintain large amounts of boilerplate code;

  3. A way to compose commands into wire frames and decompose them back again;

  4. Some code to write marshalled commands to a socket;

  5. A component that reads from a socket, decodes commands and dispatches them to a language specific handling routine. This is a point of demarcation between the language neutral AMQP model and the mechanics of the target language. For example,

    • When a client sends a QueueDeclare command, the broker would acknowledge this with a QueueDeclareOk. Although these are sent asynchronously, it is inherently a request-response semantic. From a language perspective, you may choose to either perform a blocking receive of the response or register an event handler to process the response event. This is entirely down to the style of the library that you would like to have and the constraints that the target language imposes;
    • When a client subscribes to a queue, the messages will arrive asynchronously. It is then down to the implementation to decide how it will dispatch message payloads and other subscription life cycle events to the application. One common practice is to define an listener interface that will be invoked when messages arrive;
  6. (Optionally) Some simple workflow functionality that makes sure that commands in dispatched in the correct order with respect to the execution model, rather than relying on the application code to take care of this. This option is one of the extension points that you may want to build in to your client library.

Terminology

Until now, I have used the term command to indicate a normative message that is exchanged between AMQP peers. Strictly speaking, this is not the correct terminology, but I used it in an introductory fashion because it is a well understood concept.

The AMQP nomenclature uses the term Method to describe an action that an AMQP implementation can request from a peer. AMQP methods are grouped into Classes that represent different functional areas. For example, the methods Publish, Consume and Return belong to the Basic class, which defines operations to send and receive messages. Operations to create and bind queues, such as QueueDeclare, QueueDelete, and QueueBind belong to the Queue class.

Layering

Layering is a intrinsic aspect of the whole AMQP concept and it would be prudent to give this consideration when writing a client. For instance, it might be a good idea to divide your implementation into:

  • A low level layer that encodes and decodes AMQP methods. Rather than strongly typing a client API, if you write functions that can send any type of AMQP methods, you will have little, if any, command specific code in this layer. This means that you can delegate all of the command specific functionality to the generated code thus making this layer relatively impervious to any future changes in the specification;

  • Optionally, a middle layer that provides some simple form of method ordering. This allows the client to verify that the correct sequencing order of the AMQP model is maintained as well as allowing application code to submit commands ahead of time. In addition to this, many AMQP methods can not be transmitted concurrently, so this is an ideal location to enforce a barrier of serialization;

  • A high level API that provides convenience methods for application code. This reduces the amount of flag and attributes that need to set on each AMQP method. It is this layer that lends itself to techniques such as dependency injection and inversion of control in order to minimize and simplify the amount of application code required to interface with the client;

  • Templates for common programming idioms such as default message consumers or asynchronous RPC clients and servers;

  • Sensible defaults for the various method attributes that user code can override if they require specific options to be set.

Available Tools

As previously mentioned, there is parseable version of the specification that can be used to generate a lot of the boilerplate code for parsing and generating AMQP methods and headers. Because this work has already been done in a few variants, it may be a good idea to re-target an existing code generation facility.

One example is the Python based codegen component in the RabbitMQ source tree. This component is divided up into two parts:

  • A parser that reads a JSON version of the AMQP specification and creates a canonical in-memory representation of all of the AMQP methods and headers;
  • Different emitters that serialize the abstract model to a particular language syntax. Emitters exist for Java and Erlang.

Although this is conceptually an entity separate from the RabbitMQ broker code base, it hasn't yet been released as a standalone component yet.

The RabbitMQ C# client takes exactly the same approach but it has implemented the parsing and emitting entirely in C#.

A further example is the code generation submodule of the as3amqp library. This takes the same two step approach, except that it parses an XML variant of the specification and it uses the StringTemplate library to emit the model to the target language. In contrast to the RabbitMQ emitter, which is a custom code emitter, the as3amqp code generator uses a push based templating mechanism. Furthermore, it marries up requests with their associated responses in the generated code, which can be useful when implementing non-blocking event-based method dispatching.

Example Implementations

Before jumping straight into the development of new client, it may be advantageous to see how other people have approached the same problem before. There are a variety of client implementations available that implement different versions of the specification. Without going into too much detail, some clients work with more than one broker implementation, whilst others have been designed with only one broker in mind. This is due to some incoherencies in the interpretation of the specification by the various broker implementations. However, the working group is directing efforts to an upcoming revision of the protocol that will harmonize the current broker implementations. Hopefully, in the near future, there will an unambiguous specification that will prevent any misinterpretations.

For now, however, here is a list of the current client implementations with a side note about their respective broker compatibility and design considerations:

  1. QPid clients for C++, Java, Ruby, Python and C#. These work with the QPid broker implementations, which vary marginally from the original core specification;
  2. OpenAMQ client for C which is designed to work well with the OpenAMQ broker;
  3. RabbitMQ Java client which implements version 0-8 of the specification;
  4. RabbitMQ C# client which implements versions 0-8 and 0-9. This has been tested against the RabbitMQ, QPid and OpenAMQ brokers;
  5. RabbitMQ Erlang client, which implements version 0-8. In addition, this client can transparently use AMQP to speak to a co-located instance of RabbitMQ, thus avoiding network overhead whilst not tying the application into the proprietary RabbitMQ API;
  6. py-amqplib, which implements version 0-8. This provides a simple non-threaded client for Python for scenarios where threading is undesirable;
  7. as3amqp, which is a AS3 implementation of version 0-8. Because the Flash player only has one single thread of execution, this client is similar in nature to py-amqplib.

Gotchas

Although I cannot account for all of the potential incoherencies between the different versions of the specification and the various implementations, I will briefly list a few things that an implementor may be confused by:

  • It would be nice to know about which of the eight frame types you have to deal with, and which ones aren't strictly necessary. For example, can an implementor ignore OOB-METHOD, OOB-HEADER, OOB-BODY, TRACE, and HEARTBEAT?
  • There are different versions of the specification 0-8, 0-9, 0-10, some of which have an SP revision, and some of which have not been fully signed off yet. Because the forwards or backwards compatibility of each one may not guaranteed, an implementer should have some idea what they should be targeting;
  • There are ambiguities in the formal specification of certain elements, for example, how decimal types are coded in field tables is unclear in the 0-8 specification. Another example is that the exact match specification for a header exchange is unclear;
  • AMQP Realms are specified in version 0-8, but were removed from a subsequent version of the protocol, due in part to the fact that most of the implementations didn't actually include this feature;
  • The StartOk command encodes the username and password for authentication purposes. This employs SASL as an encoding scheme, so if you encode using 'PLAIN' you need to pad each key/value pair with a null byte, alternatively use 'AMQPLAIN' and don't pad at all.

These are just some examples of the pitfalls one may encounter when reading and interpreting the specification.

Other Sources Of Information

Hopefully this has been a useful article for a potential AMQP client implementor. Apart from reading the specification, which is an essential requirement, you might want to use some of the following sources of information:

  • The issue system at amqp.org which forms the basis of the working group's discussions;
  • The mailing lists of OpenAMQ, QPid or RabbitMQ;
  • Articles illustrating the concept of AMQP;
  • The code base of the clients mentioned above;
  • Or simply post a comment to this article.
Posted on Saturday, June 21, 2008 at 06:01PM by Registered Commenter0x6e6562 in | Comments2 Comments

Reader Comments (2)

This article is excellent, probably the best.

I like the bulleted list at the end with your gotchas; I think it is exceedingly important to expand on these types of critical lists -- and this from the social science perspective.

The most troubling is the lack of commitment to backwards compatibility; backwards compatibility modules would be nice, and should be required by an act of Congress.

September 4, 2008 | Unregistered CommenterJohn van V

Nice article, thanks for sharing!

October 4, 2008 | Unregistered CommenterRajika

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>