Component Interaction in Distributed Systems - CiteSeerX

2 downloads 188 Views 2MB Size Report
Tests show that this componentisation does not effect the latency or ... The transport framework presented in this thesis is similar to that used by Regis. ...... and connectors are implemented in Python [Rossum95], an interpreted scripting.
Imperial College of Science, Technology and Medicine University of London Department of Computing.

Component Interaction in Distributed Systems

Nathaniel G. Pryce.

A Thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Engineering of the University of London, and for the Diploma of the Imperial College of Science, Technology and Medicine

January 2000

1

2

Abstract Component based development is seen as a way to increase programmer productivity and reduce software maintenance costs through the reuse of off-the-shelf software components. Component-based development is distinguished from object-orientation in that systems are defined by the composition of black-box components that interact according to well-defined protocols. Current component models define interaction protocols between components as object interfaces. Such an approach is limited: an interface can only describe bundles of synchronous, request/reply operations, but cannot specify the protocol by which those operations must be invoked or how multiple interfaces are to be used in concert. Yet all but the simplest component models define interaction protocols that involve multiple parties, many-to-one communication or concurrency between communicating parties.

We introduce a model of component interaction that addresses the limitations of object invocation. This interaction model is used as the basis for a language, Midas, that is used to define interaction protocols between components. Midas definitions can be annotated with formal specifications of properties of the protocol. We show how the use of a notation to define the interaction protocol allows one to mechanically check the behaviour of the interaction protocol over different transports, thereby catching errors at the design stage. Models can also be translated into test code.

Midas definitions are compiled into runtime support code that makes use of the transport framework. This framework hides the platform-specific transport API and defines transport protocols as compositions of lightweight components. This componentisation of the transport protocol allows the designer to select the most appropriate protocol for each binding and insert additional functionality, such as compression or encryption, above existing protocols. Tests show that this componentisation does not effect the latency or throughput of a binding compared to the use of a traditional implementation.

3

Acknowledgements I would like to thank the members of the Distributed Software Engineering group for their advice and support. In particular I would like to thank Naranker Dulay, my supervisor, Morris Sloman, Jeff Kramer and Jeff Magee, and ex-members of the group, Steve Crane, Kevin Twidle and Hal Fossa, for many stimulating discussions. The research in presented in this thesis was funded by British Telecom through the Management of Multiservice Networks project. I would like to thank Ian Marshall, Paul McKee and Sohail Rana of British Telecom Laboratories, for valuable feedback about the use of Midas and the Regent run-time frameworks.

The work presented in this thesis is built upon research performed over several years within the Distributed Software Engineering Group, especially the Darwin/Regis system and the TRACTA project.

The Regis system was originally developed by Jeff Magee, Stephen Crane and Kevin Twidle. The second version of the Regis system was developed by Stephen Crane and the author. The runtime system presented in this thesis, Regent, differs from Regis in several ways: • Regis does not specify component interaction protocols separately from their implementation. Regis interaction protocols are defined solely by their implementation as C++ endpoint classes. • Regis does not separate the concerns of application and presentation-layer protocols: endpoint classes are responsible for marshalling messages and interfacing with the transport protocol stacks. Therefore, Regis does not support the ability to plug application-layer protocol filters into a binding. • Regis cannot generate marshalling code because there is no separate specification of the application-layer protocol. Programmers writing endpoint classes must implement marshalling themselves. In practice, this means that Regis endpoints can only transmit flat data structures. The transport framework presented in this thesis is similar to that used by Regis. However, the model of control and event interfaces is significantly different. Regis protocol layers pass control messages up and down the stack to notify higher layers of significant events or request control operations from lower layers. Regent protocols provide control interfaces and JavaBean events that can be queried from higher in the stack. Regent allows bindings to specific control and event interfaces to bypass layers that have no interest in those interfaces, while Regis control messages must pass through all intermediate layers in the stack. The Regis transport framework does not support the implementation of composite layers – layers defined solely as compositions of other layers – or provide a registry for giving stack descriptions well-known, human-friendly names.

The Darwin language was designed by Narankar Dulay, Jeff Kramer and Jeff Magee. The FSP notation and LTSA model checker were designed and implemented by Jeff Magee, based on the work of Dimitra Giannakopolu, Jeff Kramer and Sing-Chi Cheung.

4

For Dorothy Cannon.

5

Table of Contents

Table of Contents Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2. Example Scenario: An On-line Record Shop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3. Conclusions from the Example Scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4. Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 2. Component Interaction in Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.1. A Brief History of Distributed Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2. Requirements of a Middleware Platform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3. Object Oriented Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3.1. RM-ODP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3.2. CORBA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.3. COM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.4. Java Beans, RMI, and Enterprise Java Beans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.3.5. Reflective Middleware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4. Architecture Description Languages and Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4.1. Polylith. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4.2. Darwin/Regis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.3. OLAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.4.4. UniCon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4.5. Wright, Aesop and ACME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.5. Transport Protocol Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.6. Others. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Chapter 3. A Model of Component Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2. Abstract View of Component Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.2.1. Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.2.2. Synchronisation Between Components over a Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2.3. Control of Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.2.4. Transport Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 4. Midas: A Language for Specifying Interaction Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.0.1. LongSlot Interaction Style. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.0.2. Slot Interaction Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.0.3. Mport Interaction Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.0.4. Func Interaction Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.0.5. Event Interaction Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.0.6. Attribute Interaction Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.1. Modelling Bindings in FSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.2. Modelling Components in FSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.3. Supporting Endpoint Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.3.1. Guiding Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6

Table of Contents 4.3.2. Runtime Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Chapter 5. Case Study: On-line Music Shop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.3. System Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.4. Media Processing Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.5. Comparison with CORBA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Chapter 6. Mapping Midas To Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.2. Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3. Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.4. Basic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.5. User-Defined Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.5.1. Marshalling Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.5.2. Enums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.5.3. Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.5.4. Typedefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.6. Interaction Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.6.1. Message Interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.6.2. Endpoint Stubs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.6.3. Support for Third-Party Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.6.4. Proxies and Service Access Points (SAPs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.6.5. Reference Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.7. Additional Mappings for Interaction Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6.7.1. Reified Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6.7.2. Datagram Proxies and SAPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.8. Generic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.9. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Chapter 7. Transport Protocol Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.2. Requirements of the Transport Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.4. Implementation of Protocol Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.5. Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.6. Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.7. Connection-Oriented Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.8. Transport Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 7.9. Construction of Protocol Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 7.9.1. Automatic Stack Elaboration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 7.10. Protocol Registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 7.11. Concurrency and Synchronisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 7.12. Available Protocol Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 7.13. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Chapter 8. Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 8.2. Comparing Middleware Platforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 8.2.1. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.2.2. Relative Code Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.3. The Effect of Different Interaction and Transport Protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 8.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Chapter 9. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 9.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 9.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 9.3. Critical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 7

Table of Contents 9.4. Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 9.4.1. A General Connector Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 9.4.2. Improved Support for Dynamic Composition of Transport Layers . . . . . . . . . . . . . . . . . . . . . . 159 9.4.3. QoS-Directed Transport Construction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 9.4.4. Runtime Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 9.5. Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Appendix A. Midas Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.1. Comments and Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.2. Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.3. Interaction Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 A.4. Constant Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 A.5. Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Appendix B. FSP Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 B.1. Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 B.2. Composite Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 B.3. Common Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 B.4. Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

8

List of Figures

List of Figures 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52.

Overview of the on-line record shop application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 CORBA interceptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 A graphical Java Bean development tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 An EJB deployment tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 The structure of a hierarchical open binding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 A simple filter pipeline in Darwin’s graphical and textual syntaxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 OLAN connector objects between collocated and distributed components . . . . . . . . . . . . . . . . . . . . . . . . 38 Abstract view of the interaction model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Binding from client endpoint to service, and backbinding from service to client . . . . . . . . . . . . . . . . . . . 52 Client and service proxies provide distribution transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 SAPs identify service endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Synchronisation between concurrent components using the Attribute interaction . . . . . . . . . . . . . . . . . . . 55 Control interfaces allow management of bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Multicast used for an event interaction protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 The endpoints of an interaction represented graphically as FSP processes . . . . . . . . . . . . . . . . . . . . . . . . 63 Behaviour of the LongSlot endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Java Implementation of Func client endpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Potential error caused by binding between address spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Example executions of the Attribute interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Constraint properties of the Attribute interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 FSP model of the client endpoint of the Attribute interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 FSP model of the service endpoint of the Attribute interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 FSP model of the value held by an Attribute service.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 FSP model of how client requests are queued at an Attribute service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 The property to check that the queue of an Attribute service does not overflow. . . . . . . . . . . . . . . . . . . . 82 A model of C clients bound to a server in the same address space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Modelling bindings over a transport protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 One-slot buffer modelled in FSP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Reliable, ordered, simplex and duplex connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Unreliable, ordered, simplex connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Reliable, Unordered Simplex Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Unreliable, unordered simplex connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 A primitive component modelled as FSP processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 An animator allowing interactive exploration of a Midas interaction protocol . . . . . . . . . . . . . . . . . . . . . 87 Test filters can be monitored by remote management agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Overview of the on-line record store application.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Architecture of the record shop server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Architecture of the record shop client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Instantiations of generic interaction types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Components used to stream audio data between address spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Source component for streaming audio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Sink component for streaming audio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 An example system that streams media between two address spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 The MediaStream interaction type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Model of the MediaStream client endpoint for 1-bit sequence numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Model of the MediaStream service endpoint for 1-bit sequence numbers . . . . . . . . . . . . . . . . . . . . . . . . 100 Classes of the record shop application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Placement of CORBA objects onto servers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Sequence classes that hold primitive types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Classes for the Attribute interaction type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 The Binder class generated for the Attribute interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Generated classes that implement distribution transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

9

List of Figures 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70.

Classes supporting generic types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Protocol layer interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Protocol layers composed into a stack, showing control and event connections . . . . . . . . . . . . . . . . . . . 128 Session layers created by a multiplexor for each of its clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 ProtocolService interface definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 MultiplexorService interface definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Static structure of the memory management classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Time taken to write data to a BufferOutputStream and ByteArrayOutputStream . . . . . . . . . . . . . . . . . . 134 Immutable address objects can be shared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Address objects are constructed from header fields of received messages. . . . . . . . . . . . . . . . . . . . . . . . 136 Static Structure of TransportFilter Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Threads in the protocol stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Architecture of the Midas experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Comparison of CORBA, RMI and Midas performance.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Sizes of the three test programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 The effect of transport layers on throughput. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 The effect of interaction style on throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Modules used to group definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

10

List of Tables

List of Tables 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.

A common synchronisation error detected by static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Midas modules mapped to java packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Midas constant declaration mapped to Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Primitive Midas types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Midas enum mapped to a Java class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Midas struct mapped to Java class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Midas typedefs mapped to Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Message interfaces generated from a Midas interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 A Midas interaction instance mapped to Java. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Type objects representing primitive types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 The type object generated for each user defined type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Genericity support generated from a Midas specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Example transport filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Protocol components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 CORBA and RMI interfaces for the performance experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Primitive Midas types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 An example Midas enumerated type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Basic structured Midas types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Data structure definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Generic structure declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Type aliases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 An example interaction definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Constant definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Process operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Composite process operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Common process operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Safety and progress properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

11

1. Introduction

Introduction

The last thing that we find in making a book is to know what we must put first. Blaise Pascal

1.1. Motivation As the demand for distributed computer services rises, so does the need to easily construct, deploy and maintain such services in heterogeneous environments, where multiple types of network, end system hardware, operating system and programming language must be integrated. This has led to the development and standardisation of “middleware” systems - software environments that hide platform and language specific details and provide runtime support for transparent distributed processing.

Current middleware systems focus mainly on supporting communication between modules of the system. The object oriented paradigm has become the most widely accepted model for middleware systems [ODP95, OMG98, EE98]. An object-oriented system is composed of objects that make services available to other objects in the system through polymorphic interfaces comprised of request/reply operations by which clients can request a service and receive some reply. However, object-oriented systems do not provide much support for creating and managing the bindings between objects: bindings are hidden within object implementations and are usually established by the object that uses the binding searching a name service or trader or through out-of-band mechanisms. Such bindings are known as “first-party bindings” and complicate the system by hiding the architectural details within objects and including code to establish bindings and handle binding errors with the functional code within each component.

Component-based development [SG96] is a refinement of object orientation that aims to solve the problems inherent in the use of first party binding. A component is an object that supports a compositional approach to system construction by exposing at its interface both the services it provides to and those it requires from other components, and providing operations to allow its required services to be bound to those provided elsewhere. Component reuse is facilitated by defining standard inter-component protocols for application-independent tasks - serialisation or database access for example - and tasks specific to some application domain - media processing or healthcare management for example. Systems are constructed by a instantiating components and binding their interfaces together. A binding between two components is performed by some other party, usually a composite component that

12

1. Introduction encapsulates the components being bound. This binding model, known as “third-party binding” [Fried87], separates the algorithmic details of the system encapsulated within components, from the architecture of the system encapsulated within the code performing the binding actions. The architecture of a system can be defined declaratively using an architecture description language (ADL) [MDEK95] that is compiled into low-level code that performs the binding actions to elaborate the architecture, or visualised graphically.

As they grow in sophistication, distributed systems will require more sophisticated and more varied styles of intercomponent interaction beyond the many-to-one, synchronous request/reply invocations provided by object-oriented middleware. For example: • Financial systems use one-to-many event notification to announce stock-quotes to brokers’ terminals. • Compute-intensive parallel processing tasks require components to execute concurrently as much as possible and so use asynchronous message passing between components. • The rapid acceptance of the Internet, especially as a consumer medium, has led to demands for distributed multimedia services, such as telephony and video, that differ radically from data processing, the traditional application domain of distributed computing systems, and require different qualities of service, such as reliability and timeliness.

A major weakness of current middleware platforms is that they typically only provide the programmer with a single form of interaction between components. This makes it difficult to build a system that requires a richer set of interaction styles. Other interaction styles must be implemented either in terms of the interaction style provided by the middleware platform or by using some lower level services so losing any advantages of using the middleware.

1.2. Example Scenario: An On-line Record Shop As an example, let’s examine a potential application that is receiving a great deal of interest from consumers and industry at the moment, that of selling music over the network. In such a scenario, multiple service providers maintain databases of digital music tracks. A client wanting to buy music browses the tracks available at an on-line record store and can listen to streamed samples of tracks in which they are interested before paying for and downloading high-quality versions of the files onto their local computer or hi-fi.

From the short above description we can identify some of the high-level components of the system, as shown in figure 1. The on-line record store is accessed through a component that maintains the database of information about the music in the store. The music itself is stored in one or more media stores. Components can be instantiated on these media stores to stream a low-resolution preview of the music or to download the file to the client’s hi-fi. The client program is made up of components that allow the user to visually browse the contents in the store, receive

13

1. Introduction and play streams of audio, and download purchased files onto the client’s computer or hi-fi. Further components are used within the major application components to stream audio data to and from disk or audio devices and to process audio streams, to convert between formats for example. Client

Record Shop

Media Stores

Browse

Purchase

Control

Download Music

HiFi

Preview Music

Download Music

Figure 1. Overview of the on-line record shop application. These components must interact in different ways and different interactions, and even separate uses of the same interaction type, need different qualities of service and levels of security. The client browses the contents of the music store by invoking request/reply operations over a reliable connection. When requesting a preview of a track, the client will receive a stream of continuous media, which does not have to be reliable but requires some guaranteed bandwidth and maximum jitter. When requesting purchase of one or more files, the client again uses a request/ reply transaction over a reliable connection; however, unlike the connection used for browsing, the connection used for requesting a purchase must also be secure. Finally the music files are transferred to the client over a pipe that efficiently transfers large amounts of data; this interaction also requires a reliable, secure connection.

It is easy to use CORBA [OMG98] to implement the services that allow clients to browse and purchase the available music tracks because the request/reply interaction style required for browsing is the(only) interaction style supported by CORBA. The security required to handle on-line payment is complicates matters, but is implemented by the CORBA security service, a standard that defines secure interoperability protocols and management interfaces implemented in terms of ORB mechanisms. The security service is not a standard component of the CORBA platform, and so one must make sure to purchase an ORB that implements this standard.

Streaming preview versions of music to the client is more complex, however. The reliable request/reply interaction abstraction used by CORBA objects is not appropriate for the streaming of continuous media because the reliability mechanisms and the synchronisation between client and server reduce the smooth delivery of data and so disrupt playback. Interfaces for controlling media streams are defined by the CORBA telephony standard, a domain-specific facility defined in terms of the CORBA ORB and base services, but there are currently no available implementations. Thus the programmer is forced either to integrate the CORBA ORB with another middleware platform that provides support the media streams and other interaction styles they require, or to implement media streams themselves, using non-portable operating system mechanisms, and integrate their own mechanisms with the ORB.

14

1. Introduction We will develop this scenario in more detail in chapter 5, to illustrate how the demands of this application are met by the software environment and tools resulting from this research.

1.3. Conclusions from the Example Scenario We can see that even in this small example, components of a distributed system must communicate in a variety of ways, but that current middleware platforms do not adequately support these requirements. Middleware, therefore, must allow programmers to specify a wider variety of component interaction protocols than is currently possible. As supported by current middleware, programmers must be able to compile these specifications into the runtime support required to construct the system: support for binding, distribution transparency, monitoring and management. However, protocols that are more complex than sequences of synchronous request/reply pairs are both harder to design correctly and harder for programmers using the protocol to understand. These drawbacks can be overcome by defining precise, formal specifications of each interaction protocol, both to unambiguously describe the protocol to those who must implement it and to allow design-time error detection by mechanical means.

This thesis introduces a model of interaction between distributed components and a language that is used to define interaction styles based upon this interaction model. A major advantage of this interaction model and language is that it supports both the design-time analysis of interaction styles and the construction of components and systems that make use of those interaction styles. Design is supported by including specifications of the interaction protocol that can be checked mechanically for deadlock or violation of user-defined constraints. Construction is supported by translating these specifications into implementation language constructs that provide the “glue” to connect components using the interaction style whether they are in within the same address space or distributed across the network. Moreover, our approach links the design and construction phases by translating interaction specifications into objects that can be inserted into bindings to check that components interacting over those bindings conform to the specified interaction protocol.

1.4. Structure The rest of this thesis is structured as follows. In chapter 2 we examine the current state of the art in component interaction models for distributed systems and highlight the advantages and shortcomings of the various approaches. This concludes with a set of requirements that a system should meet to support programmers in defining system architecture in terms of communicating components. In chapter 3 we introduce our model of intercomponent interaction and the Midas language used to specify intercomponent interaction protocols and show how this model and language meet the requirements identified in chapter 2. We show how the Midas language integrates with existing modelling and CASE tools to support design-time mechanical analysis of interaction protocols and entire systems that follow our component model. Chapter 5 elaborates on the scenario of section 1.2, showing how a media-on-demand application is designed and built in terms of components communicating via Midas interaction styles. Chapter 6 describes in detail how Midas definitions are translated into the Java programming language and Chapter 7 describes the Java component framework for implementing transport protocols. Chapter 7 examines the

15

1. Introduction effect our approach has on the performance of a system and shows that the approach has a negligible effect and can in some cases improve performance. Finally, chapter 9 summarises the contributions of this thesis and examines how the work can be taken further.

16

2. Component Interaction in Distributed Systems

Component Interaction in Distributed Systems I may not agree with what you say, but I'll defend to the death your right to say it Voltaire

2.1. Introduction Distributed systems are constructed from components that interact together to perform common tasks. The architecture of a system can be viewed as the composition of its components and the bindings between them. This chapter examines how current distributed programming environments support the design of system architecture and component interaction styles and the mechanisms that realise these architectures and interaction styles at run-time.

2.1.1. A Brief History of Distributed Programming Computers were first connected by networks during the late 1960s. At the time, processing resources were limited and operating systems simple. Programmers writing distributed programs were responsible for implementing all of the required communication code, even to the level of driving the link-layer hardware. In the early 1970s, researchers on the ARPANET project recognised the need to pass messages between different network architectures and formulated the “internet problem”: how to get computers to communicate across multiple packet networks without knowing underlying the network technologies.

A year later, they published the TCP protocol to solve the internet problem. At the time TCP handled both transport and internetwork concerns. The separate concepts of transport and internetwork protocol were developed later, when TCP and IP were separated to allow the implementation of UDP over IP. Transport protocols simplify distributed programming by providing the programmer with higher-level communication abstractions, such as a reliable byte streams, hiding the unreliable, packet-based nature of the underlying internet protocols.

17

2. Component Interaction in Distributed Systems Until the early 1980s, IP and TCP were typically implemented as user-level processes and programs used platform specific interfaces to make use of these implementations. As operating systems became more sophisticated, the implementations of link-layer, internet and transport protocols became integrated into the kernel and provided to application programmers through standard APIs, such as the “sockets” API introduced in release 4.2 of Berkley UNIX [Steve97].

The use of standard programming interfaces allow programmers to easily port applications to different architectures. However, in a heterogeneous internet, programmers have to handle communication between programs running on different architectures or written in different programming languages. Transport protocols only provide data transmission services between applications and the application must translate data structures to and from raw bytes and convert data between representations used by different architectures and languages. To address this requirement, standard presentation layer protocols, such as XDR [RFC1832], were developed to define a standard format by which primitive and structured types are converted to and from bit strings. Types are defined in a specification language that is compiled into type definitions and marshalling code in some programming language. By using a separate specification language, programs written in different languages can easily communicate structured data.

For programmers familiar with procedural, structured programming, building distributed programs using only presentation and transport protocols is complex. The abstraction by which local program modules are used – the procedure call – is different that by which remote modules are used – message passing. Remote procedure call (RPC) [BN84] was invented in the early 1980s to extend the procedure call abstraction between address spaces, providing the same mechanism for invoking both local and remote services. An RPC server exposes its services through a set of procedures. These procedures are defined using a definition language that specifies the names and types of the server’s procedures. This definition is compiled into client side and server side stubs. Client side stubs appear to be local procedures, but marshal and send arguments to a remote server and then wait for and unmarshal the server’s reply. Server side stubs receive messages, unmarshal parameters, invoke the appropriate procedure of the server and then marshal and send any returned data to the client.

RPC provided programmers with a familiar programming model for building distributed applications but had the drawback that it was hard to build programs that were dynamic: those in which servers created and removed services during their lifetime, and didn’t match the object oriented style of programming that was becoming prevalent. Object oriented middleware systems, developed during the late 1980s and early 1990s, have become the predominant form of distributed programming environment and procedural RPC is now used only rarely. OO middleware, such as CORBA or DCOM, makes use of current programming language features, such as polymorphism and exception handling, to provide a simpler programming model for the construction of distributed programs.

18

2. Component Interaction in Distributed Systems

2.2. Requirements of a Middleware Platform Although current object-oriented middleware platforms provide the programmer with a great deal of support in implementing the communication between modules of the system, they have a number of drawbacks. In particular, object oriented middleware does not provide much support for deploying components onto nodes of a network or for establishing communication paths (bindings) between those components. Because objects are defined by their provided services only and do not expose the services that they require, objects must be responsible for connecting themselves to services that are provided by other objects. This increases each object’s dependence on its context, reducing the scope for its reuse, and pollutes its implementation with irrelevant structural details.

Furthermore, objects are limited to very few forms of interaction, typically either RPC-like object invocation or asynchronous message passing. Some middleware platforms include other interaction mechanisms as ad-hoc extensions. Otherwise programmers must implement other interaction mechanisms in terms of the predominant style or in terms of lower level mechanisms, such as platform specific networking APIs, that are more complex and error prone to use.

Finally, no commercial distributed programming environments supports design-time modelling and analysis of components, forms of interaction or systems composed from these architectural parts.

The requirements of a middleware platform are: • Component Model. A middleware platform should present the programmer with a coherent component model that defines how a system is composed from individual components of functionality. The component model should defines how a component’s interface is defined in terms of the services provided and required by the component. It should also define how components are packaged for deployment and instantiation, and how instantiated components are composed by binding components that require a service to compatible services provided by other components. The component model should provide some way of managing the complexity of large systems, by hierarchical composition, for example. The component model should also support visual programming, allowing the designer to specify compositions of components with graphical tools. • Binding. The middleware platform should support both first-party and third-party binding establishment. First-party binding involves the client component locating and establishing a connection with the service that it requires. Third-party binding involves a third party, neither the client nor the service provider, that establishes bindings between the components that require a service and those that provide the service. A middleware platform must support both static binding, creating connections between components that do not change during the lifetime of those components, and dynamic binding, allowing components to be dynamically constructed and destroyed during the lifetime of the system or dynamically start and stop communicating with services during their lifetime. The middleware should be able to establish bindings over different transport protocols, allowing the designer to select the most appropriate transport for each binding.

19

2. Component Interaction in Distributed Systems • Open Interaction Styles. The system must not constrain the system designer to a single or limited number of interaction styles between components. The designer should be able to select or design interaction styles appropriate to their needs, choosing between synchronous or asynchronous communication and one-to-one, many-to-one or one-to-many communication. Interaction styles should be independent of transport protocol, allowing transport functionality and quality of service to be selected independently for each binding. • Interaction Style Specification. The designer should be able to specify new interaction styles. Specifications should not be tied to any specific programming language. The specification language should support the requirements of open interaction styles, listed above. It should be possible for the programmer to compile interaction specifications into code that supports the implementation, composition and distribution of components that use those interaction styles. • Formal Modelling. Interaction styles that are more complex than synchronous request/reply will require unambiguous definition. Formal modelling of interaction protocols allows interaction styles to be precisely defined and supports design-time analysis of the protocols to check for errors, such as deadlock between communicating parties or buffer overflow. • Transport Protocols. As well as selecting appropriate transport protocols for individual bindings, a middleware platform should allow individual transport capabilities, such as encryption, compression, reliability, fragmentation and reassembly, to be easily combined and make it easy to extend the available transport protocols with new functionality. This also necessitates the naming of transport protocols in terms of their components and the dynamic construction protocols that are compatible with remote protocol stacks. • Management and Adaption. The middleware platform should support runtime management by exposing interfaces by which middleware layers can be controlled and announcing events when state changes and errors occur. These interfaces should be accessible to the application components, allowing runtime adaption of its middleware platform, and to external management agents.

2.3. Object Oriented Middleware OO middleware platforms define systems in terms of objects as the main unit of architectural structure. An object encapsulates state and behaviour behind an abstract interface. An object’s interface is usually specified in an interface definition language (IDL) independently from the language used to implement its behaviour. The IDL definitions are used to generate runtime mechanisms that allow object invocations to be performed between address spaces.

2.3.1. RM-ODP RM-ODP [ODP95] is a joint ISO and ITU standard, defined during the early 1990s, to standardise architectural concepts for distributed computing. It is intended to aid the development of future standards for distributed computing platforms rather than define a standard platform itself. RM-ODP presents distributed computing systems

20

2. Component Interaction in Distributed Systems from five viewpoints, concepts in higher level viewpoints being mapped to one or more lower level concepts in others. Of relevance to this thesis are the Computational Viewpoint, that defines the functional abstractions of a distributed system in a distribution-transparent manner, and the Engineering Viewpoint that describes how the system implements the computational viewpoint.

The computation viewpoint views a distributed system as a set of interacting objects. An object provides one or more interfaces through which it makes its services available to other objects in the system. RM-ODP defines three types of interface: operational, stream and signal interfaces. An operational interface defines one or more one-way and request/reply operations that can be invoked upon an object. A stream interface defines one or more flows of continuous media accepted by the object. Both operational and stream interfaces are defined in terms of signals, low level operations that correspond to local procedure calls; operations and streams are the only interactions that are supported between address spaces.

In the computational viewpoint, objects interact via bindings between their interfaces. RM-ODP does not specify who initiates the binding, thereby allowing both first-party and third-party binding to be supported by the model. Bindings are either primitive, directly connecting two interfaces, or are compound, being supported by binding objects that are themselves bound between the two interfaces. In the engineering viewpoint, a binding object corresponds to a channel, a concept that represents the communication mechanism used by the distributed system. A channel is composed of stubs, binders and protocols. Stubs perform processing that requires knowledge of application semantics, binders perform processing that does not require application semantics and can be performed on raw bit-streams, and protocols transmit the bit-streams between address spaces.

Because RM-ODP defines only concepts, is understandably vague as to exactly how objects and systems are defined and deployed. It does not prescribe any interface definition language or any method of constructing systems and does not define its model of object behaviour and interaction with any formality. A system could be constructed by composing components using an architecture description language, or it could be constructed using first-party binding with clients finding servers via naming and trading services. Considering this generality, it is strange that the standard limits components to only two interaction styles and treats stream bindings as fundamentally different from operation bindings at the engineering level when much of the infrastructure of a distributed system - such as the transport protocol framework - is not fundamentally different for the two interaction styles.

The concepts of the RM-ODP were realised in the ANSA system [APM93]. ANSA provided an interface definition language compiler and runtime support so that object interfaces defined in the IDL could be implemented in C programming language. Recent developments of the CORBA standard have specified features that are compatible with the concepts of RM-ODP, including naming and trading services and facilities for controlling media streams, and RM-ODP has adopted the CORBA IDL as an ISO standard. However, the main aim of RM-ODP is to provide conceptual models rather than implementable standards that can be used to build interoperable distributed systems.

21

2. Component Interaction in Distributed Systems

2.3.2. CORBA CORBA [OMG98] is a middleware standard defined by the Object Management Group (OMG). Implementations of the CORBA standard are available from commercial vendors, such as Orbix by Iona [Iona99] and Visibroker by Inprise [Inprise99], as free software, a list of which is published on the web by the OMG [OMG99], and as one of the standard libraries of the Java programming language [AG98]. CORBA has been used to build systems ranging from modular graphical user interfaces to large, distributed, secure, fault tolerant IT applications.

Interaction Styles. CORBA provides the programmer with an object-oriented model of distribution. Servers export services to clients as objects that conform to abstract interfaces. Object interfaces are defined using the CORBA Interface Definition Language (CORBA-IDL, or merely IDL), standardised by the OMG and ISO. CORBA-IDL supports the definition of constants and data structures as well as object interfaces. Object interfaces are defined in terms of named, typed, read-write or read-only attributes and operations that can take multiple named, typed parameters and return multiple values or throw exceptions. The CORBA standard defines mappings from IDL to various programming languages, allowing the server and clients to be written in different languages. Each mapping defines how constants, data types, interfaces and exceptions defined in IDL are represented in the target language. Interface definitions are usually compiled into client side (proxy) and server side (skeleton) marshalling code that is itself compiled and statically linked into the application.

The CORBA IDL language defines interfaces as sets of synchronous request/reply operations or asynchronous one-way messages. This has several disadvantages. Clients are forced to synchronise with servers and cannot take advantage of concurrency between the operation of the client and that of the server to improve performance. Oneway operations are not guaranteed to be issued asynchronously and the CORBA standard provides no guarantee of their reliability, so they cannot be relied upon in a portable program unless that program implements reliability itself.

Clients bind to objects and invoke their operations to make use of the service. Binding is performed by the Object Request Broker (ORB), the fundamental component of the CORBA architecture. The ORB identifies objects using object references. Binding involves passing an object reference identifying a service from the server to the client. The client uses the reference to initialise a proxy object that represents the remote object in the client’s address space. Proxy implementations are generated from IDL specifications and compiled into the client program. Operations invoked on the proxy object are routed by the ORB to the implementation of the object. Proxy objects provide distribution transparency; an object and its proxy conform to the same interface and the ORB can return a direct pointer to the object if the code performing the binding is in the same address space.

A more flexible method of invoking object operations is provided by the Dynamic Invocation Interface (DII) that allows object requests to be created and issued dynamically. Requests are represented as objects with operations for setting the name and parameters of the request and retrieving the returned values or an exception. To ensure type safety when using the DII, the IDL compiler stores type information about the definitions it is compiling in a

22

2. Component Interaction in Distributed Systems system wide Interface Repository which can be traversed by client code to ensure that it passes the correct values as parameters of a dynamic request. The DII also allows greater control over concurrency between the client issuing the request and the object serving it. However, issuing requests through the is less efficient, more complex and type unsafe compared to making requests through compiled proxies.

Until version 2.2 of the CORBA standard, only the client-side interface of the ORB was standardised: programs that made object requests over the ORB were portable between ORB implementations but programs that implemented CORBA objects were not. CORBA 2.2 introduced the notion of the portable object adapter (POA) that makes object implementations available via the ORB. The POA encapsulates the mechanisms by which the ORB identifies an object and routes operation requests to the object separately from the implementation of the object itself, which is known as the servant. A server can create multiple POAs, organised hierarchically, to manage the namespace of objects within the server. The ORB interacts with each POA through standard interfaces allowing custom POAs to be implemented for different implementation strategies. For example, a POA that interfaces with a database would identify objects by the key that indexes the record containing their state and would load that state from the database on reception of a request for the object. The POA specification defines default POA implementations that link the POA and the servant by delegation or inheritance.

Interceptors. An object’s reference encapsulates the transport and presentation protocols used to communicate with the object. These details are encapsulated by the ORB and not made available to the programmer through standard APIs. CORBA 2.2 introduced the concept of the interceptor that can be used to extend or replace the protocols used to transfer operation requests. Two forms of interceptor are supported, corresponding to where they are inserted in the binding and the form of processing they perform. Request interceptors process unmarshalled requests: they can be inserted at the client to process requests after they have been issued but before they have been marshalled, or at the server to process requests after they have been unmarshalled but before they are invoked on the servant. A request interceptor processes DII request and reply objects containing the information about the request or reply including the target object, operation name and arguments. Message interceptors process marshalled

23

2. Component Interaction in Distributed Systems requests which are sent between address spaces: they can be inserted at the client after marshalling but before transmission, or at the server after reception but before unmarshalling. Message interceptors are used to perform lowlevel transformations of the marshalled data, such as compression or encryption. request Client

Object reply

Request Interceptors

Request Interceptors

Marshalling

Marshalling

Message Interceptors

Message Interceptors

ORB Core request reply

Figure 2. CORBA interceptors The interceptors that are required to communicate with an object are recorded in an ORB-specific manner in the object reference of the object. The interceptors that are available are also ORB or application specific and the interceptor specification does not provide a standard way of querying or loading the interceptors required for a binding or checking the compatibility of interceptors at either end of a binding.

Services. In addition to the low-level binding and communication services provided by the ORB, the CORBA standard also specifies a number of higher-level interaction mechanisms and services that are implemented in terms of CORBA objects. These services include naming, trading, lifetime management and migration, concurrency control, transaction management and event dissemination. The CORBA standard also defines domain-specific services, known as facilities, that provide object models and services for particular application domains, such as financial services or health-care.

Evaluation. A major drawback of the CORBA environment is that there is no standard component model by which a system can be constructed in a compositional manner. Because CORBA does not have the concept of an object’s required interfaces, CORBA applications must be constructed using a first-party binding model: it is up to clients to get references for and bind to any other objects that they need to communicate with. This complicates the code of the clients with large amounts of boiler-plate code that is tedious and error prone to write and increases the amount of testing required. This also means that there is no way of visualising or analysing the architecture of a CORBA-based system, either at design time or at run time via management tools.

24

2. Component Interaction in Distributed Systems A request for proposals (RFP) was issued for a component model that would be standardised as part of CORBA 3.0. A proposal has been accepted but has not yet been completed. The component model, “CORBAComponents”, is similar to Enterprise Java Beans (see section 2.3.4) in that it concentrates on making it easy for components to be used with transaction monitors and database systems implemented by independent vendors. The CORBA 3.0 component model also extends the IDL with declarations of provided and required interfaces - it uses the term “facets” and “receptacles” respectively - and event sources and sinks, allowing a compositional approach to software construction. Assemblies of components can be described in XML [BPS98] documents that are interpreted to instantiate and bind components; compared to an ADL, the XML document describing a configuration is excessively wordy and difficult to read. There is currently no available implementation of CORBAComponents.

Because there is no way to specify an interaction protocol as anything but a set of operations, one cannot specify the order in which operations must be called on a single interface. There is no way to define protocols in terms of two or more interfaces, again because CORBA does not have the concept of the required interfaces of an object. A request for proposals has been issued for a method of formally specifying object behaviour, but as yet there is no standard.

Beyond the support for interceptors, there is no standard way to select a transport protocol or modify the QoS of a binding. Even interceptors are next to useless; the OMG admits that “the concept of interceptors in CORBA 2.2 is underspecified and not portable. This makes them largely useless as a mechanism for third parties to ‘plug into’ an ORB.”

2.3.3. COM The Component Object Model (COM) [EE98] is Microsoft’s object model and middleware platform for distributed object computing and component-based development. As an object-oriented middleware platform it is similar to CORBA in concept and operation, although with a few differences.

Interaction Styles. A COM object encapsulates state and behaviour and provides one or more interfaces to other components in its environment. Interfaces are strongly typed and single inheritance can be used to extend existing interface definitions. Interface types are defined using the Microsoft Interface Definition Language (MIDL). This IDL allows more detailed type definitions than that of CORBA: data types can contain pointers and support aliasing of structures. However, object operations are always synchronous and cannot throw exceptions to report errors. Instead, a coding convention is used in which success or error indicators are returned as the result of an operation and any returned values are passed to the compiler as “out” parameters.

Component Model. Unlike components in most architecture definition languages, a COM object can only provide one interface of each type. All interfaces are derived from a base interface called IUnknown that supports garbage collection through reference counting and provides a method, QueryInterface, for requesting the other

25

2. Component Interaction in Distributed Systems interfaces provided by a component. Thus COM’s use of multiple interfaces is actually a language-independent mechanism for implementing multiple interface inheritance; in CORBA this detail would be left to the language mapping, allowing the use of language-specific multiple inheritance mechanisms if supported.

All COM objects are instances of a particular class. The COM runtime uses a database to map a class identifier to a binary package, dynamically linked library or executable, that can be used to create instances of the class. COM functions allow the instantiation of objects within the same address space, a different address space on the same machine or on a remote machine. When instantiating objects in a different address space, the COM runtime transparently creates client-side proxies and server-side stubs to pass invocations between processes. COM uses reference counting and keep-alive messages to determine when to destroy an object; programmers have to be very careful about increasing and decreasing reference counts and avoiding circular references. In this way, COM differs from CORBA, which layers lifetime management above the ORB and leaves garbage collection up to the application.

Unlike CORBA, COM supports different programming languages within the same address space. Object interfaces are implemented as virtual function tables, similar to those of C++. As long as a language can manipulate arrays of function pointers it can invoke the operations of a COM object but modifications to the compiler are required to make COM objects appear to be “native” objects of the language.The MIDL language is only used to generate C or C++ code that defines the interface vtables and stubs and proxies for distribution transparency: support for other languages is meant to be generated by parsing the C code generated from the MIDL. If a language cannot support the low-level interface model, it must invoke operations by calling an object’s IDispatch interface that provides scripting languages with access to a subset of the object’s operations.

COM has no real equivalent to CORBA’s DII and DSI. The nearest equivalent is the Automation framework [MSFT97], part of ActiveX [Chap96], that defines the IDispatch interface through which scripting languages can invoke operations of an object. This interface was originally designed to support VisualBasic: operation arguments are passed to the automation interface in structures that are used in the internal implementation of the VisualBasic runtime. However, IDL can be used to define more data types than can be held in these structures, meaning that only a subset of an object’s interface is available via scripting. A disadvantage of ActiveX Automation, compared to the CORBA DII, is that all scriptable objects must implement the conversion of argument types and the dispatching of operation requests and each object can do this slightly differently. CORBA generates automatic support for the dispatching of operations from IDL definitions and leaves it up to the run-time of the scripting language to convert from language-specific data types to those of IDL.

Evaluation. COM has a number of advantages and disadvantages compared to CORBA. The types available in MIDL are more expressive than those in CORBA IDL, allowing general graph structures to be declared. However, operation definitions are more primitive: one cannot define exceptions or return types, which makes invoking operations of COM objects tedious.

26

2. Component Interaction in Distributed Systems COM defines how object classes are packaged for deployment, and can instantiate components in the same address space as their clients, even if written in a different language. However, the packaging reduces platform independence: COM is tightly tied to Windows platform. Furthermore, although COM is being ported to other platforms, most COM based component frameworks, such as ActiveX, rely on definitions from the Win32 API in their public interfaces and COM has not been used to develop commercial components or component frameworks for any other platform than Win32.

Like CORBA, COM does not support the specification of interaction protocols beyond defining each individual interface as a set of methods and documenting how interfaces are to be used in conjunction in natural language.

2.3.4. Java Beans, RMI, and Enterprise Java Beans Java Beans [Chap96] is the standard component model for the Java programming language [AG98]. Java Beans are precompiled Java classes that can be instantiated, customised by changing values of exported properties, and composed into an application within a graphical development environment, without access to the source code of the component. A Java Bean is a serialisable Java class, allowing bean instances to be written to files or transmitted across network connections, that is bundled with an introspector class that provides development tools with information about the bean so that they can wire it into an application and modify its properties. A bean, being a normal Java object, encapsulates its state behind an interface defined as a set of methods. The bean framework provides the developer with a higher-level view of the interface as being made up of operations that can be invoked on the bean, properties that represent the outwardly visible state of the bean and events that the bean uses to notify other components of changes to its state. These abstractions are presented to the graphical tool by the bean’s introspector; a default introspector is provided that uses the Java Reflection API [CLK98] to query the names of a bean’s methods and determine the names of the bean’s events, properties and operations from the methods’ names. It is not possible to define other ways for beans to interact.

Figure 3. A graphical Java Bean development tool

27

2. Component Interaction in Distributed Systems Java Beans components gain a lot of advantages from the features of the Java language. They are platform independent because they are executed on a virtual machine, they can be dynamically loaded across a network, and the reflective capabilities of the Java language can be used to determine their interface elements.

Java Beans have a number of limitations: • They provide a limited number of interaction styles, all of which must be implemented by the programmer in terms of individual Java methods, rather than being represented at a higher level of abstraction. For example, to implement an event for a bean, the programmer must implement methods to add and remove event listeners and maintain a list of registered listeners within the bean. • There is no way of specifying or analysing the protocols by which bean operations may be invoked or bean properties set. • Because the bean interface is defined in terms of methods, the programmer cannot define new bean types purely as compositions of simpler beans. They must write code to expose features of the composite bean as methods that delegate to a method of a contained component. • The Java Beans framework is designed to support composition of components at design time, rather than run time. Java Beans tools usually generate source code of component configurations that is then compiled. Because the bean interface is defined in terms of methods, reflection must be used to make use of dynamically instantiated components. The Reflection API is tedious to use and invoking a method by reflection is about 100 times slower than invoking the method polymorphically, because of the additional security checks that must be performed. • The Java Beans framework is limited to the construction of systems that execute in a single address space. Distributed applications must be constructed using other technologies.

Java supports a number of APIs for the development of distributed applications, including low level networking and communication APIs that provide access to the TCP/IP and UDP/IP transport protocols and the serial and parallel ports of the machine, CORBA (see section 2.3.2), Java RMI, a Java-only mechanism for remote object invocation, and Enterprise Java Beans, a component model for writing multi-tier client/server applications.

Java RMI [WRW96] provides a mechanism for invoking operations on Java objects across the network. Because of its tight integration with the Java programming language and virtual machine, RMI provides a much simpler programming model that CORBA: RMI generates proxies and stubs from Java interface definitions and Java serialisation is used to marshal parameters, return types and exceptions, so there is no need for a separate IDL. Furthermore, RMI uses Java’s capabilities for dynamic linking to load the classes of parameters or returned objects over the network, allowing clients or servers to receive objects of classes that were not known at compile time. However, compared to CORBA, RMI makes inefficient use of system resources - for example, it keeps two TCP/ IP connections open for each separate binding to a remote server object - and does not provide services, such as naming, trading, and transactions, that are required in a large distributed system.

28

2. Component Interaction in Distributed Systems RMI provides limited support for multiple transport protocols. Each Java object that is made available over RMI can be configured with a socket factory object that creates the sockets over which it receives invocations. By default, RMI objects use TCP/IP sockets, but socket factories can implement security protocols or compression over TCP/IP or use a different transport protocol. When clients bind to a remote object over RMI, the class of the socket factory for that object can be downloaded from the server, allowing the client to construct a compatible socket. This use of socket factories allows RMI objects to be configured with monolithic protocols, but does not support the construction of protocols from reusable components.

Enterprise Java Beans (EJB) [MH98] extends the Java Beans component model to support multi-tier client/server computing in which application logic is separated from the concerns of the client-side user interface and back-end database and is executed on one or more servers. An EJB encapsulates a part of the overall application logic and is executed within an EJB “container” that manages concurrency, transactions, persistence, security and the protocol used to invoke operations of the bean. This allows application components to be portable between different transaction monitors, databases, security environments and middleware platforms.

However, EJB does not support the use of graphical tools to compose EJ beans but instead allows the developer only to configure the properties of the bean’s runtime environment provided by its container. EJ beans are therefore self-contained components that are purely reactive, reacting to operation requests from clients but not making distributed requests themselves. This simplifies the programming model for a narrow range of tasks but severely limits the application domain of EJB components. The default protocol used to access EJ beans is Java RMI

Figure 4. An EJB deployment tool

over TCP/IP, but extensions to the base specification support access to a subset of EJB functionality via CORBA IIOP [OMG98], which allows clients to be written in any language.

2.3.5. Reflective Middleware Reflection is the ability of a program to examine it’s own implementation and to modify the implementation in order to modify its own behaviour. Reflective object-oriented languages [KRB91] implement language mechanisms, such as polymorphic dispatch and inheritance, as meta-level objects that conform to predefined interfaces, or “meta-object protocols”. Programs can modify language behaviour for specific application-level objects by replacing the standard meta-objects for those application objects with application-specific meta-objects that conform to the appropriate meta-object protocol.

The Multimedia Programming Group of the Department of Computing at Lancaster University have extended a CORBA-based middleware platform with reflective interfaces to provide a facility through which distributed programs can adapt their behaviour to react to changing levels of network and end-system resources

29

2. Component Interaction in Distributed Systems [BC97,BCRP97,CBC98]. This reflective middleware provides application objects with an object-oriented model of the middleware environment in which they are executing. The object-model is “live”; by modifying the model the application modifies the behaviour of its middleware.

Component Model. Each application object executing on the reflective middleware platform provides a “service interface”, through which it provides services to other application objects, and a “meta-level interface” through which meta-level information about the object can be obtained. The meta-level information for an object is comprised of three meta-models, each describing separate concerns provided by the middleware platform.

The encapsulation meta-model describes the service interface of the object, exposing its methods and attributes and providing access to the interface inheritance graph. The composition meta-model represents the implementation of an object as a graph of sub-objects and the bindings between them. These bindings may be “local bindings” between objects in the same address space or “distributed bindings” between objects in different address spaces. The environment meta-model represents the execution environment provided for each object interface by the middleware platform. In a distributed environment, this includes the marshalling and unmarshalling, transport protocols, message queuing and dispatching, and thread creation and scheduling.

The objects making up each meta-model themselves have a meta-meta-model. For example, the environment metamodel has a composition meta-meta-model that exposes the composition objects that perform the marshalling or provide transport functionality. This meta-meta-model can be modified to change the composition

Binding. Bindings between objects are reflected as “open bindings” [WEH97,FGCDR98], an extension of the RM-ODP concept of explicit bindings. Binding to a remote service returns an interface to a “binding object” through which the quality of service (QoS) of the binding can be controlled. Two types of binding are supported, “operation” bindings, used to invoke object operations, and “stream” bindings, used to transmit flows of continuous media. Thus CORBA is extended with a single additional interaction style and the programmer cannot define new interaction styles.

Binding objects are composed hierarchically. Sub-objects perform processing at either end of the binding and are themselves connected by a binding object or by a “local binding” - procedure calls or some equivalent mechanism. Processing objects include media codecs, transport protocols, the network device and the network switches responsible for routing the traffic of the binding and maintaining the network QoS reserved for the binding.

The interface of the binding object is used to monitor and control the QoS provided by the binding and to alter the configuration of the binding object if QoS degrades below an acceptable level. The component objects of a binding also have control interfaces through which their operation can be modified. Control interfaces are type-specific;

30

2. Component Interaction in Distributed Systems for example, the control interfaces of MPEG and H.263 codecs would be different, as would control interfaces of UDP/IP and AALn/ATM transport protocols. The binding object implements its control operations by calling the control operations of its component objects. Control interface Service interfaces

Application Object

Binding object

Sub-objects

Application Object

Figure 5. The structure of a hierarchical open binding

Evaluation. Reflection and open bindings are powerful tools to support the configuration of distributed programs and their middleware at both design-time and run-time. By reflecting and manipulating its own implementation details, a program can easily modify its behaviour to react to dynamic changes in its environment.

However, the open bindings model introduces scalability problems as programs grow in size and complexity. Because the control interfaces of sub-objects are type-specific, a different binding object implementation is required for each combination of sub-objects. As new codecs and transport protocols are introduced the number of binding object types required increases dramatically. It is also practically impossible to reconfigure a binding object because the binding object must have prior knowledge of its subobjects; one cannot replace a subobject that cannot achieve the desired QoS with an object that can if that object is not understood by the binding. These disadvantages could be avoided by separating the various concerns encapsulated by the meta-objects of the binding - codec or marshalling, transport and network protocols - and defining standard interfaces through which each particular concern could be controlled.

Reflection has many of the same disadvantages as first-party binding: the functional aspects of the application are intermixed with code that performs binding and meta-level manipulations. Attempting to simplify code by separating functional and meta-level concerns has led to the development of architecture description languages, that allow the programmer to describe bindings between components declaratively, and aspect oriented programming (see section 2.6) that uses multiple languages to specify the functional and various meta-level aspects of a program, and translates meta-level descriptions into operations performed by a reflective implementation.

31

2. Component Interaction in Distributed Systems

2.4. Architecture Description Languages and Connectors Object oriented systems provide programmers with support for encapsulation, abstraction and polymorphism. However, they provide little support for system composition and evolution because system structure is hidden within the implementations of the objects of the system. Component based systems are constructed from components that encapsulate the functional aspects of the system but do not encapsulate any of the structural aspects. The system structure is made explicit by defining the bindings between components separately from the implementations of those components, often in a separate architecture description language (ADL). An explicit architecture definition simplifies the implementation of individual components by allowing component developers to concentrate only on the algorithmic details of the implementation. An architecture description can also structure formal models of the system for design and analysis, and can act as a representation of the structure of a deployed system acting as a framework for management and evolution the system. A great many ADLs and component models have been designed for research projects and commercial products. In this section we review those that are influential or representative of the current state of the art.

2.4.1. Polylith Polylith [Purt94] was a middleware platform developed at the University of Maryland during the late 1980s to support component based development of distributed systems. A Polylith component, known as a module, both provides services to and requires services from other modules. Polylith uses a module interconnection language (MIL) to define the interfaces of modules in terms of named, typed service provisions and requirements. The MIL is used to define systems as compositions of components and the bindings between their interfaces.

A service of a module is a procedure entry point. However, modules do not directly invoke the services of other modules. Instead they invoke services by making calls to a software bus, a layer of software that is responsible for transporting invocations between components. The software bus encapsulates details of the way in which modules communicate separately from the functional aspects of the system encapsulated within modules. For example, a bus may encapsulate how modules written in different languages communicate within the same address space or how modules on different nodes communicate over a network. As such, a software bus is analogous to the CORBA ORB. However the bus makes use of the structural information in the MIL description of the system to optimise the communication paths between communicating modules.

The functionality of a software bus is tied closely to that of a respective packager that encapsulates how modules are compiled and linked. The packager generates information that is used by the bus to establish bindings. For example, the packager might generate stubs to flatten data structures into network messages or coerce data between the representations used by different languages.

32

2. Component Interaction in Distributed Systems Polylith has the advantage of cleanly decomposing a system into components and defining explicitly specifying the architecture of the system in terms of instantiated components and the bindings between their interface elements. However, components are limited to only two forms of interaction: synchronous (remote) procedure call and asynchronous message passing; other interaction patterns must be built in terms of these primitives. The ADL does not allow bindings to be annotated with the required quality of service or transport protocol, forcing all bindings to use the same transport protocol whether suitable or not. Finally, there is no support for design time modelling and analysis of systems.

2.4.2. Darwin/Regis Darwin [MDEK95] is an architecture description language that is used to define the structure of systems implemented in the Regis distributed programming environment. However, the Darwin language strictly separates the structural concerns of the system from those to do with computation and communication, and is therefore independent of any one runtime system. Darwin has also been used to configure distributed components written for the CORBA and AnsaWare platforms, parallel programs, C++ objects within a single address space and C modules interacting by procedure calls.

Component Model. Darwin defines program structure in terms of component instances and bindings between their interfaces. Component interfaces are comprised of the services they provide to and those they require to be provided by other components in their context. Services are typed, but the Darwin language does not define, or limit, service types; it is up to the compiler to interpret types and check type safety of bindings.

Darwin manages structural complexity through the definition of composite components, the implementation of which are defined in terms of component instances and bindings between them. Programs are thus structured as a a hierarchy of components with the root composite component representing the entire program, leaf components representing primitive components that encapsulate the computational concerns of the program, and mid-level composite components encapsulating the structural concerns of the program. The semantics of Darwin is defined such that composite components exist only during elaboration of the program’s structure: after elaboration primitive component interfaces are bound directly so that communication does not have to be routed through composite components.

Darwin has some support for dynamic architectures. Components can contain arrays of subcomponents, the size of which can be defined as a function of the parameters of the component. More dynamism is provided by support for lazy and “worker” instantiation. Lazy instantiation of a component is performed in response to the first invocation on one of its services. Worker instantiation is provided by a special form of service, the only service type defined by the Darwin language, that can be invoked to create new component instances. How this service is implemented and invoked is left up to the runtime environment.

33

2. Component Interaction in Distributed Systems Darwin has both a textual and graphical syntax. The graphical syntax is used by CASE environments [NKM96] and tools for runtime configuration management [FS97]. The CASE tools support simultaneous editing in both representations: it was found that using the textual syntax was easier when defining complex architectures. The graphical syntax represents component instances as rectangles, component types as rounded rectangles, provided services as filled circles, required services as hollow circles and bindings as lines joining a required service to a provided service. Names and types are represented as textual annotations. See the example in figure 6.

component Source { require out : Pipe; } component Filter { provide in : Pipe; require out : Pipe; } component Sink { provide in : Pipe; } component inst s inst f inst t

System { : Source; : Filter; : Sink;

System s : Source out : Pipe in : Pipe f : Filter out : Pipe in : Pipe t : Sink

bind s.out -- f.in; bind f.out -- t.in; }

Figure 6. A simple filter pipeline in Darwin’s graphical and textual syntaxes

Modelling and Analysis. Darwin has a formal semantics, defined in the π−calculus [MPW92], that precisely defines the concurrent algorithm by which Darwin elaborates configurations of components. Darwin can also be used in the TRACTA method of modelling and analysis [Gian95,GJS96] to compose models of systems from models of components and their bindings. The behaviour of primitive components are modelled as a labelled transition systems, expressed in FSP [MK99], a process calculus. Composition and binding statements in the ADL are translated into FSP relabelling and hiding operations that define how components synchronise on shared actions and how actions of the subcomponents of a composite are hidden from that composite’s siblings. This approach models component interactions as synchronous events shared by two or more processes: all processes perform the shared transition in lock-step. This model is appropriate for procedure calls between objects within the same address space or synchronous, reliable method invocations between distributed objects. However, modelling of the various interaction and transport protocols used by the Regis system is left up to the designer. There is no standard library of protocol models or tool support for generating such models.

Runtime Support. Regis is a C++ framework for building distributed programs, the configuration of which can be specified in Darwin. Primitive components are active objects that communicate through interaction objects. Interaction objects correspond to Darwin’s provided and required services and the code to establish bindings be-

34

2. Component Interaction in Distributed Systems tween them is generated from the Darwin specification. Unlike most middleware platforms, Regis allows the programmer to use multiple types of interaction between components, such as message ports, request-reply or one-tomany event dissemination, and to define new interaction styles as necessary.

The first version of Regis [MDK94,CT94] supported interaction styles with two template classes, one implementing the service side of the interaction and one implementing the client side and reference. By convention, the clientside interaction class was named after the service side class with the suffix “ref”. For example, portref objects were used to access port services. “Ref” objects could be passed in messages to create dynamic binding structures, and so doubled as both references and the programming interface to the interaction protocol. Regis provided a basic port interaction type that implemented synchronous and asynchronous message passing. Other interaction abstractions were implemented in terms of ports by a mixture of inheritance and aggregation. Implementation of new interaction types in this way was complex and the use of interaction objects inconsistent. For example, the client side “ref” objects of some interaction types contained local state, such as message queues and had to be cast to and from portref values when transmitted in messages. Furthermore, Regis only supported a single transport protocol and

did not allow the selection of different levels of QoS appropriate for each binding.

A second version of Regis [CMP95,PC96,Crane96,Crane97] was developed to overcome some of the deficiencies of the original platform. The framework for interaction objects was redesigned so that it was simple to define new interaction styles. Interaction classes were separated from the transport protocol, allowing programmers to select the most appropriate transport for each service. The concept of a reference was separated from that of a client-side endpoint and identified the protocol used to communicate with the service. Transport protocol components were dynamically loaded, allowing client and server programs to be evolved independently and make use of new protocol implementations as they became available.

Interaction Styles. The new Regis interaction framework defines each interaction type as a set of related classes that implement the service and client endpoints, distribution transparency and reference format of the interaction protocol. These classes inherit from base classes defined by the Regis framework that define the interfaces for transport communication endpoints or bindable objects. They also follow a naming convention so that the C++ template functions that perform binding can instantiate and manipulate the appropriate classes with full type safety.

Using the new interaction framework, it is much easier for a programmer to implement a new interaction type. However, Regis still has a number of deficiencies. The details of the interaction protocol are completely encapsulated within the implementations classes for the protocol. Because an interaction protocol has no specification beyond the code it is difficult to reimplement the protocol in another language. The lack of an interaction specifications from which marshalling code can be generated makes it difficult to transmit complex data structures: Regis only supports the transmission of flat records that do not contain pointers and the programmer must implement functions to translate the fields of the record between representations used by different architectures.

35

2. Component Interaction in Distributed Systems

Evaluation. The Darwin/Regis system has many advantages over current middleware systems. Darwin is used to explicitly define system architecture, freeing the programmer from the need to write code to deploy and bind components. The set of potential interaction styles is completely open - programmers can define their own by extending base classes provided by the Regis system - and the transport protocol used for communication can be selected on a binding-by-binding basis and composed from primitive protocol components.

However, the system does not provide enough support for implementing new interaction styles; the programmer must implement the code to marshal complex data structures and convert between representations used on different systems. The TRACTA approach to modelling component interactions as synchronous events is appropriate for procedure calls between objects within the same address space or synchronous, reliable method invocations between distributed objects. However, modelling of the various interaction and transport protocols used by the Regis system is left up to the designer. There is no standard library of protocol models or tool support for generating such models.

2.4.3. OLAN OLAN [BBMV98a, BBMV98b, MBVR97] is a component-based programming environment for the development of distributed systems. It is aimed specifically at building systems that integrate heterogeneous legacy software components and deploy them across a distributed network of processing nodes. Systems are constructed from components and connectors: components encapsulate the algorithmic aspects of the system by wrapping legacy software and connectors manage communication between components.

Component Model. An OLAN component is an object that encapsulates state and behaviour behind a strict interface, defined using the OLAN Interface Definition Language (OIL). As in the Darwin language, a component’s interface is comprised of services provided to other components and those required by the component to be provided by others. Unlike Darwin, which does not prescribe the service types used by components, the only services that can be provided or required by an OLAN component are object operations; operations cannot be “bundled” into higher level abstractions. A component’s interface exposes some information about how the component implements the operations at its interface: both provided and required operations must be marked as synchronous or asynchronous, indicating whether threads expect to wait for a reply or for the operation to complete.

An OLAN system is defined using the Olan Configuration Language (OCL) as a hierarchy of components in which the root component defines the entire system, intermediate components are themselves composed of components, and leaf components are primitive, wrapping legacy code. The OCL “implementation” declaration is used to define composite components in terms of instantiated sub-components and the bindings between their interfaces via connectors. Like Darwin, an OLAN composite component can expose provided and required services of its subcomponents at its own interface. However, unlike Darwin, OCL requires connectors to be specified between the interface element of the subcomponent and that at the interface of the composite component. When a composite component is deployed these connectors remain in the system and communication between the components at the

36

2. Component Interaction in Distributed Systems end of the bindings is routed through the tree of connectors. Compared to Darwin, which flattens composite component structures at runtime and creates direct bindings between the interfaces of subcomponents of different composite components, the use of composite OLAN components results in unavoidable communication overheads that increase with greater use of hierarchical composition.

OCL ensures that bindings between components are type safe and that bindings are only established between provided and required services with compatible synchronisation constraints. The synchronisation constraints also constrain the connection patterns of bindings: many synchronous requirements can be bound to the same synchronous service but each synchronous requirement can only be bound to one service; conversely an asynchronous requirement can be bound to multiple asynchronous services, but an asynchronous service can only be the target of a single binding. However, both type and synchronisation constraints can be overridden by specific connector types.

OLAN primitive components are implemented as legacy software wrapped within packaging code and data structures that implement the OLAN component model and route invocations between the component’s interface and the programming interface of the legacy software. This packaging is generated from OCL “primitive implementation” declarations that describe the mapping between the component’s interface and the program-

ming interface to the legacy software. These descriptions are programming-language specific: primitive components can be constructed from legacy software written in C, C++ and Java and Python.

Connectors. OLAN connectors are responsible for transporting invocations between components. Connectors can also perform computation on the data passed through them, including coercing parameter types to allow binding between incompatible services, converting between different presentation-layer formats, calculating new argument values and routing invocations based on their argument values. An important use of connectors is to mediate interaction between components with some application-layer protocol, such as a floor-passing mechanism in a groupware application.

In the same manner as primitive OLAN components are used to adapt legacy code to the OLAN component model, so OLAN connectors are used to adapt legacy communication mechanisms, such as shared memory, TCP/IP connections or RPC calls, for use between OLAN components. A connector is separated into an adapter object and sender and receiver objects. The adapter interfaces with the component packaging generated from OCL definitions, performs data type and control flow translations and can execute arbitrary user-defined computation on the data flowing through it. When a connector is used to bind components in the same address space, the adapter is connected directly between the components. However, when a connector is used between components in different

37

2. Component Interaction in Distributed Systems address spaces, an adapter is instantiated for each component being bound and sender and receiver objects are instantiated to act as proxies for remote adapters, routing calls to the adapter across the communication mechanism used by the connector. Component

Adapter

Component

Connector Code Colocated Components

Component

Adapter

Sender

Receiver

Adapter

Component

Communication Mechanism Connector Code Distributed Components

Figure 7. OLAN connector objects between collocated and distributed components OCL system descriptions can also include management declarations that describe how components are deployed as a set of name/value pairs. Management attributes include the host on which to execute a component, the operating system required and the user name under which a component is to be run. Further control over deployment is provided by distribution policies which specify constraints over the execution of components, such as whether components must be collocated or executed on machines within the same domain.

Evaluation. The main drawbacks of the OLAN system are: the poorly delineated conceptual model of components vs. connectors, the inflexible connector model and the inefficient implementation. The simple model of components encapsulating algorithmic concerns and connectors mediating communication between components is confused by the ability to encapsulate arbitrarily complex computation within connectors - it is no longer obvious where computation is being performed within the system.

Although the concerns of adapters and proxies are separated within the implementation of connectors for reasons of efficiency, this separation is not made visible to components: a connector type cannot be instantiated with a different transport protocol, for example, in order to select an appropriate QoS. Neither can the parameters controlling the communication mechanism of the connector be controlled by the application.

The OLAN component packaging and connectors are implemented in Python [Rossum95], an interpreted scripting language, which reduces performance. Also the way that bindings between subcomponents are routed through connectors up and then back down the component hierarchy reduces the performance of communication between subcomponents of different composites.

38

2. Component Interaction in Distributed Systems

2.4.4. UniCon UniCon [SDKR95, SDZ96] is an architecture description language that focuses on building systems from existing architectural parts - components and connectors - and supporting architectural styles that are already in use. UniCon components are pieces of software encapsulated behind strict interfaces and connectors define protocols by which components interact, such as a procedure calls, shared variables, UNIX pipes or RPC invocations. An architectural style is defined by the potential set of components and connectors that may be instantiated as part of the system and constraints on the topology in which they may be connected.

A component’s interface is comprised of the component’s type, such as a C library or UNIX process, properties that specialise the component type and “players”, points at which the component interacts with connectors. A connector defines a protocol, specified by the connector type, properties that specialise the type and the points, known as “roles”, through which components may interact according to the protocol. Both component players and connector roles are typed and may have properties that specialise their types.

A component may be primitive or composite. A primitive component is implemented as some element of software defined outside the UniCon language, such as source code in some programming language, object code in a library, a file in the file system or a process. Composite components are defined in terms of the components and connectors used, the “connections” between component players and connector roles that define how internal components interact, and “bindings” that specify how the interface of the composite component is mapped to elements of its internal configuration.

Because UniCon is designed to build systems from existing parts, it does not itself define a component model and run-time environment. Instead it makes use of component and connector types provided by the operating system on which the system is to execute, such as UNIX processes or binary object files and pipes or shared files. A noteworthy aspect of UniCon is that its set of connectors include, in addition to those concerned with explicit intercomponent communication, indirect interactions caused by components competing for resources; an example of such a connector is a scheduler in a real-time operating system.

The UniCon compiler analyses the architecture description to ensure that it is correct: that players are connected to roles that they can fulfil and that the configuration of the implementation of a composite component fully implements its interface. If the checks are successful, the UniCon compiler generates the code for the components and connectors, the glue code to attach component players to connector roles, and scripts to compile link the components, connectors and glue into an executable system. Like Darwin and OLAN, UniCon has both a textual and graphical representation; the graphical representation is used by tools that provide the user with a simple interface with which to specify system architectures.

39

2. Component Interaction in Distributed Systems Specifications can be attached to architectural parts as name/value pairs. UniCon tools do not interpret these specifications but pass them to external analysis tools. The UniCon language is being modified to support “credentials” [Shaw96], component specifications that can be incrementally evolved during the lifetime of a component, in order to better support the real-world usage of components in which new specifications are required after the component has been created or are created by analysis or synthesis of existing specifications. An important part of a component’s credential is the credibility of the specification, such as “asserted”, meaning that the specification is given by the designer and taken on faith, or “verified” meaning that the component has been mechanically verified to meet the specification.

The application domain of UniCon, that of building systems from existing, platform-specific parts, is also its main drawback. The UniCon compiler and graphical tools support a fixed set of built-in connector types. Adding new connectors requires extending the UniCon compiler, which, the designers admit, is not a trivial task, and also involves adding new icons representing the connector to the graphical tools. Implementing a new connector is performed by writing a “connector expert” for the toolset. A connector expert is comprised of data files and C source code that specify the syntactic representation of the connector type, perform semantic checks specific to the connector and can generate code and build rules to integrate the connector into a final system. A specialised compiler generator tool is used to integrate connector expert components into the UniCon compiler, raising the question as to why UniCon itself wasn’t used for the task.

UniCon is limited to the construction of systems that execute on a single machine. There is no support for distribution and connectors that are used to cross address spaces are treated as different from those used within a single address space, thereby ignoring issues of distribution transparency. For example, a player of type procedure call cannot be connected to a player of type remote procedure call. The lack of distribution support is evident in the way that connectors mix the concerns of transport, presentation and application layer issues. A pipe connector specifies the use of the UNIX pipe IPC mechanism, a transport protocol, but the players of a pipe are parameterised by a regular expression defining the format of data passed across the pipe, a presentation layer issue; however, it is impossible to specify that a procedure call or RPC can be routed across a pipe, even though it would be possible to generate presentation-layer code to marshal the procedure call into a textual format that could be understood by a pipe player.

Evaluation. UniCon provides designers with a rich component model in which components can interact in a variety of ways, and allows one to annotate component definitions with formal specifications, and provides tool support for analysing specifications and generating executable code from architecture descriptions. However, the set of connector types available to the designer is limited to IPC mechanisms that provided by the platform on which the system is to execute and supported by the UniCon compiler. UniCon does not provide the designer with any support for implementing new connector and component types: connector types are hard-coded into the UniCon compiler. A flexible way of specifying new interaction styles would be useful. The inability to build and deploy distributed systems is a disadvantage.

40

2. Component Interaction in Distributed Systems

2.4.5. Wright, Aesop and ACME Wright, Aesop and ACME are a set of languages and tools for specifying analysing and constructing system architectures, developed by the ABLE project at Carnegie Mellon University.

Wright [Allen97,AG97,ADG98], like UniCon and OLAN, is an architecture description language that structures a system in terms of components and connectors with components interacting via connectors. However, unlike UniCon and OLAN, Wright is purely descriptive: it is used only for modelling and analysing software systems and cannot be used to build or deploy the system described. Wright defines the semantics of component and connector behaviour in CSP. The syntax of the ADL acts as a structure in which to organise these CSP specifications.

Wright characterises a component by the ports at its interface through which it interacts with other components via connectors. A port represents a role that the component can play in an interaction with other components. The protocol followed by the role is expressed in CSP. The behaviour of the component is also expressed in CSP in terms of the component’s interactions with its ports. Wright characterises a connector by the roles that can be played by components and a CSP specification of the protocol by which the roles communicate. Wright is open as to the set of connector types that it supports.

System specifications are built by declaring named component and connector instances and the bindings between the ports of the components to the roles of the connectors. From a system description Wright can generate a CSP model of the system and pass it to an external theorem prover for analysis. Among the system properties that can be checked in this manner are whether the system is deadlock free, whether a component’s implementation corresponds to the port protocols, whether the a component’s port protocol is compatible with the connector role to which it is bound and whether any unbound ports of a component are actually required to be bound for the component to work correctly.

Aesop [GAO94, Monr96] is a toolkit to support the construction of graphical tools for the design and analysis of architectures that conform to some architectural style. An architectural style is a set of component and connector types together with rules as to how they can be composed into systems. Aesop represents an architecture as a system of objects stored in an object-oriented database. Classes are provided as part of the system to represent generic components, connectors, ports and roles. Objects representing architectural parts can be annotated with specifications that are not themselves represented as objects but are interpreted by other tools.

A style is defined by extending the base classes that represent architectural parts to implement constraints on how the parts can be connected. For example, a pipe-and-filters style is defined as the constraints that components interact only via pipes and that an output port of a component must be connected to the input role of a pipe and vice versa. Styles can also be defined by extending the classes of an existing style. For example, a pipeline architectural style can be defined by extending a pipe-and-filters style with the constraint that filters must have one input port and one output port. However, this use of inheritance is inflexible: for example, although a component in the pipe-

41

2. Component Interaction in Distributed Systems line style can be used as in the pipes-and-filters style, a component from the pipe-and-filters style that happens to have a single input and output port cannot be used in the pipeline style. Styles would be better described as predicates or constraints over a configuration of components.

The Aesop tools interact by making reads and writes to the architecture database and announcing events over a software bus to notify other tools that the database is modified. Aesop provides a generic graphical editor that can draw representations of architectures of any style. Other tools are style specific: depending on the style, Aesop tools can perform analysis of the model or generate code for a particular environment. For example, tools for a pipesand-filters style can generate Unix filter programs.

Architectures described in Aesop and Wright are not directly interchangeable: that is, the Aesop toolset cannot read or write Wright architecture descriptions to be analysed by the Wright toolset. Instead, architecture descriptions are translated into an intermediate language, ACME [GAO94], that was designed as a lingua franca ADL to support the interchange of architecture information. ACME defines syntax for those architectural elements that most ADLs have in common: component types, connector types and configurations of component and connector instances. This syntax is used as a structure into which models are included. ACME itself does not interpret these specifications but the ACME parser can extract them for processing by external tools.

Evaluation. Although Wright can be used to specify and analyse architectures of components that interact in an extensible variety of ways, they have the drawback of being purely descriptive: one cannot generate the boilerplate code required to instantiate and deploy a system defined using any of these ADLs. This limitation does have some benefits: unlike UniCon, for example, it is trivial to integrate new connector types into a Wright architecture. However, the designer is completely on his own when it comes to implementing them. The Aesop tools can generate the code required to instantiate and deploy a system, but cannot read an architectural description defined in Wright. The use of ACME to pass information between the Wright and Aesop tools is rather superfluous. It would be better to use a single ADL for all toolsets.

2.5. Transport Protocol Frameworks Sophisticated distributed systems require control of the transport-protocols used for the bindings between their components so that they may select appropriate reliability, security, compression, auditing and management properties. To allow the flexible combination of different transport-layer properties, the transport protocols used for a binding must be composed from primitive components. This allows, for example, the same encryption algorithms to be used with a reliable or unreliable transport. Protocol components must be dynamically loadable so that systems are not hard-coded with a limited set of protocols.

However, current object-oriented and component-based middleware systems provide only limited control over the transport-layer concerns of bindings, allowing the transport protocol over which a service is made available to be selected from a limited set when the service is initialised. Few allow the creation of new transport protocols for

42

2. Component Interaction in Distributed Systems services or the dynamic loading of compatible transports by clients. The transport protocols supported typically provide reliable, in-order delivery of messages, which is suitable for RPC and object invocation but not for other forms of intercomponent interaction, such as streaming media. Systems that use “connectors” allow a connector to encapsulate the use of different transport protocols but the connector is then used as a monolithic whole when constructing systems and the application layer protocol of the connector cannot be reused with different transport protocols.

A number of object oriented frameworks have been defined to support the implementation of transport protocol software from lightweight components.

An influential early system is STREAMS [Ritchie84], initially designed to provide greater modularity in the design of Unix device drivers. Drivers are structured as a linear chain of “modules” that encapsulate protocol processing. Each module provides a uniform interface, through which messages are passed up and down the stack, and execute concurrently as lightweight threads. STREAMS functionality is accessed through the standard UNIX file I/O API. User processes that open a device can perform I/O control operations (ioctls) to push STREAMS modules onto the top of the stack to modify the behaviour of the device. Because UNIX did not originally support dynamic loading of code into the kernel, the set of available modules was fixed. Later versions of the STREAMS system supported the construction of multiplexor trees and the dynamic loading of STREAMS modules into the kernel. The performance of STREAMS is limited by the execution of each layer as a separate thread.

The x-kernel [OP92] is a C library for building protocol software developed by the University of Arizona and since used in the OSF/1 Unix kernel. Protocols are implemented in terms of “layers” and “sessions”: a layer encapsulates protocol functionality and creates a session to encapsulate the state needed for each user of the layer. Layers and sessions are C data structures that provide uniform interfaces through which threads deliver messages containing application data and control information. The use of uniform interfaces allows layers to be composed into arbitrary directed graphs: the x-kernel encourages fine-grained composition in which lightweight protocol components encapsulate individual algorithms and are composed to implement transports with rich semantics. Protocol graphs are defined statically and cannot be modified at run-time. The x-kernel introduced the concept of the “virtual protocol” - a protocol layer that does not add headers to messages flowing through it but instead serves only to route messages within the local protocol graph.

Further research based on the x-Kernel added the concept of the “micro-protocol” to those of layers and sessions [Hiltu98]. A micro-protocol is a component of a layer that encapsulates part of a layer’s functionality and processes the messages delivered up or down to the layer. Micro-protocols are not composed into an explicit structure; instead they are added to a layer as “plug-ins” and communicate by manipulating data-structures within the layer. The semantics of a layer then depends on the selection of micro-protocols that have been plugged into it. Tools are provided to help a programmer select the appropriate micro-protocols to achieve the semantics required of the layer. However, these tools do not help the programmer in constructing the protocol graph itself.

43

2. Component Interaction in Distributed Systems Horus [RBFH95,RBM96] is a communication system supporting group communication. Multicast semantics, such as reliability, stability and ordering, can be selected by composing protocol layers into a stack. Like the xkernel, Horus defines a common interface for protocol layers through which threads carry messages, but enriches the interface with operations specifically supporting group communication. The protocol structures that can be constructed by Horus are limited to trees of multiplexors and Horus does not support the concept of virtual protocols responsible for routing messages within the protocol graph. The Horus protocol software executes in user space and is made available to user programs through a variety of APIs, including the UNIX sockets API and the Message Passing Interface (MPI), through an RPC library and as commands for the Tcl scripting language.

OmniOrb2 [LP98] is an implementation of the CORBA 2.0 specification by the AT&T Cambridge Research Laboratories. OmniOrb2 extends CORBA with support for multiple transport protocols. However, the transport framework is designed to support only CORBA GIOP as a higher layer protocol and does not provide a compositional approach to the construction of new transport protocols or support dynamic loading of protocols.

FlexiNet [HHD98] is a configurable middleware platform produced by APM. FlexiNet is based on Java, distributed services are made available as Java objects, and interaction with remote objects follows the semantics of Java object invocation as closely as is possible in a distributed system. Bindings to remote objects are implemented by stacks of protocol components. However, FlexiNet only supports linear stacks of protocols, and the protocol components are responsible only for presentation-layer marshalling. The base of any stack is responsible for sending messages over a transport protocol, but the transport itself is not constructed from components.

Softwired’s iBus [Softw99] is a commercial middleware platform for building Java “publish/subscribe” applications that includes a communication subsystem very similar to that of Horus. IBus applications are comprised of groups of components that communicate by sending asynchronous messages to shared multicast channels; a message can be any serialised Java object. All components connected to a channel receive all messages from that channel. Interaction protocols between members of a channel must be implemented in terms of individual messages but are not defined separately from their implementation. Channels can use point-to-point UDP, IP-Multicast or TCP/ IP as their transport protocol. Other protocols can be layered above these to extend the semantics of the transport, to provide reliable multicast or security for example. Protocols stacks are constructed dynamically based on information in the URLs identifying channels, but are limited to linear chains of layers.

Conduits+ [HJA95] is an object-oriented framework for the construction of network communication software in C++. Protocol software is composed from conduits that perform protocol processing on information chunks. Conduits have a uniform interface through which they pass information chunks between each other and so can be connected into arbitrary configurations. Unlike the protocol components of the x-kernel and Horus, conduits are not directional - they do not have a top and bottom - and so can be connected in any orientation. Multiplexing is implemented using two conduit types: a mux and a protocol factory. A mux performs (de)multiplexing by routing information chunks to other conduit depending on the content of each chunk. A protocol factory creates a new conduit when it receives an information chunk. A mux can be configured to pass chunks to a protocol factory when it

44

2. Component Interaction in Distributed Systems cannot determine the destination of the chunk, thus causing the instantiation of a conduit to process the chunk. Demultiplexing (i.e. message routing) can be implemented by instantiating a mux “upside down” in the protocol graph. Unlike the x-kernel, Conduits+ is not used to compose fine grained components; for example, the entire TCP protocol is implemented as a single conduit rather than conduits encapsulating individual reliability, flow control and multiplexing algorithms. Conduit graphs can be constructed dynamically but the set of conduit types is fixed when the application is constructed. Conduits+ has recently been ported to Java [NK98] and the conduit components implemented as Java Beans. This allows dynamic loading and configuration of conduit graphs at design time but dynamic loading of conduit classes is not supported at runtime.

2.6. Others MPI [MPIF93] is an API for the construction of parallel, message-passing programs. It was originally designed for use on message-passing multiprocessors but has also been ported to networked clusters of workstations. An MPI program is structured as a set of tasks that execute in parallel and communicate by message passing. To maximise the concurrency between tasks executing on different processors and minimise overhead due to context switching between tasks on the same processor, message passing is asynchronous. Higher level interaction and synchronisation patterns between tasks must be implemented in terms of the low level message passing API calls.

Compared to MPI, object-oriented middleware forces the programmer to increase concurrency by spawning multiple threads of control within components, because objects communicate by synchronous object invocation. This approach to concurrency complicates the implementation of objects with explicit synchronisation code.

Metaobject protocols of open languages have been used to implement connectors and distribution transparency [CM93, ALP99]. In an open language, a program’s mapping onto lower level mechanisms, such as its compilation or runtime support, is controlled by objects that conform to well defined protocols. These objects are termed “metaobjects” because they model the implementation of the program rather than the program’s application domain. Programmers can replace one or more metaobjects to control the implementation of specific parts of their program. To define connectors, the metaobjects that implement method calls are replaced for specific application objects with metaobjects that implement the connectors’ mechanisms. For example, the connectors’ metaobjects might filter or redirect the method call or marshal and transmit the method invocation to a remote object. The main disadvantages of defining connectors using metaobject protocols is that connectors are limited to communication between components written in a single language and that connectors must be implemented by manipulating the lowest levels of the language’s implementation which is complex and can introduce subtle errors.

Aspect-oriented programming (AOP) [KLMM97, LK97] aims to provide the power of metaobject programming with less of the drawbacks and has also been used to implement connectors for distributed programs. AOP separates a program into components of functionality, object classes, and definitions of the non-functional aspects of the system, such as how data flows between distributed components and concurrency constraints. In traditional object-oriented languages, the code implementing an aspect is scattered throughout the code of the classes it affects,

45

2. Component Interaction in Distributed Systems making it difficult to specify, understand and modify. By centralising details of each aspect in a single place, it is easier to define and modify an aspect, and therefore easier to adapt programs to use new distribution of concurrency strategies.

AOP components are written in a traditional object-oriented language, such as Java and definitions of each aspect are written in distinct aspect languages. An “aspect weaver” compiles the functional components and aspects, building runtime support for each aspect, and integrates the components with this runtime support by connecting the components to the aspect runtimes at well defined “join points”, such where code invokes methods, passes parameters returns values or allocates new objects.

AOP has the same aims as component based programming using interface and architecture definition languages, that of increasing the clarity of a system through clean separation of concerns. However, AOP is limited to a single language: the aspect languages refer to constructs of the component implementation language and hook into the metaobject protocol of that language. Although providing less flexibility, AOP removes one of the main drawbacks of metaobject programming: it is no longer necessary to explicitly manipulate the implementation details of the language; instead the programmer uses a high-level declarative notation that is translated into metaobjects or program transformations. As with the direct use of metaobject protocols, the intercomponent interaction protocols are still defined only in the implementation of each component.

Aster [IBS98] is a toolset that aims to support the systematic customisation of middleware based on an application’s non-functional requirements. Aster programs are defined, using an ADL, as a configuration of components that provide and require services. The interfaces of a component also define the non-functional requirementsof the services of the component, such as reliability or transactional properties. Non-functional requirements are specified by names that refer to definitions of the property specified in first-order predicate logic. A base middleware layer is provided that can meet basic non-functional requirements; currently Aster supports CORBA or HTTP. The functionality of this layer is extended by middleware components. The non-functional properties provided by middleware components are also defined in first-order predicate logic. The Aster toolkit uses a theorem prover to determine which middleware components must be used to meet the non-functional requirements of the application, and then generates the C++ glue code to instantiate and deploy the application. Aster components can therefore only interact through synchronous object invocation, and formal models are only used to select components that mediate bindings between services in order to meet non-functional requirements, not to specify and check interaction protocols and the compatibility of bindings.

Rapide [LKAV95,LV95] is an ADL and set of specification languages for describing and simulating distributed computer systems; Rapide architectures are not translated into executable code. Rapide models a system as a set of interacting components that generate and react to events. A simulation of a Rapide architecture generates a partially ordered set of events (a “poset”); a poset represents causal relationships between events and is partially ordered in that not all events in a poset are causally related to each other. A poset generated by a simulation run can

46

2. Component Interaction in Distributed Systems be checked for validity against a set of behavioural constraints defined by the system designer. Thus Rapide does not exhaustively check a system model against its required properties; it is up to the designer to design simulation traces that provide adequate test coverage.

Finesse [BK98] is a coordination language for specifying interaction protocols between multiple objects. A protocol defines a number of roles, the events that may be emitted or received by each roles, and causal and temporal relationships between events. Events are parameterised, and Finesse allows the designer to specify how the parameters of received events are related to parameters of earlier emitted events. Notable features of Finesse include the ability to define generic protocols that can be parameterised by role types, to define protocols in terms of sequential iterations of messages or parallel message sends, and to support multiparty communication. However, Finesse does not currently have a defined semantics and is, as yet, unimplemented.

2.7. Summary Current object-oriented middleware platforms provide little or no support for composing systems from prebuilt components. Components of a system, typically objects, are defined by their provided services only. This forces clients to use first-party binding to connect to required services, and therefore increases components’ dependence on their context, reducing the scope for their reuse, and pollutes their implementation with irrelevant structural details. Components are limited to very few forms of interaction, typically one of RPC-like object invocation or asynchronous message passing. Other interaction mechanisms are added as ad-hoc extensions, must be implemented in terms of the predominant style or must be implemented above lower level mechanisms that are more complex and error prone to use. For example, Open Bindings add media stream interactions as an extension to the CORBA standard; Java Beans components interact via method calls, properties with change notification or event dissemination, but the latter two interaction styles are implemented in terms of the first; CORBA allows asynchronous invocations through the DII but this API is complex, slow and is not type safe. No commercial distributed programming environments supports design-time modelling and analysis of components, interaction styles or systems composed from these architectural parts.

The use of an architecture description language (ADL) makes it easier to construct systems from prebuilt parts. Unlike non-ADL approaches, the interface of a component used with an ADL includes specification of the services required by the component to be implemented by other components in the system. Systems are built by composition: components are instantiated and their required services bound to those provided by others. Components can typically make use of multiple interaction styles that are often encapsulated within connectors. However, component models that use connectors treat connectors as monolithic entities, encapsulating application and transport layer issues. As highlighted by Garlan, this makes connectors difficult to implement, modify and reuse [Garlan98]. There is little tool support for the definition of new connector types. Most current ADLs are either used to support modelling and analysis or system construction and deployment; Darwin/Regis is a notable exception. However, no ADL that supports system construction also supports the modelling and analysis of the connectors that are actually used in constructed systems.

47

2. Component Interaction in Distributed Systems This introduces the requirements met by the research presented in this thesis: to develop language and tool support for the definition of component interaction styles. Such tools need to support both modelling and analysis of interactions at design time and generate the run-time support required to use the interaction style in an executable system. The interaction model must not be monolithic: the separation of application, presentation and transport layer concerns will allow the difficulties identified in this chapter to be addressed.

48

3. A Model of Component Interaction

A Model of Component Interaction My own view is that we have been wrong in taking communication as secondary. Many people seem to assume as a matter of course that there is, first, reality, and then, second, communication about it. Raymond Williams

3.1. Introduction As discussed in chapter 2, current interface and architecture description languages do not provide the support required for designing and building large distributed systems. Those languages that allow the programmer to define new interaction styles either aid the programmer by generating the runtime support for those interactions but limit the types of interaction that can be defined or allow complex interaction styles to be modelled but leave their implementation completely up to the programmer. There is a need for an approach that allows programmers to specify an interaction style such that: • The specification can be used at design time to analyse system behaviour • The specification is used at build time to generate runtime support for distribution and binding. • The transport protocol used for the interaction style can be selected when bindings are established. • Management and monitoring functionality can be inserted into bindings.

Our approach is based upon a new model for component interaction [PC98a,PC98b] and a language, named Midas, for its realisation. The advantages of this interaction model and language over those described in chapter 2 is that it supports both the design-time analysis of interaction styles and the construction of components and systems that make use of those interaction styles. Design is supported by including specifications of the interaction protocol that can be checked mechanically for deadlock or violation of user-defined constraints. Construction is supported by translating Midas definitions into implementation language constructs that provide the “glue” to connect components using the interaction style whether they are within the same address space or across the network. The runtime support for interaction model overcomes the problems inherent in current middleware platforms through strict separation of concerns: programmers can select combinations of mechanisms within a binding to achieve the behav-

49

3. A Model of Component Interaction iour required for their particular context. Moreover, the Midas language links the design and construction phases by generating objects that can be inserted into bindings to check that components interacting over those bindings conform to the interaction protocol specified by the Midas definition.

Our model of component interaction is fully compatible with the use of an ADL for system structuring. We use Darwin [MDEK95] because it, alone among available languages, meets all our requirements of an ADL: it describes structural information independently of the runtime environment, it can be used for both system modelling and system construction, it has a formal semantics that defines the required output of a compiler and it is supported by a set of software engineering tools. Specifically, compared to the systems described in chapter 2, the Midas model and language combines all of the following properties: • Open-Ended. Components can use multiple styles of interaction. Programmers can select the most appropriate interaction style from libraries of predefined styles or define new styles specifically for their application. Tools generate the run-time support to interface new styles to the binding and communication subsystems. • Analysable. Formal models of interaction protocols allow a developer to analyse new protocols at design time, quickly finding errors in the protocol design without need for time consuming and costly code and test cycles. By separating concerns of interaction and transport protocols in the model, the behaviour of an interaction can be analysed over different transport semantics. Models of interaction and transport protocols can be used in the model of the application to accurately model the final system that makes use of the interaction styles. • Flexible. The interaction model is defined at a high level of abstraction and mapped onto implementationlevel concepts in such a way that separates the concerns of API and synchronisation, presentation layer marshalling, binding and transport protocol. Furthermore, the transport protocol for a binding can be dynamically loaded and composed from protocol components and is made accessible through generic interfaces giving applications control over low-level communication aspects without loss of generality. • Distribution Transparent. Distribution transparency is provided for both the component programmer, who makes use of interaction styles, and the developer of new interaction styles. Unlike the Regis system, complex data structures can be transmitted between interaction endpoints. • Language Independent. Interaction styles are defined using a specification language that is compiled into implementation language constructs. Distribution transparency is provided by converting data types of the implementation language to and from an on-the-wire format. Thus components written in different languages can communicate by using the common marshalled representation. • Efficient. The abstract model of component interaction can be translated into efficient implementation constructs. The separation of concerns lets a system instantiate the minimal mechanisms necessary for any binding. The ability to select and compose transport protocols allows the designer to select the most efficient transport protocol for a binding, following the end-to-end principle by avoiding unnecessary transport pro-

50

3. A Model of Component Interaction tocol mechanisms if they are not required.

3.2. Abstract View of Component Interaction In our model, a component is a reusable element of a distributed program that encapsulates state and/or behaviour behind a strict interface comprised of the roles that the component may take in various interaction protocols. A component can act as either a service provider or service client in each of the interactions in which it plays a role. Roles are represented by named, typed interaction endpoints. Interaction endpoints provide distribution transparency to the component implementation: the programming interface to the interaction protocol is identical whether the far end of a binding is within the same address space, on the same machine or on a remote machine. Of course, performance of local and remote interactions will differ, as will reliability, depending on the transport protocol selected for the binding.

An interaction between two communication endpoints, service and client, can be defined in terms of asynchronous messages. In more detail, an interaction consists of: 1.

The set of asynchronous messages accepted by the service (the server-side message interface).

2.

The set of asynchronous messages that the service requires the client to accept (the client-side message interface).

3.

A protocol that defines when specific messages may be transmitted. In our approach we model protocol behaviour as communicating finite state machines at each endpoint of the interaction.

4.

The programming abstractions through which the client and service view the protocol (the client- and service-side programming interfaces), including the synchronisation of component threads at those endpoints.

Client Thread(s)

Service Thread(s) Asynchronous Messages

Client-side Programming Interface

Service-side Message Interface

Client-side Message Interface

Service-side Programming Interface

Synchronisation Synchronisation

Figure 8. Abstract view of the interaction model

51

3. A Model of Component Interaction In our model, the messages define the application-layer protocol by which components communicate over a binding and the pair of message interfaces constitutes a contract [Meyer88] between two endpoints. That is, a service guarantees to react meaningfully to messages received from a client as long as those messages are in the set of messages accepted by the service at that point in time and as long as the client reacts meaningfully when the service sends messages back to the client.

3.2.1. Binding We define bindings between endpoints solely in terms of the message interfaces defined by their Midas specification. A client endpoint is bound to a service by passing it a pointer to the message interface of the service. Due to the polymorphic nature of the message interfaces, any implementation of a client-side message interface can be bound to any implementation of a service-side message interface. Language-level strong typing is used to ensure that the endpoints being bound play opposite roles in the same interaction.

Local Bindings. The client endpoint includes a pointer to its own message interface with each message that it sends to the service endpoint. This “back-binding” allows the service endpoint to send a reply to the message, and can be stored by the service in order to send future messages to the client.

The use of pointers to perform binding means that interactions between endpoints in the same address space can be very efficient. The overhead is that of a dynamically dispatched method call, with an extra pointer argument for messages sent from the client to the service, although further overhead may be introduced by synchronisation inside each endpoint. binding

Client Thread(s)

Service Thread(s)

back-binding

Figure 9. Binding from client endpoint to service, and backbinding from service to client

Proxies. The model of binding and interaction used within a single address space is extended between address spaces by the use of Proxy objects [GHJV94]. Proxy objects provide distribution transparency to endpoint implementations and the components that use those endpoints. Because messages can be sent in either direction at any time, dependent on the current state of the sending endpoint, a proxy is required at each end of a binding. In CORBA parlance, a Midas proxy performs the role of both a proxy object, marshalling and transmitting invocations on its message interface, and an object adaptor, receiving and unmarshalling messages and performing invocations on the message interface of its associated endpoint.

52

3. A Model of Component Interaction A service proxy provides the illusion that the service endpoint is within the same address space as the client endpoint and a client proxy provides the illusion that the client endpoint is within the same address space as the service endpoint. The proxies are connected by a communication channel through which they transmit raw data to each other. When components in different address spaces need to interact, a service proxy is created in the client's address space, the client endpoint is bound to the service proxy and the service proxy is connected via some transport protocol to a client proxy in the service’s address space. The service proxy implements the service-side message interface of the interaction and, to the client, is indistinguishable from a true service endpoint. Client Address Space

Service Address Space

Client Thread(s)

Service Thread(s) Transport Connection

Service Proxy

Client Proxy

Figure 10. Client and service proxies provide distribution transparency Two remote endpoints can interact once a connection has been established between their proxies. When the clientside endpoint invokes operations of the message interface of its service proxy, the service proxy marshals the message parameters into a data buffer and transmits it over the channel. In the address space of the service endpoint, a system thread receives the data buffer and carries it up the protocol stack to the client proxy. The client proxy unmarshals the message from the data buffer and invokes the appropriate operation on the service endpoint's message interface. Similarly, the client proxy marshals invocations of the operations of its message interface into data buffers and transmits them to the service proxy in the client's address space which unmarshals the message and invokes operations of the client's message interface.

Service References. For a service endpoint to be accessible to remote clients, it must be able to accept connections from remote nodes and have a unique identity within the network that can be used by clients to request a connection to it. A service is identified by a service reference, a tuple comprised of the identifier of the transport protocol used to access the service and a transport-level address that can be used to establish a connection with that protocol. The mechanisms that allow a client to dynamically load and use transport protocols are described in detail in chapter 7.

Service Access Points. Service identity and connection establishment is managed by interaction-specific service access point (SAP) objects. SAPs provide a uniform interface through which code can acquire the reference of a service. This makes it possible to write code that can manipulate references and establish bindings without needing to know the full types of the endpoints it is binding, such as would be compiled from an architecture definition to instantiate the system or used within a name service or service trader. However, the untyped nature of the

53

3. A Model of Component Interaction SAP interfaces assumes that the binding service uses higher-level mechanisms to enforce type compatibility of bindings. For example, the compiler for the architecture description language performs type-checking before generating the code to perform the binding actions that construct the system.

Each interaction SAP is layered above a transport protocol endpoint that can accept transport-level connections (for instance, a TCP/IP server socket) and exposes the address of the transport endpoint in its service reference. The binding framework uses the transport SAP's address to initialise remote service proxy objects that then create transport-level connections to the transport SAP. The service's transport SAP passes connection requests up to the interaction SAP, that creates a client proxy object to manage each new connection. The definition of SAP and proxy classes are generated automatically by the Midas compiler for each interaction type and are independent of the underlying protocol; as long as the protocol is connection based and provides reliable, in-order delivery. Multiple interaction SAPs can be created for a single service endpoint, thereby making the service accessible simultaneously over different protocols. Client Address Space

Service Address Space

Client Endpoint

Service Endpoint

Service Proxy

Client Proxy

creates

Interaction SAP

Transport Session

creates

Transport SAP

Transport Session

Connection

transport reference

Figure 11. SAPs identify service endpoints

3.2.2. Synchronisation Between Components over a Binding Midas specifies protocols by which components may interact, but does not concern itself with how components that interact according to those protocols implement their functional behaviour. A component may be an active object with its own thread of control, may itself have a concurrent implementation with multiple threads of control, or be completely reactive, with no internal threads at all. Therefore the interaction model must support all of these threading models.

Developers are free to implement the operations of the message interfaces in any way they like as long as their implementation, as well as conforming to the protocol of the interaction style, supports asynchronous message passing and the potential concurrent or reactive execution of components.

54

3. A Model of Component Interaction These constraints mean that an endpoint implementation must be prepared for the case where it sends a message and receives a response before the call to the outgoing message interface has completed. This constrains how threads are synchronised by the endpoint: a thread must not hold a lock on an endpoint while sending a message over the binding because it could cause deadlock. This concurrency requirement supports communication between active, threaded components and purely reactive components. In a reactive component, the thread carrying a message to the component will always deliver a reply back to the calling component before returning from the original message call. These constraints also support distributed bindings implemented over an upcall-based protocol architecture: endpoints that conform to these constraints are guaranteed not to block the device threads responsible for receiving and delivering network messages. Client Endpoint Client Thread

Service Endpoint

setValue

setValue

wait on semaphore setAck

accept

Service Thread

signal semaphore Client thread scheduled Time

Figure 12. Synchronisation between concurrent components using the Attribute interaction The asynchrony of message passing constrains the implementation of an endpoint: it must not block threads that call the operations of its message interface, except for short durations while waiting for mutual exclusion. Synchronisation between threads on either side of the binding can only be implemented in terms of asynchronous messages passed across the binding. Figure 12 illustrates how synchronisation is implemented in this model. An endpoint object that synchronises the calling thread with threads at the other end of its binding must use encapsulated synchronisation objects, such as semaphores. The caller must invoke an operation on the bound message interface and then wait on a semaphore that it encapsulates. When another thread delivers a message to the endpoint, it can wake the blocked thread by signalling the semaphore.

3.2.3. Control of Bindings The component at each end of a binding can access control interfaces through which it may monitor and manage the operation of the binding. For example, a component may need to request particular quality of service parameters and be notified when quality of service drops below some minimum threshold. Control interfaces can exist at two points in a binding: above the presentation layer and within the transport stack: both forms of control interface are acquired in the same way. The manner in which transport protocols provide control interfaces is described in

55

3. A Model of Component Interaction detail in chapter 7. Monitoring and control of a binding can be added above the presentation layer by inserting interaction filters into a binding. An interaction filter implements the client and service message interfaces of the interaction type and, optionally, control interfaces through which it’s operation can be managed. It is chained between other client and service message interfaces within the binding, such as those of the client endpoint and service proxy, or client endpoint and service endpoint. An interaction filter implements the operations of its message interfaces by performing processing before forwarding the request to the next interface in the binding.

The objects composing a binding handle requests for control interfaces as a Chain of Responsibility [GHJV94]: if an endpoint or filter does not support the requested interface the request is passed to the next filter in the binding until the request reaches the endpoint at the other end of the binding or a proxy. If the request reaches the endpoint, the caller is informed that the requested control interface is not supported by the binding. If the request reaches the proxy, the proxy passes the request to the transport protocol that is used to transfer messages between address spaces. Client Address Space

Client Thread(s) Transport Connection

Figure 13. Control interfaces allow management of bindings In our implementation we define standard control interfaces are defined for common tasks performed by interaction filters and transport protocols. Such tasks include checking that the endpoints conform to the interaction protocol, managing the state of the transport connection, and reserving QoS parameters. Control interfaces defined by the transport framework are discussed in chapter 7. Because the ADL runtime system has full information about the implementation of bindings between components, including the filters and transport protocols used, it is also able to use non-standard interfaces to configure parameters of the binding when it is established.

Remote control and monitoring of bindings is implemented by instantiating “adaptor” objects that are collocated with the control interfaces and themselves provide Midas service endpoints that make the operations and events of the control interface available over the network.

3.2.4. Transport Protocols Although the implementation of intercomponent bindings has been described in terms of connections, the concept of a binding does not necessitate the use of connection-based protocols, such as TCP/IP. The connection abstractions can be implemented using light-weight adaptor objects layered above datagram-based protocols, such as

56

3. A Model of Component Interaction UDP/IP, or shared memory within the same host. Moreover, the separation of concerns within the interaction architecture provides enough flexibility to take advantage of protocols that are not connection oriented, which can result in significant performance improvements.

Take an event dissemination service as an example. Like most services, an event service can be provided over a connection-based protocol. In this configuration, each event sink is connected to the event source by a separate transport connection, and the source owns a proxy for each remote sink. The source announces events to each proxy individually, causing the same message to be transmitted separately over every connection. This is inefficient: performance can be improved by using a multicast protocol, such as IP Multicast [Deer91,RBM96] so that only a single event message need be transmitted to multiple event sinks.

The Event interaction can be used over a multicast protocol by deriving custom proxy and SAP classes from the base classes generated for the interaction type by the Midas compiler. A multicast client proxy is initialised with a reference to a multicast group to which it transmits event messages. A multicast service proxy is also initialised with the address of the multicast group. When the client endpoint registers with the event service, the multicast service proxy joins the group and starts receiving event messages. When the client endpoint disables the reception of events, the multicast service proxy leaves the group and therefore stops receiving event messages. The event source service is made available over the multicast protocol using a multicast interaction SAP object. Since connections are not used, the multicast SAP does not listen for connection request events from a transport protocol layer. Instead, the multicast interaction SAP acquires a group reference from a name service, creates a multicast client proxy for the endpoint and attaches the endpoint and multicast client proxy. The multicast SAP exposes the address of the multicast group in its service reference and the binding framework passes this to remote multicast client proxies allowing them to join the group and receive events. Note that a service can make use of this optimisation alongside less efficient connection-oriented SAPs, allowing clients that are not multicast capable to also receive events. Client Address Space

theBinding Harness

Server Address Space

2. bind

anEvent Service

anEvent Client

aMulticast ServiceProxy

5. event

6. event

theBinding Harness

4. enable 1. create

3. enable

param

aMulticast ClientProxy

param

1. create

2. addSAP aMulticastSAP

4. join aMulticast ServiceProxy

Multicast Group

6. transmit

aMulticast ClientProxy

param

5. receive param

Figure 14. Multicast used for an event interaction protocol

57

3. create

aMulticast SAP

3. A Model of Component Interaction

3.3. Conclusion This chapter has introduced a model of component interaction that permits components to communicate by an open-ended set of interaction styles. We have shown how the interaction style can be separated from the transport protocol used to carry the application-layer messages of the interaction protocol, and how bindings can be augmented with monitoring and management functionality. Our interaction model supports efficient interactions within the same address space and allows the transport protocol to be selected at run time to transport interactions between address spaces.

Chapter 4 introduces the Midas language that is used to specify interaction styles that follow our model. Chapter 6 describes how the language is mapped to an implementation language, Java. Chapter 7 describes the framework used to implement transport protocols and the mechanisms used to dynamically load and compose protocol components.

58

4. Midas: A Language for Specifying Interaction Styles

Midas: A Language for Specifying Interaction Styles Get the habit of analysis - analysis will in time enable synthesis to become your habit of mind. Frank Lloyd Wright

We have defined a language, named Midas, with which one can specify interaction styles that are based upon the interaction model introduced in chapter 3. Midas specifications are compiled into implementation language constructs, such as Java classes, that define the message interfaces and implement distribution transparency. Midas allows interactions to be defined independently of the implementation language and simplifies the task of the programmer by generating support for distribution transparency.

To take advantage of programmer familiarity, Midas syntax is based upon that of the CORBA IDL [OMG98]. The syntax for constant and module definitions is virtually identical to that of IDL. Type definitions are extended to allow the definition of generic types. Interface definitions are replaced with interaction definitions that follow the model described above. Unlike CORBA IDL, however, Midas does not support the Any type and so does not allow untyped data to be held in data structures or passed in messages. Instead, data structures and interaction definitions can be generic, parameterised by type. Generic interactions allow type checking to be performed at design time by the Midas and ADL compilers. If components are implemented in a language that has direct support for genericity, such as C++, generic definitions can be mapped to language constructs that support compile-time type checking of component implementations. Otherwise, the Midas compiler generates code that supports genericity.

Midas interaction definitions can be annotated with one or more specifications of the interaction protocol. There are several points in the software lifecycle where a formal model of component interaction is useful. While a developer is designing a new interaction style it is faster, and therefore cheaper, if the designer can detect errors in the protocol by exhaustively checking a model of the behaviour of the interaction protocol, rather than performing multiple code/test/debug cycles. It is also useful to check the behaviour of the protocol when layered above different types of transport protocol. The use of a model checker has the advantage that synchronisation errors can be detected by the model checker which might go unobserved in real code due to vagaries of scheduling or timing during testing.

59

4. Midas: A Language for Specifying Interaction Styles Different systems require different kinds of analysis. Concurrent systems must be checked for deadlock and progress violations while designs of real-time systems must include timing constraints. The Midas language allows interaction declarations to be annotated with arbitrary specifications. The Midas language identifies each specification with a string type-name but ignores the contents of the specification’s body. A specification is extracted by a compiler back-end that passes the specification to external analysis tools. Each specification must follow some type-specific mapping from the elements of the interaction - name, message interfaces, message names etc. - so that the compiler back-end can associate elements of the specification with elements of the Midas definition.

Once the interaction protocol is designed, the programmer must implement the interface between interaction endpoints and the internal implementation of the component that owns the endpoint. The programmer can use the formal model to gain an accurate understanding of the protocol and can use visualisation tools to animate the state machines representing the behaviour of each role in the protocol, thereby aiding understanding. The formal models are also used to generate test code to verify that endpoint implementations conform to the protocol specification. An application architect, building systems from components that make use of existing interaction styles similarly benefits from mechanical checks of the behaviour of the entire system for safety, progress and other relevant properties.

We will introduce the Midas language through a number of example interaction styles that incrementally introduce language features and their rationale. The styles described and the features introduced are: • LongSlot. Introduces basic Midas syntax, the modelling of Midas endpoints with FSP and their implementation in Java. • Slot. Motivates and introduces generic interaction types. • Mport. Describes the how to specify the queuing of messages for delivery. • Func. Introduces synchronisation between clients and the service. • Event. Introduces one-to-many communication. Shows how clients can selectively enable and disable messages from a service. Shows how messages sent from the service to client by the volition of the service rather than in response to a client can cause synchronisation errors, and that these errors can be detected by the model checker. • Attribute. Shows how the service endpoint of an interaction style can choose a subset of its clients to which to send messages and introduces interaction-specific properties defined by the designer.

60

4. Midas: A Language for Specifying Interaction Styles

4.0.1. LongSlot Interaction Style The simplest interaction that can be specified in Midas is asynchronous message sending to a one-slot buffer. In this interaction, clients send a message containing a data value - let's assume a 32-bit long integer - to the service. The service holds a single value for consumption by the component. If a value is received at the service before a previous value has been consumed by the component, the previous value is overwritten by the new value.

To define this interaction, we give it a name - let's say “LongSlot” because it is a slot that holds long integer values - and define the message interfaces of the service and client. In this interaction, the service accepts a single message, put, and sends no messages back to the client. The Midas definition of the LongSlot message interfaces is shown in figure 15; the provide statement defines the message interface of the service and the require statement that of the clients. The body of each message interface consists of one or more message definitions. Messages are named and can take any number of named, typed parameters. Unlike operation parameters in CORBA IDL that can be used to pass values back from the server to the client, message parameters are always read-only. Values are returned in Midas interactions by explicitly defining reply messages. 1 2 3 4 5 6 7

interaction LongSlot { provide { put( unsigned long data ); }; require { }; };

Figure 15. The message sets of the LongSlot interaction However, the message interfaces themselves do not fully define the interaction: the formal behaviour of the protocol by which messages are exchanged must be defined. For this we augment the interaction definition with a specification of the protocol using a spec statement. Different systems require different kinds of analysis. Concurrent systems must be checked for deadlock and user defined property violations while designs of real-time systems must include timing constraints. Therefore Midas allows interaction declarations to be annotated with arbitrary specifications, identifying each specification with a string type-name but ignoring the contents of its body. A specification is extracted by a compiler back-end that can parse the specification and process it or pass it to external tools. Each specification must follow some type-specific mapping from the elements of the interaction - name, message interfaces, message names etc. - so that the compiler back-end can associate elements of the specification with elements of the Midas definition.

We use FSP [MK99], a process calculus developed at Imperial College, to model interaction protocols. FSP specifications are given the type string “FSP”. FSP allows the designer to analyse that the interaction protocol meets desired safety properties. For example, the protocol can be checked to ensure that it is free from deadlock, often caused by messages being lost or being transmitted when not expected by the receiver. The designer can also define interaction-specific properties, such as properties to check that finite buffer space within the endpoints is never exceeded. Midas defines intercomponent interaction protocols as separate from transport protocols. This allows a de-

61

4. Midas: A Language for Specifying Interaction Styles signer to select the most appropriate transport protocol for each binding. This separation of concerns must be reflected in the model to allow the designer to analyse aspects of the system that will actually be deployed. Therefore interaction protocols and transport protocols are modelled separately and composed to form models of actual bindings.

The observable behaviour of an interaction protocol is modelled as two FSP constraint properties that specify valid sequences of messages that can be transmitted and received by each end of a binding. The names of these constraints are based on the name of the interaction by capitalising all characters, separating words by underscores, rather than case, and adding the suffix _PROVIDE or _REQUIRE, for the client-side or service-side constraints respectively. The incoming and outgoing message sets of each endpoint are represented by interfaces named in and out, the actions of which are named after the messages in the corresponding message set.

The constraints for the LongSlot interaction are trivial: a client repeatedly sends put messages and a service repeatedly receives put messages, as shown in figure 16. 1 2

property LONG_SLOT_PROVIDE( N=0 ) = ( in.put[0..N] -> LONG_SLOT_PROVIDE ). property LONG_SLOT_REQUIRE( N=0 ) = ( out.put[0..N] -> LONG_SLOT_REQUIRE ).

LONG_SLOT_PROVIDE

LONG_SLOT_REQUIRE

0

0 in.put.0

out.put.0

Figure 16. Constraint properties for the LongSlot interaction Because a constraint property only defines the valid sequences of messages observable at one end of a single binding, further models are required to define the behaviour of the endpoints, including how messages are buffered, whether the service endpoint sends messages to a single client, all clients or a subset of the clients, and the way that a component implementation makes use of the interaction endpoints.

One may ask why one should bother to specify constraint properties in that case. Constraints are useful because they can be combined with endpoint models to double-check the correctness of the model. Furthermore, constraint properties can be translated into filters that can be inserted into a binding to check the correctness of the endpoint implementations. For that reason, the ranges of message parameters are limited in property specifications to a single value so that it is easier to translate the properties into executable code that implements the state machines.

Endpoint behaviour interaction is modelled by two FSP processes, defining the behaviour of the client and service endpoints. The process names are based on the name of the interaction, as above, and suffixed with _CLIENT or _SERVICE.

For example, the LongSlot interaction is described by two FSP processes named

LONG_SLOT_CLIENT and LONG_SLOT_SERVICE. Models of service endpoints take the number of clients as their

first parameter and have an array of in and out interfaces, one for each client.

62

4. Midas: A Language for Specifying Interaction Styles As with the constraint properties of the interaction, the incoming and outgoing message sets of each endpoint model are represented by interfaces named in and out, the actions of which are named after the messages in the corresponding message set. Other actions of the process are assumed to represent the programming interface by which components make use of the interaction protocol. Because many client endpoints can be bound to a single service, the service endpoint has an array of in and out interfaces, one for each client that is bound to it, and must model the (de)multiplexing strategy of the interaction protocol. Therefore service endpoint models take the number of clients as their first parameter. svc : X_SERVICE(C,V)

clt[n:1..C] : X_CLIENT(V) in[n:1..C] out

API actions

out[n:1..C]

API actions

in

Figure 17. The endpoints of an interaction represented graphically as FSP processes For the LongSlot interaction we must model the service endpoint as comprising a single slot. The slot is initially empty; when a message is received, the slot contains a value that can be read by the component through the endpoint's API, causing the slot to become empty once again. If a message is received when the slot is full, the slot then contains the value delivered in the message. The model of the client endpoint is much simpler: whenever the component puts a value into the slot of the service through the client endpoint, the endpoint sends a put message to the service endpoint. These models are included in the complete Midas definition of the interaction, shown in figure 18. Figure 18 graphically depicts the behaviour of the client and service endpoints of a slot that can hold an integer value that is either 1 or 0. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

interaction LongSlot { provide { put( unsigned long data ); }; require { // Nothing }; spec "FSP" { property LONG_SLOT_PROVIDE( N=0 ) = ( in.put[0..N] -> LONG_SLOT_PROVIDE ). property LONG_SLOT_REQUIRE( N=0 ) = ( out.put[0..N] -> LONG_SLOT_REQUIRE ). LONG_SLOT_SERVICE( C=2, N=1 ) = EMPTY, EMPTY = ( in[1..C].put[n:0..N] -> FULL[n] ), FULL[n:0..N] = ( get[n] -> EMPTY ). LONG_SLOT_CLIENT( N=1 ) = ( put[n:0..N] -> out.put[n] -> LONG_SLOT_CLIENT ). }; };

Figure 18. The complete definition of the LongSlot interaction

63

4. Midas: A Language for Specifying Interaction Styles

put.1

in.1.put.1

put.0

0

in.1.put.0

1

2

in.1.put.1

0

1

out.put.0

get.0 out.put.1

in.1.put.0

2

in.1.put.1

in.1.put.0 get.1

LONG_SLOT_SERVICE

LONG_SLOT_CLIENT

Figure 19. Behaviour of the LongSlot endpoints From this definition we can implement endpoint abstractions in some programming language, such as Java. The first step is to compile the Midas source code into Java packages and classes. The mapping from Midas to Java is described in detail in chapter 6. For now, it is enough to know that a Midas interaction is translated into a Java package named after the interaction that contains classes supporting binding and distribution transparency. Endpoints classes are implemented by deriving from base classes generated by the Midas compiler. The FSP models provide an exact specification of the behaviour that must be implemented.

Figure 20 shows the Java implementation of LongSlot client endpoint for use by components that are implemented as single-threaded active objects. The implementation follows the FSP model very closely: the put action of the specification is implemented as the put method which takes an int argument and transmits it to the service by calling the put method of the service’s message interface. The binding to the service is managed by the base class, LongSlotClientStub, and made available to derived classes by the protected binding method. The client pass-

es a back-binding to its message interface as the final parameter of the put method of the service. A call to a message interface may fail due to network or node errors; failures are reported by throwing an exception derived from RegentException which can be caught and handled by the component using the endpoint. 1 2 3 4 5 6 7

public class LongSlotClient extends LongSlotClientStub { public void put( int i ) throws RegentException { binding().put( i, this ); } }

Figure 20. Java implementation of the LongSlot client endpoint Figure 21 shows the Java implementation of the LongSlot service endpoint for use by single-threaded components. Again, the implementation follows the FSP model very closely. Synchronisation is implemented using Java monitors: a thread calling the get method waits on the service endpoint if the slot is empty. Values are delivered to the endpoint by calls to the put method which stores the value in the slot and wakes notifies the thread waiting on the endpoint.

64

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

public class LongSlotService extends LongSlotServiceStub { private int _slot = 0; private boolean _empty = true; public synchronized int get() throws InterruptedException { while( _empty ) wait(); _empty = true; return _slot; } public synchronized void put( int i, LongSlotClientMessages client ) { _slot = i; _empty = false; notify(); } }

Figure 21. Java implementation of the LongSlot service endpoint

4.0.2. Slot Interaction Style The one-slot buffer is a useful interaction style but the LongSlot definition is limited to the transmission of long values. However, the concept of a one-slot buffer is independent of the type of values stored in the slot and it would be better to define and implement a generic one-slot buffer and instantiate it to hold long values by parameterising it by the long type. Midas supports generic definitions of both data structures and interaction types.

A generic one-slot buffer interaction - let's call it “Slot” - is defined similarly to the LongSlot interaction but is parameterised by the type of values held in the slot. The LongSlot interaction can then be defined as an instantiation of the Slot interaction. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

interaction Slot < type T > { provide { put( T data ); }; require { // Nothing }; spec "FSP" { property SLOT_PROVIDE( N=0 ) = ( in.put[0..N] -> SLOT_PROVIDE ). property SLOT_REQUIRE( N=0 ) = ( out.put[0..N] -> SLOT_REQUIRE ). SLOT_SERVICE( C=2, N=1 ) = EMPTY, EMPTY = ( in[1..C].put[n:0..N] -> FULL[n] ), FULL[n:0..N] = ( get[n] -> EMPTY ). SLOT_CLIENT( N=1 ) = ( put[n:0..N] -> out.put[n] -> SLOT_CLIENT ). }; }; typedef Slot LongSlot;

Figure 22. The generic Slot interaction

65

4. Midas: A Language for Specifying Interaction Styles The FSP models of the Slot and LongSlot interactions are identical; by convention we use cardinal values, typically bits, to represent values of generic types in the FSP models of generic interactions. This allows the model to represent values changing without needing to model the types themselves.

In Java, values of the parameter types of a generic interaction are represented as untyped Object references. The full type of these objects are represented as Type objects that encapsulate how to marshal and unmarshal values of the type they represent. Generic interaction endpoints are handed Type objects representing their type parameters when constructed. Type objects are defined for all the primitive Midas types and are generated by the Midas compiler for all user defined types. Thus, generic interaction styles can only be instantiated for types that are known to or written in Midas.

The implementations of the generic Slot client and service is shown in figure 23. As can be seen, the implementation is virtually identical to that of the LongSlot interaction, except that the slot is an untyped Object reference and a Type parameter representing the type of the slot is passed to the constructor of the endpoint. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

public class SlotService extends SlotServiceStub { private Object _slot = null; private boolean _empty = true; public SlotService( Type T ) { super(T); } public synchronized Object get() throws InterruptedException { while( _empty ) wait(); _empty = true; return _slot; } public synchronized void put( Object o, SlotClientMessages client ) { _slot = o; _empty = false; notifyAll(); } } public class SlotClient extends SlotClientStub { public SlotClient( Type T ) { super(T); } public void put( Object o ) throws RegentException { binding().put( o, this ); } }

Figure 23. Java implementation of the generic Slot interaction

66

4. Midas: A Language for Specifying Interaction Styles

4.0.3. Mport Interaction Style The Slot interaction only buffers a single value at the service endpoint. The service can therefore lose values if the component that owns the service reads values from the slot at a slower rate than they are delivered from clients. We can avoid losing values by queuing them at the service. This interaction style is usually known as an asynchronous message port so let's name this interaction type “Mport”.

The first thing we notice is that the message interfaces of Mport and Slot are identical. The difference is in the FSP model and implementation of the service endpoint.

For the FSP specification, we must model a queue of messages. This is modelled as a chain of Q one-slot buffers, MPORT_BUF, that can hold a single value at any time. A service endpoint is defined in terms of the queue process.

The relabelling statements of line 13 specify that messages delivered to the port are put onto the end of the queue and messages received from the port are removed from the head queue. 1 2 3 4 5 6 7 8 9 10 11 12 13

MPORT_BUF( N = 1 ) = ( put[n:0..N] -> get[n] -> MPORT_BUF ). ||MPORT_QUEUE( Q=2, N=1 ) = if Q == 1 then MPORT_BUF(N) else ( b:MPORT_BUF(N) || q:MPORT_QUEUE( Q-1, N ) ) /{ put/b.put, mid/b.get, mid/q.put, get/q.get } @{ put, get }. ||MPORT_SERVICE( C=2, Q=2, N=1 ) = ( q:MPORT_QUEUE(Q,N) ) /{ in[c:1..C].put/q.put, get/q.get }. in .2 .p u t.0 in .1 .p u t.0

in .2 .p u t.0

in .2 .p u t.0

in .1 .p u t.0

in .1 .p u t.0

in .2 .p u t.1 in .1 .p u t.1

g et.1

in .2 .p u t.1

in .2 .p u t.1

in .1 .p u t.1

in .1 .p u t.1

M P O R T _ S E R V IC E

0

1 g et.1

2

3

4

5

g et.1

g et.0

g et.0

g et.0

Figure 24. FSP model of the Mport service endpoint

67

6

4. Midas: A Language for Specifying Interaction Styles

The implementation of the Mport service endpoint follows the specification closely, the main difference being that the queue does not have a fixed maximum size. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

public class MportService extends MportServiceStub { private Queue _queue = new Queue(); public MportService( Type T ) { super(T); } public synchronized Object get() throws InterruptedException { while( _queue.isEmpty() ) wait(); return _queue.next(); } public synchronized void put( Object o, MportClientMessages client ) { _queue.enqueue(o); notify(); } }

Figure 25. Java implementation of the Mport service endpoint

4.0.4. Func Interaction Style The Mport interaction has two main shortcomings: • The client does not know whether its messages have been received by the component owning the service. • Clients can exhaust available memory at the server by sending messages at a higher rate than the component owning the service can process them.

To overcome these shortcomings we need to provide some form of synchronisation between client and service. The simplest form is to use a request/reply interaction: a client sends a request message to the service and must wait for a reply message before sending another request; the reply message acts as an acknowledgement that the service has received the request. As well as limiting the maximum queue size at the service to be the same as the number of its clients, a request/reply interaction allows the service to send a value back to the client, rather like a function call. Therefore, we will call this interaction “Func”. To increase the reusability of the interaction we will define it to be generic, parameterised by the types of request and reply. The message interfaces are shown in figure 26.

68

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8

interaction Func< type REQUEST, type REPLY > { provide { request( REQUEST data ); }; require { reply( REPLY data ); }; };

Figure 26. Message interfaces of the Func To specify the protocol, we first define the constraints for a single binding. It is worth defining these constraints for the Func interaction, compared to the Slot and Mbuf interactions, because the interaction is made up of multiple messages and messages must be sent and received in a specific order. It is therefore useful to check both the FSP models of the endpoints during design time and their implementations at runtime in order to catch errors as quickly as possible. Using properties automates both of these checks. 1 2 3 4

property FUNC_PROVIDE( M=0, N=0 ) = ( in.request[0..M] -> out.reply[0..N] -> FUNC_PROVIDE ). property FUNC_REQUIRE( M=0, N=0 ) = ( out.request[0..M] -> in.reply[0..N] -> FUNC_REQUIRE ).

Figure 27. Constraint properties of the Func interaction The model of the Func client is shown in figure 28. 1 2 3

FUNC_CLIENT( M=1, N=1 ) = ( request[m:0..M] -> out.request[m] -> in.reply[n:0..N] -> reply[n] -> FUNC_CLIENT ).

Figure 28. Model of the Func client endpoint The model of the Func service is more complex. Because clients block waiting for a reply to their requests, requests must be queued; dropping a request will cause a deadlock because the client will never receive a reply to its request. However, when pulling a request from the queue the service needs to know which client sent the request so that it can send the reply to the correct client. Therefore queue must store both the value of the client’s request and the identifier of the client that sent the request. This client identifier corresponds to the back-binding of the Midas binding model, described in section 3.2.1.

The service endpoint itself is composed of the queue and a process, FUNC_SERVICE_IMP, that defines how the component removes requests from the queue and sends replies back to the client that sent the request. The API, represented by the actions request[0..M] and reply[0..N], hide the details of the back-binding from the component. Finally, the relabelling statement of line 4 specifies that requests received from the client are placed onto the back of the queue, and the interface statement of line 5 hides the actions of the queue process, making the actions in, out, request and reply the only observable behaviour of the FUNC_SERVICE process.

69

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

||FUNC_SERVICE( C=2, M=1, N=1 ) = ( FUNC_SERVICE_IMP(C,M,N) || q:FUNC_QUEUE(C,C,M) ) /{ in[c:1..C].request[m:0..M]/q.put[c][m] } @{ in, out, request, reply }. FUNC_SERVICE_IMP( C=2, M=1, N=1 ) = ( q.get[c:1..C][m:0..M] -> request[m] -> reply[n:0..N] -> out[c].reply[n] -> FUNC_SERVICE_IMP ). ||FUNC_QUEUE( Q=2, C=1, M=1 ) = if Q == 1 then FUNC_BUF(C,M) else ( b:FUNC_BUF(C,M) || q:FUNC_QUEUE( Q-1, C, M ) ) /{ put/b.put, mid/b.get, mid/q.put, get/q.get } @{ put, get }. FUNC_BUF( C=1, M=1 ) = ( put[c:1..C][m:0..M] -> get[c][m] -> FUNC_BUF ).

Figure 29. Model of the Func service endpoint We can use the constraint properties to check our endpoint models by building a model of a binding between client and service endpoints and service endpoints within the same address space and combining the properties with that model. This is shown in figure 30. By checking the FUNC_BINDING process with the model checker we verify that our endpoint models match the constraints. 1 2 3 4 5 6 7 8 9

||FUNC_BINDING( C=2, M=1, N=2 ) = ( svc:FUNC_SERVICE(C,M,N) || forall[c:1..C] clt[c]:FUNC_CLIENT(M,N) || forall[c:1..C] clt[c]:FUNC_REQUIRE(M,N) || forall[c:1..C] chk[c]:FUNC_PROVIDE(M,N) ) /{ clt[c:1..C].out/{svc.in[c],chk[c].in}, svc.out[c:1..C]/{clt[c].in,chk[c].out} }.

Figure 30. Model of a binding incorporating constraint properties Again, the implementation of the endpoints for the Func interaction closely follows the FSP specifications. The two points to take note of are: • The client identifier held in the queue of the service endpoint is implemented as the backbinding reference to the client’s message interface that is passed to the service along with the request message. A simple structure is defined so that the Queue can hold both the request object and the backbinding of the requesting client. • The synchronisation between client and service is implemented using message passing, as described in section 3.2.2. The calling thread is blocked by waiting on the monitor of the client endpoint until a request message is received, rather than in the service endpoint.

Additionally, the service class implements the actions request[0..M] and reply[0..N] as the methods getRequest and sendReply, in order to follow Java programming conventions.

70

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

public class FuncService extends FuncServiceStub { private static class QueuedRq { Object data; FuncClientMessages client; QueuedRq( Object d, FuncClientMessages c ) { data = d; client = c; } } private Queue _queue = new Queue(); public FuncService( Type M, Type N ) { super(M,N); } public synchronized Object getRequest() throws RegentException, InterruptedException { while( _queue.isEmpty() ) wait(); return ((QueuedRq)_queue.peek()).data; } public void sendReply( Object d ) throws RegentException { QueuedRq rq; synchronized(this) { rq = (QueuedRq)_queue.next(); } rq.client.reply(d); } public synchronized void request( Object r, FuncClientMessages c ) { _queue.enqueue( new QueuedRq( r, c ) ); notify(); } }

Figure 31. Java implementation of the Func service endpoint

71

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

public class FuncClient extends FuncClientStub { private Object _reply; private boolean _waiting; public FuncClient( Type M, Type N ) { super(M,N); } public Object call( Object rq ) throws RegentException { _waiting = false; binding().request( rq, this ); synchronized(this) { try { while( _waiting ) wait(); Object result = _reply; _reply = null; return _reply; } catch( InterruptedException ex ) { throw new InteractionInterruptedException(ex.getMessage()); } } } public synchronized void reply( Object r ) { _reply = r; _waiting = false; notify(); } }

Figure 32. Java Implementation of Func client endpoint

4.0.5. Event Interaction Style So far we have only considered many-to-one communication in which clients send a request to the service which may respond back to a single client at a time. Event dissemination is an example of one-to-many communication: a service announces typed events to multiple clients. Before a client can receive events it must register with the service, by sending it an enable message, so that the service can store the backbinding to the client through which to send event messages. If the client wants to stop receiving events it sends a disable message; the service responds with a disableAck. The client can later re-enable events. 1 2 3 4 5 6 7 8 9 10

interaction Event< type T > { provide { enable(); disable(); }; require { event( T data ); disableAck(); }; };

Figure 33. Message interfaces of the Event interaction The constraint properties for the Event interaction, shown in figure 34, define this protocol:

72

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6 7 8 9 10 11 12 13

property EVENT_PROVIDE( N=0 ) = DISABLED, DISABLED = ( in.enable -> ENABLED ), ENABLED = ( out.event[0..N] -> ENABLED | in.disable -> DISABLING ), DISABLING = ( out.event[0..N] -> DISABLING | out.disableAck -> DISABLED ). property EVENT_REQUIRE( N=0 ) = DISABLED, DISABLED = ( out.enable -> ENABLED ), ENABLED = ( in.event[0..N] -> ENABLED | out.disable -> DISABLING ), DISABLING = ( in.event[0..N] -> DISABLING | in.disableAck -> DISABLED ). in.enable

-1

in.disable

0

1

2 out.event[0..N ]

ou t.event[0..N ]

out.ev ent[0 ..N ] in.d isable out.disableA ck

o ut.d isableA ck

in .en ab le out.d isableA ck in.enable in .disable

E V E N T _P R O V ID E (N ) out.disable

out.enable

-1

0

1

2 in.event[0..N]

in.event[0..N]

in.event[0..N] out.disable in.disableAck

in.disableAck

out.enable in.disableAck out.enable out.disable

EVENT_REQUIRE(N)

Figure 34. Constraint properties of the Event interaction Figure 35 shows the behaviour of the Event service endpoint. Typically, one would implement an Event service endpoint to maintain a list of back-bindings to the clients that have enabled event reception. On reception of an enable message the service would append the backbinding to the sender onto the list; on reception of a disable message the service would remove the sender’s backbinding from the list. However, FSP does not allow the modelling of processes that change dynamically so the service behaviour is modelled as multiple EVENT_ANNOUNCER processes each of which manages the dynamic enabling and disabling of event transmission to a single client.

73

4. Midas: A Language for Specifying Interaction Styles All EVENT_ANNOUNCER processes share the announce API action of the event service. They start in the DISABLED state, in which they accept the announce action but do nothing in response. When they receive an enable message from their client they enter the ENABLED state. In this state they react to announce actions by sending an event message to their client. They remain in the ENABLED state until they receive a disable message at which point they return to the DISABLED state.

Note also that the service side constraint properties are composed with the model of the service endpoint, rather than with the model of the binding itself. This reduces the intermediate state space of any models using Event interactions with constraint properties. 1 2 3 4 5 6 7 8 9 10 11 12

||EVENT_SERVICE( C=2, N=1 ) = ( forall[c:1..C] EVENT_ANNOUNCER(N,c) || forall[c:1..C] chk[c]:EVENT_PROVIDE(N) ) /{ out[c:1..C] / chk[c].out, in[c:1..C] / chk[c].in }. EVENT_ANNOUNCER( N=1, C=1 ) = DISABLED, DISABLED = ( in[C].enable -> ENABLED | announce[0..N] -> DISABLED ), ENABLED = ( announce[n:0..N] -> out[C].event[n] -> ENABLED | in[C].disable -> out[C].disableAck -> DISABLED ).

Figure 35. FSP model of the Event service endpoint Figure 36 shows the model of the Event client endpoint. Events received at the client are queued for consumption by the component owning the client endpoint. The queue is defined similarly to that of the Mport and Func endpoints. The EVENT_CLIENT_IO process models the behaviour of the client endpoint, showing how its state changes in response to API actions and how received event messages are only placed onto the event queue when the client is in the ENABLED state. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

||EVENT_CLIENT( N=1, Q=2 ) = ( EVENT_IO(N) || q:EVENT_QUEUE(N,Q) || EVENT_REQUIRE(N) ) /{ receive[n:0..N]/q.get[n] } @{ receive[0..N], enable, disable, in, out }. EVENT_IO( N=1 ) = DISABLED, DISABLED = ( enable -> out.enable -> ENABLED ), ENABLED = ( in.event[n:0..N] -> q.put[n] -> ENABLED | disable -> DISABLE_OUT ), DISABLE_OUT = ( out.disable -> DISABLE_ACK | in.event[0..N] -> DISABLE_OUT ), DISABLE_ACK = ( in.disableAck -> DISABLED | in.event[0..N] -> DISABLE_ACK ). ||EVENT_QUEUE( N=1, Q=2 ) = if Q == 1 then EVENT_BUF else ( b:EVENT_BUF(N) || q:EVENT_QUEUE(N,Q-1) ) /{ put/b.put, mid/{b.get,q.put}, get/q.get } \{mid}.

74

4. Midas: A Language for Specifying Interaction Styles 22

EVENT_BUF(N=1) = ( put[n:0..N] -> get[n] -> EVENT_BUF ).

Figure 36. FSP Model of the Event client endpoint A notable feature of this model is that disabling the endpoint is modelled by two explicit intermediate states, DISABLE_OUT and DISABLE_ACK, rather than a simple transition from the ENABLED to the DISABLED state. This

is to avoid synchronisation problems caused by the service being able to send messages to the client at any time, without needing an initial request from the client.

One common synchronisation problem is caused by both endpoints blocking the reception of messages that are being sent by the other party: a classic deadlock situation. The model and code in table 1 illustrate how this can happen when endpoints are directly bound within the same address space: the client will not allow an event message to be delivered between the disable and out.disable actions and the service will not allow an in.disable action between the announce and out.event actions. This form of deadlock can be avoided by following the synchronisation rules described in section 3.2.2. The introduction of the DISABLE_OUT state allows the client endpoint to receive and ignore event messages while reacting to a disable action, and explicitly highlights where synchronisation errors might occur in the implementation if the programmer is not careful to follow the intercomponent synchronisation rules. 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

EVENT_IO( N=1 ) = DISABLED, DISABLED = ( enable -> out.enable -> ENABLED ), ENABLED = ( in.event[n:0..N] -> q.put[n] -> ENABLED | disable -> out.disable -> in.disableAck -> DISABLED ). EVENT_ANNOUNCER( N=1, C=1 ) = DISABLED, DISABLED = ( in[C].enable -> ENABLED | announce[0..N] -> DISABLED ), ENABLED = ( announce[n:0..N] -> out[C].event[n] -> ENABLED | in[C].disable -> out[C].disableAck -> DISABLED ). public class EventClient extends EventClientStub { ... public synchronized void disable() throws ... { binding().disable(this); } public synchronized void event( Object ev ) { ... } ... } public class EventService extends EventServiceStub { ... public synchronized void announce( Object ev ) throws ... { for each client, c, do { c.event(ev); } } ... }

Table 1. A common synchronisation error detected by static analysis

75

4. Midas: A Language for Specifying Interaction Styles The disableAck message is used to avoid a problem caused by transport connections delaying messages that are sent between address spaces when the interaction is not request/reply. If an explicit acknowledgement was not included in the protocol it would be possible for the client to send a disable request and enter the DISABLED state while an event message from the service is in transit by the transport connection. This would result in an event message being received by the client when it is in a state in which that message is invalid, as shown in figure 37. The introduction of the disableAck message allows the client to wait in the DISABLE_ACK state until it has been informed that no more events will be sent to it.

anEventClient

Enters DISABLED state

anEventService

disable()

event(x)

Error!

Figure 37. Potential error caused by binding between address spaces

4.0.6. Attribute Interaction Style The Attribute interaction demonstrates an interaction style in which the service can decide to send messages to a subset of its clients. The Attribute interaction defines typed attributes exposed at a component’s interface, also known as “properties”. An Attribute service maintains a value for the owner of the service and exposes that value for use by clients. Clients can query the current value of a component’s attribute. Whenever a component changes the value of its attribute, the new value is transmitted to all clients. Clients can also try to set the value of the Attribute; the component that owns the attribute can accept or refuse the new value. When a client changes the value of the attribute, an accept or refuse notification is sent back to the client requesting the change and an update is sent to all other clients if the change was accepted.

76

4. Midas: A Language for Specifying Interaction Styles The UML [BJR97] sequence diagram of figure 38 shows potential collaborations between an Attribute service and its clients. Solid arrow heads signify procedure calls, dashed arrows signify threads returning from procedure calls and half arrow heads indicate asynchronous messages. Client Component

1

AttributeClient Endpoint

getValue

AttributeService Endpoint

Other AttributeClient Endpoints

Server Component

getValue

update

2

waitForChange setValue

update

update

getValue

3

setValue

getProposedValue setValue

setAck

acceptValue

update

Figure 38. Example executions of the Attribute interaction As shown in step 1 of figure 38, the client endpoint initialises itself the first time the component requests the attribute’s value by sending a getValue request to the service; the service responds with an update message containing the current value of the attribute. Further queries to the client endpoint for the attribute’s values will return a cached copy of the attribute’s value.

The component owning an attribute service can change the value of the attribute. As shown in step 2, the service endpoint then notifies clients by sending them update messages containing the new value of the attribute. Client endpoints react to this message by updating their cached value and informing their owning component that the value has changed.

Clients can request a new value of the attribute. As shown in step 3, they do so by sending a setValue message containing the proposed new value of the attribute. The server component can accept or refuse proposed values. If they accept the value, as is shown in step 3, the service sends a setAck message back to the client that proposed the value and notifies all other components of the new value by sending them an update message containing that 77

4. Midas: A Language for Specifying Interaction Styles value. If the server component refuses the proposed value, the service sends a setNak message back to the client that proposed the value. Figure 39 shows the Midas definition of the message interfaces and protocol constraints for this protocol, and figure 40 graphically depicts the protocol constraints. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

interaction Attribute< type T > { provide { getValue(); setValue( T value ); }; require { update( T value ); setAck(); setNak(); }; spec “FSP” { property ATTRIBUTE_PROVIDE( V=0 ) = READY, READY = ( in.getValue -> out.update[0..V] -> READY | out.update[0..V] -> READY | in.setValue[0..V] -> REPLY ), REPLY = ( out.update[0..V] -> REPLY | out.{setAck,setNak} -> READY ). property ATTRIBUTE_REQUIRE( V=0 ) = READY, READY = ( out.getValue -> in.update[0..V] -> READY | in.update[0..V] -> READY | out.setValue[0..V] -> REPLY ), REPLY = ( in.update[0..V] -> REPLY | in.{setAck,setNak} -> READY ). }; };

Figure 39. Message sets and constraint properties of the Attribute interaction

78

4. Midas: A Language for Specifying Interaction Styles

out.getValue

out.setValue[0..N]

-1

0

1

in.update[0..N]

in.setAck

in.setAck

in.setNak

in.setNak out.getValue

in.update[0..N]

2

out.update[0..N]

2

in.update[0..N]

out.setValue[0..N] out.getValue out.setValue[0..N] in.setAck in.setNak

ATTRIBUTE_REQUIRE(N) in.getValue

in.setValue[0..N]

-1

0

out.update[0..N]

out.setAck

out.setAck

out.setNak

out.setNak

1

out.update[0..N]

in.getValue in.setValue[0..N] in.getValue in.setValue[0..N] out.setAck out.setNak

A T T R IB U T E _P R O V ID E (N )

Figure 40. Constraint properties of the Attribute interaction The ATTRIBUTE_CLIENT process modelling the client side of the Attribute interaction, shown in figure 41, is the composition of a model of the implementation of the client endpoint, ATTRIBUTE_CLIENT_IMP, and the ATTRIBUTE_REQUIRE property that checks the model’s conformance to the interaction protocol. The ATTRIBUTE_CLIENT_IMP process is the actual model of the client endpoint’s behaviour.

Line 4 states that a client endpoint starts in the UNSET state in which it does not know the value of the attribute service to which it is bound. The UNSET state is defined by lines 5 to 7; the client can acquire the value from an unsolicited update message, transmitted when another component updates the attribute’s value, or may request the value by sending a getValue message and receiving an update message in reply. Either trace will result in

79

4. Midas: A Language for Specifying Interaction Styles the client entering the state VALUE[v], where v is the current value of the attribute. This definition allows for both eager and lazy initialisation of the client: the only constraint is that the client endpoint be in one of the VALUE states before a component first synchronises on its getValue API action. From a programmer’s point of view, if a component calls getValue the client must request and receive the value before the getValue action returns. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

||ATTRIBUTE_CLIENT( V=1 ) = ( ATTRIBUTE_CLIENT_IMP(V) || ATTRIBUTE_REQUIRE(V) ). ATTRIBUTE_CLIENT_IMP( V=1 ) = UNSET, UNSET = ( in.update[v:0..V] -> VALUE[v] | out.getValue -> in.update[v:0..V] -> VALUE[v] ), VALUE[v:0..V] = ( getValue[v] -> VALUE[v] | setValue[w:0..V] -> SENDING[v][w] | in.update[w:0..V] -> CHANGED[w] ), CHANGED[v:0..V] = ( valueChanged -> VALUE[v] | getValue[v] -> VALUE[v] | setValue[w:0..V] -> SENDING[v][w] | in.update[w:0..V] -> CHANGED[w] ), SENDING[v:0..V][w:0..V] = ( out.setValue[w] -> WAITING[v][w] | in.update[x:0..V] -> SENDING[x][w] ), WAITING[v:0..V][w:0..V] = ( in.update[x:0..V] -> WAITING[x][w] | in.setAck -> VALUE[w] | in.setNak -> VALUE[v] ).

Figure 41. FSP model of the client endpoint of the Attribute interaction. The VALUE states, defined in lines 8 to 11 define the behaviour of client endpoint when it has been informed of the attribute’s value and the component is processing the current value. Any call to the getValue return the current value. When an update message is received from the service, component is notified of the change of value, via the valueChanged event, and the client enters the VALUE state for the new value. A call by the component to setValue API action causes the endpoint to enter the SENDING state, from which it sends a setValue message

to the service, enters the WAITING state and waits for a setAck or setNak in reply. However, because the service can also send update messages to the client at any time, the client must be willing to receive the update messages while performing the request/reply transaction, in accordance to the Midas model of asynchronous and concurrent communication described in section 3.2.2. Thus both the SENDING and WAITING states accept incoming update messages and keep track of the current value of the attribute until a setAck or setNak message is received. A setAck message is handled by updating the local value to that requested by the client, while a setNak message is

handled by resetting the local value to the current value of the remote service.

The ATTRIBUTE_SERVICE model, shown in figures 42, is more complex, because it must specify how the service interacts with multiple clients and queues requests to be handled by the service owner. Because of the necessity to handle incoming and outgoing messages concurrently, the reception and queuing of messages and the storing of

80

4. Midas: A Language for Specifying Interaction Styles and updates to the attribute’s value are specified as two separate processes, ATTRIBUTE_QUEUE and ATTRIBUTE_VALUE. An ATTRIBUTE_PROVIDE property is composed to verify each binding from client to service

by line 4 and by the relabelling statements of lines 7 and 8. 1 2 3 4 5 6 7 8 9 10

||ATTRIBUTE_SERVICE( C=2, V=1 ) = ( ATTRIBUTE_VALUE(C,V) || q:ATTRIBUTE_QUEUE(C,C,V) || forall [c:1..C] check[c]:ATTRIBUTE_PROVIDE(V) ) /{ in[c:1..C].setValue[v:0..V] / q.in.setValue[c][v], in[c:1..C] / check[c].in, out[c:1..C] / check[c].out } @{ in, out, proposal, accept, refuse, setValue, getValue }.

Figure 42. FSP model of the service endpoint of the Attribute interaction The ATTRIBUTE_VALUE process shown in figure 43 models how the Attribute service stores the value of the attribute and handles local and remote updates to that value. The process starts in the VALUE[0] state. If a getValue request is received, an update reply is sent back to the client with the current value. If the component owning the service calls the setValue API action to set the attribute to a new value, the process sends an update message containing the new value to all the clients. If a setValue request is queued for processing, the process notifies the component of the proposed value and enters the PROPOSED state indexed by the requesting client, the current value and the old value. The component can either refuse the new value, in which case a setNak reply is sent back to the client before the process returns to the VALUE state for the original value, or accept the value, in which case the process enters the UPDATE_ACK loop in which it sends a setAck message to the requesting client and an update with the new value before returning to the VALUE state for the new value. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

ATTRIBUTE_VALUE( C=2, V=1 ) = VALUE[0], VALUE[v:0..V] = ( setValue[w:0..V] -> if w == v then VALUE[w] else UPDATE[1][w] | getValue[v] -> VALUE[v] | in[c:1..C].getValue -> out[c].update[v] -> VALUE[v] | q.out.setValue[c:1..C][w:0..V] -> proposal[w] -> PROPOSED[c][v][w] ), UPDATE[c:1..C][v:0..V] = ( out[c].update[v] -> if c == C then VALUE[v] else UPDATE[c+1][v] ), PROPOSED[c:1..C][v:0..V][w:0..V] = ( accept -> UPDATE_ACK[1][c][w] | refuse -> out[c].setNak -> VALUE[v] ), UPDATE_ACK[c:1..C][d:1..C][v:0..V] = ( when (c == d) out[c].setAck -> UPDATE_CNT[c][d][v] | when (c != d) out[c].update[v] -> UPDATE_CNT[c][d][v] ), UPDATE_CNT[c:1..C][d:1..C][v:0..V] = if( c == C ) then VALUE[v] else UPDATE_ACK[c+1][d][v].

Figure 43. FSP model of the value held by an Attribute service.

81

4. Midas: A Language for Specifying Interaction Styles The queuing of setValue requests is modelled by the ATTRIBUTE_QUEUE process shown in figure 44. The queue is defined recursively in terms of one-slot buffers that hold setValue requests for a particular value from an identified client. The client identifier, which corresponds to the back-binding to the client, as described in section 3.2.1, must be queued so that the service can send replies to the client and update all other clients. 17 18 19 20 21 22 23 24 25 26

||ATTRIBUTE_QUEUE( Q=1, C=1, V=1 ) = if Q == 1 then ATTRIBUTE_BUF(C,V) else ( b:ATTRIBUTE_BUF(C,V) || q:ATTRIBUTE_QUEUE(Q-1,C,V) ) /{ in/b.in, mid/{b.out,q.in}, out/q.out } @{ in, out }. ATTRIBUTE_BUF( C=1, V=1 ) = ( in.setValue[c:1..C][v:0..V] -> out.setValue[c][v] -> ATTRIBUTE_BUF ).

Figure 44. FSP model of how client requests are queued at an Attribute service. In practice, the service does not know how many clients will be bound to it, and so queue size is limited by available memory. However, FSP is not able to model an unbounded queue. Luckily the maximum size of the queue is guaranteed to be the same as the number of clients bound to the service because the interaction protocol does not allow a client to send a request while a reply to a previous request is pending. We can verify this constraint by defining a further property to be composed into the ATTRIBUTE_SERVICE definition that checks that the queue does not overflow. This property is shown in figure 45. 1 2 3

property ATTRIBUTE_QUEUE_OVERFLOW( Q=2, C=2, V=1 ) = QUEUE[0], QUEUE[q:0..Q] = ( when q < Q q.in.setValue[1..C][0..V] -> QUEUE[q+1] | when q > 0 q.out.setValue[1..C][0..V] -> QUEUE[q-1] ). q.in.setV alue[1..C ][0..V ]

-1

0 q .out.setV alue[1..C ][0..V ]

q.in.setV alu e[1..C ][0..V ]

1 q.out.setV a lue[1..C ][0..V ]

2 q.out.setV alue[1..C ][0..V ]

q.in.setV alue.[1..C ][0..V ]

ATTRIBUTE_QUEUE_OVERFLOW

Figure 45. The property to check that the queue of an Attribute service does not overflow.

4.1. Modelling Bindings in FSP A local binding between endpoints is modelled by using the FSP relabelling operator to associate the in and out interface of each client endpoint with the service’s out and in interfaces with the same index.

82

4. Midas: A Language for Specifying Interaction Styles 1 2 3

||ATTRIBUTE_LOCAL_BINDING( C=2, V=1 ) = ( s:ATTRIBUTE_SERVICE(C,V) || forall c[n:1..C]:ATTRIBUTE_CLIENT(V) ) /{ c[n:1..C].out/s.in[n], s.out[n:1..C]/c[n].in }.

Figure 46. A model of C clients bound to a server in the same address space. Remote binding is implemented by connecting endpoints to two ends of a process representing a duplex transport connection. A process modelling the transport connection has an array of two endpoint interfaces, named end[1] and end[2], each of which has two sub-interfaces, named send and recv, through which messages are sent and received. The send and recv interfaces are themselves arrays; the indexed elements of each interface are relabelled to the identifiers of the messages sent or received through that interface. The definition of the process modelling the connection is parameterised by the number of messages and buffers in each direction. An example remote binding for the Attribute interaction is shown in figure 47. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

||TRANSPORT_CONNECTION( M1=1, B1=2, M2=1, B2=2 ) = ... ||ATTRIBUTE_REMOTE_BINDING( C=1, B1=2, B2=2 ) = ( forall [n:1..C] c[n]:ATTRIBUTE_CLIENT || s:ATTRIBUTE_SERVICE(C) || forall [n:1..C] t[n]:TRANSPORT_CONNECTION( 2, B1, 3, B2 ) /{ c[n:1..C].out.getValue / t[n].end[1].send[1], c[n:1..C].out.setValue / t[n].end[1].send[2], c[n:1..C].in.update / t[n].end[1].recv[1], c[n:1..C].in.setAck / t[n].end[1].recv[2], c[n:1..C].in.setNak / t[n].end[1].recv[3], s.out[n:1..C].update / t[n].end[2].send[1], s.out[n:1..C].setAck / t[n].end[2].send[2], s.out[n:1..C].setNak / t[n].end[2].send[3], s.in[n:1..C].getValue / t[n].end[2].recv[1], s.in[n:1..C].setValue / t[n].end[2].recv[2] }.

Figure 47. Modelling bindings over a transport protocol At first glance, it would appear that FSP is not well suited to model distributed interactions that follow the Midas model. The Midas model of interaction is that protocol endpoints communicate by sending asynchronous messages but FSP processes synchronise through shared actions, equivalent to a synchronous procedure call, and must have a finite number of states, so making it impossible to model completely asynchronous communication. However, in practice, reliable transport protocols provide only a finite amount of asynchrony between sender and receiver because they perform buffering and flow control as part of their mechanisms to ensure reliability and in-order delivery, and networks can only queue a finite number of messages before becoming congested and discarding messages.

Message buffering is modelled using single-slot buffers, shown in figure 48, that are repeatedly used to send and receive messages: a sender shares the send actions and a receiver shares the recv actions. Each buffer can hold M different message types. A transport connection is modelled by composing buffers so that up to some maximum number messages can be in transit over the connection at any one time.

83

4. Midas: A Language for Specifying Interaction Styles 1

BUF( M=2 ) = ( send[m:1..M] -> recv[m] -> BUF ).

Figure 48. One-slot buffer modelled in FSP. Ordered, reliable, simplex connections that buffer multiple messages are modelled as chains of BUF processes in which subsequent buffers receive messages from the buffer in front. Duplex connections are modelled as two simplex connections carrying messages in each direction. These two connections are hidden; the interfaces end1 and end2 are exposed to represent the two ends of the duplex connection at which messages can be sent and received.

RODCX s : ROSXC send

send

BUF

recv

recv

BUF

BUF

end1

end2

t : ROSXC recv

2 3 4 5 6 7 8 9 10 11 12 13 14 15

send

||ROSCX( M=1, B=2 ) = if B == 1 then BUF(M) else (BUF(M)/{mid/recv} || ROSCX( M, B-1 )/{mid/send} ) @{send,recv}. ||RODCX( M1=1, B1=2, M2=1, B2=2 ) = (s:ROSCX(M1,B1) || t:ROSCX(M2,B2)) /{ end[1].send / s.send, end[1].recv / t.recv, end[2].send / t.send, end[2].recv / s.recv } @{end[1],end[2]}.

Figure 49. Reliable, ordered, simplex and duplex connections Unreliable, ordered connections are modelled using reliable connections with the first buffer replaced by one that may discard messages. Such connections can be used to model streaming protocols such as ATM.

84

4. Midas: A Language for Specifying Interaction Styles

s : UOSXC send

send

1 2 3 4 5 6 7 8

recv

recv

UBUF

BUF

BUF

UBUF( M=1 ) = ( send[i:1..M] -> {discard,recv[i]} -> UBUF ). ||UOSCX( M=1, B=2 ) = if B == 1 then UBUF(M) else (UBUF(M)/{mid/recv} || ROSCX( M, B-1 )/{mid/send} ) @{send,recv}.

Figure 50. Unreliable, ordered, simplex connection. Unordered, reliable, simplex connections are modelled using B one-slot buffers that work in parallel. An input process non-deterministically selects one empty buffer to which to pass each transmitted message and an output processes non-deterministically selects one full buffer whose message to deliver. Thus up to B messages can be in transit at any time and can be delivered in any order. Unordered, reliable connections are useful for modelling reliable multicast protocols.

RUSCX BUF

send

BUF

send

FAN OUT

FAN IN

recv

recv

BUF

1 2 3 4 5 6 7

FAN_OUT( M = 1, B = 2 ) = ( send[m:1..M] -> buf[1..B].send[m] -> FAN_OUT ). FAN_IN( M = 1, B = 2 )

= ( buf[1..B].recv[m:1..M] -> recv[m] -> FAN_IN ).

||RUSCX( M = 2, B = 2 ) = ( FAN_OUT(M,B) || forall[b:1..B] buf[b]:BUF(M) || FAN_IN(M,B) ) @{send,recv}.

Figure 51. Reliable, Unordered Simplex Connection Unreliable, unordered simplex connections are based on reliable, unordered connections but the FAN_OUT process is replaced by one that can discard messages. This model can be used to represent communication over datagram protocols such as UDP/IP.

85

4. Midas: A Language for Specifying Interaction Styles 1 2 3 4 5 6

UFAN_OUT( M = 1, B = 2 ) = ( send[m:1..M] -> {discard,buf[1..B].send[m]} -> UFAN_OUT ). ||UUSCX( M = 2, B = 2 ) = ( UFAN_OUT(M,B) || forall[b:1..B] buf[b]:BUF(M) || FAN_IN(M,B) ) @{send,recv,discard}.

Figure 52. Unreliable, unordered simplex connection Like reliable, ordered connections, duplex unreliable and unordered connections are composed from two simplex connections carrying messages in opposite directions.

4.2. Modelling Components in FSP A primitive application component is modelled as a composite process composed of processes representing the interaction objects that it uses and a primitive process that shares API actions with those interaction processes. Only the interaction objects are exposed at the interface of the composite object. These models are generated automatically by the Darwin ADL compiler. COMP

getValue[0..V]

a: ATTRIBUTE_SERVICE(C,V)

setValue[0..V]

in[1..C]

a

proposal[0..V] accept

out[1..C]

refuse

p _COMP a p : PIPE_CLIENT in

p

write out

Figure 53. A primitive component modelled as FSP processes Primitive components models are composed to form models of composite components. Internal bindings between the endpoints of subcomponents may be modelled by a process representing a transport connection or be a direct local binding, as described in section 4.1. Following the semantics of Darwin defined in [MDEK95], bindings that expose provided or required endpoints of a subcomponent at the interface of the composite component, are never modelled by transport connections.

86

4. Midas: A Language for Specifying Interaction Styles

4.3. Supporting Endpoint Implementation 4.3.1. Guiding Implementation The Midas compiler translates interaction declarations into endpoint stubs and the “glue” that supports communication between interaction endpoints but not the implementation of the endpoints themselves. The programmer must extend the endpoint stubs to interface the endpoint with the internal implementation of their components. The FSP specification of the endpoint behaviour included in the Midas definition provides an exact specification of the protocol to be implemented.

However, programmers do not enjoy deciphering formal specifications to determine the behaviour that they must implement. It is important to support programmers by helping them understand the protocols that they must use. A powerful aid to understanding is the use of visualisation tools, particularly those that use animation and allow the interactive exploration of a model. The use of formal models in a Midas interaction definition allows the models to be passed to animation tools that allow the user to explore the state machines by hand, manually selecting which events get triggered, and graphically highlight the transitions being executed in response to the events. Plug ins are provided to visualise message passing between client and service endpoints in a more natural manner.

Figure 54. An animator allowing interactive exploration of a Midas interaction protocol Further support for the programmer implementing endpoints is provided by generating test code from the FSP properties that specify the observable behaviour of the endpoints, as described in section 4.3.2.

87

4. Midas: A Language for Specifying Interaction Styles

4.3.2. Runtime Verification Midas interaction declarations include formal specifications of the interaction protocol that programmers can follow implementing endpoints. However, programmers must still test each implementation in order to be confident that it conforms to the protocol. Writing such tests is tedious and, itself, error prone. Therefore, a back-end to the Midas compiler is used to automatically generate this code from the formal specification of the protocol. This has the further advantage of providing a tangible benefit to the programmer and will therefore help ensure that the design information is included in the Midas declarations and kept up to date.

Test code that verifies endpoints’ conformance to the interaction protocol is implemented as interaction filters, as described in section 3.2.3. Test filters are generated from the FSP properties that specify the observable behaviour of each end of the binding. The filters execute the state machines defined by the properties, reacting to incoming and outgoing messages by performing state transitions. Any transitions to the error state are reported to the application by notifying monitoring agents of the error event; events can be announced through Midas services, allowing monitoring agents to be remote from the components being tested. Erroneous transmissions can also be reported by throwing an exception to be handled by the component that owns the endpoint; this behaviour can be turned on or off at runtime through the test filter’s control interface. Once in the error state, the binding is deactivated: the test filter discards any transmitted and received messages, including those that caused the transition to the error state. The binding can be reset to a valid state through a control interface on the test filter. Client Address Space

Service Address Space

Client Component

Service Thread(s) Transport Connection

Agent Address Space

Monitoring Agent Transport Connection

Figure 55. Test filters can be monitored by remote management agents

4.4. Summary This chapter has introduced the Midas language that is used to define interaction styles that follow the interaction model introduced in chapter 2. The advantage of Midas is that it supports both the design-time analysis of interaction styles and the construction of components and systems that make use of those interaction styles. Design is supported by including specifications of the interaction protocol within the Midas declarations. Construction is

88

4. Midas: A Language for Specifying Interaction Styles supported by translating Midas definitions into implementation language constructs that provide the “glue” to connect components using the interaction style; the design of the Java runtime support for Midas interactions is described in detail in chapter 6. Midas links the design and construction phases by generating objects that can be inserted into bindings to check that components interacting over those bindings conform to the interaction protocol specified by the Midas definition.

Midas interactions are independent from the transport used to carry interaction messages between endpoints. The Midas runtime model supports efficient interactions within the same address space and allows the transport protocol to be selected at run time to transport interactions between address spaces. The framework for implementing transport protocols and the mechanisms used to dynamically load and compose protocol components at run time are described in chapter 7.

89

5. Case Study: On-line Music Shop

Case Study: On-line Music Shop

Since Aristotle... the main emphasis in [man’s] language... has been on the identification of objects rather than on the relationships between objects. The 13th Floor Elevators

5.1. Introduction This chapter illustrates the benefits of using and Darwin and Midas to design a distributed application: an on-line record shop server and client. The design highlights the need for multiple forms of interaction between system components and different transport protocols for individual bindings. The chapter show the use of Midas to design and analyse interaction types and shows how generic interaction styles can be used to capture common forms of interaction for future reuse.

5.2. Overview The selling of digital music tracks over public computer networks is an application that is receiving a great deal of interest from both consumers and industry. In such a scenario, multiple service providers maintain databases of digital music tracks. A client wanting to buy music browses the tracks available at an on-line record store and can listen to streamed samples of tracks in which they are interested before paying for and downloading high-quality versions of the files onto their local computer or hi-fi. Client

Record Shop

Media Stores

Browse

Purchase

Control

Download Music

HiFi

Preview Music

Download Music

Figure 56. Overview of the on-line record store application.

90

5. Case Study: On-line Music Shop An informal overview of the system is shown in figure 56. The on-line record store is accessed through a component that maintains the database of information about the music in the store. The music itself is stored in one or more media stores. Components can be instantiated on these media stores to stream a low-resolution preview of a track to the client or to upload a track to the client’s hi-fi. The client program is made up of components that allow the user to visually browse the contents in the store, receive and play streams of audio, and download purchased files onto the client’s computer or hi-fi. Further components are used within the major application components to stream audio data to and from disk or audio devices and to process audio streams, to convert between formats for example.

These components interact in different ways and different interactions, even separate uses of the same interaction type, need different qualities of service and levels of security. The client browses the contents of the music store by invoking request/reply operations over a reliable connection. When requesting a preview of a track, the client will receive a stream of continuous media, which does not have to be reliable but requires some guaranteed bandwidth and maximum jitter. When requesting purchase of one or more files, the client again uses a request/reply transaction over a reliable connection; however, unlike the connection used for browsing, the connection used for requesting a purchase must also be secure. Finally the music files are transferred to the client using over a pipe that efficiently transfers large amounts of data; this interaction also requires a reliable, secure connection.

The rest of this section will demonstrate how the architecture of the system is described as components using the Darwin ADL and how interactions between those components are defined using Midas and bindings are implemented using the Regent runtime environment.

91

5. Case Study: On-line Music Shop

5.3. System Architecture The Record Shop is a client/server application. The Record Shop itself is implemented as a server that exports its services into a name service. Clients of the Record Shop import the services of a shop from the name service. The architecture of the server is shown in figure 57 and that of the client in figure 58. RecordShop( name : string, n : int ) store[s:0..n-1] : MediaStore(s) @Host(s+1) c : ContentsDir contents : ContentsAttribute

dyn

name + "/browse"

contents[0..n-1] src : AudioSource

str : MediaStream @Protocol( "order/cod/udp")

preview[0..n-1] preview

browse : Browse

dir : Directory(s)

name + "/preview"

purchase @Protocol("secure/tcp")

upload[0..n-1]

bank dyn u : Upload pipe : Pipe

preview

name + "/purchase"

name + "/bank"

Figure 57. Architecture of the record shop server. The server is composed of multiple MediaStore components that manage storage of music tracks on a single physical host and a single Directory component that maintains a hierarchical listing of the tracks on all MediaStores. The directory acts as a Facade [GHJV94], hiding the individual MediaStores from clients by directing preview or download requests for individual tracks to the appropriate MediaStore. The Directory component provides three request/reply services: “browse” through which clients can query the tracks in the shop before purchase, “preview” through which shoppers can request a stream of low-resolution audio be transmitted to their local client and “purchase” through which clients can authorise purchase of a music track by credit card and arrange for a high-resolution version of the track to be downloaded to their local machine.

92

5. Case Study: On-line Music Shop The architecture of the client is simpler than that of the server. Its functionality is mostly implemented by the Browser component that provides a graphical user interface through which the customer may interactively browse the tracks on the Record Shop server, select tracks to preview and enter purchase details. The Browser requires the services exported by the server and imports them from the name service. The browser can dynamically instantiate components to receive streamed previews of music tracks and download track data to the local filestore. RecordClient( server : string ) dyn

src : AudioSource

server + "/browse"

str : MediaStream

browse : Browse

preview b : Browser download

dyn

preview

server + "/preview"

purchase

d : Download pipe : Pipe @Protocol( "secure/tcp")

server + "/purchase"

Figure 58. Architecture of the record shop client The client can request a streamed preview of a track by invoking the Directory’s “preview” service, passing the name of the track as the request and receiving a reference to the MediaStream service that is transmitting the preview as a reply. The Directory reacts to a preview request by mapping the track name to the MediaStore that contains that track and then invoking the “preview” service of that MediaStore. The “preview” service of a MediaStore component on the server is a “worker” service of type AudioSource, indicated by the Darwin “dyn” keyword. Invocations on the service create a new AudioSource component and return a reference to a control interface on the new component that can be used to obtain the references of its provided services and bind its required services. In this case, the Directory obtains the reference of the source’s “str” service and returns it to the client so that the client can bind to it and start receiving the audio stream. Continuous media data that is transmitted for real-time human consumption does not need reliable delivery. Indeed, reliable delivery will degrade the quality of presentation because reliable protocols delay later frames while retransmitting earlier frames that have been lost. Therefore the “str” service of the AudioSource is provided over the transport protocol “order/cod/udp”, a composite protocol that implements unreliable but in-order delivery of messages (order) over a light-weight connection protocol (cod) over UDP/IP.

When the client wants to arrange purchase of a track it instantiates a Download component that receives blocks of raw data through its Pipe service named “input” and writes them to a file. The Pipe interaction is used for downloading large amounts of data in preference to the Port or Entry interactions because it uses a windowed flow-control protocol that allows some concurrency between sender and receiver but avoids overflowing the buffers of the receiver. The reference of the Download component’s input pipe is sent to the server as a field of the purchase request, along with the name of the track required and payment details. The server communicates with its bank to

93

5. Case Study: On-line Music Shop verify the purchase details and then instantiates an Upload component on the appropriate media store host and binds its pipe client endpoint to the input service of the Download component. The “purchase” service of the Directory component and the “input” service of the Download component are provided over the transport protocol “secure/tcp” that secures the TCP/IP connection used by the binding with encryption and authentication. These bindings must also be configured with parameters specific to the “secure” protocol, such as the name of the principal for whose identity the binding is to be authenticated. These configuration parameters are not shown in the graphical view of the system architecture, but are represented in the architectural description language.

Apart from the “worker” services, which are generated automatically by the Darwin compiler, and the mediastream interaction used for the previews, all interactions between components of the Record Shop application follow one of two styles: request/reply or typed attributes with change notification. In these cases, there is no need to define a new Midas interaction type. The Regent runtime libraries include generic request/reply and attributes in the set of standard interaction styles provided for component programmers. The programmer can parameterise these generic styles application-specific data structures defined in Midas. The definitions of the ContentsAttribute and Browse interactions are shown in figure 59. 1 2 3 4 5 6 7 8 9 10 11 12 13

module record_shop { typedef string PathElement; typedef sequence TrackPath; typedef sequence TrackList; struct DirectoryElement { PathElement name; boolean is_directory; }; typedef sequence ContentList; typedef ::regent::interact::Attribute ContentsAttribute; typedef ::regent::interact::Entry Browse; };

Figure 59. Instantiations of generic interaction types. The use of generic interaction types greatly reduces the effort required to design the component interactions. It is much easier to declare data structures than new interaction styles. It is preferable to make use of existing interactions where possible because they have been verified for correctness and implementations will typically be available in a library for use by programmers. For this application, the programmer need only design a single new interaction type for streaming media. As described in section 5.4, even the media streaming components make use of common interaction styles to monitor and control media flow.

5.4. Media Processing Components The streaming, reception and presentation of streamed media, such as audio and video, can easily be broken down into separate tasks that are implemented as individual components. For the on-line record store, media data must be read as an MP3 file, converted into a low resolution format, streamed over a network connection, received at

94

5. Case Study: On-line Music Shop the client, converted from the low resolution format to that expected by the client’s audio hardware, buffered to remove jitter caused by queuing delays in the network and finally passed to the client’s audio hardware for playback. Source Address Space

FileReader(file)

Sink Address Space

ConvertFmt

ConvertFmt

Buffer

AudioOut

Figure 60. Components used to stream audio data between address spaces. These media processing components also need to communicate in other ways to control the media flow. The format of the data must be provided by the component reading the data from the file for the component converting it to the low-resolution format, and by that component for the component converting it to the format used by the sound card. The source the audio stream needs to notify other components when the stream of media has started, paused and stopped, so that they can flush their internal buffers correctly and inform the user through a graphical interface. The audio output component must inform the buffer component as audio data is drained from the card to request more data. Other components might want to announce events as they process frames, allowing other components to synchronise their processing, to maintain lip-sync between an audio and video stream, for example.

There are several existing component frameworks for processing streamed media, such as Microsoft’s DirectShow, the latest Windows API for processing and presenting continuous media, and the Java Media Framework [SWDB98]. A drawback of these frameworks is that they combine several of the distinct interactions described above into complex interfaces through which media data, format information and control messages and events are all passed. This effectively limits the component model to use within a single address space: each interface must be bound as a whole even though different interactions require different qualities of service when transmitted between address spaces. For example, media data can be transmitted over an unreliable connection, because the timeliness of delivery is more important than reliability, but control messages must be transmitted reliably. Combining these different interactions into a single interface definition forces media data and control information to be transmitted with the same QoS. The result of this is that either the media is transmitted over a reliable connection, so adversely affecting presentation, or control messages are transmitted over an unreliable connection, resulting in lost messages that cause the system to enter an inconsistent state.

The method with which such media frameworks solve this binding problem is to encapsulate transmission and reception of media within specific components rather than within connectors. This negates any benefits of a component framework when developing a distributed application, because the component framework cannot be used to compose components in different address spaces, and complicates the resulting system by hiding structural information - the bindings between distributed media processing components communicating over the network - within the implementation of components rather than making it visible at the architectural level. By defining the interactions used for media processing in Midas, the developer can take full advantage of the capabilities of the Midas/

95

5. Case Study: On-line Music Shop Regent component model. Components can interact using the same interaction styles whether collocated or distributed, and the appropriate transport protocol can be selected for each binding between address spaces, thus allowing bindings to have the different quality of service and reliability characteristics.

Figure 61 shows the architecture of the component that acts as source of the audio stream, as defined using Midas interactions. All its constituent components are collocated. The FileReader component named “reader” streams audio data from a local file through its MediaStream service named “stream”. It provides the format of the audio data via an Attribute, a generic interaction type, and announces flow events - the start and end of the stream, through the “flow” Event service. The ConvertFmt component named “convert” converts the frames of media from the format in which they are stored to a low resolution format for streaming across the network. It requires the format of incoming media from the FileReader and provides the format of outgoing media for other components. AudioSource( string file ) flow

flow : Event

reader: FileReader(file)

stream

convert: ConvertFmt

stream

format : Attribute

fmt_in

stream

stream

fmt_out format

Figure 61. Source component for streaming audio. Figure 62 shows the architecture of the sink of the audio stream. Media data is received by the ConvertFmt component named “convert” that converts from the low-resolution format to that used for playback by the audio hardware. An Attribute holding the format of incoming media is required by the “convert” component and is provided elsewhere, by the source of the stream. The “convert” component passes the media frames to the “smooth” component that buffers frames to remove jitter resulting from network transmission. Draining of the buffer is performed in response to messages sent to the Port named “pump”: when messages are sent to this buffer, frames are sent out of the component’s stream service. The Buffer requires notification of the end of the stream, so that it can detect the different between the end of the stream and buffer underflow, and provides notification of the end of the stream of media that is buffering. The “playback” component receives media frames and passed them to the audio hardware. It requires the output format of the “convert” component so that it can initialise the hardware, and receives notification of the end of the stream, so release the audio hardware for use by other programs when the stream is complete. As audio data is consumed by the card, the “playback” component announces its requirement for more

96

5. Case Study: On-line Music Shop data by sending notifications through its “drain” event. These events are routed to the “smooth” component by the “push” component that converts the events to messages sent to the Buffer’s “pump” port. The Buffer reacts to these messages by sending more media frames from its stream service. AudioSink flow

src_flow

stream stream

stream

convert : ConvertFmt

flow

smooth : Buffer

flow

stream stream

playback : AudioOut

stream fmt_in

fmt_out

drain

pump

format

format

push : Ev2Port

Figure 62. Sink component for streaming audio. Because stream communication is exposed as interaction endpoints owned by components, binding between components in different address spaces can be performed by the Darwin runtime based on information in the architectural description of the system. Figure 63 shows a simple system that streams from an AudioSource component in one node to an AudioSink component in another, showing how media stream bindings are explicitly declared between components, just like any other form of interaction.

ExampleSystem flow

src : AudioSource @Node(1)

flow

stream

stream

Protocol("order/frag/cod/udp") format

snk : AudioSink @Node(2)

format

Figure 63. An example system that streams media between two address spaces A further advantage from the use of Midas is that, apart from the media-stream interaction itself, the interactions between media-processing components are instantiations of common, generic interactions: event dissemination and typed attributes and message ports. In these cases, there is no need to define a new Midas interaction type: the programmer can parameterise existing generic interactions with application-specific Midas types. This greatly reduces the effort required to design the component interactions. It is much easier to declare a data structure than a new interaction style, and so it is preferable to make use of existing interactions where possible because they have been verified for correctness and implementations will typically be available in a library for use by programmers.

97

5. Case Study: On-line Music Shop The media stream interaction must be defined in Midas, but is itself very simple: a stream service transmits each frame of media to all of its clients. Clients must send a connection request to the stream service to start transmission media. Because the stream interaction is designed to operate over an unreliable protocol, client endpoints repeat connection requests if no frame has been received after some timeout period. The Midas code and state machines of the protocol constraints are shown in figure 64. The state machines are parameterised by the size of the range of sequence numbers used to identify frames of media, the seq parameter of the frame message. The seq parameter of the frame message is represented as an index to the in.frame or out.frame events of the constraint properties. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

typedef long long Timestamp; typedef unsigned long long SequenceNumber; typedef sequence Frame; interaction MediaStream { provide { connect(); }; require { frame( SequenceNumber seq, Timestamp start, Frame data ); }; spec "FSP" { property MEDIA_STREAM_PROVIDE( S=2 ) = READY, READY = ( in.connect -> STREAMING ), STREAMING = ( out.frame[0..S-1] -> STREAMING | in.connect -> STREAMING ). property MEDIA_STREAM_REQUIRE( S=2 ) = ( out.connect -> WAITING ), WAITING = ( out.connect -> WAITING // on timeout | in.frame[0..S-1] -> STREAMING ), STREAMING = ( in.frame[0..S-1] -> STREAMING ). }; }; in .con n e ct

o u t.co nn e ct

in .co n n ect o ut.fra m e [0 ..S -1 ]

0

in.fra m e [0 ..S -1 ]

1

0

1

2 ou t.co n n ect in .fra m e [0 ..S -1 ]

MEDIA_STREAM_PROVIDE(S)

MEDIA_STREAM_REQUIRE(S)

Figure 64. The MediaStream interaction type Figure 65 shows the FSP model of a client endpoint of the MediaStream interaction. The model is composed of the client-side protocol constraint and a model of the endpoint implementation, MEDIA_STREAM_CLIENT_IMP. This process defines how the connection protocol works and how frames are buffered within the endpoint until processed by the component’s implementation.

98

5. Case Study: On-line Music Shop The client endpoint initially sends a connect message and then starts waiting to receive media in the WAITING state. In the WAITING state the client waits until a frame is delivered, in which case the frame is buffered, or a timeout occurs. If a timeout occurs, the client enters the TIMED_OUT state in which it can receive a frame of media or send a timeout and return to the WAITING state. This extra state is required to model the concurrency inherent in the Midas communication model: a message can be transmitted and received by different threads; forcing a timeout to be followed by the transmission of a connect message would cause a deadlock if the service was concurrently trying to deliver a frame of media. Once a frame is received by the client endpoint, it enters a HAVE_FRAME state, indexed by the sequence number of the frame received. This represents the buffering of a single frame of media. The component can then receive the frame by the getFrame API action, in which case the endpoint waits for the next frame in the NO_FRAME state. If another frame is delivered while the endpoint is in a HAVE_FRAME state, it enters a HAVE_FRAME state indexed by the latest frame received, discarding the buffered frame. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

||MEDIA_STREAM_CLIENT( S=3 ) = ( MEDIA_STREAM_CLIENT_IMP(S) || MEDIA_STREAM_REQUIRE(S) ). MEDIA_STREAM_CLIENT_IMP( S=4 ) = ( out.connect -> WAITING ), WAITING = ( in.frame[seq:0..S-1] -> HAVE_FRAME[seq] | timeout -> TIMED_OUT ), TIMED_OUT = ( out.connect -> WAITING | in.frame[seq:0..S-1] -> HAVE_FRAME[seq] ), HAVE_FRAME[seq:0..S-1] = ( getFrame[seq] -> NO_FRAME | in.frame[seq2:0..S-1] -> HAVE_FRAME[seq2] ), NO_FRAME = ( in.frame[seq:0..S-1] -> HAVE_FRAME[seq] ). in .fra m e.0

in .fram e.0

in.fram e.1

o ut.conne ct

0

tim eout

1

in.fram e.0

getF ram e.1

in.fram e.1

2

3

in.fram e.1

in.fram e.1

ou t.co nnect

in.fram e.0

4

5 in.fram e.0

ge tF ram e.0

in.fram e.1

MEDIA_STREAM_CLIENT( S=2 )

Figure 65. Model of the MediaStream client endpoint for 1-bit sequence numbers Figure figure 66 shows the FSP model of the service endpoint of a MediaStream. The model is composed of a model of the API through which the component sends frames of media and models representing how clients connect to the stream. The MEDIA_STREAM_SERVICE_API process represents how the component sends frames: each sendFrame action synchronised with the component results in an internal broadcast.frame action indexed by

99

5. Case Study: On-line Music Shop the current sequence number; sequence numbers are incremented for each frame, modulo the range of sequence numbers, S. The MEDIA_STREAM_XMIT processes define how each client connects to the stream. When unconnected, a client does not receive media transmitted by the service; the broadcast.frame event is ignored. Once a connect message is received, each frame broadcast is transmitted to the client. Line 9 hides the broadcast action

so that it is internal to the endpoint. The combined state machine of the MEDIA_STREAM_SERVICE process is too large to display graphically. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

||MEDIA_STREAM_SERVICE( C=2, S=4 ) = ( MEDIA_STREAM_SERVICE_API(S) || forall[c:1..C] MEDIA_STREAM_SERVICE_XMIT(c,S) || forall[c:1..C] check[c]:MEDIA_STREAM_PROVIDE(S) ) /{ out[c:1..C] / check[c].out, in[c:1..C] / check[c].in } \{ broadcast }. MEDIA_STREAM_SERVICE_API( S=2 ) = SEQ[0], SEQ[s:0..S-1] = ( sendFrame -> broadcast.frame[s] -> SEQ[(s+1)%S] ). MEDIA_STREAM_SERVICE_XMIT( C=1, S=2 ) = UNCONNECTED, UNCONNECTED = ( in[C].connect -> CONNECTED | broadcast.frame[0..S-1] -> UNCONNECTED ), CONNECTED = ( in[C].connect -> CONNECTED | broadcast.frame[s:0..S-1] -> out[C].frame[s] -> CONNECTED ).

Figure 66. Model of the MediaStream service endpoint for 1-bit sequence numbers By combining the MEDIA_STREAM_SERVICE and MEDIA_STREAM_CLIENT models with the model of an unreliable, in-order transport connection, UODCX, described in section 4.1, one can check that the protocol performs as required: that connection setup eventually succeeds and that frames can be lost without causing deadlock.

5.5. Comparison with CORBA We have shown how the on-line record shopping application would be structured using the Darwin ADL and the interactions between components defined using Midas. But how would this application be implemented using an existing, object-oriented middleware platform?

Firstly, one would define an object model of the application in terms of the classes of objects and relationships between classes, as shown in figure 67. Clients access the record shop through a “shop-keeper” object, through which they can browse the music tracks in the shop and purchase tracks. Music tracks are organised by category and are annotated with information about the artist, track and publisher. The client can request a preview of a track and a low-quality version of the music is then streamed across the network to them.

100

5. Case Study: On-line Music Shop The client purchases a track by giving the shop-keeper object their payment details. The shop-keeper then checks the client’s payment details with their bank over a secure connection. If the check succeeds – e.g. the client has given a correct account number and has enough money in the account to pay for the music – then the shop-keeper returns the client a stream by which the track may be downloaded over a secure connection. The download stream is created by the track object itself, since the track object encapsulates the storage of the digital music data; the shop-keeper asks the track to create the stream via a private operation not exposed to clients.

ShopKeeper

Bank bank

+browseMusic() : MusicCategoryList +purchase(p : PaymentDetails) : DownloadStream

+check(details : PaymentDetails) : boolean

categories n parent 0..1

MusicCategory

0..n

-name : string -description : string

subcategory category tracks 0..n

Track -name : string -artist : string -copyright : string -copyright_date : Date -publisher : string -published_date : Date -description : string -price : Currency +preview() : StreamAddress download(client_key : PublicKey) : DownloadStream

creates 1 track

DownloadStream +read(bytes : int) : sequence

Figure 67. Classes of the record shop application. The design of the CORBA system does not define how music is streamed to the client for playback, beyond stating that the reference to a preview stream is passed to the client as a StreamAddress value. Media streams cannot be defined using CORBA IDL interfaces because IDL operations are synchronous and reliable, which is not appropriate for continuous media. One-way operations are more appropriate, but the CORBA standard provides no guarantees they are actually implemented differently from synchronous operations. Further, CORBA provides no mechanisms for selecting an appropriate transport protocol and quality of service for bindings, making it difficult to smoothly transfer and present media via CORBA interfaces.

Therefore we must either implement mechanisms for streaming audio data ourselves, using native APIs provided by each operating-system we want to support, or use some other cross-platform framework for processing streamed media. Either approach requires that we integrate the event loops of the CORBA ORB and streaming framework; this is often a non-trivial task when off-the-shelf frameworks, such as ORBs, do not provide hooks into their event

101

5. Case Study: On-line Music Shop loop. Similarly, it is inefficient to download music tracks using multiple operation invocations, because each operation invocation involves a round-trip between client and server. Rather than using the DownloadStream interface, the system should use an interaction style that is better suited to bulk transfer of data: for example, a style that relaxes the synchronisation between sender and receiver by using a windowed flow-control protocol.

The next step is to decide the placement of objects on nodes, and thereby determine which application objects would be exposed as CORBA objects with IDL interfaces. The placement of objects onto operating system processes and machines is shown in figure 68 in UML notation. Note, however, that the placement information is not translated into executable code that deploys the system components; a system administrator must place each server that executes CORBA objects onto a physical node. Each server must register its objects in a naming service: components requiring the use of remote objects must look them up in the naming service. This duplicates binding information, the names of objects, and distributes it among the system components, requiring changes to multiple components when the system architecture is changed. MediaStore 1

Tracks

Second Tier Server Front End Server

MusicCategories

MediaStore n theShopKeeper

Clients

Tracks

Bank Server

aBank

Figure 68. Placement of CORBA objects onto servers. As this example demonstrates, CORBA does not provide appropriate mechanisms to define all interactions between components, even for this simple application. The separation of concerns provided by the Midas language and runtime libraries allows the designer to select the most appropriate combination of interaction and transport protocols for each binding. Compared to the UML placement diagram, the Darwin architecture description provides a clearer view of which components interact and the protocols by which those interactions are mediated.

102

5. Case Study: On-line Music Shop

5.6. Conclusion This chapter has described the design of an on-line record shop expressed using Darwin and Midas. The design highlights that components need to interact in a variety of ways and that individual bindings require different transport protocols, levels of security and qualities of service. Midas allows different interaction styles to be described and named, and those names can be used within the Darwin architecture description. By clearly separating the concerns of interaction style from transport protocol, the system can select the appropriate transport for each binding, thereby securing those bindings that must be secure or using an unreliable transport for the transmission of continuous media data that would suffer from reliable transmission. As described in chapter 7, transport protocols are themselves implemented by composing components, allowing existing protocols to be augmented with additional behaviour; for example, any binding can be secured by including the security protocol layer as a component of its transport.

As we have shown, the design of a new interaction protocol can be a complex task. The ability for clients or services to communicate asynchronously and the ability to create bindings with different transport semantics can result in errors such as deadlock caused by messages being received by an endpoint while in an invalid state, being lost by an unreliable connection or being delivered in a different order than that in which they were transmitted.

Midas uses three methods to alleviate these problems. Firstly, many interactions can be defined as instantiations of existing, generic interaction styles. This allows the designer to reuse existing designs that have been checked for correctness and implementations that have been well tested. Secondly, Midas interaction definitions include formal specifications of their behaviour that can be combined with models of transport connections to check the behaviour of a protocol over different transport protocols. Thirdly, test code can be generated from the formal specifications to perform runtime checks that endpoint implementations conform to their interaction protocol.

103

6. Mapping Midas To Java

Mapping Midas To Java

Never worry about theory as long as the machinery does what it's supposed to do. Robert A. Heinlein

6.1. Introduction Chapter 3 introduced the Midas language that is used to specify component interaction styles that follow the interaction model introduced in chapter 2. Midas is used purely to specify interaction styles; Midas declarations must be translated into constructs of a programming language before a programmer can create components that use those interaction styles. This translation is performed automatically by one or more compilers that process the Midas code and, optionally, formal specifications with which that code is annotated.

In this chapter we show how Midas definitions are mapped to one particular implementation language, Java [AG98]. However, many of the decisions involved are applicable to other languages, for example C or C++. A scheme that translates Midas declarations into a programming language must decide: • How to map primitive Midas types to the primitive types of the programming language. There may not always be a one-to-one correspondence between types in the two languages. • How to map Midas types constructors, such as sequences and arrays, and user defined types, such as enums and structs, to those of the programming language. Almost certainly, there will not be a one-to-one mapping between Midas types and those of the programming language. Programming languages typically provide the minimum data structuring facilities, such as arrays and records, and allow the programmer to build more complex data structures, such as sequences, from the basic language facilities. • How to implement interaction endpoints in the programming language, and how to separate the various concerns of client and service endpoints so that component developers and integrators can select the most appropriate presentation and transport protocols and insert management functionality into bindings. • How to map instances of interaction types into the programming language. For example, a type defined by an interaction statement can be used as a parameter of a message or as a field of a structure. • How to map generic type and interaction declarations, those that are themselves parameterised by one or

104

6. Mapping Midas To Java more types, onto features of programming languages that do not provide direct support for genericity.

Java is a strongly-typed object oriented programming language that features single-inheritance of state and behaviour and multiple inheritance of interfaces. The mapping of primitive types from Midas to Java is based on the Java mapping of CORBA IDL [OMG98]. User-defined types are mapped slightly differently to provide a more convenient interface for the programmer. Because Java has no support for parameterised types, the design decisions involved in mapping generic Midas declarations to Java classes and supporting genericity in the runtime support for distribution transparency and binding are described in detail, using UML notation [BJR97].

6.2. Modules Midas allows names to be defined within modules to avoid name clash and to group related type and interaction definitions. Unless declared within the body of a module statement, Midas definitions are defined within the global module. Modules can be nested; definitions in other modules can be referred to by their scoped names. A scoped name is preceded by the names of the modules containing the name separated by the scoping operator, “::”. An initial “::” indicates that the name is resolved from the global module, otherwise it is resolved from the module in which the name is used following the usual name resolution rules of block structured languages such as C++ or Pascal.

Modules are mapped to Java packages. The global module is mapped to the unnamed global Java package and nested modules are mapped to Java sub-packages. Midas Syntax module regent { module interact { ... }; };

Java Mapping package regent.interact; ...

Table 2. Midas modules mapped to java packages

6.3. Constants The Midas const statement is used to define a named, typed constant. Constants can be values of any of the primitive types or character strings. Numeric and boolean constants can be calculated at compile time from literal values and other constants. Expressions follow the same syntax as those of CORBA IDL [OMG98].

105

6. Mapping Midas To Java Java does not support constant declarations outside class declarations, so a Midas const declaration is mapped to an interface that has the same name as the constant and contains the constant as a static, final variable that also has the same name as the constant. Classes can implement the constant’s interface if they want to use the Midas name of the constant. Midas Syntax

Java Mapping

const unsigned long TIMEOUT = RTT * 2; public interface TIMEOUT { public static int TIMEOUT = RTT.RTT*2; };

Table 3. Midas constant declaration mapped to Java

6.4. Basic Types Midas provides the same primitive types as CORBA IDL, as shown in table 4, below. The mapping of primitive Midas types to Java types follows the CORBA IDL/Java mapping. Midas Type

Data Value TRUE or FALSE octet 8-bit cardinal short 16-bit integer unsigned short 16-bit cardinal long 32-bit integer unsigned long 32-bit cardinal long long 64-bit cardinal unsigned long long 64-bit integer float 32-bit real double 64-bit real char 8-bit character

boolean

Java Mapping boolean byte short short int int long long float double char

Table 4. Primitive Midas types

Midas provides a number of ways to specify structured data. The simplest are fixed size, multidimensional arrays of values. Variable sized sequences of values are supported by the sequence type constructor; sequences can optionally be bounded. The string type is similar to a sequence of characters, but is mapped into the most appropriate type for handling strings in the implementation language.

106

6. Mapping Midas To Java Arrays are translated directly to Java arrays. Midas sequences are mapped to instances of the class regent.interact.Sequence which maintains a controlled sequence of Object references. This class can be

used to maintain sequences of types that are mapped into Java classes but is an inefficient way of representing sequences of primitive types: each element of the sequence would have to be stored within another Java object. Therefore the regent.interact package contains classes that hold sequences of each primitive type. Midas Type sequence sequence sequence sequence sequence sequence sequence sequence sequence sequence sequence string

Java Mapping regent.interact.BooleanSequence regent.interact.OctetSequence regent.interact.ShortSequence regent.interact.UShortSequence regent.interact.LongSequence regent.interact.ULongSequence regent.interact.LongLongSequence regent.interact.ULongLongSequence regent.interact.FloatSequence regent.interact.DoubleSequence regent.interact.CharSequence java.lang.String

Figure 69. Sequence classes that hold primitive types The sequence classes also implement marshalling support, as described in section 6.5.1, and run-time support for genericity, as described in section 6.8. The basic sequence class, regent.interact.Sequence, is itself a generic type and therefore its marshalling functions and run-time type information are parameterised by type as described in section 6.8.

6.5. User-Defined Types 6.5.1. Marshalling Support For each user-defined type, the Midas compiler generates code that marshals and unmarshals values of that type to and from byte streams. Marshalling code is implemented as static functions of the class generated from the Midas type. The write function writes instances of the class to a byte stream via a DataOutput interface while the read function reads an instance of the class from a byte stream via a DataInput interface.

107

6. Mapping Midas To Java 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

public class UserDefinedType { ... public static void write( java.io.DataOutput out, UserDefinedType data ) throws java.io.IOException { ... } public static UserDefinedType read( java.io.DataInput in ) throws java.io.IOException { ... } }

Figure 70. Marshalling functions of a generated class This approach has the disadvantage that, unlike the standard Java serialisation mechanism, untyped data cannot be marshalled because marshalling is not invoked through polymorphic interfaces. However, to increase type safety, Midas allows the specification of generic interaction types, and so doesn’t allow the passing of untyped data between components. Thus the receiving endpoint has a-priori knowledge of the type of data being received, and so type information does not have to included in messages, as it is by the standard Java serialisation mechanism. This reduces the size of messages sent between address spaces.

The use of abstract DataInput and DataOutput interfaces to read and write data allows Midas types to be marshalled to any byte stream by pushing DataInputStream and DataOutputStream objects onto the front of the stream. Usually, marshalling code writes values into a TransmitBuffer to be passed to the transport subsystem, using a BufferOutputStream, and reads values from a ReceiveBuffer received from the transport subsystem. These classes implement efficient algorithms for data management that are described in detail in section 7.5.

Marshalling support is elided from the example Java code below.

6.5.2. Enums The enum statement is used to define a new enumerated type, values of which can be one of a set of symbolic constants. Java does not provide enumerated types and Midas enum declarations are therefore mapped to Java classes that follow a widely used programming idiom to emulate enumerations. Each enum declaration is mapped to a class with private constructors, making it impossible for client code to instantiate new objects of the class. The only objects of the class are held as constants scoped within the class itself, each of which has the same name as one of the enumeration values.

The alternative approach would be to translate enums into an interface containing static integer constants for each symbolic constant. This approach is type unsafe: a Java program that attempted to use invalid integer values where an enumerated type is expected would compile correctly but cause runtime errors. The approach we have chosen catches such errors at compile time.

108

6. Mapping Midas To Java Each object also stores its index in the enumeration and a string representation of the enumeration value, and static mappings from these integers and strings to the appropriate object instances are set up when the class is initialised. This allows client code to map enumerated values to integers or strings and vice versa. Additionally, iteration through the enumeration values is supported by the next and prev methods of each object that return the next and previous enumeration value respectively. Midas Syntax enum E { x, y, z };

Java Mapping public class E { private int _int_value; private String _str_value; private static E[] _by_index = new E[3]; private static Hashtable _by_name = new Hashtable(); private E( int iv, String sv ) { _int_value = iv; _str_value = sv; _by_index[iv] = this; _by_name.put( sv, this ); } public static final E x = new E( 0, “x” ); public static final E y = new E( 1, “y” ); public static final E z = new E( 2, “z” ); public public public public

int intValue() { return _int_value; } String toString() { return _str_value; } E next() { ... } E prev() { ... }

public static E fromString( String str ) { return _by_name.get(str); } public static E fromInt( int n ) { return _int_value[n]; } ... }

Table 5. Midas enum mapped to a Java class

The write method of an enum class marshals values by writing their integer representation to the DataOutput stream as a 32-bit integer. The read method of an enum class unmarshals a value by reading a 32-bit integer from the DataInput stream and passing it to the fromInt static method. If fromInt throws an ArrayIndexOutOfBoundsException, a regent.interact.DataFormatException is thrown to abort fur-

ther unmarshalling.

109

6. Mapping Midas To Java

6.5.3. Structures More complex data types can be defined using struct statements. A struct declaration defines a record type containing zero or more named, typed fields. Structures are translated into Java classes that contain the fields of the structure as public member variables. A structure class provides two constructors: one with no arguments that initialises the fields to safe initial values and one that takes the values of the fields as arguments. Midas Syntax struct S { long x; string y; };

Java Mapping public class S { public int x; public String y; public S() { x = 0; y = ““; } public S( int x, String y ) { this.x = x; this.y = y; } ... }

Table 6. Midas struct mapped to Java class

The write method of a structure marshals values by writing each field to the DataOutput stream in order of declaration in the Midas source file. The read method reads the values from the stream and uses them to construct an instance of the structure class.

6.5.4. Typedefs Midas allows the definition of type aliases with the typedef statement. This can be useful to give convenient or descriptive names to type definitions that have long names, are in different modules or are instantiations of a generic type.

110

6. Mapping Midas To Java Java does not support the concept of a typedef so names defined by Midas typedefs are not mapped into Java. Any uses of a typedef in a Midas declaration are mapped to uses of the type renamed by the typedef. If the typedef is itself an alias for a typedef, the chain of typedefs is followed back to the “root” type definition, which is used. Midas Syntax

Java Mapping

typedef string TypeName; typedef TypeName MajorType; // “root” type is string typedef TypeName MinorType; // “root” type is string struct MIMEType { MajorType major; MinorType minor; };

public class MIMEType { public String major; public String minor; ... }

Table 7. Midas typedefs mapped to Java

6.6. Interaction Types Interaction types are used to generate Java interfaces that define the provided and required message interfaces, base classes from which endpoint objects are derived and classes that implement distribution transparency as described in section 3.2.1. Other compiler back-ends generate useful classes from interaction types.

Each interaction type is mapped to a Java package that contains the generated interface and class definitions. The name of the package is derived from the name of the interaction by converting all characters in the name to lower case and inserting underscores before upper case characters that are preceded by a lower case letter or are preceded by an upper case character and followed by a lower case character. This algorithm transforms the name from one in which words are delimited by upper case characters to one containing only lower case characters in which words are delimited by underscores. For example, the interaction named Attribute is mapped to a package named attribute, the interaction named MediaStream is mapped to a package named media_stream and the inter-

action named RTPControl is mapped to a package named rtp_control. All classes in an interaction’s package are prepended by the name of the interaction. The text in the rest of this section does not show the name prepended to class names, to keep the names short and clear, although they are shown in the corresponding UML diagrams for specific interaction types.

6.6.1. Message Interfaces The provided and required message interfaces of a Midas interaction are translated into Java interfaces named ServiceMessages and ClientMessages, respectively. Each message in a message set is translated to a method

of the appropriate interface, and each parameter of the message is translated to an argument to the message. Methods of the ServiceMessages interface take the back-binding to the client as an additional parameter. Each method of the message interface can throw exceptions of type regent.transport.TransportException, to report

111

6. Mapping Midas To Java errors caused by transmitting messages, java.io.IOException, to report errors caused by marshalling the message and InterruptedException to allow threads sending a message to be interrupted if they are blocked waiting for a binding to be completed or resources to become available.

The Java message interfaces also include two “house-keeping” methods. The _unbind method of a message interface is called when the binding is closed by the peer. The _getClientControl or _getServiceControl methods provide access to control interfaces on the binding, allowing components to query and configure parameters of the protocols used to implement the binding. Midas Syntax

Java Mapping

interaction Attribute< type T > { package attribute; provide messages { get(); public interface AttributeServiceMessages { set( T value ); void getValue( AttributeClientMessages _client ) }; throws RegentException; require messages { void setValue( Object value, update( T value ); AttributeClientMessages _client ) setAck(); throws RegentException; setNak(); }; void _unbind( ClientMessages client ); ProtocolControl _getServiceControl( Class c ); } ... }; public interface AttributeClientMessages { void update( Object value ) { throws RegentException; void setAck() { throws RegentException; void setNak() { throws RegentException; void _unbind(); ProtocolControl _getClientControl( Class c ); }

Table 8. Message interfaces generated from a Midas interaction

6.6.2. Endpoint Stubs The message interfaces are implemented by base classes that provide support for the implementation of interaction objects. The abstract ServiceStub class is the base class from which programmers derive service-side endpoints. The ServiceStub class implements the ServiceMessages interface of the interaction but does not implement the methods of that interface, leaving their implementation to derived classes. It also implements the regent.interact.ServiceEndpoint interface that provides methods for attaching service access points to the

endpoint to make it available over the network.

112

6. Mapping Midas To Java The abstract ClientStub class acts is the base class from which programmers derive client-side endpoints. The ClientStub class implements the ClientMessages interface of the interaction but, like the ServiceStub

class, it leaves the implementation of the methods of that interface to derived classes. The ClientStub class is responsible for maintaining a reference to the ServiceMessages interface of the service endpoint to which it is bound. Initially this reference refers to a singleton member of the ClientStub class named UNBOUND which implements the ServiceMessages interface by throwing exceptions whenever its methods are called. The ClientEndpoint can be bound by passing a ServiceMessages reference to the bind

method. The

ServiceMessages interface of the binding is made available to derived classes through a protected method

named binding. These classes are illustrated in figure 71. «Interface» ServiceEndpoint «Interface» ClientEndpoint

+addSAP() +enumerateSaps() +removeSap() +getUntypedReference() +getSupportClass()

+bindUntyped(o : Object) +getSupportClass()

attribute

«Interface» AttributeServiceMessages

binding

«Interface» AttributeClientMessages

+update(value : Object) +setAck() +setNak()

+get() +set(value : Object)

AttributeClientStub

AttributeServiceStub

+bind(s : ServiceMessages) #binding() : ServiceMessages

AttributeService

AttributeClient

+apiFunction1() +apiFunction2() +get() +set(value : Object)

+apiFunctionA() +apiFunctionB() +update(value : Object) +setAck() +setNak()

Figure 71. Classes for the Attribute interaction type Both the ClientStub and the ServiceStub classes provide an implementation of the _getControl method of their message interface that returns null. That is, there are no control interfaces available on a direct binding within the same address space.

113

6. Mapping Midas To Java

6.6.3. Support for Third-Party Binding The message interfaces are implemented by client-side Binder objects that support third-party binding, as shown in figure 72. A Binder implements the ServiceMessages interface and itself holds a binding to another ServiceMessages interface. Threads calling an operation of the ServiceMessages interface of a Binder are

blocked until the Binder is bound. Once bound, calls to the ServiceMessages interface Binder are delegated to the binding. This allows components to execute concurrently with the binding service that is performing thirdparty binding operations without components receiving UnboundException errors when using an endpoint that has not yet been bound. A Binder implements the ClientMessages interface by delegating calls to the clientendpoint that is bound to it. «Interface» Bindable

+bindUntyped(o : Object)

attribute

«Interface» AttributeServiceMessages

«Interface» AttributeClientMessages

+get() +set(value : Object)

+update(value : Object) +setAck() +setNak()

binding 0..1

1 client

AttributeBinder +bind(s : AttributeServiceMessages)

Figure 72. The Binder class generated for the Attribute interaction

6.6.4. Proxies and Service Access Points (SAPs) The interaction definition is also translated into classes that support distribution transparency. Inheritance is used to separate the concerns of presentation-layer translation - how messages are represented as octet sequences - and transport layer protocol - how messages are transmitted between proxies. The abstract base classes ServiceProxyStub and ClientProxyStub encapsulate the presentation-layer marshalling. Each proxy stub

class implements one message interface and has a reference to a message interface of its peer in the interaction. A proxy stub implements each method of its message interface by marshalling an identifier of the message and its arguments into a transport-level buffer and passing that buffer to a protected, abstract method named _ transmit. It is the responsibility of derived classes to implement the _transmit method to transmit the buffer to the peer proxy using some transport mechanism. The proxy stub classes also define a protected method named _receive that takes a transport level buffer as an argument, unmarshals a message identifier and arguments from the buffer and invokes the appropriate method of the peer message interface to which it holds a reference. 114

6. Mapping Midas To Java The concrete classes ConnectionServiceProxy, ConnectionClientProxy and ConnectionSAP are generated to provide bindings over connection-oriented transport protocols, such as TCP/IP. The connection proxy classes extend the proxy stub classes to interface with the Regent transport framework, which is described in detail in chapter 7. Each connection proxy implements the regent.transport.ProtocolUpcall interface, to which the transport connection delivers messages, and holds a reference to the regent.transport.ProtocolService interface of the connection protocol through which messages are transmitted. The protected _transmit method declared by the proxy stub base class is implemented to transmit the message via the connection’s ProtocolService and delivery of messages from the connection is handled by passing them to the protected

_receive method defined by the base class. The connection proxies implement the _unbind method by closing their connection and react to their connection being closed by calling the _unbind method of the message interface to which they hold a reference. They both implement the _getControl method by querying the protocol stack they are using for the requested control interface.

Binding over connection-oriented transport protocols is supported by the ConnectionSAP class. A ConnectionSAP is created for a service endpoint in order to make that endpoint available over a specific connec-

tion-oriented protocol. The constructor of the ConnectionSAP class takes the endpoint and a textual description of the stack as arguments. It elaborates the stack description into a server-side stack that can be used to listen for and accept connection requests. When a ConnectionSAP receives a connection-request it accepts the connection request, getting the stack of the new connection as a result, creates a new ConnectionClientProxy and pushes the new proxy onto the stack.

Each ConnectionSAP object makes its address available as a QualifiedAddress object. A QualifiedAddress encapsulates the address of a protocol endpoint with the textual description of the protocol

stack itself. Binding is performed by passing the QualifiedAddress of a service’s ConnectionSAP to the client, where a ConnectionServiceProxy is created. The ConnectionServiceProxy class constructor takes a QualifiedAddress as one of its arguments, elaborates the stack description to create a client-side stack, and uses

the ConnectionControl interface of the stack to establish a connection with the remote ConnectionSAP. The client endpoint is then bound to the service proxy to complete the binding.

115

6. Mapping Midas To Java

«Interface» ServiceAccessPoint

saps 0..n

+getQualifiedAddress()

attribute «Interface» AttributeServiceMessages

+get() +set(value : Object)

AttributeServiceProxyStub

client

«Interface» AttributeClientMessages

+update(value : Object) +setAck() +setNak()

service

AttributeClientProxyStub

AttributeServiceStub endpoint

+update(value : Object) +setAck() +setNak() #_transmit() #_receive()

+get() +set(value : Object) #_transmit() #_receive()

AttributeConnectionServiceProxy

AttributeConnectionSAP

AttributeConnectionClientProxy creates

transport

ProtocolService transport

transport

Figure 73. Generated classes that implement distribution transparency

6.6.5. Reference Objects Each Midas interaction type has a corresponding Reference class that identifies a service endpoint of the interaction over multiple protocols as a list of QualifiedAddress objects. A Reference for an endpoint can be obtained by calling the endpoint's getReference method. The Reference class supports marshalling and genericity: uses of an interaction type in a Midas data structure definition are translated to References of the interaction in the generated Java data structures. Midas Syntax struct S { Attribute attr; };

Java Mapping public class S { public attribute.AttributeReference attr; }

Table 9. A Midas interaction instance mapped to Java.

116

6. Mapping Midas To Java

6.7. Additional Mappings for Interaction Types Additional back-ends to the Midas compiler generate optional support classes from interaction definitions.

6.7.1. Reified Messages Endpoints that synchronise with threads within a component need to queue the messages they receive until they can be handled by the component’s threads. This requires representing the messages as objects. Writing the classes for these objects is tedious, so a backend of the Midas compiler is provided that can generate classes for reified messages from Midas interaction definitions.

A class is generated for each message to hold the arguments of that message. Each message class extends an abstract base class: there is one base class for provided messages of an interaction and one for required messages. This allows the programmer take advantage of Java’s strong typing, if they so wish, when queuing or processing messages because an endpoint implementation does not need to pass around untyped Object references. The message classes are defined as nested inner classes of their abstract base class. The name of each message class is defined to be the name of the Midas message with the first character capitalised. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

public abstract class AttributeServiceMsg { public abstract void dispatch( AttributeServiceMessages subject ) throws RegentException; ... public static class SetValue extends AttributeServiceMsg { java.lang.Object value; AttributeClientMessages _client; public SetValue( java.lang.Object value, AttributeClientMessages _clt ) { this.value = value; this._client = _clt; } public void dispatch( AttributeServiceMessages _subject ) throws RegentException { _subject.setValue( value, _client ); } } }

Figure 74. Reified message classes for the Attribute service message interface When a message needs to be dispatched, the endpoint must determine the type of the message. Rather than store an identifier of the message in the base class and force client code to perform type cases based on the value of the identifier, the message classes use the Visitor pattern [GHJV94]: the base class defines an abstract dispatch

117

6. Mapping Midas To Java method that takes a reference to a client or service message interface and concrete message classes implement the dispatch method by calling their corresponding operation of the message interface. Combined with anonymous

inner classes, this provides a type safe equivalent to Java’s switch statement based on the type of the message. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

void dispatchMessage() { AttributeServiceMsg msg = ... Dequeue the message ... try { msg.dispatch( new AttributeServiceMessages() { public void setValue( Object val, AttributeClientMessages clt ) { ... handle setValue message ... } public void getValue( AttributeClientMessages clt ) { ... handle getValue message ... } } ); } catch( RegentException ex ) { // Ignore Regent exceptions if we know they will not be thrown } }

Figure 75. Example usage of an anonymous inner class to dispatch reified messages

6.7.2. Datagram Proxies and SAPs The default back-end of the Midas compiler generates distribution support – proxies and SAPs – that uses connection-oriented transport protocols. This allows greater control over the transport protocol used for individual bindings: the transport protocol can provide control interfaces through which QoS and other parameters can be configured on a binding-by-binding basis. However, the use of connection-oriented protocols has the disadvantage that a transport-level connection must be established for each application-level binding, involving at least one additional round-trip communication between client to server.

An additional back-end to the Midas compiler generates proxies and SAPs that use datagram-oriented, rather than connection-oriented, protocols. This is useful for those bindings for which transport-level control is not required.

Datagram proxies and SAPs usually make use of a reliable datagram protocol. The transport framework described in chapter 7 provides components that can be composed to create a reliable datagram protocol: • The “bounds/tcp” protocol implements a reliable connection-oriented protocol that maintains message boundaries. Reliable connections can also be implemented by layering the appropriate components over UDP, such as “frag/rel/cod/udp”; see section 7.12 a description of these protocol components. • The “doc” protocol routes datagrams over connections. The “doc” layer maintains a table of outgoing connections indexed by the address of the remote peer of each connection. When a datagram is transmitted, the destination address is used to look up a connection in the table. If a connection is found, the datagram is transmitted over the connection, otherwise a new connection is created to the remote address, added to the table and the datagram is transmitted over the connection when it is established. If an outgo-

118

6. Mapping Midas To Java ing connection is not used for some period of time it is removed from the table and closed. • The “dmux” protocol multiplexes multiple datagram protocols over a single underlying datagram protocol.

Therefore a reliable datagram protocol suitable for use with the datagram proxies and SAPs is implemented using the stack “dmux/doc/bounds/tcp”. This stack shares a single TCP server socket between all endpoints using the same protocol, and uses a single outgoing and single incoming TCP connection to transmit messages between any two address spaces. TCP connections are created on demand and torn down when idle. This behaviour is the same as the method by which CORBA IIOP [OMG98] uses the TCP protocol to transmit object requests between address spaces, but is implemented only by combining reusable transport components.

6.8. Generic Types Midas allows the definition of generic types: structures and interaction types can, themselves, be parameterised by both primitive and user-defined types. Java, by contrast, does not support generic types – class definitions cannot be parameterised. There are three possible ways to map generic Midas types into Java:

1.

Use run-time type identification: The Midas compiler could generate objects that represent and manipulate type information at run-time. These type objects could be used to parameterise instances of the Java classes that are generated from generic Midas declarations. Message parameters and fields of user defined types that have a generic parameter as their type would be represented as references to Object (the root of Java’s inheritance hierarchy) and the type objects and Java’s run-time type information would be used to enforce type compatibility and externalise or internalise such values.

2.

Instantiate generics in the Midas compiler: A backend to the Midas compiler could be used to instantiate generic types by generating Java classes representing the generic type parameterised by some set of type arguments. The advantage of this option is that better performance could be achieved because the use of run-time type information is avoided and type information could be made available to the Java compiler so catching type errors earlier in the development process. However, more code would be generated because a Java class would be generated for each instantiation of a generic Midas type, which would increase download times for Java applications served over the network.

3.

Use a different language: Rather than strive for compatibility with the Java language, we could instead achieve compatibility at the level of Java byte-code. A Midas compiler could generate code in one of the variants of Java that does support genericity, such as PolyJ [MBL97] or GJ [BOSW98]. This option has the advantage that less code is generated and type information is still made available to the compiler. However, these variants of Java are not as widely used as Java or as well supported, often lagging one or two versions behind the latest Java development kit.

119

6. Mapping Midas To Java The third option, that of using another language, is not acceptable because it does not maintain source-code compatibility with Java. This reduces the ability to take advantage of programmers’ familiarity with the Java language and proscribes the use of existing tools that manipulate Java source code, such as document generators or integrated development environments. The third option, that of instantiating generic types in the Midas compiler, is not acceptable because it complicates the development process for developers using prebuilt components and increases download times, which is a great disadvantage considering that the ability to download code is one of the main reasons that Java is widely used.

Therefore the Midas compiler generates run-time type information to support genericity. Midas types are represented at runtime by objects that extend the abstract class regent.interact.Type, shown in figure 76.

Run-time type information must be used at several points in a distributed system:

1.

Within each node, to safely downcast untyped values.

2.

In proxies, to marshal and unmarshal values into and out of network messages.

3.

To check compatibility of bindings between endpoints in the same address space.

Java itself supports safe downcasts with the instanceof and cast operators and makes detailed type information available through the Reflection API [CLK98]. Therefore type information generated by Midas need only be concerned with marshalling, unmarshalling and checking the compatibility of bindings, and so only endpoint objects need to use type objects generated from Midas definitions. Java’s dynamic typing can be used to ensure that userdefined generic types are used correctly within component implementations.

The isCompatible method of the Type class is used to check whether a variable of this type can be assigned a value that is described by the given type object. The write method takes a DataOutput stream and an untyped object, checks whether the object is of the correct type and then writes a binary representation of the object onto the output stream. The read method reads a value of this type from a DataInput stream and returns the value as an untyped object. 1 2 3 4 5 6 7 8 9 10

public abstract class Type { public abstract boolean isCompatible( Type t ); public abstract void write( DataOutput out, Object data ) throws TypeCompatibilityException, IOException; public abstract Object read( DataInput in ) throws TypeCompatibilityException, IOException; ... }

Figure 76. Type class definition

120

6. Mapping Midas To Java Type objects representing the primitive Midas types are defined as static constant members of the regent.interact.Type class itself. These type objects marshal and unmarshals primitive types held in the standard “wrap-

per” classes defined in the java.lang package. For convenience, the type objects for numeric values do not always expect the exact type of wrapper object: the write method accepts any class derived from java.lang.Number, the abstract class from which all numeric wrapper classes are derived. The read method instantiates objects of the appropriate Java wrapper class so that Java run-time type checks can be used. Primitive Type boolean octet short unsigned short long unsigned long long long unsigned long long float double char string

Type Object BOOLEAN OCTET SHORT USHORT LONG ULONG LONGLONG ULONGLONG FLOAT DOUBLE CHAR STRING

Type Marshalled java.lang.Boolean java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Number java.lang.Character java.lang.String

Type Unmarshalled java.lang.Boolean java.lang.Byte java.lang.Short java.lang.Short java.lang.Integer java.lang.Integer java.lang.Long java.lang.Long java.lang.Float java.lang.Double java.lang.Character java.lang.String

Table 10. Type objects representing primitive types

121

6. Mapping Midas To Java For each struct, union or interaction type defined in a Midas specification, the Midas compiler generates a static, final member that holds an reference to a Type object of an anonymous nested class. The read and write methods of these type objects delegate to the static read and write functions of the generated class, performing run-time type checks to ensure type compatibility. Midas Declaration struct S { ... };

Generated Java Code class S { ... public static S read( DataInput in ) { ... } public static void write( DataOutput o, S data ) { ... } static final Type TYPE = new Type() { ... public Object read( DataInput in ) ... { return S.read(in); } public void write( DataOutput out, Object o ) ... { try { S.write( out, (S)o ); } catch( ClassCastException e ) { throw new regent.interact.TypeCompatibilityException(...); } } }; }

Table 11. The type object generated for each user defined type

Generic user-defined types, other than interaction types, are compiled into Java classes that hold values of their type parameters as untyped Object references. It is the responsibility of code that uses generic types to ensure that correctly typed data is stored in these fields by using the standard Java mechanisms for run-time type identification.

Because instances of generic data types contain untyped values, Type objects representing the generic parameters of the data type are needed to marshal and unmarshals these generic data values. These Type objects are passed as extra parameters to the static read and write functions generated by the Midas compiler.

Similarly, the Type object representing an instantiation of the generic data type must also hold references to the Type objects representing the instantiation parameters. Generic user-defined types define a private nested class (named _Type) that represents the type of an instantiated user-defined type. Instances of _Type store references to other Type objects representing the parameters of the generic type. Instances of the type class are made available to client code via a factory function, named TYPE that takes the Type parameters as arguments and constructs a new instance of the _Type class. Unlike the anonymous, singleton type objects generated from non-generic type

122

6. Mapping Midas To Java definitions, multiple instances of _Type can exist at any time, so the _Type class defines the equals and hashCode methods to allow comparison of _Type objects and their use in Java hash-tables.Unlike user-defined

generic data structures, generic interaction objects store their Type parameters throughout their lifetime. This is necessary because it is interaction objects that pass typed data between address spaces and therefore must call the marshalling and unmarshalling functions generated by the Midas compiler, passing type objects to those functions as necessary. The type parameters of an interaction object provide enough information to marshal any data type that is used as an argument to a message of that interaction. Marshalling code generated for a generic user-defined type is shown in table 12.

Generic interaction objects make their type parameters available to other objects through the interface regent.interact.Generic. This allows proxy and SAP objects to obtain the Type objects necessary for mar-

shalling

and

demoralising

values

of

the

interaction

object’s

parameter

types.

The

regent.interact.GenericStub class implements the Generic interface by storing Type objects in an array.

Proxy and SAP objects extend the class regent.interact.DelegatingGeneric which implements the Generic interface by delegating calls to another Generic object; Proxies and SAPS delegate to the interaction

object with which they are associated. These relationships are illustrated in the UML diagram of figure 77. «Interface» Generic

master

+getParameterCount() : int +getParameter(n : int) : Type

parameters 1..n

«Interface» Type

+isCompatible() +read() +write()

1..n parameter

«Generated» ExampleEndpoint._Type

1

GenericStub

DelegatingGeneric

+getParameterCount() : int +getParameter(n : int) : Type

+getParameterCount() : int +getParameter(n : int) : Type

ExampleEndpoint

ExampleProxy endpoint

+equals(o : Object) : boolean +hashCode() : int

Figure 77. Classes supporting generic types

123

6. Mapping Midas To Java

Midas Declaration struct G { T generic_field; ... };

Generated Java Code class G { Object generic_field; ... public static G read( DataInput in, Type T ) { ... } public static void write( DataOutput o, G data, Type T ) { ... } private class _Type extends Type { private Type T; _Type( Type T ) { this.T = T; } public boolean isCompatible( regent.interact.Type _t ) { return _t instanceof G._Type && T.isCompatible(((G._Type)_t).T); } public Object read( DataInput in ) ... { return G.read( in, T ); } public void write( DataOutput out, Object o ) ... { try { G.write( out, (S)o, T ); } catch( ClassCastException e ) { throw new regent.interact.TypeCompatibilityException(...); } } public int hashCode() { return T.hashCode(); } public boolean equals( java.lang.Object _o ) { return _o instanceof G._Type && T.equals(((G._Type)_o).T); } } public static Type TYPE( Type T ) { return new _Type(T); } }

Table 12. Genericity support generated from a Midas specification

124

6. Mapping Midas To Java

6.9. Summary This chapter has described how the interaction model and Midas language features described in chapter 3 are mapped to features of the Java language. Midas is translated into Java classes that support distribution transparency and binding. Following the principle of increasing flexibility by separating concerns, support for generic interaction protocols, binding, presentation layer marshalling and transport layer communication are separated and can be replaced or modified by the programmer as required. Further flexibility is provided by the Regent transport framework which allows the dynamic construction of transport protocols, allowing services to take advantage of new transport protocols as they become available and clients to dynamically load the protocols required to use a service without a-priori knowledge of those protocols. The Regent transport framework is described in detail in chapter 7.

125

7. Transport Protocol Framework

Transport

Protocol Framework

Nothing is particularly hard if you divide it into small jobs. Henry Ford

7.1. Introduction The proxies and SAPs for application-layer interaction protocols access transport protocol services via the Regent transport framework. This framework is distinguished by three main features: it is platform independent, hiding native networking APIs of the host operating system behind cross-platform, object-oriented abstractions; it is component based, allowing transport functionality to be implemented through the composition of lightweight protocol components; and it is dynamic, allowing protocol components to be loaded and composed at runtime.

7.2. Requirements of the Transport Subsystem The transport subsystem for a flexible middleware platform has to meet the following requirements: • Platform Independence. Different operating systems make the transport protocols they provide available through different programming interfaces. For example, Microsoft Windows provides the Winsock [QS95] and NetBIOS APIs and functions in the Win32 API for using Microsoft LanManager protocols, while Unix variants provide the Socket or XTI APIs [Steve97]. These different APIs need to be encapsulated behind a cross-platform API to allow code that uses the framework to be executed on different platforms. • Independence from Higher Layer Protocols. Although the transport framework is designed to support Midas communication endpoints, it is important that it can be used separately if need be. This allows programmers to make use of components that use different presentation and application layer protocols, such as the CORBA GIOP [OMG98], but still gain the benefits provided by the transport framework, and allows the transport framework to be integrated with application-specific APIs such as the Java Media Framework [SWDB98]. • Composition. A typical transport protocol, such as TCP, is often implemented as a single, monolithic component. Such implementations can provide more guarantees than are actually required by the application and the implementation of these guarantees can have an adverse effect on overall performance. For exam-

126

7. Transport Protocol Framework ple, TCP provides multiplexing and in-order, reliable delivery. If reliable delivery is not required, one cannot configure TCP to provide only multiplexing and in-order delivery. Furthermore, it is often necessary to compose other functions, such as compression or security, with existing protocols. Therefore, it is necessary to encapsulate individual protocol mechanisms as components and allow the application to compose them to achieve required functionality. • Common Interface. To allow maximum reuse of protocol components, all components must conform to the same interface, so that any component can be layered above any other. Although this will allow the creation of invalid protocol stacks, we found during development that the use of strong typing to avoid invalid stacks severely limited the ability to reuse generic layers and therefore resulted in large amounts of duplicated code. Higher level mechanisms can be used to help developers create valid protocols [GS98]. • Dynamicity. The protocols required for a binding depend on the relative location of the components at either end of that binding. If they are located on the same host, shared-memory or an efficient local IPC mechanism can be used. If they are located on the same physical network, protocols for internetworking can be avoided. If they are separated by a private internet, the stack must contain layers that provide fragmentation and reassembly of messages and reliable delivery. If they are separated by an untrusted network, additional layers providing encryption and authentication may be necessary. The location of components is decided at run-time as components are launched. Therefore, a system must be able to load and compose protocol components into the stacks required for each binding depending on run-time information. Servers need to describe the composition of their stacks so that clients can build compatible stacks when binding. • Manageability. The stack must support management. This breaks down into two requirements. Firstly, it must be possible to examine and modify parameters controlling the operation of the stack and invoke control operations on the stack. Secondly, the stack must provide a way for management agents to receive notification of important events occurring within the stack. Example events might be notification that a connection had failed or that the QoS available from the network had fallen below some threshold.

As discussed in section 2.5, no transport protocol framework fully meets these requirements. This chapter describes the transport framework that was developed to address these issues.

7.3. Overview The fundamental abstractions used by the transport framework are that of the protocol graph, protocol layers, services and upcalls. Data is routed between network devices and communication endpoints through a graph of protocol layers. Each layer is an object that implements some protocol functionality. A layer provides one or more data transmission services to higher layers and can require such services from lower layers. Service requirements are represented by upcall interfaces. Upcalls are bound to services; when bound, data is passed down the stack through

127

7. Transport Protocol Framework the service interface of the lower layer and passed up the stack through the upcall interface of the higher layer. This component model matches that of the Darwin language and so we use the Darwin graphical notation to represent configurations of protocol layers.

A protocol layer can also provide control interfaces, composed of attributes, control operations, through which higher layers can control

service

the operation of the protocol. Control interfaces also export events to control

rate

higher layers. To be notified of events occurring beneath it in the event

stack, a protocol layer must implement a compatible “event listener” interface and register the interface to be called back when events oc-

upcall

cur. The model of attributes and events used by control interfaces is the same as that used by the Java Beans component framework

Figure 78. Protocol layer interfaces [Chap96].

Stacks are constructed by instantiating layer objects and binding the required services of higher layers to services provided by layers below. As each requirement of a layer is bound, the layer queries the layer providing the bound service for any control interfaces it requires and registers for event callbacks. If a request for a control interface cannot be met by the layer directly below, the request is passed down the stack until a layer is found that implements the interface or the bottom of the stack is reached. This chain of responsibility [GHJV94] has a two-fold benefit: firstly, it increases the reusability of protocol implementations because each layer does not need to know the implementation details of the stack over which it is layered; secondly, it improves performance by allowing configuration requests and event notifications to bypass layers that are not interested in them.

A protocol configuration need not be linear; indeed, most protocol graphs are structured as a tree. In such graphs, layers are responsible for multiplexing data transmitted by multiple higher layers using the protocol provided by a

crypt

single lower layer. In the Regent transport framework, a multiplexor is a normal protocol layer, with the same interface as any other layer in the graph. However, when a higher layer is bound to a service of a multiplexor, the mul-

rate

tiplexor instantiates a hidden session [Pryce99] layer that implements the service and provides the service interface of the session.to the higher layer instead. Each session encapsulates addressing information and protocol state for the client using it.

cx

Figure 79. Protocol layers composed into a stack, showing control Most protocol layers modify the messages passing through them. They either and event connections Another important classification of protocol layer is the virtual protocol.

add headers to outgoing messages and remove headers from incoming messages or completely transform the message contents, such as by compression or encryption.Virtual protocols, on the other hand, do not modify the messages passing through them. Instead, virtual protocols perform monitoring

128

7. Transport Protocol Framework on the data flowing through the stack and can perform management actions in response to measured activity. Examples of virtual protocols include a layer that measures the QoS levels available to the application or a layer that performs leaky-bucket traffic shaping.

For two stacks to communicate, the same non-virtual protocols must be in each stack and in the same order in each stack. However, the presence or absence of virtual protocols in a stack does not affect the stack’s compatibility with others. Therefore virtual protocols may be inserted into a stack by individual nodes of an application, in order to perform monitoring and management, without affecting the nodes with which they communicate.

client 1

client 2

client 2

session

session

session

mux

Figure 80. Session layers created by a multiplexor for each of its clients

7.4. Implementation of Protocol Layers Each protocol layer in the stack is a Java object that conforms to the regent.transport.ProtocolLayer interface, which is shown in figure 81. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

public interface ProtocolLayer { Address getAddress(); ProtocolControl getControl( Class control_class ); String[] getServiceNames(); ProtocolService getService( String name, Object discriminator ) throws TransportException; String[] getUpcallNames(); ProtocolUpcall getUpcall( String name, Object discriminator ) throws TransportException; ProtocolLayer getMultiplexor(); void abortStack(); Address parseAddress( PushbackReader reader ) throws AddressFormatException, IOException; Address readAddress( DataInput input ) throws AddressFormatException, IOException; }

Figure 81. ProtocolLayer interface definition

129

7. Transport Protocol Framework The ProtocolLayer interface provides methods to enumerate the services and upcalls of the protocol layer. The getServiceNames method returns an array containing the names of the services provided by the layer. The serv-

ices can be acquired by the getService method. This method takes the name of the service as a parameter and returns the ProtocolService interface of the service; if the name parameter is null, the service returned is that which is considered “default” for the layer. If the layer is a multiplexor, a new session layer is instantiated and the default service of the session is returned. The discriminator argument can be used to request a session with a specific protocol discriminator; if left null, the multiplexor assigns a unique discriminator for the new session. Similarly, the getUpcallNames and getUpcall return the names of the upcalls required by the layer and the ProtocolUpcall interface of a named upcall, respectively.

The getAddress method returns an Address object that uniquely identifies the layer object within the address space of protocol that it implements. A layer can also internalise addresses that have been externalised as textual strings or raw binary data: the parseAddress method reads an textual address from a stream of characters and the readAddress

reads a binary address from a stream of bytes. Both methods can throw the

AddressFormatException to indicate that the stream contains an invalid representation of an address for this

type of layer. The run-time representation of addresses is described in section 7.6.

The getControl method is used to query the layer’s control interfaces. This method takes the Class object representing the required control interface as an argument. If the layer does not support the requested control interface, it should pass the request to the layer below, or return null if it is at the bottom of the stack. To maximise the ability to reuse protocol layers, a number of standard control and event interfaces are defined that support common protocol types and options, such as active and passive connection establishment, quality of service management and

error

handling.

Standard

regent.transport.controls.

control

and

User-defined

event

control

interfaces interfaces

are must

defined extend

in the

the

package

“tag”

interface

regent.transport.ProtocolControl. This allows the ProtocolLayerStub class to find a layer’s control

interfaces by reflection. Similarly, user-defined event-listener interfaces must extend the interface regent.transport.ProtocolEventListener.

The interfaces regent.transport.ProtocolService and regent.transport.ProtocolUpcall define the interactions between a bound service and upcall. The attach methods of each interface are used to bind layers into a stack - a binding is established between an upcall and a service by passing a reference to the upcall to the attach method of the service and then passing a reference to the upcall to the attach method of the service. Data is passed down the stack by calling the transmit method of a service and passed up the stack by calling the receive method of an upcall. The transmit method can throw exceptions to report errors that can be detected

immediately; the receive method does not throw exceptions to avoid disrupting the system threads responsible for receiving data from devices. Incoming and outgoing data is passed between layers in buffer and address objects that encapsulate efficient memory management algorithms; memory management issues are addressed in more detail in section 7.5.

130

7. Transport Protocol Framework The release method of the ProtocolService interface detaches the upcall that has been bound to the service. After a call to release, no more data can be transmitted by that service and no more data will be delivered to the upcall that was detached. However, because of the multithreaded nature of the transport framework, data can be passed to the receive method of the associated upcall during the processing of the release method; concurrency issues are addressed in more detail in section 7.11. Once all the services of a layer have been released, the layer is no longer needed and can be destroyed. Typically this involves releasing the services of layers beneath it and freeing resources, such as threads and device handles. 1 2 3 4 5 6 7 8 9

public interface ProtocolService { void attach( ProtocolUpcall upcall ) throws AlreadyAttachedException; ProtocolLayer getLayer(); void transmit( TransmitBuffer data, Address to ) throws TransportException; void release(); }

Figure 82. ProtocolService interface definition 1 2 3 4 5 6

public interface ProtocolUpcall { void attach( ProtocolService service ) throws TransportException; void receive( ReceiveBuffer data, Address sender ); }

Figure 83. ProtocolLayer interface definition As described below, multiplexors require special processing when protocol stacks are constructed and the names of virtual protocols can be removed from stack descriptions that are passed between nodes, so it must be possible to identify whether a layer is a multiplexor or is virtual. “Tag” interfaces are used to identify multiplexors.and virtual protocols: multiplexors must implement regent.transport.Multiplexor and virtual protocols must implement regent.transport.VirtualProtocol. These interfaces are empty - they provide no operations - but Java’s run-time type identification or Reflection API can be used to inspect a layer object or class to determine whether the layer implements one of these interfaces and so is a multiplexor or virtual.

Implementing the component packaging defined by the ProtocolLayer interface for each new protocol layer is tedious and error prone. Therefore an implementation is provided in the form of the abstract ProtocolLayerStub class that uses the Java Reflection API [CLK98] to determine the services and upcalls and control interfaces of a derived layer from some simple coding conventions. The ProtocolLayerStub implementation of the ProtocolLayer interface examines the class that instantiated the layer object. Public instance variables holding

references to objects derived from ProtocolUpcall are assumed to be upcalls to the layer. If the layer is not a multiplexor - that is, it is not tagged with the Multiplexor interface - then public instance variables of the layer that hold references to objects derived from ProtocolService are assumed to be services provided by the layer. If the layer is a multiplexor, then services are assumed to be defined as instance variables that hold references to the support interface, MultiplexorService, shown in figure 84. When a service is requested from a multiplexor, the

ProtocolLayerStub

implementation

calls

the 131

createSession

method

of

the

appropriate

7. Transport Protocol Framework MultiplexorService to create a new session layer and returns the default service of the session. Control inter-

faces are found by querying the class to see whether it is derived from the interface class passed to the getControl method; if so, a reference to the layer is returned. 1 2 3 4 5

public interface MultiplexorService { ProtocolService createSession( Object discriminator ) throws TransportException; }

Figure 84. MultiplexorService interface definition A protocol layer can implement service and upcall interfaces in two ways. The simplest, but most restrictive, is for the layer object to implement the ProtocolService and ProtocolLayer interfaces directly, in addition to the ProtocolService interface, and store references to itself in public member variables if they are to be accessible

to the ProtocolLayerStub implementation. This is restrictive because a layer can only have at most one service and one upcall. Alternatively, the layer uses auxiliary objects that implement the ProtocolService or ProtocolUpcall interface and delegate calls to those interfaces to non-public methods of the layer object. This

approach is greatly simplified by the use of anonymous inner classes, introduced to Java in version 1.1.

7.5. Memory Management Empirical studies have shown that a major contributing factor to latency in middleware frameworks is excessive data copying when performing marshalling and protocol processing [GS98]. Highly layered protocols exacerbate this problem by passing data between many layers and processing messages piecemeal: each protocol component adds its header to messages passed down the stack and removes its header from messages passed up the stack. The applications using the stack cannot know a-priori the exact composition of the stack or the memory requirements of individual layers and so cannot allocate message buffers that are exactly the right size. Therefore, the transport framework passes data between layers in objects that provide efficient mechanisms for constructing and processing messages piece by piece.

In our framework, the basic unit of memory management is the Chunk. A Chunk is a contiguous, read-only sequence of bytes. Because a Chunk is read-only, it can be shared between protocol layers and between threads. This allows a message made up of chunks to be buffered in different layers of the stack and delivered to endpoints within the same address space without copying or synchronisation overhead.

A Buffer object stores a sequence of chunks in a data structure that allows chunks to be efficiently appended to either end of the sequence. This is vital because the presentation layer has to write data onto the tail of the Buffer when marshalling messages and the transport layers have to add headers to the front of the Buffer. Buffer objects are used only to hold the sequence of chunks, but do not themselves provide support for efficiently allocating memory for headers or the message payload or for reading data from the chunks in the sequence. The presentation and transport layers use buffers to pass data between each other but build and access the memory stored in Buffers through wrapper objects that act as streams and perform efficient memory allocation. 132

7. Transport Protocol Framework Buffers are passed down the stack for transmission in TransmitBuffer objects. These objects support efficient allocation and management of memory for message headers and provide methods for writing formatted data into headers. A TransmitBuffer object maintains an array of bytes into which headers are written. New headers are allocated by calling the TransmitBuffer’s newHeader method which takes the size of the header in bytes and returns a DataInput stream that can be used to write formatted data into the header. When the header array is full, it is pushed onto the head of the encapsulated Buffer and a new header array, twice the size, is allocated. When the TransmitBuffer is to be transmitted, the encapsulated Buffer is acquired by calling the getBuffer method. This flushes any headers in the header array onto the front of the encapsulated Buffer and returns it so that the raw data can be accessed. The getBuffer method also allows a TransmitBuffer to be converted to a ReceiveBuffer to support efficient, short-circuited delivery of data within the same address space.

Buffers are passed up the stack for delivery to the application in ReceiveBuffer objects. These objects support the efficient removal of message headers and provide methods for reading formatted data from the headers. A ReceiveBuffer maintains the index of the unread Chunk and byte within that Chunk at the head of the encapsu-

lated Buffer. A ReceiveBuffer is derived from InputStream, so that bytes can be read from it, and is connected to a DataInputStream so that formatted data can be read from the head of the Buffer.

InputStream

OutputStream

ReceiveBuffer

TransmitBuffer

BufferOutputStream

DataInputStream +write()

buffer

DataOutputStream

-header_offset : int -header_length : int +newHeader() : DataOutput +write()

hdr_in +read()

hdr_out

buffer current_header

buffer

Buffer -size : int -head_capacity : int -tail_capacity : int +pushHeader() +copyHeader() +pushTrailer() +copyTrailer()

Chunk 1

n chunks

-offset : int -length : int

m

byte[]

data

Figure 85. Static structure of the memory management classes Buffers are usually created by a BufferOutputStream, an object that conforms to the standard Java OutputStream interface and streams data into a Buffer. In this respect, it is very like the ByteArrayOutputStream class that is part of the standard Java I/O library. A ByteArrayOutputStream

streams data into a contiguous byte array. Because it creates a single byte array it must periodically copy the streamed data to grow the array. This copying causes considerable overhead, especially as the array is sized arithmetically, rather than geometrically. In comparison, the BufferOutputStream does not copy the streamed data. Instead, as new space is needed, a new Chunk is allocated and appended to the Buffer. When the array of Chunks

133

7. Transport Protocol Framework in the Buffer is full, a new array is constructed and the Chunk references are copied from one to the other. However, because the Chunks are stored by reference, the data copying overhead is much less than that of copying the data itself. Additionally, both the size of each Chunk allocated and the array of Chunks in the Buffer is grown geometrically, so the overhead grows logarithmically to the amount of data written to the stream. The amount of time taken to write various sized arrays of bytes into a BufferOutputStream and a ByteArrayOutputStream is shown in figure 86; the constant overhead is caused by the JVM loading the stream classes, instantiating the stream objects and performing garbage collection. 80

ByteArrayInputStream BufferInputStream

70

60

Time (msecs)

50

40

30

20

10

0 32

64

128

256

512

1024

2048

4096

8192

16384

32768

65536

Data Size

Figure 86. Time taken to write data to a BufferOutputStream and ByteArrayOutputStream

7.6. Addressing Each layer in the stack has a unique address that identifies that layer within the network. This address is used by other protocol layers to communicate with the identified layer: if the identified layer is a datagram protocol its address is used as the destination of messages transmitted to it, if it is a connection-oriented protocol its address is used to establish connections with it. Addresses are created or acquired in three ways: 1.

Protocol layers create address objects to identify themselves.

2.

Protocols layers construct address objects from data received in message headers to identify the sender of the message.

3.

Address objects are internalised from some external form, such as a textual string or octet sequence, that is used to communicate the address via some “out-of-band” medium, such as in an e-mail address, a web page or in the payload of a network message.

Due to the composite, dynamic nature of the protocol stacks, it is impossible to know a-priori the structure of an address or to use a fixed-size data structure to hold address information, as does the Berkeley Sockets API [Steve97]. Therefore addresses are accessed only through the abstract Address interface, shown in figure 87,

134

7. Transport Protocol Framework through which addresses can be compared, used in a hash table, and serialised to an external form that can be translated back into an address at a future time. The internal implementation and serialised representation of an address object is known only to the implementation of the protocol layers that created them. 1 2 3 4 5 6 7 8

public interface Address extends java.io.Serializable { void write( java.io.DataOutput output ) throws java.io.IOException; String toString(); boolean equals( Object o ); int hashCode(); }

Figure 87. The Address interface An atomic address object, one that does not refer to other address objects, identifies the root of a tree of protocols. Multiplexors create composite address objects to identify the sessions that they create, each of which is composed of the protocol discriminator for the session and a reference to the address of the multiplexor itself, which is typically the address of the layer or session upon which it is itself stacked.

Filter

Session

Address

Session

Mux

Address

Figure 88. Immutable address objects can be shared All address objects are immutable. This property allows address objects to be shared and composed; because individual address objects cannot be modified it is impossible to invalidate a composite address object by modifying one of its components. Immutability also removes the need to copy address data: a layer’s address object can be referred to directly by the composite addresses of layers above it, other protocol layers and threads performing computation on the address without danger of interference. Furthermore, because addresses are immutable, they can be shared between threads with no synchronisation overhead.

In addition to creating address objects to identify themselves, protocol layers create address objects to identify the source of messages that they pass up the stack. These addresses are initialised from header fields of the messages received. When a message is transmitted via a stack, each multiplexor session in the stack adds a header to the front

135

7. Transport Protocol Framework of the message that contains, among other information, the identifier of the session. When the message is received, each multiplexor removes the header from the message and uses the session identifier in the field to construct an address object before routing the message up to the session. If the multiplexor is the root of the tree of layers, it creates an atomic address. Multiplexors higher in the stack will receive the message from a layer below and so will create a composite address containing the session identifier and a reference to the address passed to it by the layer below.

Address

create

Address

create

Address

create

Headers

Message Body

Figure 89. Address objects are constructed from header fields of received messages Before communication can take place, the address of the receiver must be passed to the sender by some out-of-band mechanism. This requires the externalisation and internalisation of address objects. Although it is possible to use Java serialisation for this, Java serialisation would result in unnecessary space overhead and would inhibit interoperation with code written in other languages. Therefore address objects and the protocols that use them provide support for the serialisation of address objects to binary and textual format and the reconstruction of address objects from such serialised data.

The Address interface provides support for externalisation of the address as binary data or a textual string. The write method writes the address as an octet sequence to a DataOutput stream. The octet sequence of a serialised

address is made up of the protocol discriminators that define the address with discriminators of lower layers preceding those of higher layers; the exact format of individual discriminators is defined by each layer. The toString method converts the address to a string. Address strings are made up of protocol discriminators separated by colon characters (‘:’), with discriminators of lower layers preceding those of higher layers; again the format of discriminators is defined by each layer.

Internalisation of address objects is supported by the ProtocolLayer and ProtocolFactory interfaces. The ProtocolLayer interface provides the readAddress method to internalise an address in its own format from a DataInput stream and the parseAddress method to parse an address from a PushbackReader - a character

stream that supports unreading of characters. A layer with a composite address format internalises an address object

136

7. Transport Protocol Framework by addresses by calling the readAddress or writeAddress of the layer beneath it before reading its own protocol discriminator and constructing a composite address object. Layers that do not define an address format itself, such as a filter protocols, delegate calls to readAddress and writeAddress to the layers beneath them without performing any extra processing.

Constructing a protocol stack has a significant overhead because layers in the stack will have to allocate system resources, such as threads and operating-system handles. This overhead is wasted if the stack is being constructed solely to internalise an address, and will be discarded afterwards. Therefore, the ProtocolFactory interface provides the methods readAddress and parseAddress to internalise addresses in the format of the protocol that is created by the factory without the need to instantiate a stack. Both methods perform the same task as the method in the ProtocolLayer interface with the same name. However, because a protocol factory does not know how the layer it creates is going to be stacked, it cannot delegate to another factory to internalise parts of a composite address. Therefore the methods of the ProtocolFactory interface also take an additional Address parameter that holds the address internalised by the layers that would be beneath the factory’s layer. A factory that creates a composite address reads the protocol discriminator for its protocol and creates a new composite address made up of the discriminator and the address parameter passed to the method.

7.7. Connection-Oriented Protocols On their own, the ProtocolService and ProtocolUpcall interfaces provide an adequate interface for datagram oriented protocols. That is, data can be transmitted via the stack without prior communication between sender and receiver. However, connection-oriented protocols have to establish a connection before data can be transmitted. Connection oriented protocols are usually asymmetric: a server-side stack passively waits for connections requests while client-side stacks send connection requests to establish the connection. The application’s interface to these roles is provided through control and event interfaces.

Control of a connection, including initiating active connection establishment, is provided by the ConnectionControl interface, shown in figure 90. The connect method is used to establish a connection with

a server and takes as parameters the address of the server and an optional buffer of data to be sent along with the connection request. Connection establishment is asynchronous; after calling connect, the higher layer must wait to be informed that the connection has been successfully established before sending data.

137

7. Transport Protocol Framework 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

public interface ConnectionControl extends ProtocolControl { boolean isConnected(); void connect( Address to, TransmitBuffer data ) throws TransportException; void close( TransmitBuffer data ) throws TransportException; void addConnectionListener( ConnectionListener listener ); void removeConnectionListener( ConnectionListener listener ); } public interface ConnectionListener extends ProtocolEventListener { void connectionOpen( ReceiveBuffer data ); void connectionClosed( ReceiveBuffer data ); void connectionFailed(); }

Figure 90. Control and event interfaces used to manage active connection setup and tear-down The state of a connection can be monitored by registering a ConnectionListener interface via the addConnectionListener method of the ConnectionControl interface. The connection protocol will inform

listeners when the state of the connection changes by calling the methods of the listener interface. The connectionOpen method is used to signal that connection establishment has been successful and that data can

now be transmitted. The protocol can pass an optional TransmitBuffer parameter that contains additional data passed between endpoints at connection-setup time. If the connection was refused by the server, the connectionClosed method of the listener interface is called, with an optional buffer of data. If the server could

not be contacted at all, the connectionFailed method of the listener interface is called.

The connection can be actively closed by calling the close method, which also takes an optional TransmitBuffer of data to be sent along with the close request. Like connection establishment, closing a connection happens asynchronously. A successful active close or a passive close, where the connection is closed by the peer, is reported through the connectionClosed method of the ConnectionListener interface.

Passive connection establishment is performed using the interfaces ConnectionRequestControl, ConnectionRequestListener and ConnectionRequestEvent, shown in figure 91. A protocol that performs

passive connection establishment waits for connection requests from a client and announces the requests to a higher layer.

Higher

layers

must

register

a

ConnectionRequestListener

interface

via

the

ConnectionRequestControl interface of the server-side stack. Connection requests are passed to the connectionRequest method of the ConnectionRequestListener interface as objects that implement the ConnectionRequestEvent interface. This interface provides methods through which the request can be accept-

ed or refused, the address of the peer can be queried and an optional buffer of connection-setup data can be retrieved.

138

7. Transport Protocol Framework 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

public interface ConnectionRequestControl extends ProtocolControl { void addConnectionRequestListener( ConnectionRequestListener l ) throws java.util.TooManyListenersException; void removeConnectionRequestListener( ConnectionRequestListener l ); } public interface ConnectionRequestListener extends ProtocolEventListener { void connectionRequest( ConnectionRequestEvent rq ); } public interface ConnectionRequestEvent { ProtocolLayer accept( TransmitBuffer reply ) throws TransportException; ReceiveBuffer getData(); Address getPeerAddress(); void refuse( TransmitBuffer data ) throws TransportException; }

Figure 91. Control and event interfaces used to manage connection acceptance If the connection request is accepted, a stack is created for the new connection and the ProtocolLayer interface of the layer at the top of the stack is returned to the acceptor. This stack is already in the connected state and can be used for transmission immediately. The accepted stack provides the ConnectionControl control interface so that the state of the accepted connection can be monitored and controlled.

7.8. Transport Filters A common form of protocol is that which acts as a filter, monitoring or transforming data that flows through it. Filter protocols are so common that it is worth providing special support for their implementation. The TransportFilter class extends ProtocolLayerStub with a single service and upcall and provides default im-

plementations of those interfaces that pass data through the filter unchanged – derived classes need only redefine the methods transmit and receive to perform filter processing.

Special handling is required when filters are used with connection-oriented protocols. At the client side of the connection, filters can be used as expected. However, filters specified to be part of a server side stack get layered above the stack that is receiving connection requests, not above the stacks of accepted connections themselves. To handle this situation the TransportFilter class acts as a Prototype [GHJV94] when part of server side stacks by cloning itself when a new connection is accepted and pushing the clone onto the top of the stack of the new connection.

The attach method of the TransportFilter’s upcall interface queries the service to which it is attached for a ConnectionRequestControl interface. If such an interface is returned, the filter knows that it is part of a server

side stack, registers itself as the ConnectionRequestListener to receive connection requests from the layer below and provides a ConnectionRequestControl interface for use by a layer above.

139

7. Transport Protocol Framework Connection requests are handled by wrapping the received request in a FilterConnectionRequest object and passing that up to a ConnectionRequestListener registered with the filter through its own ConnectionRequestControl interface. When the accept method of a FilterConnectionRequest object is

invoked, the FilterConnectionRequest accepts its wrapped connection request, clones the filter that created the FilterConnectionRequest, pushes the clone filter onto the stack accepted from the wrapped request and finally returns the top of the resulting stack to the caller.

FilterConnectionRequest objects call the newLayer method of their filters to create clones. This method is

declared abstract in the TransportLayer class and must be defined by derived classes. Cloning a filter should copy that state of the filter that can be queried or modified through the control interfaces of the filter, except for connections to listener interfaces. This allows an application to elaborate a server side stack containing transport filters and configure those filters to meet the application’s QoS requirements. The stacks of accepted connections will then contain filters with the required parameter settings. «Interface» ProtocolLayer

«Interface» ProtocolUpcall

+attach() +receive() above 1

«Interface» ProtocolService

+attach() +transmit() +getLayer()

ConnectionRequestControl

+getControl() +getService() +getServiceNames() +getUpcall() +getUpcallNames()

«Interface» ConnectionRequestListener

+connectionRequest()

+addConnectionRequestListener() +removeConnectionRequestListener()

0..1 listener

below 1

0..1 lower_contol

ProtocolLayerStub

TransportFilter

#newLayer() : ProtocolLayer

Figure 92. Static Structure of TransportFilter Classes A number of useful transport filters have been implemented as part of the Regent package, including those shown in table 13. Name

Function

frag

A transport filter that fragments and reassembles large messages into smaller fragments. It expects to be layered above a reliable, in-order, connection-oriented protocol.

rel

A transport filter that provides reliable connection establishment and sequenced, inorder delivery of messages. It expects to be layered above an unreliable connectionoriented protocol, such as “cod”. Table 13. Example transport filters 140

7. Transport Protocol Framework Name

Function

compression

A transport filter that compresses transmitted messages using the Deflate algorithm [RFC1951] to reduce bandwidth requirements.

rate

A transport filter that limits the rate of transmission using a leaky-bucket algorithm. Packets that have been transmitted above the policed rate are discarded.

keyexch

A transport filter that secures a reliable, in-order connection using the NeedhamSchroeder private-key exchange protocol [Schneier96].

threadpool

A transport filter that uses a pool of threads to deliver messages. A message delivered to the filter is placed on a queue. One or more threads within the filter remove messages from the head of the queue and deliver them up the stack. This can be used to increase the level of concurrency within the stack to improve performance when many endpoints share a multiplexor.

local

A datagram protocol that routes messages within the same address space. This is only really useful for debugging other protocol layers.

log

A transport filter that logs the size of all messages sent and received. This is only really useful for debugging other protocol layers.

error

A transport filter that throws away messages with some configurable probability. This is only really useful for debugging other protocol layers. Table 13. Example transport filters

7.9. Construction of Protocol Layers Although it is possible to directly instantiate the Java classes that implement protocol layers it is useful to decouple the implementation of a protocol from the name of the protocol that is used by a program. This allows different implementations to be used on different platforms or by trusted and untrusted code. This decoupling is achieved through the use of dynamic linking and the use of factories to instantiate protocol layers. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

public abstract class ProtocolFactory { public abstract ProtocolLayer createLayer(); public abstract Class layerClass(); public abstract Address readAddress( DataInput in, Address addr ) throws AddressFormatException, IOException; public abstract Address parseAddress( PushbackReader in, Address addr ) throws AddressFormatException, IOException; public static void register( String name, ProtocolFactory factory ); public static void remove( String name ); public static ProtocolFactory get( String protocol_name ) throws LoadProtocolException; public static String getProtocolPackage( String protocol_name ) throws LoadProtocolException; public static Enumeration protocolPackages(); }

Figure 93. The abstract ProtocolFactory class

141

7. Transport Protocol Framework Protocol layers are created by factory objects derived from the abstract ProtocolFactory class, as shown in figure 93. Each type of protocol layer known to the system is given a short textual name, such as “tcp” or “udp”. The ProtocolFactory class maintains a static table mapping names to factory objects. The factory for a given name

can be acquired via the ProtocolFactory.get function. An application can explicitly register factories by name by

calling

the

ProtocolFactory.register

function,

and

remove

mappings

by

calling

the

ProtocolFactory.remove function. However, explicitly registering protocol implementations in this way is lit-

tle better than directly instantiating protocol classes: the protocol namespace is fixed and the program must be modified to make use of different protocols. Therefore, the ProtocolFactory.get function uses dynamic linking to locate the factory class for a named protocol and load it into the JVM.

The ProtocolFactory.get function maps the name of the protocol to a package name, loads the class named “Factory” within that package and creates an instance of the loaded class that is returned to the caller. To reduce the time taken to obtain a factory for a protocol, the ProtocolFactory class maintains a static table of factories indexed by protocol name. Dynamic loading is only used if a factory for the required protocol is not already in the table, and any factories loaded dynamically are added to the table before being returned to the caller.

The protocol name is mapped to a package name by searching a list of “root” packages for a sub-package with the same name as the protocol that contains a class named Factory. The list of root packages is obtained by querying the value of the transport.protocol.path property from the database of properties used to configure the Regent system. If an implementation for a protocol is not found in any of the packages in the protocol path, the package regent.transport.protocols is searched. If that fails, the error is reported to the caller by throwing an exception.

Access to these mechanisms are provided to other parts of the system through two static functions. The getProtocolPackage function returns the Java package that contains the implementation of a named protocol.

The protocolPackages function returns an enumeration over the root packages in which protocol implementations may be found.

Java system properties can be specified on the command-line and modified at run-time, thereby allowing an application’s protocol search path to be specified when it is started or changed as it runs. Because protocol factory classes are loaded into the JVM using the normal Java mechanisms protocols can be loaded both from a local filestore and network servers and the available protocols can be changed during the execution of an application by installing packages to or removing packages from the locations searched by the class loader.

7.9.1. Automatic Stack Elaboration In a large distributed system, it is impractical for applications to statically define the protocol configurations that they will use. All systems are adapted over their lifetimes to fix programming errors, meet changing requirements, or take advantage of changes in their environment. In a large client/server system, where many clients have been

142

7. Transport Protocol Framework deployed, changing a server component to use new protocols would invalidate the clients that were configured to use the old protocol stack. Therefore, servers must be able to describe the stack over which they are making their services available and clients must be able to dynamically build a compatible stack from such a description. The Regent transport framework supports dynamicity with a standard naming scheme for protocol stacks, naming conventions to be followed by layers that are to be stacked automatically and stacker functions that can load and compose protocols to match a named stack.

Protocol stacks are named using the path notation that is commonly used to name Internet protocols. A stack description is a list of layer names separated by backslash characters. Layers are listed from the top-most to the bottom-most in the stack. For example, “tcp/ip” describes a stack that consists of the “tcp” layer stacked above the “ip” layer.

The Stacker class is responsible for elaborating a stack description. That is, it parses a stack description, locates or instantiates each layer and composes layers into a stack by connecting the services and upcalls of adjacent layers.

The stacker uses naming conventions to determine which upcalls and services should be attached when pushing a layer onto the stack. The Stacker iterates over the upcalls of the layer being pushed: each upcall has a name that is mapped by the naming convention to possible service names that it can be attached to. The Stacker tries each potential service name in turn to finds a service provided by the top of the stack. It then attaches the upcall and service. If an upcall is found that is not supported by the naming convention, or no services are available that can be attached to the upcall, an exception is thrown to report the error.

Different types of stacks use different naming conventions to allow filter components to be stacked above the appropriate multiplexor sessions. For example, a client-side, connection-oriented stack would need to stack filters over the sessions created for the “connect” service while a service-side stack would need to stack filters over sessions created for the “listen” service. Naming conventions are themselves given names and are stored in the Regent properties database. When a stack description is elaborated, the naming convention to be used for attaching layers is passed to the Stacker which loads the convention from the properties database and caches it for future use. Four standard naming conventions are defined: “client” for building client-side connection-oriented stacks, “server” for building server-side connection-oriented stacks, “connectionless” for building datagram-oriented stacks and “accepted” for building stacks of connections accepted by a listening server.

The stacker uses the following algorithm to elaborate stacks. The algorithm augments the existing protocol graph as stacks are instantiated, using existing multiplexors where possible and keeping track of multiplexors that are created so that they can be reused in future stacks.

143

7. Transport Protocol Framework 1.

Find the highest multiplexor already instantiated for this stack.

The Stacker maintains a table of multiplexors indexed by stack name. This allows the stacker to build requested stacks by pushing layers onto sessions of an existing multiplexor, rather than instantiating a new instance of the multiplexor class. When instantiating a stack description, the stacker iteratively looks up a multiplexor for the stack description, removing the first name of the description and trying again if no multiplexor is found until either a multiplexor is found or their are no names left in the stack description.

For example, when instantiating the stack “compression/bounds/tcp”, the Stacker tries to find a multiplexor named “compression/bounds/tcp”, then “bounds/tcp”, and finally “tcp”.

2.

If no multiplexor is found, load the base of the stack.

The base of the stack is loaded by trying to load the protocol with the last name of the stack description. If this cannot be loaded, the stacker tries to load a protocol that implements a compound protocol with the last two names, and iterates until a protocol can be loaded or there are no more names in the stack, in which case an exception is thrown. If the base of the stack is a multiplexor, it is added to the multiplexor table.

For example, when instantiating the stack “compression/bounds/tcp”, the Stacker tries to load “tcp”, then “bounds/tcp”, and finally “compression/bounds/tcp”.

3.

For each remaining name, instantiate the named layer and push it onto the stack.

Each remaining layer to be added to the stack is instantiated and attached to the top of the stack using a naming convention passed to the Stacker as a parameter. If the layer is a multiplexor, it is added to the multiplexor table.

7.10. Protocol Registry Protocol layers all support the same upwards and downwards interfaces. This allows great flexibility in constructing stacks: any layer can be stacked above any other. However, this has the downside that an invalid stack configuration cannot be caught at compile time. Invalid configurations can be caught when the stack is constructed if a higher layer cannot obtain a control interface that it needs from a layer lower in the stack. Otherwise the stack might fail silently or cause obscure errors during communication.

Designing a valid stack requires knowledge of the behaviour of the components in the stack and the resultant behaviour of their combination. As a rule of thumb, protocol components are designed to each perform a single, medium-grained element of protocol functionality, such as handling connection setup, providing reliability over an unreliable connection protocol, routing datagrams over underlying connections or compressing and decompressing

144

7. Transport Protocol Framework messages. This design guideline makes it easier to predict the behaviour of combinations of protocol components. However, not all programmers find it easy to understand which protocol components they should select to achieve the properties they require for a binding.

To alleviate these problems, the transport framework provides a protocol registry that maps human-readable names to stack configurations. This allows experienced engineers to design stacks that provide specific properties and assign them meaningful names that can be used by less experienced programmers when integrating components into a system.

The

mappings

in

the

protocol

registry

is

accessed

through

static

methods

of

the

class

regent.transport.ProtocolRegistry. The name/stack mappings of the registry are stored in file that is

made available over the web. The URL of this file is specified in the Regent properties database that is used to configure the Regent installation. When a program first maps a name to a stack description, the name/description pairs in the protocol registry are loaded from the web and stored in a local cache. Scalability of the registry is achieved by using existing web cache proxies to replicate the protocol registry at various points in the enterprise network. By encapsulating the registry lookup within a class, the implementation can be changed, to use a distributed relational database for example, without needing to modify, or even recompile, code that uses the registry.

Stack descriptions obtained from the protocol registry are elaborated using the algorithm described in section 7.9.1. References to endpoints that make use of a stack named in the protocol registry contain the description of the stack, not its convenient name, so that the reference remains valid if the stack assigned to that name is changed during the lifetime of the endpoint.

145

7. Transport Protocol Framework

7.11. Concurrency and Synchronisation The protocol framework uses multiple threads of control. System threads carry data and timer notifications up the stack from the devices and application threads carry data down the stack from application components.

Application

Protocol Stack

Devices

Figure 94. Threads in the protocol stack Access to protocol state in each layer must be serialised to avoid corrupted data, lost updates and inconsistent analysis. Since threads may concurrently traverse the graph of layers in opposite directions, simply locking each layer will cause deadlock and race conditions. Therefore, a more elaborate locking strategy is used by layers in the stack. Any layer that needs to protect its state from concurrent access must implement this locking strategy: • Threads travelling down the stack must release the locks of higher layers before acquiring the locks of lower layers. • Threads travelling up the stack do not release locks of lower layers before acquiring when acquiring locks on higher layers. • When the user of a layer releases the layer, the layer must release any lower layers that it owns before freeing resources and allowing the garbage collector to destroy it. • Protocol layers must not block calling threads on monitor condition variables (they should not call the wait method of the Object class).

7.12. Available Protocol Components The following table describes some of the protocol components that have been implemented using the transport framework:

146

7. Transport Protocol Framework Name

Function

tcp

A “device” layer that provides an interface to the TCP/IP protocol implementation of the host operating system.

udp

A “device” layer that provides an interface to the UDP/IP protocol implementation of the host operating system. Two versions of the “udp” layer are provided, one that interfaces to the standard Java networking APIs in the java.net package and one that uses native methods to interface to the Winsock-2 API on Windows. The implementation is selected at runtime by the protocol loader, as described in section 7.9. The native implementation is provided because the java.net package provides very poor support for the upcall-driven style of protocol implementation and the memory management buffers used by the Regent transport framework. Specifically, the java.net.DatagramSocket class does not support scatter-gather I/O and does not allow a program to query for the size of the datagram at the head of the socket’s receive queue.

dmux

A multiplexor that allows multiple higher layers to use a single datagram-oriented protocol.

cod

A simple connection-oriented protocol that implements connections above a datagram-oriented protocol, such as UDP. Cod connections are unreliable; reliability can be layered above the “cod” protocol.

doc

A datagram-oriented protocol that uses an underlying connection-oriented protocol to route datagrams between address spaces, establishing new connections as necessary and closing connections when they fall idle to minimise its use of resources. By layering the “cod” layer above the “doc” layer, multiple lightweight connections can be multiplexed over individual heavyweight connections between each address space. This is useful when processes can only own a limited number of operating system resources, such as TCP/IP sockets.

cmux

A multiplexor that multiplexes multiple lightweight connections can be multiplexed over individual connections of a lower connection-oriented protocol. This protocol is implemented as a composite component, composed of a “cod” layer above a “doc” layer.

frag

A transport filter that fragments and reassembles large messages into smaller fragments. It expects to be layered above a reliable, in-order, connection-oriented protocol.

rel

A transport filter that provides reliable connection establishment and sequenced, inorder delivery of messages. It expects to be layered above an unreliable connectionoriented protocol, such as “cod”.

compression

A transport filter that compresses transmitted messages using the Deflate algorithm [RFC1951] to reduce bandwidth requirements.

rate

A transport filter that limits the rate of transmission using a leaky-bucket algorithm. Packets that have been transmitted above the policed rate are discarded.

keyexch

A transport filter that secures a reliable, in-order connection using the NeedhamSchroeder private-key exchange protocol [Schneier96]. Table 14. Protocol components

147

7. Transport Protocol Framework Name

Function

threadpool

A transport filter that uses a pool of threads to deliver messages. A message delivered to the filter is placed on a queue. One or more threads within the filter remove messages from the head of the queue and deliver them up the stack. This can be used to increase the level of concurrency within the stack to improve performance when many endpoints share a multiplexor.

local

A datagram protocol that routes messages within the same address space. This is only really useful for debugging other protocol layers.

log

A transport filter that logs the size of all messages sent and received. This is only really useful for debugging other protocol layers.

error

A transport filter that throws away messages with some configurable probability. This is only really useful for debugging other protocol layers. Table 14. Protocol components

7.13. Summary This chapter has presented the transport framework used by the Regent middleware platform and shown how it meets the requirements set out in section 7.2. Platform independence is supported by the definition of abstract interfaces through which transport protocols are acquired and used. These interfaces are independent of the presentation and application layer protocols making use of the transport.

The transport framework supports a compositional approach to the construction of transport protocol software. Protocol components all implement the same abstract interfaces and so can be composed into arbitrary directed graphs through which data flows. Address objects hide the composite nature of their implementation behind an abstract interface supporting externalisation and internalisation of address information. Control interfaces allow configuration of protocol parameters and notification of significant events in a manner that hides the exact configuration of the stack being controlled, and, along with the definition of generic control interfaces that are used by many different protocols, supports the operation of generic stack management agents.

The use of factories and dynamic linking to load protocols into the JVM on demand allows distributed system components to evolve separately yet remain able to communicate by loading the required protocols on demand. The use of standard naming schemes for protocols and textual addresses allows composite protocols to be named and stacks to be generated from names on demand. The protocol registry allows convenient names to be assigned to useful stack configurations, so that less experienced programmers do not need to know deal with the full complexity of the transport framework.

The transport framework includes several classes that provide efficient memory management. Section 7.5 compares the performance of these classes with that of the memory management classes provided by the Java standard library. Further performance analysis of the transport framework and the Midas interaction model is described in chapter 8.

148

8. Performance Analysis

Performance Analysis

Never promise more than you can perform. Publilius Syrus

8.1. Introduction The previous chapters have shown how the use of Midas and underlying runtime frameworks provides a great deal of flexibility to the system architect. The architect can design a system composed of components that use different interaction styles and can specify the transport protocols over which those interactions are carried as compositions of lightweight components. The binding model allows monitoring and management functionality to be inserted into bindings as interaction filters and the formal models used in Midas specifications can be automatically translated into filters that testing component and endpoint implementations.

However, up to now the issue of how this flexibility affects performance has not been addressed. This chapter describes a set of experiments that compare the performance of Midas with existing middleware platforms. These experiments demonstrate that performance is not adversely affected by the additional flexibility provided by the component interaction model and runtime system. Further experiments show how performance can be improved by the judicial selection of interaction style and transport protocol components.

8.2. Comparing Middleware Platforms In this chapter we will compare the latency and throughput of the same application written in Java for three different middleware solutions: • Midas: Synchronous message passing over a protocol providing similar functionality to TCP/IP. • Java RMI: Remote object invocation over TCP/IP protocol. • Java & CORBA: Remote object invocation over the CORBA IIOP protocol.

The application used in the experiments consists of a single server that provides a file-like service: clients can open the service, send sequences of octets to it and finally close the service. Only a single client can open the service at a time: other clients wishing to open the service are blocked until the current client has closed the service. The ex-

149

8. Performance Analysis periment comprises multiple iterations, during which the client opens the server, transmits 10,000 blocks of data and then closes the server. The client sends larger blocks of data on each iteration, from 10 octets long to 5000 octets. The client times the duration from issuing the open request to receiving acknowledgement of the close request. Several iterations are performed before starting to record the timing measurements in order to reduce variations in timing caused by dynamic class loading, just-in-time compilation, memory allocation and garbage collection.

The experiment was executed on two machines running Red-Hat Linux 6.0 connected by a lightly loaded 100Mbit switched Ethernet. The server was run on a machine with two 300MHz Pentium-Pro processors and 192 Mb of memory and the client on a machine with a single 300 MHz Pentium-Pro processor and 128 Mb of memory.

The service is implemented as an object in the CORBA and RMI implementations, the interfaces of which are shown in table 15. CORBA

Java RMI

module perf { module corba {

package perf.rmi; import java.rmi.*;

typedef sequence Block; interface TransferService { void open(); void data( in Block block ); void close(); }; }; };

public interface TransferService extends Remote { void open() throws RemoteException; void data( byte[] block ) throws RemoteException; void close() throws RemoteException; }

Table 15. CORBA and RMI interfaces for the performance experiment

The Midas version implements the client and server components as single-threaded active objects. The server provides three synchronous message ports, “open”, “in” and “close”. The client requires three message ports, “open”, “out” and “close”, that are bound to those of the service as shown in figure 95. The message port interaction, Port, is a generic interaction template that is supplied as part of the runtime library and instantiated by the

client and server components parameterised by the types of message passed between them. perf : Perf c : Client

open

open : Port

out

in : Port

close

close : Port

Figure 95. Architecture of the Midas experiment

150

s : Server

8. Performance Analysis The transport protocol used by the bindings is “frag/rel/cod/udp”. The “frag” layer implements fragmentation and reassembly of large messages, “rel” implements reliable, in-order delivery of messages, “cod” implements a connection-oriented protocol over a datagram protocol and “udp” provides access to the UDP/IP protocol provided by the host operating system. The experiment used an implementation of the “udp” layer that makes direct calls to Winsock-2 API functions via native methods because the standard Java networking APIs in the java.net package provide very poor support for the upcall-driven style of protocol implementation and the memory management buffers used by the transport framework. Specifically, the java.net.DatagramSocket class does not support scatter-gather I/O and does not allow a program to query for the size of the datagram at the head of the socket’s receive queue.

8.2.1. Results The following graphs show the times taken to perform the tests per block size.

140

120

Time (secs)

100

80

60

40

20

0 10

510

1010

1510

2010

2510

3010

3510

4010

4510

Block Size (octets) RMI

CORBA

Port

Figure 96. Comparison of CORBA, RMI and Midas performance. The graph shows that initially the performance of the Midas version is better than that of the CORBA version, despite the use of additional threads with the extra context switch and synchronisation overhead that implies. This indicates that the Midas interaction model and the use of composite, layered transport protocols do not significantly reduce performance.

However, the performance of both CORBA and Midas is significantly worse than that of RMI. This is probably caused by overhead due to memory allocation and garbage collection. Midas and CORBA have a greater memory allocation overhead during marshalling because they marshal invocations into in-memory buffers before transmis-

151

8. Performance Analysis sion. RMI, on the other hand, marshals invocations into a TCP/IP stream through a fixed size buffer, transmitting the buffer whenever it becomes full. Because CORBA and Midas allocate memory for each message, memory is used more quickly, causing the garbage collector to run more often. Furthermore, the JVM zeros all allocated memory, causing further overhead when allocating buffers.

The performance advantage of RMI is gained at the price of flexibility and inefficient utilisation of other resources: RMI does not allow the programmer to insert interaction or transport filters into a binding and must establish separate TCP/IP connections for each binding1. On many operating systems a process can only open a limited number of TCP/IP connections; Midas systems can use transport multiplexors to share operating-system sockets between multiple endpoints.

The Midas platforms shows a marked decrease in throughput with block sizes above about 1000 octets. This is because the “frag” layer was configured to divide large messages into 1000-byte fragments. The configuration interfaces of the Frag layer allow a system administrator to tune the binding to fit the size of each fragment to the MTU of the link-layer protocol if the link-layer protocol is known a-priori. However, UDP/IP does not expose this information to higher layers, because it cannot be determined for an internet. RMI and CORBA both use TCP/IP as their transport; TCP has more information about the configuration of the link-layer protocol and uses the optimum maximum segment size if the connection is between two hosts on the same physical segment. Nevertheless, the throughput of the Midas application decreases less rapidly than that of the CORBA application as the amount of data increases.

8.2.2. Relative Code Sizes Figure 97 shows the sizes of each test program. As can be seen, the programs written using Midas are significantly smaller. This is mainly because the application components do not include any code to perform the binding of interaction endpoints; this is handled by the runtime system. Another contributing factor is the use of predefined,

1. RMI uses two connections for every RMI reference being held to a Remote object, even if two or more references refer to the same object or different objects in the same process. 152

8. Performance Analysis generic interaction abstractions that can be instantiated for use by this particular application; this reduces the need for application-specific interaction declarations. It is worth noting that the difference in sizes would increase as the programs became more complex, with a greater number of distributed components. 160 140 140 127

Size (lines)

120 96

100 80 60 40 20 0

Java RMI

CORBA

Regent/Midas

Figure 97. Sizes of the three test programs

8.3. The Effect of Different Interaction and Transport Protocols The latency and throughput of a Midas system is affected by the selection of interaction and transport protocols for a binding. Figure 98 shows how additional protocol functionality affects the throughput of the Midas version of the test application when transmitting an HTML file, 1571 octets in length1, between Port endpoints on the same host. As can be seen, significant benefits can be obtained by removing unnecessary functionality from the stack. When endpoints are on the same host, fragmentation and reassembly can safely be removed. Further performance im-

1. This reason for this rather strange choice of number is that it was the size of the author’s home page. 153

8. Performance Analysis provements can be achieved by using a shared memory transport as the base of the stack, rather than UDP/IP. Similarly, bindings between endpoints on the same physical network could safely remove the fragmentation layer if the interaction protocol and higher transport layers never create messages larger than the MTU of the network. 16 13.22

14

Time (msecs)

12 9.44

10 8

7.64 6.91

6 4 2 0

cod/udp

rel/cod/udp

frag/cod/udp

frag/rel/cod/udp

Protocol

Figure 98. The effect of transport layers on throughput The synchronisation policy of the interaction protocol can also have a significant effect on throughput when sender and receiver produce or consume messages at variable rates. Figure 99 shows the difference in throughput of two programs that transmit 1000 messages from sender to receiver. Both the sender and receiver sleep for a random

154

8. Performance Analysis duration between 0 and 500 milliseconds after each transmission. Both programs are identical except that one version transmits messages synchronously, using a request/reply interaction, while the other uses an interaction style that allows asynchronous messages to be sent up to some window size. 400 347.7

350

Time (msecs)

300 244.05

250 200 150 100 50 0 Synchronous

Asynchronous

Figure 99. The effect of interaction style on throughput

8.4. Summary This chapter has presented the results of performance measurement experiments that compare the throughput of the same application written using CORBA, RMI and Midas. These demonstrate that the Midas interaction model and the use of composite transport stacks do not have a significant effect on latency or throughput. Furthermore, selection of an appropriate transport stack and interaction style for a binding can result in significant improvements in performance.

155

9. Conclusions

Conclusions

Problems worthy of attack Prove their worth by hitting back. Piet Hein

9.1. Introduction As described in chapter 2, current component models are almost exclusively defined in terms of objects, the interaction protocols between components being specified as abstract interfaces. Such an approach has severe limitations: an object interface can only describe bundles of related synchronous request/reply operations, but cannot specify the protocol by which those operations may be invoked. Furthermore, such an approach has no way of defining protocols in which operations are invoked on multiple interfaces. Yet in general all but the simplest component models define interaction protocols that do require multiple interfaces.

9.2. Contributions This thesis has presented a model of component interaction that supports an open variety of interaction styles, a language by which the programmer may define interaction styles that follow our model and software frameworks providing run time support for our interaction style between distributed components. Our model of interaction is more general than the interaction styles supported by current middleware platforms, such as RPC or object invocation. During the lifetime of a binding, communication can be synchronous or asynchronous, initiated by the client or service, pointcast from client to service, service to client, or multicast from the service to all or a subset of the clients.

Our language, Midas, is used to define interaction styles that follow our model. The Midas compiler translates these definitions into run-time support for binding and distribution transparency, base classes for application-layer filters and support for third-party binding. Unlike current interface definition languages, Midas lets the designer annotate interaction definitions with formal specifications of the application-layer protocol. This specification can be exported from the Midas source code, combined with abstract models of transport protocols and analysed by model checkers, to catch design errors before implementation begins. Midas links the design and implementation phases

156

9. Conclusions of system development by translating these formal specifications in FSP format into animations that allow a programmer to interact with the model of the interaction protocol and application-layer monitors that check that component implementations conform to the specified protocol.

The code generated from the Midas definitions interfaces with generic runtime frameworks that support the interaction model and implement transport protocols. The runtime frameworks are efficient – the interaction model and runtime support do not impose additional overhead compared to existing object-oriented middleware platforms. The clean separation of concerns allows designers to select the most appropriate and efficient mechanism for each concern – application-layer protocol, marshalling, transport protocol, programming interface, synchronisation – on a binding-by-binding basis, and allows filters that implement management and monitoring functionality to be added to a binding at both the application layer and the transport layer.

Transport protocols are modelled in FSP in terms of the reliability and ordering of message delivery. This high level of abstraction allows the designer to check the correctness of interaction protocols when used over a variety of transports, but ensures that model checking is still feasible. In practice, the selection of transport protocol components is also influenced by other, non-functional decisions which cannot easily be expressed in any one modelling notation. Concerns such as latency, throughput, efficient use of operating-system resources and security, as well as reliability and message ordering, must be taken into account when selecting the transport protocol. The selection of individual transport components is exposed to the designer to allow them to make these decisions, but it up to the designer to select components that meet those properties that are shown to be necessary by design-time analysis.

9.3. Critical Evaluation Chapter 3 described the goals of this research as six properties that are required for interactions between distributed components. In this section we review how the Midas language runtime system provide these properties. • Open-Ended. The Midas language defines interaction protocols in terms of asynchronous messages between components. Synchronisation is implemented in terms of these messages, allowing an interaction definition to specify both information flow and synchronisation patterns. Messages can be transmitted at any time by both clients and the service, allowing the definition of many-to-one interactions, one-to-many interactions or any mixture of the two. • Analysable. The Midas language allows the designer to annotate interaction definitions with arbitrary specifications. Chapter 2 defined a mapping between Midas and the FSP notation. This allows the formal specification of interaction protocols and mechanical analysis for deadlock and other protocol errors. Additional advantages of a formal specification include using interactive animations to help programmers understand interaction protocols and the automatic generation of test code. • Flexible. Midas defines interactions purely in terms of application-layer messages. As described in section

157

9. Conclusions 3.2, the run-time support for an interaction separates the concerns of API and synchronisation, presentation layer marshalling, binding and transport protocol. Chapter 7 described how the transport protocol for a binding can be dynamically loaded and composed from protocol components. Applications can manage bindings by querying for standard control interfaces, allowing low-level control of communication without loss of generality. • Distribution Transparent. Midas interaction declarations are translated into proxies that provide distribution transparency by marshalling invocations and transmitting them between address spaces. This approach provides distribution transparency to component implementations that communicate through interaction endpoints at their interfaces, and to the implementations of those endpoints themselves. • Language Independent. Midas is a pure specification language: it does not include any implementation details of how endpoints or components are implemented. Instead, language mappings define how Midas interaction definitions are translated into programming language constructs. Chapter 6 describes the mapping from Midas to Java and showed that it is possible to write compilers that generate code in other programming languages. • Efficient. Chapter 8 analysed the impact on performance of the Midas interaction model and the runtime support. These experiments demonstrate that the Midas interaction model and the use of composite transport stacks do not have a significant effect on latency or throughput. Furthermore, selection of an appropriate transport stack and interaction style for a binding can result in significant improvements in performance.

The main drawback that we have found in using Midas is the complexity of the language. Compared to existing IDL languages, Midas can be used to define richer, more complex interaction protocols, and provides the programmer with a greater amount of flexibility and choice. This has the disadvantage that, compared to IDL, it is harder to define object-oriented interfaces in terms of request/reply invocations. This disadvantage could be overcome by a compiler that translates CORBA IDL into Midas definitions. Such a translation would be trivial to perform because the type, constant and module syntax of Midas is based on that of CORBA IDL, and the compiler could easily generate the FSP describing the interaction protocol as described by IDL.

9.4. Further Work The results presented in this thesis can be extended in several ways.

9.4.1. A General Connector Model Midas supports the Darwin model of component binding which is inherently client/server: many client endpoints can be bound to a single service endpoint, but not vice versa. As illustrated in chapter 3, Midas can specify many types of interaction within these constraints but there are styles of interaction that do not easily fit into this model. Examples include group communication, in which all members of the group play the same role, or protocols between three or more different types of participant.

158

9. Conclusions An alternative view of the Darwin and Midas model is that components are bound via connectors that define two types of roles: “service” and “client”. More complex communication patterns could be described by extending Midas so that it could define interaction protocols between more than two types of role. This opens a number of issues: • The current Midas model allows multiple clients to communicate with single server. With more than two role types, how does one define the multiplicity of each role. • Not all roles will communicate with one another. Can one determine which roles must be connected from the specification of the interaction protocol, or must the designer explicitly define connections? If the latter, then should such a specification be defined with the ADL, which is also used to define connections between components, with Midas only being used to define the roles’ message interfaces? • How would such a connector model map to run-time support classes? Should all communication between roles of a connector use the same transport protocol?

Such a change would also necessitate changes to Darwin or the use of another ADL that defines architectures in terms of components communicating through connectors.

9.4.2. Improved Support for Dynamic Composition of Transport Layers There are a number of shortcomings with the current transport protocol framework. • Multiplexors create hidden session layers when a higher layer requests a reference to one of their protocol services. This is necessary to allow reusable layers to be stacked above multiplexors or non-multiplexors. However, it makes it difficult to write a layer that allocates multiple sessions from a lower multiplexor: that layer cannot be bound to a service that creates sessions because no such service exists. Instead it must be bound to a session and then query the session for the layer interface of its multiplexor, and then request a service of the layer whenever it wants a new session. This is an overly tortuous way of doing something that should be very simple. • Components that require control interfaces from lower layers must explicitly query for them when bound to a protocol service. • The logic by which a layer must pass a request for a control interface to lower layers if it cannot satisfy the request must be implemented by each component. It is possible to encapsulate this logic in a reusable base class that uses Java reflection, but this is inefficient, only works for classes that derive from the reflective base class and forces derived classes to be written to specific coding conventions which are not always convenient. • Stacks are torn down by “releasing” the upper layers, which release layers beneath them, and so on. Before releasing layers beneath it, a released layer must set any references to higher layers to null, so that the higher layer can be garbage collected, communicate with remote layers if necessary, remove its event listeners from lower layers and then release layers beneath it. This logic must be encoded in each component.

159

9. Conclusions These problems can be alleviated by moving the logic for dynamically composing transport components out of the components themselves. The rules by which components can be composed into valid stacks, stacks can be named, and multiplexor layer reused between individually named stacks together define the architectural style of the transport subsystem. This architectural style can be described as constraints over the possible components in a stack and the valid bindings between components: • Higher layers can be bound to the data transmission interfaces of lower layers, but not vice versa. • Lower layers can be bound to the data reception interfaces of higher layers, but not vice versa. • A layer's requirement for a control interface must be met by the closest lower layer that provides that interface. • A layer's events must be routed to the closest higher layer that can handle those events. • Binding a requirement for a transmission service to a multiplexor will instantiate a new session that provides a transmission service. Binding a requirement for a multiplexor to a multiplexor will not instantiate a new session.

Further constraints need to be placed on individual layer types. For example: • A layer that performs encryption should be placed as low in the stack as possible. • A layer that performs compression should be placed as low in the stack as possible, but above the layer that performs encryption, if there is one.

We are currently developing a component model and ADL that will allow designers to specify such constraints externally to functional implementation of each component. This will allow much of the binding logic and error detection can be removed from the layer implementations, making them simpler to write and compose. The transport framework will be used as a case study: it will be ported to the new component model and a style defined for protocol stacks.

9.4.3. QoS-Directed Transport Construction The runtime system currently requires the programmer to explicitly define the structure of the transport stack used for a binding. The programmer must therefore understand what each protocol component does and how they interact. A better approach would be for the programmer to define a required QoS for a binding and let the runtime system construct a stack to meet that QoS.

This is still an open research topic. There has been some work on tools that guide the developer in selecting “plugin” components that are plugged into a protocol layer to implement the required transport semantics [Hiltu98]. However, these tools do not help the programmer build a protocol stack or graph.

160

9.4.4. Runtime Visualisation A visualisation of the behaviour of endpoints of a binding could be created by generating interaction filters from Midas and FSP that drive the animation tool. This would be a useful aid for debugging.

9.5. Closing Remarks The task of developing distributed computer systems is complex. Current middleware platforms attempt to hide this complexity from the developer by providing simple, but limited, abstractions of communication and system structuring. However, in practice, this has the opposite of the intended effect. As the sophistication demanded of distributed applications increases, those applications require communication mechanisms that are more varied and sophisticated than those provided by current middleware platforms. Developers must both implement any mechanisms not provided by the middleware using lower level abstractions and then integrate their implementations with those of the middleware platform.

This thesis has shown that the design of new communication abstractions, although more complex than defining RPC interfaces, can be aided by formal modelling and model checkers, allowing the developer to catch errors early in the development cycle. However, by basing communication abstractions upon a clear model of communication and cleanly separating the implementation concerns of that model, a middleware platform can offer component developers, those using predefined communication abstractions, with both flexibility and simplicity.

161

Bibliography

Bibliography [ADG98]

Robert J. Allen, Remi Douence, and David Garlan, Specifying and Analyzing Dynamic Software Architectures. Proceedings of the 1998 Conference on Fundamental Approaches to Software Engineering (FASE '98), March 1998.

[AG97]

Robert Allen and David Garlan. A Formal Basis For Architectural Connection. ACM Transactions on Software Engineering and Methodology, July 1997.

[AG98]

K. Arnold and J. Gosling. The Java Programming Language, Second Edition. Addison-Wesley, 1998. ISBN 0-201-31006-6.

[Allen97]

Robert J. Allen. A Formal Approach to Software Architecture. Ph.D. Thesis, Carnegie Mellon University, Technical Report Number: CMU-CS-97-144, May, 1997.

[ALP99]

U. Aßmann, A. Ludwig D. Pfeifer. Programming Connectors In an Open Language, In Web-Published Proceedings of Position Papers, WICSA 1, Working IFIP Conference on Software Architecture, IFIP WG 2.9, February 1999.

[AP93]

M.B. Abbott and L.L. Peterson. A Language-based Approach to Protocol Implementation. IEEE Transactions on Networking, 1(1):4-19, February 1993.

[APM93]

APM Ltd. (Editors). Application Programming in AnsaWare - Document RM. 102.02. APM Ltd., Poseiden House, Castle Park, Cambridge, UK. 1993.

[BBMV98a] R. Balter, L. Bellissard, F. Boyer, M. Riveill and J.Y. Vion-Dury. Architecturing and Configuring Distributed Applications with Olan. Proc. IFIP Int. Conf. on Distributed Systems Platforms and Open Distributed Processing (Middleware'98), The Lake District, 15-18 September 1998. [BBMV98b] L. Bellissard, F. Boyer, M. Riveill and J.Y. Vion-Dury. System Services for Distributed Application Configuration. Proc. of the 4th IEEE Intn'l Conf. on Configurable Distributed Systems (ICCDS'98), Annapolis MD, 4-6 May 1998. [BC97]

G.S. Blair and G. Coulson. The Case for Reflective Middleware. Proc. 3rd Cabernet Plenary Workshop, Rennes, France, April 1997.

[BCRP97]

G.S. Blair, G. Coulson, P. Robin, M. Papathomas. An Architecture for Next Generation Middleware. Proc. IFIP International Conference on Distributed Systems, Platforms and Open Distributed Processing (Middleware’98), Lake District, UK. N. Davies, K. Raymond and J. Seitz (eds), SpringerVerlag, 1998.

[BJR97]

G. Booch, I. Jacobson and J. Rumbaugh. Unified Modelling Language. Rational Software Corporation. Version 1.0.

[BK98]

A. Berry and S. Kaplan. Open, Distributed Coordination with Finesse. Presented at the 1998 ACM Symposium on Applied Computing (SAC’98), Atlanta, GA. February 1998.

162

Bibliography [BN84]

A. Birrell and B. Nelson. Implementing Remote Procedure Calls. ACM Transactions on Computer Systems, 2(1): 39-59, February 1984.

[BOSW98] G. Bracha, M. Odersky, D. Stoutamire, and P. Wadler. Making the future safe for the past: Adding Genericity to the Java Programming Language OOPSLA 98, Vancouver, October 1998. [BPS98]

Tim Bray, Jean Paoli, C. M. Sperberg-McQueen. W3C Recommendation REC-xml-19980210: Extensible Markup Language (XML) 1.0. Published on the web-site of the World Wide Web Consortium (W3C), http://www.w3c.org/XML/, February 1998.

[CBC98]

F.M. Costa, G. Blair and G. Coulson. Experiments with Reflective Middleware. ECOOP Workshop on Reflective Object Oriented Programming and Systems (ROOPS’98), Brussels, Springer-Verlag, 1998.

[CD97]

S. Crane, N. Dulay. A Configurable Protocol Architecture for CORBA Environments. In Proc. Third International Symposium on Autonomous Decentralised Systems, Berlin, Germany, April 8-11, 1997.

[CDFK95]

S. Crane, N. Dulay, H. Fosså, J. Kramer, J. Magee, M. Sloman and K. Twidle. Configuration Management for Distributed Software Services. In Y. Raynaud, A. Sethi and F. Faure-Vincent, editors, Integrated Network Management IV, pps 2942. Chapman and Hall, 1995.

[Chap96]

D. Chappell. Understanding ActiveX and Ole. Microsoft Press. 1996. ISBN 1572312165.

[CL98]

P. Chan and R. Lee. The Java Class Libraries, Second Edition, Volume 2: java.applet, java.awt, java.beans. Addison-Wesley, 1998. ISBN 0-201-31003-1.

[CLK98]

P. Chan, R. Lee and D. Kramer. The Java Class Libraries, Second Edition, Volume 1: java.io, java.lang, java.math, java.net, java.text, java.util. Addison-Wesley 1998. ISBN 0-201-31002-3.

[CM93]

S. Chiba and T. Masuda. Designing an Extensible Distributed Language with a Meta-Level Architecture. Proceedings of the 7th European Conference on Object-Oriented Programming (ECOOP’93), Kaiserlautern, July 1993, LNCS 707, pp 482-501.

[CMP95]

S. Crane, J. Magee and N. Pryce. Design Patterns for Binding in Distributed Systems. Presented at the Workshop on Design Patterns for Concurrent, Parallel and Distributed Object-Oriented Systems at OOPSLA '95.

[Crane96]

S. Crane. A Framework for Distributed Interaction. Presented at the International Workshop on Development and Evolution of Software Architectures for Product Families, Madrid, November 1996.

[Crane97]

S. Crane. Dynamic Binding for Distributed Systems. PhD Thesis, Imperial College, University of London, March 1997.

[CT94]

S. Crane and K. Twidle. Constructing Distributed Unix Utilities in Regis. Proc. International Workshop on Configurable Distributed Systems, Pittsburgh, March 1994.

163

Bibliography [Deer91]

S. Deering. Multicast Routing in a Datagram Internetwork. PhD Thesis, Stanford University, December 1991.

[EE98]

G. Eddon and H. Eddon. Inside Distributed COM. Microsoft Press, 1998. ISBN 157231849X.

[FGCDR98] Tom Fitzpatrick, Gordon Blair, Geoff Coulson, Nigel Davies and Philippe Robin. Supporting Adaptive Multimedia Applications through Open Bindings International Conference on Configurable Distributed Systems (ICCDS '98) Annapolis, Maryland, USA May 1998. [FKT94]

I. Foster, C. Kesselman and S. Tuecke. Nexus: Runtime Support for Task-Parallel Programming Languages. Technical Report, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne Il. 60439, August 1994.

[Fried87]

S. Friedberg. Transparent Reconfiguration requires a Third-Party Connect. Technical Report 220, Department of Computer Science, University of Rochester, November 1987.

[FS97]

H. Fossa, M. Sloman. Interactive Configuration Management for Distributed Object Systems. Proceedings of the First Enterprise Distributed Object Computing Workshop (EDOC’97), Goald Coast, Australia, October 1997.

[GAO94]

David Garlan, Robert Allen, John Ockerbloom. Exploiting Style in Architectural Design Environments. Proceedings of SIGSOFT '94 Symposium on the Foundations of Software Engineerng, December 1994.

[Garlan98]

D. Garlan. Higher-Order Connectors. Presented at the Workshop on Compositional Software Architectures, Monterey, CA, January 6-7, 1998.

[GHJV94]

E. Gamma, R. Helm, R. Johnson and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. 1994.

[Gian95]

D. Giannakopoulou. The TRACTA Approach for Behaviour Analysis of Concurrent Systems. Department of Computing, Imperial College of Science, Technology and Medicine DoC 95/16, September 1995.

[GJS96]

J. Gosling, B. Joy and G. Steele. The Java Language Specification. Addison-Wesley, 1996. ISBN 0-201-63451-1.

[GKC99]

D. Giannakopoulou, J. Kramer, and S.C. Cheung. Analysing the Behaviour of Distributed Systems using Tracta. Journal of Automated Software Engineering, special issue on Automated Analysis of Software, vol. 6(1), pp. 7-35, January 1999. R. Cleaveland and D. Jackson, Eds.

[GMW97]

David Garlan, Robert T. Monroe, David Wile. Acme: An Architecture Description Interchange Language. Proceedings of CASCON '97, November 1997.

[GS98]

A. Gokhale and D. Schmidt. Managing and Optimizing CORBA Latency and Scalability Over High-Speed Networks. IEEE Transactions on Computers, Vol. 47, No. 4, April 1998.

164

Bibliography [HHD98]

R. Hayton, A. Herbert and D. Donaldson. FlexiNet - A flexible component oriented middleware system. Presented at ACM SIGOPS European Workshop, Lisbon, 7-10 September, 1998.

[Hiltu98]

M. Hiltunen. Configuration Management for Highly Customizable Services. Proceedings of the Fourth International Conference on Configurable Distributed Systems, Annapolis, Maryland, May 1998.

[HJA95]

H. Hueni, R. Johnson, R. Angel. A framework for network protocol software. Object Oriented Programming Systems, Languages and Applications Conference Proceedings (OOPSLA'95), ACM Press 1995.

[IBS98]

V. Issarny, C. Bidan and T. Saridakis. Achieving Middleware Customization in a Configuration-Based Development Environment: Experience with the Aster Prototype. In Proceedings of the 4th International Conference on Configurable Distributed Systems, pages 207214, May 1998, Annapolis, Maryland, USA.

[Inprise99]

Inprise Inc. VisiBroker: CORBA Technology from Inprise.. Published on the web-site of Inprise Inc., http://www.inprise.com/visibroker/.

[Iona99]

Iona Technologies. Orbix Product Information. Published on the web-site of Iona Technologies, http://www.iona.com/products/orbix/.

[KLMM97] Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Videira Lopes, JeanMarc Loingtier, John Irwin. Aspect-Oriented Programming. Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland. Springer-Verlag LNCS 1241. June 1997. [KRB91]

G. Kiczales, J. des Rivières and D.G. Bobrow. The Art of the Metaobject Protocol. MIT Press. 1991.

[LK97]

C. Lopes and G. Kiczales. D: A Language Framework for Distributed Programming. Techical report SPL97-010, P9710047 Xerox Palo Alto Research Center, February 1997.

[LKAV95]

D.C. Luckham, J.J. Kenney, L.M. Augustin, J. Vera, D. Bryan and W. Mann. Specification and Analysis of System Architecture Using Rapide. IEEE Transactions on Software Engineering, Special Issue on Software Architecture, Vol. 21, No. 4, pp. 336-355, April 1995.

[LP98]

S. Lo and S. Pope. The Implementation of a High Performance ORB over Multiple Transport Protocols. Technical Report 98.4, AT&T Laboratories Cambridge, 24a Trumpington Street, Cambridge CB2 1QA, England, 1998.

[LN79]

H. Lauer and R. Needham. On the Duality of Operating System Structures. Operating Systems Review, 13(2): 3-19, April 1979.

[LV95]

D.C. Luckham and J. Vera. An Event-Based Architecture Definition Language. IEEE Transactions on Software Engineering, Vol 21, No 9, pp. 717-734. September 1995.

[LY97]

T. Lindholm and F. Yellin. The Java Virtual Machine Specification. Addison-Wesley, 1997. ISBN 0-201-63452-X.

165

Bibliography [MBL97]

A. C. Myers, J. A. Bank and B. Liskov. Parameterized Types for Java. Proc. 24th ACM Symposium on Principles of Programming Languages, Paris, France, January 1997.

[MBVR97] V. Marangozov, L. Bellissard, J.-Y. Vion-Dury, M. Riveill. Connectors: a Key Feature for Building Distributed Component-Based Architectures. Proc. 2nd European Research Seminar on Advances in Distributed Systems (ERSADS'97), Zinal, 1721 March 1997, pp. 246-251. [MDEK95] J. Magee, N. Dulay, S. Eisenbach, J. Kramer. Specifying Distributed Software Architectures. Proc. of 5th European Software Engineering Conference (ESEC '95), Sitges, September 1995, LNCS 989, (Springer-Verlag), 1995, 137-153 [MDK94]

J. Magee, N. Dulay and J. Kramer. A Constructive Development Environment for Parallel and Distributed Programs. In IEE/IOP/BCS Distributed Systems Engineering, 1(5): 304-312, Sept 1994.

[Meyer88]

B. Meyer. Object-Oriented Software Construction. Prentice-Hall International Series in Computer Science, 1988.

[MH98]

V. Matena and M. Hapner. Enterprise JavaBeans, Version 1.0. Sun Microsystems, 901 San Antonio Road, Palo Alto, CA 94303. March 1998.

[MK99]

J. Magee and J. Kramer. Concurrency: State Models and Java Programs. John Wiley and Sons, 1999.

[MKG97]

J. Magee, J. Kramer, and D. Giannakopoulou. Analysing the Behaviour of Distributed Software Architectures: a Case Study. Presented at 5th IEEE Workshop on Future Trends of Distributed Computing Systems, Tunisia, October 1997.

[Monr96]

Robert T. Monroe. Capturing Design Expertise in Customized Software Architecture Design Environments. Proc. of the Second International Software Architecture Workshop, October 1996.

[MPIF93]

Message Passing Interface Forum. Document for a Standard Message-Passing Interface, 1993. Message Passing Interface Forum, University of Tennessee, Knoxville, Tennessee.

[MPW92]

R. Milner, J. Parrow and D. Walker. A Calculus of Mobile Processes, Parts I and II. Journal of Information and Computation, Vol 100, pp 1-40 and pp 41-77, 1992.

[MSFT97]

Microsoft Corp. (Editor). Automation Programmer’s Reference: Using ActiveX Technology to Create Programmable Applications. Microsoft Press. 1997. ISBN 1572315849.

[NK98]

P. Nikander and A. Karila. A Java Beans Component Architecture for Cryptographic Protocols. Proceedings of the 7th Usenix Security Symposium, January 26-29, 1998, San Antonio, Texas.

[NKM96]

Keng Ng, Jeff Kramer, Jeff Magee A CASE Tool for Software Architecture Design. Journal of Automated Software Engineering (JASE), Special Issue on CASE-95, 1996

[ODP95]

Secretariat: ISO/IEC JTC1/SC21/WG7. Reference Model of Open Distributed Processing, part 3: Architecture. Document ITU-T X.903 (ISO/IEC 10746-3). Standards Association of Australia, PO Box 1055, Strathfield, NSW, Australia 2135, May 1995.

166

Bibliography [OMG98]

The Object Management Group. The Common Object Request Broker: Architecture and Specification, Version 2.2. The Object Management Group, OMG Headquarters, 492 Old Connecticut Path, Framington, MA 01701, USA. July 1998.

[OMG99]

The Object Management Group. CORBA: Free Downloads! Published on the web-site of the OMG, http://www.omg.org/corba/freestuff.html.

[OP92]

S.W. O'Malley and L. L. Peterson. A Dynamic Network Architecture. ACM Transactions on Computer Systems, 10(2): 110-143, May 1992.

[OW97]

M. Odersky and P. Wadler. Pizza into Java: Translating Theory into Practice. in Proc. 24th ACM Symposium on Principles of Programming Languages, Paris, France, January 1997.

[PC96]

N. Pryce and S. Crane. A Uniform Approach to Communication and Configuration in Distributed Systems. Proc. Third International Conference on Configurable Distributed Systems, Annapolis, May 1996.

[PC98a]

N. Pryce and S. Crane. A Model of Interaction in Concurrent and Distributed Systems. In Proc. Second International Workshop on Development and Evolution of Software Architectures for Product Families, Las Palmas de Gran Canaria, Spain, February 26-27 1998.

[PC98b]

N. Pryce and S. Crane. Component Interaction in Concurrent and Distributed Systems . In Proc. Fourth International Conference on Configurable Distributed Systems, Anapolis, MD, USA. May 4-6 1998. IEEE Computer Society. ISBN:0-8186-8451-8.

[Pryce99]

N. Pryce. Type-safe Session. To be published in Pattern Languages of Program Design, Volume 4. Addison-Wesley Longman, 1999.

[Purt94]

J. Purtillo. The POLYLITH Software Bus. ACM Transactions on Programming Languages and Systems, 16 (1), January 1994, pp 151-174.

[QS95]

B. Quinn and D. Shute. Windows Sockets Network Programming. Addison-Wesley, 1995. ISBN 0201633728.

[RBFH95]

R. van Renesse, K. Birman, R. Friedman, M. Hayden, and D. Karr. A Framework for Protocol Composition in Horus. In Proceedings of Principles of Distributed Computing, August, 1995.

[RBM96]

R. van Renesse, K. Birman and S. Maffeis. Horus, a flexible Group Communication System. Communications of the ACM, April 1996.

[RFC1112] S. Deering. Extensions for IP Multicasting. RFC 1112, IETF, August 1989. [RFC1832] R. Srinivasan. XDR: External Data Representation. RFC 1832, IETF, August 1995. [RFC1951] P. Deutsch. DEFLATE Compressed Data Format Specification version 1.3. RFC 1951, IETF, May 1996.

167

. [Ritchie84] D. M. Ritchie. A Stream Input-Output System. AT&T Bell Laboratories Technical Journal, 63(8): 1897-1910, October 1984. [Roger97]

D. Rogerson. Inside COM - Microsoft's Component Object Model. Microsoft Press, 1997.

[Rossum95] G. van Rossum. Python Reference Manual, Release 1.5.2. Corporation for National Research Initiatives (CNRI), 1895 Preston White Drive, Reston, Va 20191, USA, January 12, 1999. Available online from http://www.python.org/.

[Schneier96] B. Schneier. Applied Cryptography, Second Edition. Protocols, Algorithms and Source Code in C. John Wiley and Sons. 1996. ISBN 0471117099. [SDKR95]

M. Shaw, R. DeLine, D. Klein, T. Ross, D. Young, and G. Zelesnik. Abstractions for Software Architecture and Tools to Support Them. IEEE Transactions on Software Engineering, April 1995.

[SDZ96]

M. Shaw, R. DeLine, and G. Zelesnik. Abstractions and Implementations for Architectural Connections. Third International Conference on Configurable Distributed Systems, May 1996.

[SG96]

M. Shaw and D. Garlan. Software Architecture: Perspectives on an Emerging Discipline. Prentice Hall. 1996. ISBN 0131829572.

[Shaw96]

M. Shaw. Truth vs. Knowledge: The Difference Between What a Component Does and What We Know It Does. in Proc. 8th International Workshop on Software Specification and Design, March 1996.

[Softw99]

Softwired Inc. (Editor). iBus Technical White Paper. Published on the web-site of Softwired Inc., http://www.softwired-inc.com/

[Steve97]

W. Richard Stevens. Unix Network Programming : Networking APIs: Sockets and XTI (Volume 1). Addison-Wesley, 1997. ISBN 013490012X.

[SWDB98] S. Sullivan, L. Winzeler, J. Deagen, D. Brown. Programming With the Java Media Framework. John Wiley & Sons. 1998. ISBN 0471251690. [WEH97]

D.G. Waddington, C. Edwards and D. Hutchison. Resource Management for Distributed Multimedia Applications. Proceedings of the Second European Conference on Multimedia Applications, Services and Techniques - ECMAST '97; LNCS 1242, Milan, Italy, May 1997.

[WRW96]

A. Woolrath, R. Riggs and J. Waldo. A Distributed Object Model for the Java System. Computing Systems, 9(4), pp. 291-312.

168

A. Midas Syntax

Midas Syntax

Language services three functions. The first is to communicate ideas. The second is to conceal ideas. The third is to conceal the absence of ideas. Otto Jespersen

This appendix describes the syntax of the Midas language.

A.1. Comments and Preprocessing Midas code is preprocessed by the C preprocessor before parsing. This allows the inclusion of Midas files within each other, the use of macros and conditional compilation.

Midas definitions can include comments. Comment syntax is the same as that of CORBA IDL. The parser ignores text between the symbols /* and */ and from the symbol // to the end of the line. Comments can not be nested.

A.2. Types Midas provides the same primitive types as CORBA IDL, as shown in table 16, below. boolean octet short unsigned short long unsigned long long long unsigned long long float double char

TRUE or FALSE 8-bit cardinal 16-bit integer 16-bit cardinal 32-bit integer 32-bit cardinal 64-bit cardinal 64-bit integer 32-bit real 64-bit real 8-bit character

Table 16. Primitive Midas types

169

A. Midas Syntax The enum statement is used to define a new enumerated type, values of which can be one of a set of symbolic constants: enum Direction {

An enumerated type, named Direction, that can

north, south, east, west have the value north, south, east or west. };

Table 17. An example Midas enumerated type.

Midas provides a number of ways to specify structured data. The simplest are fixed size, multidimensional arrays of values. Variable sized sequences of values are supported by the sequence type constructor; sequences can optionally be bounded. The string type is similar to a sequence of characters, but is mapped into the most appropriate type for handling strings in the implementation language. T[N] sequence sequence string string

Fixed size array of N elements of type T. Unbounded sequence of zero or more elements of type T. Bounded sequence of zero to N elements of type T, Unbounded character string Bounded character string containing up to N characters. Table 18. Basic structured Midas types

More complex data types can be defined using struct and union statements. A struct declaration defines a record type containing zero or more named, typed fields. A union declaration defines a discriminated union type that can, at run time, hold values of different types; the current value and type depends on a discriminator value that can be checked to ensure type-safe usage of the union. struct StringPair { string first; string second; }; union StringOrDouble switch(short) { case 0: string the_string; default: double the_double; };

A record containing two strings, named “first” and “second”.

A union that holds a string, named “the_string”, if its discriminator is zero, or a double, named “the_double”, otherwise.

Table 19. Data structure definitions

In addition to type definitions supported by CORBA IDL, Midas allows generic type definitions; that is, type definitions can themselves be parameterised by one or more types. This allows the definition of generic data structures that can be used without sacrificing type safety and allows the definition of generic interaction protocols, as dis-

170

A. Midas Syntax cussed in section 4.0.2. Generics are defined by specifying one or more type parameters within angled brackets (“”) after the type name, as shown in table 20. Generic definitions can be parameterised only by primitive Midas types or types that have themselves been defined in Midas. A generic struct that holds two values, named “first” and “second”, the types of which are parameters of the struct.

struct Pair < type T, type U > { T first; U second; };

Table 20. Generic structure declaration

Finally, Midas allows the definition of type aliases with the typedef statement. This can be useful to give convenient or descriptive names to type definitions that have long names, are in different modules or are instantiations of a generic type. typedef string TypeId; typedef Pair< TypeId, TypeId > MIMEType;

Use of typedefs to give descriptive names to an existing type and an instantiation of generic type.

Table 21. Type aliases

A.3. Interaction Definitions The interaction statement is used to define an interaction style. The interaction statement is followed by the name of the interaction style, optional formal parameters if the interaction style is generic, and, finally, the body of the interaction statement containing definitions of the message interfaces and specifications of the interaction protocol.

The body of an interaction statement must contain a provide statement, that defines the message interface of the service endpoint, and a require statement, that defines the message interface of the client endpoints. The body of each message interface consists of one or more message definitions. Messages are named and can take any number of named, typed parameters. Unlike operation parameters in CORBA IDL that can be used to pass values back from the server to the client, message parameters are always read-only. Values are returned in Midas interactions by explicitly defining reply messages.

The body can contain zero or more spec statements that annotate the interaction definition with specifications of the interaction protocol. Each specification has a string type-name and arbitrary contents. A specification is processed by a type-specific compiler that can parse the specification and pass it to external tools. Each specification must follow some type-specific mapping between the specification notation and the elements of the interaction – name, message interfaces, message names – so that the compiler back-end can associate elements of the specification with elements of the Midas definition.

171

A. Midas Syntax Like structures and unions, interaction styles can be generic and instantiated for any primitive or user-defined type; type parameters can be used as the types of message arguments. An interaction statement itself defines a new type that can then be used as the type of fields in a user-defined data type, the type of message parameters of other interaction types or as arguments to instantiate a generic type. interaction Func< type REQUEST, type REPLY > { A definition of a generic interaction named “Func” provide { parameterised by two types, REQUEST and REPLY. request( REQUEST data ); The service endpoint receives request messages }; with a single argument of type REQUEST and sends require { reply messages to the clients with a single argureply( REPLY data ); ment of type REPLY. The protocol by which re};

quest and reply messages are sent is defined by the FSP specification defined in the spec block (the contents of which have been elided for brevity). This interaction definition is described in detail in section 4.0.4.

spec “FSP” { ... }; };

Table 22. An example interaction definition

A.4. Constant Definitions The const statement is used to define a named, typed constant. Constants can be values of any of the primitive types or character strings. Numeric and boolean constants can be calculated at compile time from literal values and other constants. Expressions follow the same syntax as those of CORBA IDL [OMG98]. Definitions of string constant, named name with the value “Nat Pryce”, and a long constant, named max_size, calculated from the two other constant values.

const string name = “Nat Pryce”; const long max_size = num_elements * max_element_size;

Table 23. Constant definitions

A.5. Modules Modules allow names to be defined within packages to avoid name clash and to group related type and interaction definitions. Unless declared within the body of a module statement, Midas definitions are defined within the global module. Modules can be nested; definitions in other modules can be referred to by their scoped names. A scoped name is preceded by the names of the modules containing the name separated by the scoping operator, “::”. An

172

A. Midas Syntax initial “::” indicates that the name is resolved from the global module, otherwise it is resolved from the module in which the name is used following the usual name resolution rules of block structured languages such as C++ or Pascal. Modules used to group definitions related to media processing that are defined as part of the Regent set of libraries. The third definition declares a typedef for an interaction type in a different submodule of the regent module.

module regent { module media { typedef string TypeId; struct MediaType { TypeId major_type; TypeId minor_type; }; }; };

Figure 100. Modules used to group definitions

173

B. FSP Syntax

FSP Syntax

In theory, there is no difference between theory and practice. But, in practice, there is. Jan L.A. van de Snepscheut

This chapter describes the FSP notation and is taken, with permission, from [MK99].

B.1. Processes A process is defined by a one or more local processes separated by commas. The definition is terminated by a full stop. STOP and ERROR are primitive local processes.

Example 1 2

Process = (a -> Local), Local = (b -> STOP).

Action Prefix ->

Choice |

Guarded Action when

Alphabet Extension +

If x is an action and P a process then (x->P) describes a process that initially engages in the action x and then behaves exactly as described by P. If x and y are actions then (x->P|y->Q) describes a process which initially engages in either of the actions x or y. After the first action has occurred, the subsequent behaviour is described by P if the first action was x and Q if the first action was y. Abbreviation: (x->P|y->P) can be written as ({x,y}->P). The choice (when B x -> P | y -> Q) means that when the guard B is true then the actions x and y are both eligible to be chosen, otherwise if B is false then the action x cannot be chosen. The alphabet of a process is the set of actions in which it can engage. P + S extends the alphabet of the process P with the actions in the set S. Table 24. Process operators

174

B. FSP Syntax

B.2. Composite Processes A composite process is the parallel composition of one or more processes. The definition of a composite process is preceded by ||.

Example 1

||Composite = (P || Q).

Parallel Composition ||

Replicator

If P and Q are processes then (P||Q) represents the concurrent execution of P and Q. If processes in a composition have actions in common, these actions are said to be shared. Unshared actions are arbitrarily interleaved; shared actions are executed at the same time by all processes that share the action. forall [i:1..N] P(i) is the parallel composition (P(1) || ... || P(N))

forall

Process Labelling

a:P prefixes each label in the alphabet of P with a.

:

Process Sharing ::

Priority High
Q). ||C =(P||Q)

||C=(P||Q)>>{a1,...,an} specifies a composition in which the actions a1,...,an have lower priority than any other action in the alphabet of P||Q including the silent action tau. In any choice in this system which has one or more transitions not labeled by a1,...,an, the transitions labeled by a1,...,an are

discarded. Table 25. Composite process operators

B.3. Common Operators The operators in table 26 may be used in the definition of both processes and composite processes. Conditional if then else

Relabelling /

Hiding \

Interface @

The process (if B then P else Q) behaves as the process P if the condition B is true otherwise it behaves as Q. If the else Q is omitted and B is false, then the process behaves as STOP. Re-labelling is applied to a process to change the names of action labels. The general form of re-labelling is /{newlabel_1/oldlabel_1,... newlabel_n/oldlabel_n}. When applied to a process P, the hiding operator \{a1..ax} removes the action names a1..ax from the alphabet of P and makes these concealed actions “silent”. These silent actions are labelled tau. Silent actions in different processes are not shared. When applied to a process P, the interface operator @{a1..ax} hides all actions in the alphabet of P not labelled in the set a1..ax. Table 26. Common process operators

175

9. Conclusions

B.4. Properties Safety property

Progress progress

A safety property P defines a deterministic process that asserts that any trace including actions in the alphabet of P, is accepted by P. progress P = {a1,a2..an} defines a progress property P which asserts that in an infinite execution of a target system, at least one of the actions a1,a2..an will

be executed infinitely often. Table 27. Safety and progress properties

176