Towards a Table Driven XML QoS Aware Transmission Framework Alex Ng
[email protected] Department of Computing, Macquarie University, North Ryde, NSW 2109, Australia Abstract
Table Driven XML (TDXML) is an encoding mechanism designed to improve the inefficient serialisation/deserialisation process of converting machine object representations to/from XML representations by encoding the XML message data into tabular format, with each data and attribute element being assigned unique identifiers for improved serialisation and deserialisation efficiencies. The table structure of TDXML gives birth to a flexible transmission framework called the Table Driven XML QoS Aware Transmission Framework (TDQAT) which makes use of the concept of a content-based routing mechanism and using different tables to convey different functional and non-functional properties of a Web Service. TDQAT promises the following benefits: improved performance and Quality-of-Service control, reduced implementation cost and time, and easing cross-platform integrations. Keywords
Web Services, Optimisations, QoS
1. Introduction The first Web Service appeared in 1998 [3] but until now we still haven’t quite achieved the level of ubiquity and performance that everybody was hoping for. Some of the hindering factors identified in [12] are related to platform differences. One of the examples is the difference in behavior of java.util.date and the .Net System.DateTime. There are other limiting factors that limits the performances of XML Web Services, such as XML’s verbosity and processing overhead, storage requirements, and bandwidth consumption [25]. Furthermore, the presence of SOAP intermediaries, the issue of the object-XML mismatch, the requirement for an increased security measures for the Web Services protocol stacks and incorporating Quality-of-Service in Web Services [15, 21] have made the need for an optimised transfer mechanism for Web Services an eminent issue. Earlier studies [19, 20] showed that the performance of the SOAP (used to be known as the Simple Object Access Protocol) protocol is affected by numerous factors: the implementation platform, the choice of encoding style, and the complexity of the message structure. However, not many of the available
techniques provide efficient ways to reduce the network verbosity, improve the inefficient serialisation and deserialisation processes in the SOAP protocol, and provide ubiquitous invocation. The Table Driven XML QoS Aware Transmission Framework (TDQAT) makes use of the content-based routing technique to enable the sender and receiver to process XML data in an efficient manner and handle the processing of native binary data. TDQAT is based on the TDXML message encoding technique by putting XML data into different table formats. However, according to Kay and Pasquale [13], datatouching is a major source of overheads in protocol processing and usually scale linearly with message size. Therefore, while TDXML puts data into table format to enable direct data access, the use of contentbased transmission technique in TDQAT is to feed different TDXML tables to different table processors with the following goals: (1) Avoid the frequency and duration of data-touching incidence; (2) Avoid unnecessary serialisation and deserialisation processes; (3) Enable concurrent processing; and (4) Improve overall performance and enforce different QoS mechanisms through the use of content aware routing technique. The rest of this paper is organised as follows: Section 2 discusses the related works. Section 3 presents an overview of the proposed TDQAT framework. Future work plan for this research and conclusions are presented in section 4.
2. Related Work
There are a number of performance enhancement techniques being proposed. A majority of the enhancement techniques have emphasised on compressing XML message size through different techniques, such as, software or hardware compression [9], using shorter XML tags [26], using binary metadata [4, 33] and using binary XML encoding to replace the unparsed, text-based XML format [17, 23]. The use of binary XML is exemplified by the Sun Fast Web Services [23] and Fast InfoSet [24] proposals. The MTOM [10] proposal is able to selectively encode
portions of a SOAP message using XML-binary Optimised Packaging (XOP) [11] to efficiently serialise XML Infosets containing binary data. The Portable Binary I/O metadata (PBIO) [4] technique of creating efficient wire formats using Natural Data Representation (NDR) [33] is an example of using binary metadata on the wire, which is maintained by the sender, then decoded by the receiver into its desired form. Pull Parsing (XPP) [28] and schema-specific parsers [5], are techniques that improve the parsing efficiency of an XML parser. To accommodate the need for multiple data type platforms, the ShareMemory Access (SMA) [1] technique has been suggested to improve the performance of SOAP-based Web Services using in-memory shared data, distributed architecture, and standardised APIs. The notion of QoS is broad and may be applied to a variety of areas, such as end-user quality perception, network performance or system performance. QoS is a set of user-perceivable attributes, which describe a service in a user-understandable language and manifests itself as a number of parameters. The Web Services QoS requirement, according to Sumra and Arulazi [29], mainly refers to the quality, both functional as well as non-functional, aspect of a Web Service. This includes performance, reliability, integrity, accessibility, availability, interoperability, and security. Researches on QoS aware Web Services architectures have been conducted by many organisations targeting QoS management and development. Quality of Service Aware Component Architecture (QuA) [2] is designed for existing distributed object systems such as CORBA by offering hooks where QoS management components are attached in order for the execution environment for components can meet committed QoS levels at run time. IBM’s Web Service Level Agreement (WSLA) framework [15] and HP’s Web Service Management Network Agent (WSMN) [22] both targeted at defining and monitoring service level agreements (SLA) for Web Services. The WSLA framework defines precise SLAs and uses distinct interaction stages (Management, Condition Evaluation, Measurement and Instrumentation) to implement runtime management services. The WSMN Agent model uses the model generator to receive the WSDL/WSFL specifications and creates a model of the Web Service in the model repository. The SLO evaluator obtains the required management information determines compliance/violations. The SLA violation engine maintains the record for violations, their timestamps, and the clauses that are violated. Ran [21] proposes a QoS model which extends the current UDDI registry model [6] with a new certifier role to verify service provider’s QoS claims. The
UDDI data structure types were extended to include the quality information data type. The Web Services QoS Architecture (WS QoS) [30] addresses the integration issue in Web Services based on the WS QoS ontology concept with three upper elements (QoSInfo, WSQoSOntology and QoSOfferDefinition) to allow service providers and service clients to specify requests and offers with QoS properties. This model consists of the following components: QoS Editor, Requirement Manager, Web Service Broker, and QoS Monitor. SOAP-binQ [27] is different from other’s work but is similar to the TDQAT work proposed in this paper. Both SOAP-binQ and TDQAT provide the ability to dynamically adjust the data volumes sent to available network resources by specifying quality attributes in a quality file which is compiled jointly with the WSDL. The information contained in this quality file is the data types of the parameters sent in SOAP messages in conjunction with various quality attribute values. Both TDQAT and SOAP-binQ allow the application layer to communicate with the transport layer either via XML data or via native data. The main difference between TDQAT and SOAP-binQ is this: the data format used for data transmission in SOAP-binQ is PBIO which requires format conversions to be performed at the receiver side, whereas TDQAT uses the XOP technique to maintain XML Infosets compliance.
3. Overview of TDXML and TDQAT TDQAT makes use of the content-based routing technique which has been widely used in many areas. Lockwood et al. [14] used layered protocol wrappers to parse the content of Internet data to achieve the purpose of a System-On-Programmable-Chip (SOPC) content-aware Internet firewall. Mahajan and Parashar [16] on the other hand implemented the content-aware bandwidth broker (CABB) to provide adaptive brokering for networked multimedia applications in a Differentiated Services environment. In a mobile adhoc network, Yang and Hurson [34] make use of the content-based image indexing technique to perform reactive, multi-path and hybrid routing strategies. The use of content-based routing in the TDQAT framework is for the segregation of message data processing. A content-based router (in TDQAT’s design, it is a table scheduler) examines the message content and routes the message onto a different channel based on the data contained in the message. The routing can be based on a number of criteria such as existence of fields or specific field values. When designing a content-based router for TDQAT, special caution has been taken to make the routing function easy to maintain, as the router can become a point of frequent
maintenance. In TDQAT’s design, a message filter is a special form of a content-based router. It examines the message content and passes the message to another channel if the message content matches certain criteria. Otherwise, it discards the message.
3.1 Table Driven XML (TDXML) TDQAT is built upon the Table Driven XML (TDXML) [18] encoding mecahnism. A TDXML document is embedded in an XML document like a SOAP message embeds in an XML document. The presence of a TDXML document is signified by a pair of tags ( in short form). A TDXML document is composed of two entities: Data Schema and Data. The TDXML Data Schema ( ) provides important information for describing the structure and constraining the contents of a TDXML document. It is required at service setup time only. Abstract Syntax Notation One (ASN.1) is used to describe the data schema contained in a TDXML Data Schema block. person [0]::= SEQUENCE OF{ lang(0) [ATTRIBUTE] UTF8String (“English”) firstname[0] UTF8String, lastname[1] UTF8String } [0]0(0)|English [0]0[0]|Alan [0]0[1]|Johnson [0]1(0)|French [0]1[0]|Chris [0]1[1]|Michell
Figure 1: An example showing a block of TDXML data representing the details of two persons The Content Table ( ) contains the actual data and attribute representations of a structure instance described in a TDXML Data Schema. The TDXML Content Table contains unique identifiers for each XML element/attribute and their corresponding values in a table format. In the example shown in Figure 1, the element is assigned tag [0], which is the root element. The attribute is assigned tag (0), which is the first attribute for the element . The first row of data has the entry ‘[0]0(0)|English’ which identifies the attribute in the first occurrence of the element.
TDXML offers a versatile method of encapsulating data as such we can use another set of tables to maintain QoS attributes. A TDXML Quality Schema is marked by the block. The Quality Schema table defines the data structure and constrains of all the required QoS attributes for a particular transmission. A TDXML Quality Table is marked by the block. The Quality Table contains the values of the QoS attributes. QoS ::= SEQUENCE OF{ RTT [0] Integer, Reliable [1] UTF8String, Security [2] UTF8String } [0]0[0]|5 [0]0[1]|On [0]0[2]|WS-Security
Figure 2: An example showing a block of TDXML data representing the QoS attributes required for a particular transmission Figure 2 shows an example of a block of TDXML data used to specify the QoS attributes for a particular transmission. Round-Trip-Time (RTT) is one of the quality attributes for message types. However, a monitored attribute can use any value that is suitable for triggering changes in data quality, including attributes specified by end users, such as desired image resolution, jitters, time-to-live. Quality attributes can also be specific to an application, as demonstrated with a SOAP-based Web Service or other attributes suitable for the applications and execution environments used in this paper may capture CPU load, by measuring serialisations or deserialisation costs, memory consumption, or similar factors.
3.2 Table Driven XML QoS Aware Transmission Framework (TDQAT) The design principle of TDQAT is based on the idea that if the client and the service provider agree on a common representation of the data type which is different from the standard XML representation (whether it be binary or NDR), they can send the raw data using that agreed representation scheme encapsulated in a specially assigned TDXML Content Table and thus can avoid serialisation and deserialisation. This design concept borrows the ideas from Widener et al. [33] and Seshasayee et al. [27]. For
example, signed integer in Java uses 32 bits (4 bytes) to represent value from -2,147,483,648 to +2,147,483,647. If the client and the service provider agree to use native Java integer representation in the data transmit phase, the sender will send the 4-byte integer representation in a TDXML custom binary data Content Table instead of serialising the integer into the string representation of the integer value. This saves the sender’s effort from serialising the integer into string as well as the receiver’s effort in deserialising the string value back to the integer value. TDQAT QoS Transmission Framework will involve the following life cycle which is typical in the Service Oriented Architecture (SOA) environment: Phase 1: Service creation; in which interface references for the service (WSDL) are created and can be referred to by other services in an UDDI registry. While TDXML puts data into table form to enable direct data access, TDQAT segregates data into different portions to enable the avoidance of serialisation and deserialisation overheads. To achieve such agreement, the server provider has to specify in the WSDL, for each SOAP request/response parameter, what kind of internal representation is supported and the client needs to let the service provider knows whether the data transmitted is using the internal representation or not.
Figure 3: An example showing qat:type attribute is used in WSDL to indicate an element supports certain TDQAT supported internal representation Figure 3 shows an example of using qat:type attribute in a WSDL document to indicate an element supports certain TDQAT supported internal representation. The presence of a qat:type attribute in an element indicates that the described element support a certain type of TDQAT defined system internal representation. All TDQAT supported types will use the namespace xmlns:qat=http://www.tdqat.org/qat/2007/qatSchema.
To avoid overheads of exchanging application level information between client and service provider, TDQAT will let the service provider to dictate what internal representation is used and it is up to the client to choose to use the service provider’s internal representation or not. Phase 2: Service finding; in which a client can perform a look-up in an UDDI registry for that particular service. Phase 3: Service Binding; in which a client negotiates with the service provider for an agreed set of service environment (i.e. SLA). Figure 4 explains the processes involved in the binding establishment phase of TDQAT. Implicit binding is not desirable for the TDQAT platform as the QoS requirements may demand that resource allocation procedures are performed before the request is executed. Therefore, in TDQAT designs explicit binding is necessary, which consists of taking explicit actions at the computational level in order to establish the binding before actual interaction. The agreed QoS serves as a contract between the client and the TDQAT patform, which should be respected during the activation phase.
Figure 4: TDQAT Binding Establishment Phase 4: Service activation; in which a service starts execution, which implies that all resources necessary for the service to execute should be properly allocated. Phase 5: Service deactivation; in which all resources allocated to a service may be released, although the interface references may still be valid in case persistent services are supported. Phase 6: Service destruction is the last phase; in which the service is deactivated (if it is still active) and its interface references become invalid.
call the application using interrupt or put the request on a queue, which will be eventually consumed by the application. Splitting incoming requests into interrupts and queues can better handle requests at different priorities. This design follows the result recommended by Welsh, Culler and Brewer [32].
Figure 5: A schematic diagram showing the components involved in transmitting data using the TDQAT framework The components involved using the TDQAT framework to send data is shown in Figure 5. Every object involved in a transmission requires the sender to define an Encoding Ruler (one per object) for specifying which part of an object uses the natural presentation and which part of the object uses XML representation. When an object is to be sent, a Data Analyser is activated. The Data Analyser works according to rules specified in the Encoding Ruler for that particular object and generates the following three intermediate parts for the object: • Message Signature: The TDXML Data Schema; • Natural Data: the portion of data that contains the object fields in natural binary data format; and • Serialise Data: the portion of object fields that require serialisation. These data portion will undergo TDXML serialisation process into its designated TDXML Content Table. The TDQAT Encoding Ruler acts as a blueprint to describe the schema of the object structure, the data format of each field and the target Web Service name where the object will be processed. The ASN.1 notation will be used to specify the rules. The TDXML Data Container is the repository for all different TDXML tables. Once a TDXML table is ready to be sent, further security and other QoS related functions can be performed on the data or that table will be handed over to the network buffer for sending without the need to wait for all the other related tables to be ready for sending. On the receiving end, as shown in Figure 6, the identity of each TDXML Content Table is interrogated and is fed to different TDXML handlers to perform task specific processing and also enable concurrent processing. The Message Scheduler is responsible for interpreting the Message Signature portion and performs the necessary action for each data field contained in the Natural Data portion. It will either
Figure 6: A schematic diagram showing the components involved in processing the received data using the TDQAT framework The Object Data Scheduler is responsible for handling those data elements in the natural binary format that do not require deserialisation. It is basically retrieving the value from the data table, together with the matching action specified in the message signature portion and putting the data into the scheduler module for processing. For those tables that require deserialisation, the TDXML Deserialiser is a set of specialised routines to handle the deserialisation of TDXML data back to their internal system format and then feed the internal system data to the message scheduler module for processing.
3.3 TDQAT QoS Mechanisms The design of TDQAT acknowledges the importance of the adaptability property for a QoS aware transmission framework, both on short and long terms. Short term adaptability, i.e., concerning runtime changes, is required because in a heterogeneous environment the qualitative properties are subject to change due to a changing number of service request and unavailability due to planned or unplanned downtime of intermediaries. Long term adaptability, i.e., concerning evolutionary changes, is needed due to new computing and communication resources with additional functionality make available to the Web Services protocol stacks [31] over time. Our objective is to design an adaptable generic QoS aware transmission framework that hides the run-time, adaptable to evolutionary changes, that simplifies the establishment of agreements. QoS in traditional communication networks is mainly driven by the need to provide QoS
differentiation to various users and/or application channels. Basic mechanisms for QoS in TDQAT employ the combination of the scheduling mechanism and the buffer management scheme to determine the QoS perceived by a particular message flow. A more frequent scheduling of messages from a source flow increases the bandwidth for that flow, whereas the buffer size and buffer management determine the delay and jitter (i.e., delay variance). The research community has proposed buffer management schemes, such as Class Based Queuing (CBQ) [8] and Weighted Fairness Queuing [7]. TDQAT. Figure 7 shows that a TDQAT intermediary provides two channel planes. The Control Plane processes and forwards signaling tables, while the Message Plane processes two classes of application tables: premium and best-effort tables where premium tables will take precedence over besteffort. The number of precedence classes can be controlled by global parameters.
Figure 7: A schematic showing TDQAT using different control and data transfer planes in an intermediary
4. Conclusion
This paper has briefly described the rationale and design of a special kind of realisation of XML using tables in conjunction with a QoS aware transport framework called the Table Driven XML QoS Aware Transmission Framework (TDQAT). This framework aims at paving the groundwork for an improved performance as well as increasing the degree of security and interoperability of Web Services with QoS capability. We expect that TDQAT will be an efficient alternative-processing framework for the Web Services community. We are in the process of implementing the code needed for parsing, interrogating different TDXML documents signatures, different task schedulers and handlers, and are developing tools and converters to help automate and manage the processes involved in the setup and maintenance of TDQAT. We are also in the process of developing viewers and tools to assist programmers in making use of our approach. In future work, to validate our approach against other similar implementations, we shall identify other implementations for our experimental measurements. Some examples are: • The MTOM [10] standard is a suitable candidate for comparison since MTOM is a standard and there are implementations for MTOM (e.g. Apache Axis2) available. • The SOAP-BinQ [27] mechanism is another good candidate for comparison due to the similarity between TDQAT and PBIO in the optimisation approach.
References [1]
3.4 TDQAT Validation The proposed TDQAT architecture will be validated by means of designing and implementing a broker between the TDQAT capable end-point and at least one intermediary node under the control of a quality requirements and provisioning application component called the TDQAT QoS Provisioning Service (TDQPS). To further justify the proposed extensions to existing SOA and to ensure that these extensions are not tied to a specific implementation platform, we will develop a reference model together with a reference implementation so that we can analyse the performance characteristics demonstrated by the TDQAT reference implementation and its limitations by collecting the following metrics: parsing and content interrogation overheads, overall serialisation and deserialisation overheads, CPU utilisation, overall throughput, and memory usage patterns.
[2]
[3] [4]
[5] [6] [7]
Achieving Extreme Performance with In-Memory Shared Data between Java and C++, Rogue Wave Software, 2006. Amundsen, S., Lund, K., Eliassen, F., et al. QuA: Platform-Managed QoS for Component Architectures. In Proceedings of the Norwegian Informatics Conference, p.55-66, 2004. Box, D. A Brief History of SOAP, 4 April 2004 (online). http://www.xml.com/pub/a/ws/2001/04/04/soap.html Bustamante, F., Eisenhauer, G., Schwan, K., et al. Efficient Wire Formats for High Performance Computing. In Proceedings of the Conference on Supercomputing, 2000. Chiu, K. and Lu, W., A Compiler-Based Approach to Schema-Specific Parsers for XML, Tech Report, No. 592, Indiana University, Feb 2004. Clement, L., Hately, A., Riegen, C.v., et al., UDDI Version 3.0.2, OASIS, 19 October 2004. Demers, A., Keshav, S., and Shenker, S., Analysis and simulation of a fair queueing algorithm. Journal of
[8] [9]
[10] [11] [12]
[13]
[14]
[15] [16] [17] [18]
[19]
[20]
Internetworking: Research and Experience: 3-26, 1, January 1990. Floyd, S. and Jacobson, V., Link-sharing and resource management models for packet networks. IEEE ACM Trans. Networking. 3(4): 365–386, August 1995. Ghandeharizadeh, S., Papadopoulos, C., Cai, M., et al., Performance of Networked XML-Driven Cooperative Applications. Concurrent Engineering. 12(3): 195-203, September 1, 2004. Gudgin, M., Mendelsohn, N., Nottingham, M., et al., SOAP Message Transmission Optimization Mechanism W3C Recommendation, 25 January 2005. Gudgin, M., Mendelsohn, N., Nottingham, M., et al., XML-binary Optimized Packaging W3C Recommendation 25 January 2005. Guest, S. Top Ten Tips for Web Services Interoperability, (on-line). http://blogs.msdn.com/smguest/archive/2004/08/12/213 659.aspx Kay, J. and Pasquale, J. The importance of non-data touching processing overheads in TCP/IP. In Proceedings of the ACM SIGCOMM, San Francisco, California, United States, p.259 - 268:ACM Press, 1993. Lockwood, J.W., Neely, C., Zuver, C., et al. An Extensible, System-on-Programmable-Chip, ContentAware Internet Firewall. In Proceedings of the Field Programmable Logic and Applications (FPL), 2003 Ludwig, H., Keller, A., Dan, A., et al., Web Service Level Agreement (WSLA) Language Specification, Specification, IBM, 2003. Mahajan, M. and Parashar, M., Managing QoS for Multimedia Applications in a Differentiated Services Environment, Rutgers University, 2002 Martin, B. and Jano, B., WAP Binary XML Content Format W3C NOTE, W3C, 24 June 1999. Ng, A. Optimising Web Services Performance with Table Driven XML. In Proceedings of the Australian Software Engineering Conference (ASWEC 2006), Sydney, Australia, p.100-112, 2006. Ng, A., Chen, S., and Greenfield, P. An Evaluation of Contemporary Commercial SOAP Implementations. In Proceedings of the Fifth Australian Workshop on Software and System Architectures (AWSA2004), Melbourne, p.64-71:Swinburne University of Technology, 2004. Ng, A., Greenfield, P., and Chen, S. A Study of the Impact of Compression and Binary Encoding on SOAP Performance. In Proceedings of the Sixth Australasian Workshop on Software and System Architectures (AWSA2005), Brisbane, p.46-56:Swinburne University of Technology, 2005.
[21] Ran, S., A model for web services discovery with QoS. ACM SIGecom Exchanges. 4(1): 1-10, 2003. [22] Sahai, A., Machiraju, V., Sayal, M., et al., Automated SLA Monitoring for Web Services, Technical Report, HPL-2002-191, HP, 17-July-2002. [23] Sandoz, P., Pericas-Geertsen, S., Kawaguchi, K., et al., Fast Web Services, Sun Microsystem, August 2003. [24] Sandoz, P., Triglia, A., and Pericas-Geertsen, S., Fast Infoset, Sun Microsystems, June 2004. [25] Schmelzer, R. Will binary XML solve XML performance woes?, 22 Nov 2004 (on-line). http://searchwebservices.techtarget.com/tip/1,289483,si d26_gci1027726,00.html [26] Serin, E., Design and test of the cross-format schema protocol (XFSP) for networked virtual environments. Naval Postgraduate School: Montery, California. p. 133, 2003. [27] Seshasayee, B., Schwan, K., and Widener, P. SOAPbinQ: High-Performance SOAP with Continuous Quality Management. In Proceedings of the 24th International Conference On Distributed Computing Systems (ICDCS), 2004. [28] Slominski, A. Home page of XML Pull Parser (XPP), (on-line). http://www.extreme.indiana.edu/xgws/xsoap/xpp/ [29] Sumra, R. and Arulazi, D. Quality of Service for Web Services—Demystification, Limitations, and Best Practices, March 4, 2003 (on-line). http://www.developer.com/services/print.php/2027911 [30] Tian, M., Gramm, A., Naumowicz, T., et al. A Concept for QoS Integration in Web Services. In Proceedings of the 1st Web Services Quality Workshop (WQW 2003), 2003. [31] Weerawarana, S., Curbera, F., Leymann, F., et al., Web Services Platform Architecture: SOAP, WSDL, WSPolicy, WS-Addressing, WS-BPEL, WSReliableMessaging, and More. 1st. ed. Prentice Hall PTR. 416, 2005. [32] Welsh, M., Culler, D., and Brewer, E. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. In Proceedings of the Eighteenth Symposium on Operating Systems Principles (SOSP18), Chateau Lake Louise, Canada,, 2001. [33] Widener, P., Eisenhauer, G., Schwan, K., et al. Open Metadata Formats: Efficient XML-Based Communication for High Performance Computing. In Proceedings of the Tenth IEEE International Symposium on High Performance Distributed Computing-10 (HPDC-10), San Francisco, 2001. [34] Yang, B. and Hurson, A.R. A Content-Aware Multimedia Accessing Model in Ad Hoc Networks. In Proceedings of the The IEEE International Conference on Parallel and Distributed Systems (ICPADS), 2005.