Bringing Service Differentiation to the End System Domenico Cotroneo, Massimo Ficco, Simon Pietro Romano, and Giorgio Ventre Via Claudio, 21 Dipartimento di Informatica e Sistemistica Università degli Studi di Napoli “Federico II” 80125 Napoli, Italy {cotroneo,spromano,ventre}@unina.it
[email protected]
Abstract. A number of distributed applications require communication services with Quality of Service (QoS) guarantees. To the purpose, resource management is required both in the end-system and inside the network. To date several approaches have been proposed, dealing either with end-system issues or with those more strictly related to the network, but no unified view exists. Furthermore, while network QoS provisioning has come to a good level of standardization, no standard proposals exist for the end-systems. We argue that a single architectural model is needed, taking into account a more exhaustive concept of “resource management”, where quality of service is defined at the level of user perception. In this paper we propose an architectural model that enhances the IETF Diffserv framework in order to provide service differentiation even inside network end-points. We also present a prototype implementation of such architecture along with experimental results validating the proposed solution. Index terms — Quality of Service, Operating Systems, Differentiated Services
A. INTRODUCTION To provide applications with end-to-end guarantees, network resource management alone is not sufficient. This is particularly true when end-points become more sophisticated, e.g., workstations with rich device support, multiprocessing and multiple users. Distributed applications usually require a diversity of resources and network bandwidth is just part of them. Other features, such as processing capacity, have to be managed in concert with the network to provide applications with a guaranteed behavior [1]. With respect to the network, experiences over the Internet have showed the lack of a fundamental technical element: real-time applications do not work well across the network because of variable queueing delays and congestion losses. The Internet, as originally conceived, offers only a very simple quality of service: point-to-point best-effort data delivery. Thus, before real-time applications can be broadly used, the Internet infrastructure must be modified in order to support more stringent QoS guarantees.
Based on these assumptions, the Internet Engineering Task Force (IETF) has defined two different models for network QoS provisioning: Integrated Services [2] and Differentiated Services [3]. These frameworks provide different, yet complementary, solutions to the issue of network support for quality of service. Although network designers stress the point that the network is more than its applications, we argue that the applications themselves drive its evolution and deployment. A single architectural model is thus needed, taking into account a more exhaustive concept of “resource management”, where the quality of the service is defined at the level of user perception. We propose a framework that extends the concept of service differentiation, introducing a schema for priority-based communication mechanisms inside the end point operating system. We also present a testbed implementation of this framework running on Solaris 2.6 and provide a critical discussion of the main experimental results. The rest of the paper is organised as follows. In section B we briefly introduce the QoS provisioning issue in the Internet, taking a look at the two standard proposals of the IETF. In section C we present an overview of some interesting proposals for end-to-end QoS support inside the end system. An innovative proposal for end-system resource management is depicted in section D, where we introduce a new component called the Priority Broker (PB). We describe the overall structure of our enhanced differentiated services end-to-end architecture in section E. In section F we illustrate implementation issues related to the Priority Broker, while presenting experimental results in section G. Conclusions and directions for future work are provided in section H. B. NETWORK QOS In the past several years work on QoS-enabled networks has led to the developement of the Integrated
Services Architecture, where the term Integrated Services (IS) is used to denote an Internet service model that includes best-effort service, real-time service and controlled link sharing. Such architecture, together with an appropriate signalling protocol like RSVP (Resource ReSerVation Protocol) [4] enables applications to demand per-flow guarantees from the network. However, as work has proceeded in this direction, a number of basic limitations have emerged which impede large-scale deployment of QoS mechanisms in the Internet: the reliance of IntServ protocols on per-flow state and per-flow processing schemes brings to a level of granularity too fine to avoid scalability problems on large networks. The need raises to define a scalable architecture for service differentiation. Such an architecture is the subject of the studies conducted by the Diffserv working group in the IETF. A differentiated-services network achieves scalability by means of aggregation of traffic classification state conveyed in an appropriate DS field set by IP-layer packet marking [5]. In the Diffserv framework traffic entering the network is subjected to a conditioning process at the network edges and then assigned to one of different behavior aggregates, identified by means of a single DS codepoint. Furthermore, each DS codepoint is assigned a different per hop behavior, which determines the way packets are treated in the core of the network. Thus, differentiated services are achieved through the appropriate combination of traffic conditioning and perhop behavior forwarding. To date, the most interesting proposals are an Expedited Forwarding (EF) [12] per hop behavior and an Assured Forwarding (AF) [11] per hop behavior. The former can be used to build an end-to-end service characterized by low values for packet loss, latency and jitter, and assured bandwidth, thus appearing to the endpoints like a point-to-point connection or a "virtual leased line”. The latter provides delivery of IP packets in four different flavours (AF classes); within each AF class, a datagram can be assigned one of three different levels of drop precedence. A congested Diffserv node tries to protect packets with a lower drop precedence value from being lost by preferably discarding packets with a higher drop precedence value. C.
David K.Y. Yau and Simon S. proposed an end system architecture designed to support network QoS [6]. They implemented a framework, called Migrating Sockets. User level protocols that minimize hidden scheduling in protocol processing are provided on the sender side, while an active demultiplexing mechanism with constant overhead is implemented on the receiver side. With Migrating Sockets, performance-critical protocol services such as send and receive are accessed as a user level library linked with applications. An architectural model providing Quality of Service guarantees to delay sensitive networked multimedia applications, including end systems, has been proposed by K. Nahrstedt and J. Smith [7]. This model provides a number of services. The Translation Service is in charge of mapping parameters between application QoS and network QoS as well as between application/network QoS and system QoS. The Resource Admission Service provides a control mechanism for shared resource availability. The Information Distribution Services implement layer-to-layer and peer-to-peer communication for call establishment with QoS guarantees. K. Nahrstedt and J. Smith also proposed a new model for resource management at the end-points, called the QoS Broker [8]. The broker orchestrates resources at the end-points, coordinating resource management across layer boundaries. As an intermediary, it hides implementation details from applications and per-layer resource managers. The QoS broker manages communication among the entities to create the desired system configuration. Configuration is achieved via QoS negotiation resulting in one or more connections through the communication system. While we recognize that the cited works represent fundamental milestones for the pursuit of quality of service mechanisms inside the end-systems, we claim that the time is ripe for taking an integrated view on QoS-enabled communication. A single architectural model is needed, integrating network centric and end-system centric approaches and taking into account the most relevant standards proposed by the research community. In the next sections we propose a framework that extends the concept of service differentiation, introducing a schema for priority-based communication mechanisms inside the end point operating system
END SYSTEM QOS D.
A number of researchers dealt with the issue of providing Quality of Service guarantees inside network end-points.
A NEW PROPOSAL: THE PRIORITY BROKER
A number of research studies [1] [6] show that the operating system has a substantial influence on communication delays in distributed environments. In
particular, packet scheduling policies inside the operating system turn out to be a crucial issue for the provisioning of some guarantees which are of paramount importance for end-to-end QoS enforcement. To the purpose, we developed a schema for prioritybased communication mechanisms inside the operating system, which we called the Priority Broker (PB). The PB is an architectural model conceived to support communication flows with a guaranteed priority level. It is implemented as a new software module residing between applications and the kernel and providing an additional level of abstraction, in the form of a virtual machine.
The sender-broker is responsible for the transmission of packets with the priority required by end-users, while the receiver-broker manages incoming packets. The communcation subsystem provides the I/O primitives that allow discriminating outgoing packets in the kernel I/O subsystem. Finally, the connection manager is in charge of handling all service requests on behalf of the local and remote processes. To provide QoS guarantees, the Priority Broker exploits the following mechanisms: • •
Figure 1 shows a conceptual view of the Priority Broker. • •
I/O primitives made available by the Operating System to assure flow control on the sender side when accessing the network; Shared memory to avoid multiple copies between applications and kernel to send and receive messages; Synchronization mechanisms to implement the communication protocol between user applications and PB; Demultiplexing capabilities on the receiving side.
Our Priority Broker has been conceived as a daemon process running on the end-systems. Communication between users’ applications and the broker is made possible by an ad hoc defined library, which we called the Broker Library Interface (BLI). The PB, in turn, relies on the services made available by both the transport provider and the I/O mechanisms of the operating system upon which it resides. Figure 1. Simplified model for the Priority Broker
The PB offers a number of communication services, belonging to the following three classes: • • •
best-effort: no QoS guarantees; low priority: for adaptive applications; high priority: for applications that are most demanding in terms of QoS guarantees.
The new module is part of a software architecture capable of interacting with the operating system in order to provide the applications with a well-defined set of differentiated services, while monitoring the quality of the communication by means of flow control techniques. Three are the major components that build the broker (see Figure 2): the sender-broker, the receiver-broker, the communication sub-system, and the connection manager.
E.
AN ARCHITECTURE FOR END TO END QOS PROVISIONING Starting from the consideration that end-systems and network are both fundamental in the process of QoS provisioning, we tried to design a framework general enough to take into account local and distributed resources in a coherent way. We propose here an architecture that enhances Diffserv communication paradigm in order to bring the concept of differentiation even to the end-system, putting aside the simplistic view that looks at it as an atomic point with no specific requirements in terms of resource allocation and control. Thus, we envision a distributed environment where resources are explicitly required by the applications and appropriately managed by the local OS, which is also in charge of mapping local requests onto the set of services made available by the network. More precisely, we enforce a correspondence between end-system priorities and network classes of service,
thus obtaining an actual end-to-end Per Hop Behavior, where the first and last hops take into account transmission between end hosts and network routers. The framework we propose is depicted in Figure 2
Next sub-sections describe in detail the PB components mentioned in section D, and how they interact in an Inter-PB communication scenario. We implemented the PB on Sun Solaris 2.6 operating system. 1st. Connection Manager
Figure 2. End-to-end communicationarchitecture
As Figure 2 shows, communication in this scenario takes place with the following steps: 1. A sender application chooses its desired level of service among the set of available QoS classes and it asks the local connection manager to appropriately setup the required resources. 2. The connection manager tests for local admission control and, in case of success, configures the local marker so to set the particular DS codepoint corresponding to the required class of service. To cope with this issue, an appropriate mapping must be defined between local classes of service and network QoS levels. The solution we propose is shown in table 1 Local priority High Low Best Effort (no priority)
F.
Network service EF AF Default
PB IMPLEMENTATION ISSUES
The prototype Priority Broker we implemented is a software module placed between application and kernel space. It provides a set of communication services, by means of a library (BLI, Broker Library Interface), allowing applications to request a communication channel with a fixed priority .
Figure 3. Connection manager scheme
The connection manager is the PB component in charge of handling all service requests on behalf of local and remote processes. It provides the following services: • • • • •
control and data sockets management; service requests scheduling on behalf of remote processes; control information handling; I/O buffers handling; local database management.
Figure 3 shows a scheme of the Connection Manager. This component manages a local database containing the following fields: • • • • • •
Virtual name of local processes (VnL); Local processes IDs (IDL); IP addresses of remote end-systems (IP); Port-number of remote PB (Pn); Virtual name of remote processes (VnR); Remote processes IDs (IDR).
At startup a local process requests a communication service, with a fixed priority, to the local PB, specifying the remote host it wishes to communicate with. A control socket with the remote PB is created and a request message is sent to the remote PB connection manager. The request message is structured as follows:
• virtual name of the source process; • source process ID; • virtual name of the destination process. Once the request message has been accepted, the remote PB connection manager stores a new record into its local database and sends back an acknowledgement message to the source PB, indicating the Virtual Name of the remote process. Upon reception of this acknowledgement message, the source PB creates a data socket and appropriately configures its Diffserv marker so to map the local priority level onto the right level of service adopted over the Diffserv communication network. 2nd.
Sender subsystem
The Sender subsystem provides the following services:
As we can see from the figure, the sender subsystem is structured as a multithreaded process. It manages a pool of LWP threads [9]. Each priority level, requested form local processes, is associated to one thread. Each thread is in charge of transmitting data units with the assigned priority relying on the set of communication services defined in the BLI (Broker Library Interface) library. Data exchanging between transport and application layers is obtained via a shared memory, thus avoiding unnecessary copies. Each process connecting to the PB is assigned a portion of such memory area, where all data coming from the Receiver Broker are placed. In order to achieve synchronization between user processes and the Sender, we adopted a “1 producer n consumers” paradigm, which we implemented by means of mutex[10]. 3rd.
•
•
Inter-PB data unit creation. An Inter-PB data unit is a unit of information that travels from the sender PB to the receiver PB. This data unit is created by adding a header to the process message. Such header contains two fields: the destination process ID, and the message dimension. Inter-PB data unit transmission with the requested priority. The PB sends data units using the communication channel made available by the Connection Manager.
The Sender subsystem architecture is shown in figure 4.
Figure 4. Sender subsystem architecture
Receiver subsystem
The Receiver Broker accomplishes the following tasks: • •
Message reconstruction. It extracts sender messages from the packets received from the data socket (which contain an additional header) Message demultiplexing. It delivers received messages to the appropriate processes. To the purpose, the Receiver Broker makes use of the process ID contained inside each packet.
The Receiver broker is implemented as a unique LWP thread, taking into account all of the defined priority levels, as shown in Figure 5.
Figure 5. Receiver Broker architecture
4th. Communication subsystem As a general remark, it is obvious that the features of the communication subsystem are heavily dependent on the particular operating system adopted. For our implementation we chose to exploit a general purpose OS. Our PB runs on Solaris 2.6. This kind of system, which derives from Unix System V family, provides a powerful I/O mechanism, the STREAMS framework [10]. This mechanism consists of a set of system calls, kernel resources, and kernel routines. STREAMS provides an effective environment for kernel services and drivers requiring modularity. The fundamental STREAMS unit is the Stream. A Stream is a full-duplex bi-directional data-transfer path between a process in user space and a STREAMS driver in kernel space. It is composed of a Stream head, zero or more modules, and a driver. Data on a Stream are passed in the form of messages. Each Stream’s module has its own pair of queues, one for reading and one for writing. Queues are the basic elements by which the Stream head, modules, and drivers are connected. When called by the scheduler, a module’s service procedure processes queued messages in a first-in, first-out (FIFO) manner, according to priorities associated with them. Thus, STREAMS uses message queuing priority. All messages have an associated priority field: default messages have a priority of zero, while priority messages have a priority band greater than zero. Non-priority, ordinary messages are placed at the end of the queue following all other messages that can be waiting. Priority band messages are placed below all messages that have a priority greater than or equal to their own, but above any with a smaller priority[10]. To discriminate among three different classes of service, we found, by trial and error, that the best choice was to use a priority of 0 for Best Effort, 1 for Low Priority and 255 for High Priority. G.
Application
BLI Priority Broker
Stream Head TCP IP Ethernet Driver end
Figure 6. The Broker Library Interface
Table 1 provides a summary description of the BLI functions. BLI functions Broker() Removebrk() B_open() B_connect()
B_accept()
B_alloc()
BLI LIBRARY B_snd()
The Broker Library Interface (BLI) acts as an interface between applications and the Priority Broker, as shown in Figure 6. This library makes available to the end-user a set of primitives that provide the applications with the necessary support for differentiated communication services. Two are the major entities involved in the communication: the broker endpoint and the broker provider. The former represents either local or remote process, whereas the latter is the set of routines implementing the PB upon the end-system.
B_rcv() B_close()
Description Activates and configures the PB on the end-system Removes all resources allocated to the PB when no more processes are connected to it Establishes a connection with the local PB Enables a client process to connect with a remote server process Enables a server process to accept a connection request coming from a remote client process Allocates two structures containing, respectively, a header and a pointer to the shared memory segment where the message is to be stored Enables data transmission with a fixed priority Enables data reception Closes the connection with the PB
Table 1. The BLI primitives
Using these functions, an application may be granted access to the services provided by the Priority Broker. Figure 7 illustrates a functional schema describing the client-server communication protocol implemented via the BLI.
Server
Client
b_open( )
b_open( )
Test 1 – Service guarantees provided by the PB The first experiment was run to investigate the kind of guarantees the Broker is able to provide. The setup for this experiment is shown in figure 8.
wait for connection b_connect ( ) wait for connection b_accept ( )
b_alloc( )
b_alloc( )
b_rcv( ) wait for reply b_snd( )
b_rcv( ) wait for reply b_snd( )
Figure 7. Client-server communication via the BLI
As depicted in the figure, the first step required is the connection with the local PB (b_open() call both on client and on sender side). Afterwards, a client invokes b_connect() to ask a server application for a specific service. Upon acceptance of such request (b_accept() on the server side), the receiver broker stores the client’s ID into its local database and then sends the client its own process ID through the control socket. This ID is then stored into the client’s local database. These process IDs act as unique identifiers for a virtual channel between client and server end-points. The next step is the invocation of b_alloc(), which results in memory allocation for both reading and writing buffers. Finally, data transmission and reception is achieved via b_snd() and b_rcv() invocations. H.
Figure 8. Setup for the first experiment
In the scenario depicted, a server application runs on workstation Minori, whereas two client applications (A and B) run concurrently on workstation Furore. Both clients are connected to the server through PB communication services. A asks for a Best-Effort treatment, while B requires a High Priority service. Both clients send 20000 messages, each 1200 bytes long. Communication is activated first by client A; client B comes into play some seconds afterwards. Figure 9 shows the results obtained with this experiment.
EXPERIMENTAL RESULTS
In this section we present some of the experiments we made in order to validate the behavior of our Priority Broker and we provide a critical analysis of the most significant results we obtained. Since our primary interest was to test the correct behavior of the inner components of the PB, we are presenting here experiments dealing with the PB architecture we described in section D.
Figure 9. Results of the first experiment
It is worth observing that client A transmission is preempted as soon as client B starts sending packets. Transmission from A is resumed only when B stops.
Test 2 – PB overhead evaluation The purpose of the second experiment was an evaluation of the overhead introduced by the software module we added between application and kernel space in order to implement the set of new services provided by the Priority Broker.
End-System i Client Socket
Client Broker A
End-System j Server Broker
Client Broker B
Priority Broker
Server Socket
Priority Broker
Kernel
Kernel Network
Figure 11. Testbed used for the experment 3 and 4
Bandwidth measures (Figure 12) show that the increase in the number of clients exploiting PB services has no effect on the resources assigned to the client that uses standard sockets. Figure 10. PB overhead evaluation
The test scenario is the same as the one described in the previous subsection, with the only difference that all of the clients request a Best-Effort service. This enables us to make a comparison between our implementation and a standard client-server application using BSD sockets. Figure 10 shows a comparative evaluation based on the average bandwidth achieved during transmission (the bandwidth has been computed as the ratio of packets sent by the client to the time elapsed between first and last packets received at the server). As Figure 10 shows, the performance attained by the additional module is almost as good as that obtained with standard sockets. Test 3 – Broker’s degree of intrusion on standard communication The testbed described in Figure 11 was set up to evaluate the impact of the Broker module on applications using standard communication libraries, while running concurrently.
Figure 12. Results of the third experiment
Test4 – Qualitative comparison between standard sockets and BLI processing In order to give a flavour of the difference in processor load induced by BLI clients against standard sockets clients, we provide in Figure 13 a snapshot of Solaris Perfmeter program while running the two kinds of clients in sequence.
[3]
S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss. "An Architecture for Differentiated Services". RFC 2475, Dec. 1998. [4] R. Braden, L. Zhang, S. Berson, S. Herzog, and S. Jamin. "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification". RFC 2205, September 1997. [5] K. Nichols, V. Jacobson, and L. Zhang. "A Two-bit Differentiated Services Architecture for the Internet". Internet draft, draft-nichols-diffsvc-arch-00.txt, Nov. 1997. [6] David K.Y.Yau and Simon S. Lam “Migrating Sockets-End System Support for Networking with Quality of Service Guarantees”, IEEE/ACM Trans. On Networking. VOL. 6, December 1998. [7] K. Nahrstedt and J. Smith “A Service Kernel for Multimedia Endstations”, Technical Report, MS-CIS, University of Pennsylvania. [8] K. Nahrstedt and J. Smith, “The QoS Broker”, IEEE Multimedia Spring 1995,Vol.2, No.1, pp. 53-67. [9] John R. Graham, Solaris 2.x, internal and architecture, Mc. Graw Hill, 1995. [10] Uresh Vahalia, UNIX Internals, The new frontiers, Prentice Hall International, 1996. [11] J. Heinanen, F. Baker, W. Weiss and J. Wroclawski. “Assured Forwarding PHB Group”. Internet draft, draft-ietf-diffserv-af-04.txt. January 1999 [12] V. Jacobson, K. Nichols, K. Poduri. “An Expedited Forwarding PHB”. Internet draft, draft-ietf-diffserv-phb-ef-01.txt. November 1998
Figure 13. Qualitative CPU loads compoarison
I.
CONCLUSIONS
In this paper we discussed some of the issues involved in the design of a framework that extends the concept of service differentiation. We proposed an architecture that enhances Diffserv communication paradigm in order to bring the concept of differentiation even to the endsystem, putting aside the simplistic view that looks at it as an atomic point with no specific requirements in terms of resource allocation and control. Thus, we introduced a schema for priority-based communication mechanisms inside the end point operating system. We also presented a testbed implementation of this framework and provided a critical discussion of the main experimental results. Work is in full swing to integrate our Priority Broker component in the framework of a Diffserv network testbed we set up at our Laboratory. Ongoing experiments are dealing with the refinement of the interaction between the PB and the edge routers of the Diffserv network and with the validation of the mapping process we implemented.. Our aim is to design a distributed environment where resources are explicitly required by the applications and appropriately managed by the local OS, which is also in charge of mapping local requests onto the set of services made available by the network.
J. [1] [2]
REFERENCES
Henning Schulzrinne, "Operating System Issues for Continuous Media," Multimedia Systems, vol. 4, no. 5, pp. 269--280, Oct. 1996. R. Braden, D.Clark, and S. Shenker. "Integrated Services in the Internet Architecture: an Overview". RFC 1633, July 1994.