CLARA: A CLuster-based Active Router Architecture - Semantic Scholar

13 downloads 10337 Views 400KB Size Report
C&C Research Laboratories, NEC USA, Inc. Princeton, NJ ... a scalable computational service within a network. The pro- ... cessful as the IP model has become for network services. ..... video to a client, using UDP as the underlying transport.
CLARA: A CLuster-based Active Router Architecture Girish Welling, Maximilian Ott, Saurabh Mathur C&C Research Laboratories, NEC USA, Inc. Princeton, NJ 08540, USA. http://www.ccrl.nj.nec.com Abstract— Personalization of web-based content coupled with a proliferation of heterogeneous terminals and devices will dramatically increase the need for computational services. Current trends towards thin clients make it difficult to perform such computation at the terminal end, while service providers frequently out-source computation, especially when operating under varying load. Today, most computational services utilize a traditional distributed systems model, which does not scale to Internet proportions. To alleviate this problem, we believe that computational services should instead be provided on a highly-distributed, besteffort basis, which has proven successful for packet routing in IP networks. In this paper, we describe an architecture that collocates routing and computational functionality, thereby providing a scalable computational service within a network. The prototype we have built utilizes multiple off-the-shelf PC’s to provide the necessary computational power. Initial experiments indicate that our proposed architecture can be utilized to perform real-time transcoding of video with minimal overhead. Moreover, the architecture does not incur any overhead on conventional IP routing. Keywords— cluster computing, active network, router architecture, multimedia streaming, MPEG transcoder

I. I NTRODUCTION The advent of the Web has resulted in increasingly large-scale deployment of distributed computing systems. However, current distributed system models do not scale to Internet proportions. Wide-spread differences in a user’s personal preferences calls for the customization of presentations provided by a server. Performing the necessary computation exclusively at the server-end is hard because of the often dramatic load variations of servers, which are usually tightly associated with specific services whose popularity can change dramatically with time. At the terminal end, there is an increasing shift from a single, allpurpose, high-performance PC, to a plethora of cheaper, specialized network appliances and thin terminal devices. It has therefore become difficult to assume that the necesContact email: [email protected]

sary computation can be performed exclusively at the terminal end-point. While limited network bandwidth is still the foremost cause of service degradation, the growing use of ecommerce applications is already resulting in scalability concerns for the processing requirements of such activities as database access. It is our belief that the imminent shift to multimedia services will only result in a dramatic increase in this need for processing capacity. In order to be resilient to burstiness in the demand for specific services, it is therefore vital for service providers to be able to off-load processing. This issue has been addressed by several commercial service providers, who utilize a proxy model. In this architecture, a specific site called a proxy is chosen at the beginning of a session to perform the required processing for the service. However, the proxy architecture only succeeds in moving the problem of scale from the server to the proxy. We envision a transition from the mostly requestresponse processing of current distributed systems, to continuous processing on data streams. Services in such systems can be modeled as transformations on the data stream, which only need be performed before the stream reaches its destination. We propose the J OURNEY network model which consists of a network of routers with additional computational capabilities. Under this model, a unit of computation can be completely performed at a single computing router. Drawing from the highly successful properties of the Internet, the decision to perform a unit of computation is taken independently by each computing router on a stream’s path. Although this model does not guarantee processing, it maintains the simplicity of the network. Additional guarantees must be implemented end-toend, according to the requirements of individual streams. Even with no coordination between computing nodes, we observe performance behavior similar to that of IP networks. We believe this indicates that the J OURNEY model for computational services can potentially become as successful as the IP model has become for network services. This paper describes the design of CLARA, the proto-

type architecture of a routing node in a J OURNEY network. The goal was to develop a scalable high-performance computing switch/router out of off-the-shelf hardware. Section II first outlines the J OURNEY active network model. Section III then presents the architecture of the computing router, and gives a functional overview of its software support. Section IV evaluates our prototype implementation when providing a real-time MPEG stream transcoding service. Section V positions our work with other work in the community, and finally, Section VI concludes.

Mem CPU

SAN NI NI

II. T HE J OURNEY N ETWORK M ODEL The goal of the J OURNEY network model is to provide processing as an additional network service with properties that are similar to current networks. Under this model, streams of active multimedia units are injected into the network for routing to their destination, as well as for customizing to the needs of their clients. Each multimedia unit is computationally independent even of others belonging to the same stream, and can therefore be processed independently of the rest of the stream. For instance, in an MPEG stream, a group of pictures (GoP) is independent of other GoPs and therefore can be considered a multimedia unit under the J OURNEY model. In J OURNEY, it is not necessary to process all media units that belong to a particular stream at a single computing router in the network. This improves the scalability of the computational power of a J OURNEY network, and is made possible only by the restriction that media units be computationally independent, even within the same stream. This is not a severe restriction because a longrunning stream is usually composed of such independent units for resilience, while a short stream can be considered a media unit in its entirety. Each computing router in a J OURNEY network independently makes a decision as to whether a media unit should be processed at that router. The computing router will only utilize local conditions of resource availability to make the decision, and does not require any global stream state other than the customization parameters. Eliminating the need for inter-router control messaging to process a particular stream simplifies the overall architecture considerably, but requires that customization parameters be propagated along the complete routing path. Making an independent decision at each computing router will not guarantee that each media unit will arrive processed. However, such best-effort processing is in the spirit of best-effort routing in IP networks, where there are no guarantees of perfect packet delivery either. Similar to IP networks, J OURNEY treats unprocessed media units as errors which higher layers must recover from. In J OUR -

Fig. 1. The CLARA Architecture NEY, media units may also arrive out-of-sequence at their destination. This issue is again similar to IP, and can be addressed by higher layer buffering, coupled with the utilization of sequence numbers for in-order delivery if so desired. The J OURNEY model results in a network that presents a scalable computational service that is not bound to any fixed node or site. We believe the advantages of this model in terms of network overheads and programming simplicity far outweigh the disadvantages. Details of an analytical model we are developing, that illustrates the duality between processing in a J OURNEY network and routing in a packet network are omitted from this paper. Instead, we focus on our proposed architecture for the key element of a J OURNEY network, the computing router.

III. T HE C OMPUTING ROUTER A RCHITECTURE Our goal was an architecture that could not only perform competitively as a conventional router, but also scale well with respect to computational resources that are required. We therefore proposed the CLuster-based Active Router Architecture (CLARA) illustrated in Figure 1, which consists of a cluster of generic PC’s connected by a fast System Area Network (SAN). Currently, one PC (the routing element) is configured as a normal IP router, while the others (the computing elements) provide computational resources for the customization services. Concentrating all routing functionality within a single element of the cluster simplifies the incorporation of special-purpose switching technology, giving a CLARA router the potential to perform routing just as well as any conventional router or switch. For instance, it would allow us in the future, to replace the PC based routing element with a dedicated router. Moreover, the size of a CLARA cluster itself can be adjusted to accommodate the computational power that is required at a particular node in a J OURNEY network.

This also makes it possible to simultaneously track the possibly independent technology improvements in routing and computation. Our current prototype consists of three Dual-Pentium III PC’s running SMP Linux connected by a 2.5 Gbps Myrinet [1]. The routing element of the cluster behaves as a normal IP router for incoming packets that have already been processed (indicated by diamonds in Figure 1), but captures unprocessed packets (indicated by squares in Figure 1) for processing consideration. Whether a packet is processed or unprocessed is currently determined by whether the IP Router Alert option has been set or not. Therefore, processed packets are directly routed by IP, while unprocessed packets are handed up to the CLARA software for possible processing.

Service Specific

Engine

Engine

Engine

ement that supports the required processing functionality with the restriction that all packets that belong to the same MU are dispatched to the same computing element. After processing, packets are gathered in the routing element, and are sent out on the appropriate interface according to their routing requirements. Cluster elements can dynamically join and leave a CLARA cluster while it is still on-line. To aid this process, one element of a CLARA cluster (the routing element in the current prototype) is elected to serve as a cluster manager. Cluster elements also inform the cluster manager about such current resource utilization as the processor load, and memory utilization. The cluster manager aggregates this information, making it available to various decision-making processes. These include the admission policy that determines which MUs to admit for processing, and the dispatch policy that schedules work among the elements of the cluster. A. Router Programming Framework

Network Specific Admit

Dispatch

Accept?

Collect

Which engine?

Ingress

Egress

Besides enabling efficient multimedia processing, the CLARA software framework is designed to support:  accounting of packet/stream resource utilization,  division and vending of portions of the computational resources available on a router,  the dynamic addition of customization functionality. Stream

Fig. 2. Functional Overview

A functional overview of the CLARA software is illustrated in Figure 2. The network-specific functionality resides on the routing element, while the service-specific processing engines reside on the computing element. The Ingress module consists of a Raw-IP socket suitably programmed to deliver only packets that have the Router Alert option set. Such incoming packets are sent to a CallAdmission module where the decision of whether or not to accept a packet for processing is made. This decision may depend on such local router conditions as whether the required processing functionality is available, or what the aggregate computational load on the router is. Since a J OURNEY MU may span several packets, the acceptance of the first packet of an MU necessitates the acceptance of all packets belonging to the same MU. Packets that are “rejected” are routed without processing, while packets that are accepted for processing are sent on to a Dispatch module. The Dispatch module takes the decision of which processing engine to send the packet for processing. This depends on the functional capabilities of the various processing engines as well as their transient load conditions. Packets are dispatched to a computing el-

TCP

IP UDP

Flow of Packets

Fig. 3. Stages and Accounting

Within each CLARA entity, packets pass though multiple Stages (Figure 3). A Stage is composed of a sequence of Modules, each of which encapsulates packet processing capabilities. Usually, the last Module in each Stage is a classifier that determines which of several possible next Stages to send a packet. Typically, the number of choices increase as more information about the packet becomes known. For instance, in Figure 3, the IP Stage sends all UDP packets to the UDP Stage, and TCP packets to the TCP Stage. Moreover, the TCP and UDP Stages classify packets according to the service family they belong to, and

send them to the appropriate Stage. Along the out-going path, a packet is transferred to increasingly generic Stages until it leaves the CLARA entity as a generic raw packet. The CLARA software framework accounts the resources utilized by a packet to the most specific class or service it belongs to. Towards this end, each Stage accumulates a packet’s resource utilization while it is being processed. The cumulative resource utilization of a packet is then transferred along with the packet, to its next Stage. In Figure 3, the resource utilization of a TCP/IP packet is transferred from the IP Stage to the TCP Stage, and then to the Stage associated with the particular TCP stream that the packet belongs to. After this point, any further resources that are utilized by the packet in Stages along its out-going path are accumulated at the Stage associated with its stream. The ability to maintain such utilization accounts for individual streams within the CLARA software enables new network management policies. For instance per-stream accounts can form the basis for associating a dollar cost for services rendered at a computing router. It is also possible to allocate portions of available resources to different streams, which a cluster manager can enforce by penalizing streams with heavy resource usage. Another possibility is to aggregate the accounting information so that the cluster manager can ensure that available resources are fairly allocated to all the services on a particular computing router. While we have not fully exploited the possible uses of resource accounting, we believe it is important to make accounting part of the basic software design. Computational resources in CLARA are allocated on the occurrence of such triggering events as the arrival of a packet, or the expiration of a timer. A Scheduler implements a strategy to allocate computation, while a Job implements a unit of work that must be triggered by an event. In CLARA, a Scheduler is itself a Job. This permits the creation of hierarchies of Schedulers, making it possible to divide the computational resources available on a computing router. For instance, a portion of the computational capacity of a computing router can be allocated to a service provider for a fee. The CLARA scheduling framework can then guarantee that streams belonging to that provider receive at least the portion of the computational capacity that has been allocated. The CLARA cluster management system is specified using OMG-IDL [2]. A custom mini-ORB integrates the cluster management system with CLARA software, which has mostly been designed for processing media streams with real-time requirements. The mini-ORB presents a one-way messaging service with a programming interface that is identical to that provided by a distributed

object system like CORBA or JavaRMI. This simplifies the evolution of the cluster management, while avoiding the overheads associated with a complete CORBA ORB. Details of the customizable IDL compiler and the miniORB that we built can be obtained from [3]. B. Active Media Packet Format

IP/UDP Header

Active Header

+ IP Router Alert

Data Header

Payload

+ Magic Pattern + Version + Service + Unit Identifier + End-of-unit

+ Type + Length

Fig. 4. Format of a Active Media packet

The format of an active media packet is shown in Figure 4. An incoming packet in our test-bed is deemed to be active if the IP Router Alert option is set, and it contains the active media magic pattern. The active media header contains a service identifier that indicates the computational service to be performed on the packet, and a unit identifier that identifies packets that belong to the same media unit. Packets belonging to the same stream are identified by their source and destination IP addresses and port numbers, similar to the identification of a flow in RTP or RSVP. We chose UDP as the basic transport not only because it was easiest to integrate, but also because it allowed us to operate the J OURNEY network as an overlay over an existing network. However, we have also experimented with user-managed device-drivers and user-level protocol stacks in order to reduce data-copying overhead. Descriptor

Packet

MAC

IP

TCP

Payload

Fig. 5. Packet Programming Interface

A Packet provides the abstraction of a raw packet within the CLARA software framework, the organization of which is shown in Figure 5. Descriptors provide the programming interface to the various levels of semantic content of the packet data. Such Descriptors can be dynamically attached to the Packet as more information about the packet data becomes known. For instance, when a packet is read off the network interface, the only relevant infor-

Active Network

Video Server

Fig. 6. MPEG-transcoding as a Network Service

mation may be the MAC header, which therefore only requires a MAC Descriptor to be attached to the Packet. If it is then determined that the packet is UDP/IP, then the IP module may attach an IP Descriptor and a UDP Descriptor to the packet. This incremental process simplifies handling the packet within the CLARA software, not only in terms of classification of incoming packets, but also creation of outgoing packets while avoiding the copying of payload data. IV. E XPERIENCE AND E VALUATION It is often necessary to customize the presentation provided by a server to match personal user preferences or the capability of heterogeneous networks and display devices. In the scenario shown in Figure 6, the user wants to watch a news program on a small wireless device, but the media-server only stores the news program in a highquality format. Because of the low-resolution screen of the wireless device, much of the information contained in the original video stream will be discarded. Researchers have addressed this problem in the context of an “expensive” wireless link or the “expensive” computation in a batterypowered terminal [4], [5], [6], [7]. Architectures consisting of media gateways to translate the high bit-rate video stream into a low bit-rate one have been proposed, often considering decompression on the terminal as well. Unfortunately, the reduction is mostly restricted to dropping frames, removing color, or stronger uniform compression. In contrast, a semantic driven transcoder can often achieve the same reduction in resource consumption without sacrificing the video quality. For instance, the video in Figure 6 consists of a newscaster in front of a decorative background. If the service is aware that the background is not essential to the content, cropping around the newscaster will usually reduce the data-rate considerably without reducing the “information rate” and in turn, the user experience. While this may sound futuristic in light of the state-of-the art in automatic video understanding, the unnecessary information is already available in the production process. Recent video encoding standardization

efforts such as MPEG-7 [8] and MPEG-4 will not only allow the addition of meta-information like story-boards, but also preserve the layers of video that were available early in the production process. We envision these kinds of “smart” data-stream transformations to become an integral feature of future multimedia applications. Implementing dynamic media adaptation strategies requires processing within the semantic context of a presentation, utilizing such higher-level information as the relative importance of presence (IoP) of the various objects in the presentation. On-going efforts to augment media data with meta-information [8] can only aid such semantic transformation, while today’s powerful general-purpose CPU’s already provide the significant computational resources that are required to perform the processing in realtime. However, the option of performing such transformations at the media client may be impossible because of not only the current trend towards thin clients with minimal functionality, but also the problem of moving large amounts of data over a slow wireless link to mobile clients. Moreover, performing per-client customization at any specific server, proxy, or gateway can result in a scalability problem with a large number of clients. Implementing application level services in the network allows better management of resources both within the network and at the end hosts. For example, in the system illustrated in Figure 6 only a single MPEG stream need be stored at the video server for all the clients irrespective of their capabilities and needs. This drastically reduces the server’s storage requirements. A CLARA router can also be programmed to selectively drop non-critical MPEG frames on network congestion, without affecting the end hosts substantially. This was hitherto not possible because of the unavailability of semantic information to the routers. The network model also enables bandwidth sharing, permitting a router to be programmed to multicast an MPEG stream to a group of clients with different capabilities, thus offloading the server of processing requirements. The performance of the applications that do not need the transcoding service of the router remains unaffected as the router only processes packets that have been marked as active. A. The MPEG Transcoding Service We utilized the CLARA software framework to implement an MPEG stream transcoding service. The transcoder [9] converts a high bit-rate MPEG-1 stream into a low bit-rate one in real-time. The implementation of the MPEG transcoder takes advantage of the new MMX hardware available in Intel Pentium processors. The transcoder’s API allows spatial resolution, frame-rate, and bit-rate parameters to be changed dynamically, while

Media Server

Fig. 7. Overview of the J OURNEY Test-bed Plot of the GOP size for a section of a typical MPEG stream

GOP size (Kbytes)

the architecture permits the insertion of additional filters. We added a cropping filter whose parameters can also be changed dynamically.

90 80 70 60 50 40 30 20 10 0

In this experiment, a media server streams MPEG-1 video to a client, using UDP as the underlying transport (Figure 7). The client was a SONY Picturebook, which is a relatively powerful PDA with a small screen. The video stream was a newscast, where the news-reader occupies only a small part of the video image (left im20 30 40 50 60 70 80 90 GOP number age of Figure 7). Although the presentation of such a stream on a work-station would usually be acceptable, its Target output rate 64Kbps 512Kbps play-out on the Picturebook is unacceptable because the 128Kbps 350Kbps 256Kbps Input GOP size news-reader would be rendered too coarsely on the lowresolution screen of the PDA. To improve the presentaFig. 8. Bit-rate Comparison MPEG-transcoding tion on the Picturebook, the stream was semantically transformed by first cropping around the image of the newsreader and then scaling to fit the screen (right image of to the transcoder is a standard MPEG-1 video stream encoded at 1.2 Mbps (80 KB/GOP, 2 GOP/sec.). The output Figure 7). Our test-bed consisted of three computing router clus- streams were obtained by adjusting a combination of spaters, each composed of one routing element and two com- tial resolution, frame-rate and/or re-quantization of DCT puting elements. The routers were connected as shown in coefficients, depending on the targeted output bit-rate. Plot of the transcode time for the same section of the MPEG stream 450 Transcode time (ms)

Figure 7, and each utilized a greedy admission policy. On an unloaded system, the first cluster processed the entire stream. However, as “cross traffic” was injected, an automatic distribution of processing across all routers was observed. A round-robin dispatching algorithm was utilized for dispatching media units for processing amongst the computing elements on each cluster. This was sufficient as the processing workload for the experiment was fairly uniform. However, a more sophisticated dispatching policy will be required for a cluster that supports a wider variety of services.

400 350 300 250 200 150 20

30

40

50 60 GOP number

70

80

90

Target output rate We performed a series of experiments to determine the 64Kbps 512Kbps 128Kbps 350Kbps performance of our test-bed on the prototypical MPEG 256Kbps stream transcoding service. Figure 8 shows traces for the Fig. 9. Processing Overhead for MPEG-transcoding stream’s input bit-rate and the corresponding bit-rate for the transcoded ouptut. The horizontal axis shows the GOP Figure 9 shows the corresponding transcoding times in number where each GOP consists of half-second video (15 MPEG video frames displayed at 30 fps). The verti- milliseconds. We observe that the larger the input/output cal axis gives the size in kilobytes of the GOPs. The input bit-rate ratio the less time it takes to transcode. This is so

because larger input/output ratios can be achieved simply by frame dropping and/or spatial resolution adjustments which are less computationally intensive, compared to DCT re-quantization mechanisms. For small-to-medium input/output ratios (less than 4), the transcoder has to requantize the DCT coefficients and for that it has to process all information in the input stream, down to the macroblock level. In summary, we have observed that transcoding a GOP in the current J OURNEY test-bed takes between 150-250 ms for large input/output bit-rate ratios, and 350400 ms for small input/output bit-rate ratios. We have also observed that the total store-processingforward service time for a GOP at a CLARA node is approximately double the processing time described above. This overhead is due to current implementation issues. A significant contribution is due to the user-space implementation of the routing engine. This introduces a large number of context switches per GOP (more than 10, but less than 100), associated with the I/O of all packets belonging to a GOP unit. A second contribution is due to the use of IP over Myrinet. Preliminary experiments have shown that the available interconnect throughput is about half of what we could get without using IP within the cluster. We believe that a CLARA cluster with a kernel implementation of its routing module and a raw Myrinet interconnect could have half the store and forwarding overhead of our current test-bed implementation. V. R ELATED W ORK As stated early in this paper, the goal of the J OURNEY project is to provide the infrastructure for deploying and utilizing customizable data-stream transcoding functionality. Towards this end, the CLARA architecture collocates computing and routing functionality in a single clusterbased network element. In this section, we position our work in the context of other research in the community. The proxy architecture is widely utilized to customize server data to match personal user preferences or the capability of heterogeneous networks and display devices. The necessity of such customization is especially visible in mobile wireless environments. For instance, when a wireless terminal on a slow link retrieves multimedia content from a server that has only a high-quality version of the content, transcoding may be needed to convert the high bitrate stream into a low bitrate one; alternate frames can be dropped, or the stream can be requantized [10]. In TranSend [11], web images were dynamically distilled to match user preferences, while Hess, et al [4] proposed using a proxy to perform media transcoding to match a media stream to client resource availability. The MeGa project at the University of California, Berkeley [12], also utilizes

a similar transcoding service, which must be located before any data is sent. The WAP architecture [13] also uses this approach to bridge the incompatibilities between the requirements of wireless applications and the mostly wireline available infrastructure. However, in all this work, the service usually must be located a priori, and is fixed for the duration of the stream. In contrast, J OURNEY only requires the service to be performed somewhere in the network, and the architecture does not even bind the service provider to any fixed active node in the network. Researchers in Active Networking are also exploring alternative distributed computation models. Broadly, there are two approaches to active networking. At one end of the spectrum, network nodes are fully programmable, and active packets carry all the code that must be executed on them. The NetScript project at Columbia University [14], the ANTS system from MIT [15], and SwitchWare from the University of Pennsylvania [16] all aim towards this model. Most research in active networks is focused on this model with the goal of providing an efficient and secure network environment. At the other end of the spectrum, active nodes are dynamically configurable to provide different customizable services, and active packets utilize some of these deployed services. The J OURNEY network model falls into this category, as does the previously mentioned MeGa project [12]. The security of fully programmable nodes is usually addressed by executing interpreted code in a secure environment. Correspondingly, many of the projects that focus on this model, such as SwitchWare [16], NetScript [14] and Smart Packets [17] adopt a language-based approach. In SwitchWare [16], every packet executes a program, and the system architecture provides the framework for adding complex functionality in support of packet programs. NetScript [14] provides an abstraction of a programmable networking environment, while the Active IP Option [18] proposed by Tennenhouse and Wetherall utilizes packets carrying Tcl code fragments that are executed at the active nodes. While the problems associated with such approaches are clearly hard, we believe the performance penalty associated with the required interpretive environments is currently unacceptable for the application domains we are focusing on. This has also been observed by researchers working on PAN [19] which used interpreted Java byte code. Directly using precompiled native code, on the other hand, compromises both security at the active node, and portability of the active packet. The active services approach seeks to provide a solution to this domain of applications, and both J OURNEY and MeGa take this view. In [20], Tennenhouse and Wetherall describe a capsule

model in which each packet carries code for forwarding and processing data. Each packet is therefore independent, and no state is shared between packets of the same stream. With such a scheme, the loss of a packet will not prevent others from being correctly processed. While such an approach can be easily implemented under the J OUR NEY model by including all customization parameters in every MU, we have also found that the ability to send independent control packets upstream is extremely useful. The advantage of using such messages is that a server does not need to keep track of the customization parameters for all possible receivers. For instance in the MPEG transcoding example, the client can independently customize the datastream according to user preferences, as can a management process within the network in order to handle such network conditions as congestion. As a result, we prefer to utilize the soft-state built by out-of-band control messages.

[4]

[5]

[6]

[7]

[8] [9]

VI. C ONCLUSIONS In this paper, we first outlined the J OURNEY network model for providing computation as a scalable network service. The computation model trades off hard guarantees for computation in favor of architectural simplicity. We then described CLARA, an architecture for a clusterbased computing router that can be utilized to perform computation in a J OURNEY network. Besides being capable of high performance conventional routing, CLARA also scales to provide significant computational power. Moreover, the architecture supports dynamic installation of transformation functionality, accounting of resource utilization, and the ability to partition the computational capacity of the router. We evaluated a prototype we built, in the context of real-time transcoding of MPEG video. Architecturally, our approach to performing computation as a network service is unique in that it does not fix the node at which the computation is performed. Moreover, there is no guarantee that all packets are processed before reaching their destination. This introduces a probabilistic aspect to computation that hitherto was only associated with networking. We are currently developing an analytical model of this behavior by leveraging known results of network theory. Future experiments with our test-bed will include studying the performance of the admission control and routing mechanisms at different traffic loads. R EFERENCES [1] [2]

[3]

Myrinet Overview, Jan. 2000, http://www.myrinet.com. Object Management Group, Inc., The Common Object Request Broker: Architecture and Specification, Sept. 1996, Document PTC/96-08-04, Revision 2.0. Girish Welling and Maximilian Ott, “Customizing IDL mappings and ORB protocols,” in Middleware 2000, Lecture Notes

[10]

[11]

[12]

[13] [14]

[15]

[16]

[17]

[18]

[19]

[20]

in Computer Science, J. Sventek and G. Coulson, Eds., vol. 1795. Springer-Verlag, Apr. 2000. Christopher K. Hess, David Raila, Roy H. Campbell, and Dennis Mickunas, “Design and performance of MPEG video streaming to palmtop computers,” in Proceedings of the Multimedia Computing and Networks 2000, San Jose, CA, USA, Jan. 2000. Armando Fox, Steven D. Gribble, Eric A. Brewer, and Elan Amir, “Adapting to network and client variability via on-demand dynamic distillation,” in Proceedings of the 7th Intl. Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, USA, Oct. 1996. Akihiro Hokimoto, Kuniaki Kurihara, and Tatsuo Nakajima, “An approach for constructing mobile applications using service proxies,” in Proceedings of the 16th Intl. Conference on Distributed Computing Systems, Hong Kong, June 1996. Anthony Joseph, Joshua Tauber, and M. Frans Kaashoek, “Mobile computing with the Rover toolkit,” IEEE Transactions on Computers: Special Issue on Mobile Computing, Feb. 1997. Overview of the MPEG-7 Standard, Dec. 1999, http://drogo.cselt.stet.it/mpeg/standards/mpeg-7/mpeg-7.htm. Y. Senda and H. Harasaki, “A realtime software MPEG transcoder using a novel motion-vector reuse and a SIMD optimization,” in Proceedings of the IEEE Intl. Conference on Acoustics, Speech and Signal Processing, Phoenix, AZ, USA, Mar. 1999. Suresh Gopalakrishnan, Daniel Reininger, and Maximilian Ott, “Realtime MPEG system transcoder for heterogenous networks,” in Proceedings of the Intl. Packet Video Workshop, Columbia University, New York, USA, Apr. 1999. A. Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, and Paul Gauthier, “Cluster-based scalable network services,” in Proceedings of the 16th ACM Symposium on Operating Systems Principles, Saint-Malo, France, Oct. 1997. Elan Amir, Steven McCanne, and Randy Katz, “An active service framework and its application to real-time multimedia transcoding,” in Proceedings of the ACM SIGCOMM 98, Vancouver, BC, Canada, Sept. 1998. Wireless Application Protocol Forum Ltd., 1999, http://www.wapforum.org. Y. Yemini and S. da Silva, “Towards programmable networks,” in Proceedings of the IFIP/IEEE Intl. Workshop on Distributed Systems, Operation and Management, L’Aquila, Italy, Oct. 1996. David J. Wetherall, John V. Guttag, and David L. Tennenhouse, “ANTS: A toolkit for building and dynamically deploying network protocols,” in Proceedings of the IEEE OPENARCH ’98, San Francisco, CA, USA, Apr. 1998. D. Scott Alexander, William A. Arbaugh, Michael W. Hicks, Pankaj Kakkar, Angelos D. Keromytis, Jonathan T. Moore, Carl A. Gunter, Scott M. Nettles, and Jonathan M. Smith, “The switchware active network architecture,” IEEE Network, May/June 1998. A. W. Jackson and C. Partridge, Smart Packets, Mar. 1997, Slides from 2nd Active Nets Workshop, http://www.nettech.bbn.com/smtpkts/baltimore/index.htm. David L. Tennenhouse and David J. Wetherall, “The active IP option,” in Proceedings of the 7th ACM SIGOPS European Workshop, Connemara, Ireland, Sept. 1996. Erik L. Nygren, Stephen J. Garland, and M. Frans Kaashoek, “PAN: A high performance active network node supporting multiple mobile code systems,” in Proceedings of the IEEE OPENARCH’99, Mar. 1999. David L. Tennenhouse, Jonathan M. Smith, W. David Sincoskie, David J. Wetherall, and Gary J. Minden, “A survey of active network research,” IEEE Communications Magazine, Jan. 1997.

Suggest Documents