Middleware 2007 The GloServ Service-Discovery System: Combining ...

November 2007 (vol. 8, no. 11), art. no. 0711-mds2007110003 1541-4922 © 2007 IEEE Published by the IEEE Computer Society

Works in Progress

Middleware 2007 Edward Curry • WiP Papers Chair, Middleware 2007 Conference The 8th International Middleware Conference (http://middleware2007.ics.uci.edu/), taking place late November in Newport Beach, CA, aims at being the premier conference on middleware research and technology in 2007. The conference covers the design, implementation, deployment, and evaluation of distributed systems platforms and architectures for future computing environments. These work-in-progress articles, which the conference program committee chose from a large selection of excellent submissions, describe ongoing work and interim results. Edward Curry is a researcher in the Digital Enterprise Research Institute at the National University of Ireland, Galway. Contact him at [email protected].

The GloServ Service-Discovery System: Combining Ontology Queries with Keyword Search Knarig Arabshian and Henning Schulzrinne • Columbia University Current service-discovery systems use simple attribute-value pair matching, which limits the results to exact matches. They also don’t scale well and are limited to local area networks (LANs). However, as more services become available, we’ll need service discovery in a wider area network, so network scaling will become an issue. The proliferation of services also means systems will need to discover relationships between services and perform intelligent query matching. Also, systems might have to handle frequent service updates for dynamic services. To address these problems, we developed GloServ, a global service-discovery system. GloServ uses the Description Logic sublanguage of the W3C Web Ontology Language (OWL) to classify services in an ontology. It also maps knowledge that the ontology obtains onto a hybrid hierarchical peer-to-peer (P2P) network. GloServ operates in wide area networks and LANs and supports a large range of services that are aggregated and classified in ontologies. Organizing services in an ontology lets users search for general categories of services and then narrow their search to specific services. Figure 1 gives an overview of how GloServ finds servers with the appropriate services.

IEEE Distributed Systems Online (vol. 8, no. 11), art. no. 0711-mds2007110003

1

Figure 1. GloServ’s five-step process for finding the appropriate services: (1) A query for “café” comes into one of the high-level servers and (2) is mapped to the Restaurant class. (3) The server forwards it to the high-level server that handles restaurant services. (4) The restaurant ontology is converted to a form, which the user fills out. The data within the form is converted to a first-order logic query, which is classified within the Restaurant ontology to find all classes related to the query. (5) The query is then forwarded to the servers that handle those classes and results are sent to the user. The motivation behind using ontologies for service discovery rather simple attribute-value representations of data, such as in traditional databases, is their reasoning power. For example, the following query would be much easier to answer using an ontology: “Given a service class, find all related matches to my query.” Systems can share ontologies, services can reuse the ontologies’ knowledge, and we can change ontologies to easily add new classes and establish new relationships between classes. Furthermore, a hybrid hierarchical P2P architecture can efficiently distribute dynamic services. We envision GloServ handling all types of services, including service-description updates in real time. This will help resolve the network scaling issue for data replicated across servers. Also, combining ontology and keyword queries gives GloServ more flexibility and accuracy in obtaining query results. Performing pure ontology queries limits a service description and query to the service ontology definition, but it lets the system infer relationships between classes so that when a query is classified, all related classes to the query can be found. On the other hand, although text queries let service providers add all types of descriptions, the system can only approximate the results, which might not be what the users want. So, combining these two types of queries lets us reap the benefits of both mechanisms. Knarig Arabshian is a PhD candidate in Columbia University’s Department of Computer Science. Contact her at [email protected]. Henning Schulzrinne is a professor and chair in Columbia University’s Department of Computer Science. Contact him at [email protected].


2

SOVoIP: Middleware for Universal VoIP Connectivity Mohammed Jubaer Arif and Shanika Karunasekera • University of Melbourne Santosh Kulkarni • National Information and Communications Technology Australia, Victoria Research Laboratories Voice over Internet Protocol has changed how people communicate by enabling telephony services over the Internet. The main technical difference between conventional telephony and VoIP is that VoIP uses a packet-switched network, the Internet, instead of a circuit-switched network. This has resulted in VoIP-specific protocols that aren’t interoperable or compatible with existing Internet protocols. A client development process for these solutions is difficult, because developers must implement specific protocol stacks on the client end. Changes in these protocols’ behaviors require upgrading all the clients, which involves significant work and cost. VoIP extends not only across geographical boundaries but also over various hardware and software platforms. Providers and users don’t always use the same platform. Moreover, users might need to stick to their deployed protocols for specific business reasons even though interoperability with the external world and other systems is required. Web services can solve these problems. We propose a Web-service-based P2P architecture for VoIP called Service-Oriented VoIP (see figure 2). SOVoIP provides interoperability between protocols from both telephony and data networks using the converging behavior of Web services while ensuring security, extendability, and mobility. We also address other critical issues related to VoIP such as Network Address Translation and firewall traversal, Enhanced 911, and the Communication Assistance for Law Enforcement Act. SOVoIP also provides modularity and reusability, making client development easier. Extendability in the Web service architecture is transparent to the client, so frequent client updates aren’t required in order to consume new features. This reduces the time and money spent on upgrades.

Figure 2. The Web-service-based Service-Oriented VoIP architecture.


3

SOVoIP thus has several advantages over other VoIP architectures. Skype is one of the most popular VoIP architectures. It has heavy (for example, high memory consumption) clients that form a P2P network. Skype uses a proprietary protocol, encrypts its messaging, and uses TCP for call setup. SOVoIP, even though it also follows a P2P architecture for call establishment, initially makes user search method invocation in the server (see figure 3).

Figure 3. Call establishment in the SOVoIP architecture. H.323—an umbrella recommendation for VoIP from the International Telecommunication Union—also uses TCP for call setup. However, H.323’s binary encoding and requirement for extra network configuration makes it complex than SOVoIP, which doesn’t require extra hardware in a domain for call setup. Compared to H.323, SIP from the Internet Engineering Task Force is lightweight and simple. It demands network configuration in each domain. SIP’s text-based messaging makes message implementation easier. It works on both the User Datagram Protocol and TCP. Both H.323 and SIP use the Real Time Transport Protocol (RTP) for media communication, but these protocols aren’t interoperable with each other or other Internet protocols.


4

Other VoIP protocols merely introduce new protocols for voice over the Internet, but SOVoIP proposes a unified platform via Web services. Different protocols from telephony and the Internet can join SOVoIP yet retain their protocol-specific behavior. Studies have tried to introduce middleware to solve the problems we’ve mentioned, but they lack important VoIP functionalities such as the ability to track a user’s physical location during an emergency.1,2 References 1.

M. Hillenbrand and G. Zhang, “A Web Service Based Framework for Voice over IP,” Proc. 30th Euromicro Conf., ACM Press, 2004, pp. 258–264.

2.

W.Chou, L.Li, and F.Liu, “WIP Web Service Initiation Protocol for Multimedia and Voice Communication over IP,” Proc. IEEE Int’l Conf. Web Services (ICWS 06), IEEE CS Press, 2006, pp. 515–522.

Mohammed Jubaer Arif is a PhD student in the University of Melbourne’s Department of Computer Science and Software Engineering. Contact him at [email protected]. Shanika Karunasekera is a senior lecturer in the University of Melbourne’s Department of Computer Science and Software Engineering. Contact him at [email protected]. Santosh Kulkarni is a research fellow at the National ICT Australia, Victoria Research Laboratories. Contact him at [email protected].

EVEY: Enhancing Privacy in Pervasive Service Discovery Roberto Speicys Cardoso, Sonia Ben-Mokhtar, and Valérie Issarny •INRIA Aitor Urbieta •Mondragon Unibertsitatea Service discovery is critical to maintaining the privacy of clients and service providers. Individuals often disclose personal information when looking for services, or intruders can deduce this information from historical discovery data, so service-discovery protocols need to protect it from illegitimate access. Privacy-aware service-discovery protocols, however, must still be flexible and scalable so that they can support the requirements of service-oriented pervasive computing. We present EVEY (Enhancing Privacy of Service Discovery), a privacy-aware service-discovery protocol that supports syntactic and semantic matching between service requests and advertisements. It protects private information related to service discovery by introducing ambiguity in both service advertisements and service requests so that they can represent a group of services instead of a single, specific service instance. To enhance the privacy of syntactic service descriptions, EVEY uses a hash function with weak collision resistance to hash service-description element names. Consider a hash function h(). If the function provides weak collision resistance, it’s relatively easy to find distinct values x and y such that h(x) = h(y). In EVEY, instead of containing a category name (such as carDealer), requests and advertisements contain the hash of the category name (h(“carDealer”)), which can be the same as the hash of other service categories. Semantic service descriptions, on the other hand, are privacy-enhanced by encoding the ontologies used to create the service description and replacing semantic annotations with the codes of the concepts on the ontology. This approach has already been proposed to speed up semantic reasoning during service discovery.1,2 EVEY, however, encrypts ontology names onto advertisements and omits


5

them on requests, so the same code can be found on different ontologies, and a single privacyenhanced service description can represent many concrete services. EVEY uses privacy-enhanced service descriptions in all phases of service discovery—namely publication, location, matching, and selection (see figure 4). Because service descriptions are ambiguous, deducing user activities from discovery data is complicated. This approach extends existing privacy-enhanced service-discovery protocols3,4 by reducing trust requirements on service directories and by supporting not only syntactic but also semantic annotated service descriptions.

Figure 4. Privacy-enhanced service publication, location, matching, and selection.

References 1.

S. Ben Mokhtar et al., “Efficient Semantic Service Discovery in Pervasive Computing Environments,” Proc. ACM/IFIP/Usenix 7th Int’l Middleware Conf., Springer,2006, pp. 240–259.

2.

I. Constantinescu and B. Faltings, “Efficient Matchmaking and Directory Services,” Proc. IEEE/WIC Int’l Conf. Web Intelligence (WI 03), IEEE CS Press, 2003, 75–81.

3.

F. Zhu et al., “Expose or Not? A Progressive Exposure Approach for Service Discovery in Pervasive Computing Environments,” Proc. 3rd IEEE Int’l Conf. Pervasive Computing and Communications (Percom 05), IEEE CS Press, 2005, pp. 225–234.

4.

S.E. Czerwinski et al., “An Architecture for a Secure Service Discovery Service,” Proc. 5th Ann. ACM/IEEE Int’l Conf. Mobile Computing and Networking (MobiCom 99), ACM Press, 1999, pp. 24–35.

Roberto Speicys Cardosois a PhD candidate at INRIA. Contact him at [email protected]. Sonia Ben-Mokhtar is a PhD candidate at INRIA. Contact her at [email protected]. Aitor Urbieta is a PhD candidate at Mondragon Unibertsitatea. Contact him at [email protected]. Valérie Issarny is a research director at INRIA. Contact her at [email protected].


6

Bandwidth Adaptive Dissemination: The Case for BAD Trees Manos Kapritsos and Peter Triantafillou •University of Patras Bandwidth Adaptive Dissemination is an application-level multicast infrastructure. BAD improves the performance of multicast dissemination trees, in both static and dynamic environments, where the network links’ effective bandwidth changes over time. Its main goal is to improve the data rate that end users experience during a multicast operation. BAD can be used to create and manage multicast groups. It can be deployed over any distributed hash table, retaining its ability to improve bandwidth. BAD consists of a suite of algorithms that support node joins and leaves, distribute bandwidth to heterogeneous nodes, and rearrange trees, while the overhead remains low. BAD rearranges the multicast tree to improve its performance, taking into consideration the nodes’ bandwidth capacities. Specifically, nodes try to reposition themselves under other nodes that provide them with better bandwidth from the tree’s root. Nodes keep information about specific other nodes on the tree that they can probe at predefined time intervals. BAD uses several techniques to reduce the number of messages sent, including keeping logarithmic state in each node. Moreover, BAD introduces the idea of a SuperRing—a separate DHT where all high-bandwidth nodes participate (see figure 5). Join messages are first routed through the SuperRing to ensure that the tree’s root is always a high-bandwidth node. That way, BAD performs much better than conventional multicast systems.

Figure 5. Routing a message using the SuperRing.


7

We’ve implemented BAD in the FreePastry system. A detailed performance evaluation reveals BAD’s efficiency and low overhead. Specifically, our experiments show that the improvement of the minimum bandwidth (compared to the Scribe system) ranges from 40 to 1,400 percent, and the improvement of the average bandwidth ranges from 160 to 250 percent(see figure 6). The main reason for BAD’s superior performance is that it takes into account node heterogeneity. Moreover, BAD adapts to changing network conditions, thereby avoiding bottlenecks. Finally, the SuperRing guarantees that the tree’s performance will never decrease significantly because of a low-bandwidth root.

Figure 6. Improvement of bandwidth using BAD (compared to using the Scribe system). The y axis show a level (rather than percentage) of improvement—for example, the minimum bandwidth is four to 14 times the level of improvement. Manos Kapritsos is a graduate student pursuing his master's degree at the University of Patras. Contact him at [email protected]. Peter Triantafillou is a full professor at the University of Patras. Contact him at [email protected].

A Middleware-Based Architecture for Low-Cost UAVs Juan López Rubio • Technical University of Catalonia An unmanned aerial vehicle is a nonpiloted airplane designed to operate in dangerous and repetitive situations. With the advent of UAV civil applications, UAVs are emerging as a valid option in commercial scenarios. For this platform to be economically viable, it should implement a variety of missions with little reconfiguration time and overhead. At the Technical University of Catalonia, we’ve developed a middleware-based architecture specially suited to operate as a flexible payload and mission controller for UAVs. The system comprises low-cost computing devices connected by a network. The functionality is divided into reusable services distributed over several nodes with a middleware managing their life cycle and communication. Previous research in this area has focused on the control domain and its real-time operation. Our proposal differs in that we address the implementation of adaptable and reconfigurable unmanned missions in low-cost and low-resource hardware. Also, most middleware promotes a distributed-computing paradigm; however, our target application, UAV avionics, suggests a global-data-space approach. In this environment, most communicating components are sensors that spread their data samples to several controlling components (see figure


8

7). These components evaluate the data from several sources and again send control data to many actuator components.

Figure 7. UAV service architecture. In our architecture, the middleware takes the form of a service container that’s unique in each node of the distributed-system network (see figure 8). The service containers execute and manage services and provide common functionalities that the services contain (such as network access, local message delivery, and name resolution and caching). The key benefit is that services are entirely decoupled. Little design time has to be spent on how to handle their mutual interactions. In particular, the services never need information about the other participating services, including their existence or locations. The container automatically handles all aspects of message delivery, without requiring any intervention from the services, including determining who should receive the messages, where recipients are located, and what happens if messages can’t be delivered.

Figure 8. UAV services distributed over several containers. For the specific characteristics of a UAV mission, which might have many interacting systems, we propose a solution based on the Data Distribution System paradigm. DDS promotes a publish/subscribe model for sending and receiving data, events, and commands among the services. Many DDS frameworks have been already developed, each one contributing new primitives for such an open communication scenario. In our proposal, we implement only the communication primitives required by a minimal distributed embedded system to keep the system simple and soft real-time compliant.


9

Our current minimalist prototype is based on Microsoft C# and has 36 classes and fewer than 1,500 lines of code. We’re performing a functional analysis with several avionics use cases, then we’ll test performance and soft real-time compliance. The service model and its communication primitives have demonstrated that they’re flexible and simple enough to easily migrate existing UAV applications into reusable services. Juan López Rubio is an assistant professor at the Technical University of Catalonia. Contact him at [email protected].

Resource Discovery in Federated Systems with Voluntary Sharing Hao Yang, Fan Ye, and Zhen Liu • IBM T.J. Watson Research Center Federated system is a popular paradigm for supporting large-scale, distributed applications in Web services, Grid computing, and stream processing.1 Such a system usually consists of loosely coupled autonomous systems that different administrative authorities manage. The participating entities or organizations, motivated by some common interest, are willing to share their resources (for example, computing, storage, and data sources) for mutual benefits that they can’t achieve individually. A desirable feature for resource discovery in federated systems is voluntary sharing. The participants usually want to preserve control of their resources so that they can determine the type of access they want to grant, which services and participants to grant access to, and under what conditions. They don’t want to blindly publicize all their resources or allow just any other participant to store and serve their resource information. There are currently two main approaches to resource discovery: hierarchical and peer-to-peer. Hierarchical resource discovery is inefficient owing to the flooding of the hierarchy; P2P resource discovery lacks management flexibility because of deterministic hash functions. As such, neither approach lends itself well in the context of federated systems. To address these problems, we propose a replication-overlay-enhanced hierarchy. Servers belonging to different organizations form a hierarchy (see figure 9) by voluntarily associating with each other. Instead of publicizing their original resource information, they export summaries that are condensed representations of their resource records. The servers periodically propagate and aggregate the summaries in a bottom-up manner in the hierarchy. When a server receives a query from a client, it can compare the query to the summaries and direct the client down relevant branches of the hierarchy, eventually leading to matching servers. The use of summaries avoids flooding the hierarchy because they allow informed forwarding decisions based on multiple attributes in the query.

Figure 9. The hierarchy of servers with replication overlays. Each participant (resource owner) exports summaries to support resource discovery. Nodes D1, D2, C2, and B2 form one replication overlay. The search from the client can start at any node and be directed to corresponding servers.


10

To improve efficiency and robustness, each node replicates the summaries of its siblings and ancestors’ siblings. The search can start from any node, speeding up the search and eliminating any single point of failure in the hierarchy. To support the discovery of up-to-date dynamic resources, a consistency control mechanism automatically tunes the update period among servers and ensures the timeliness of summaries is within a given threshold. Evaluation of our system’s performance reveals that it provides efficient resource discovery while preserving voluntary sharing. It achieves an order-of-magnitude less message overhead than a similar system2 and provides response times comparable to a centralized system. References 1.

M. Branson et al., “Clasp: Collaborating, Autonomous Stream Processing Systems,” to appear in Proc.ACM/IFIP/Usenix 8th Int'l Middleware Conf (Middleware 2007), LNCS 4834, Springer, 2007.

2.

D. Oppenheimer et al., “Design and Implementation Tradeoffs for Wide-Area Resource Discovery,” Proc. IEEE High Performance Distributed Computing (HPDC 05), IEEE Press, 2005, pp. 113–124.

Hao Yang is a research staff member at the IBM T.J. Watson Research Center. Contact him at [email protected]. Fan Ye is a research staff member at the IBM T.J. Watson Research Center. Contact him at [email protected]. Zhen Liu is a senior manager at the IBM T.J. Watson Research Center. Contact him at [email protected].

Related Links •

DS Online's Middleware community

Cite this article: Knarig Arabshian, Henning Schulzrinne, Mohammed Jubaer Arif, Shanika Karunasekera, Santosh Kulkarni, Roberto Speicys Cardoso, Sonia Ben-Mokhtar, Valérie Issarny, Aitor Urbieta, Manos Kapritsos, Peter Triantafillou, Juan López Rubio, Hao Yang, Fan Ye, and Zhen Liu, "Works in Progress: Middleware 2007," IEEE Distributed Systems Online, vol. 8, no. 11, 2007, art. no. 0711mds2007110003.


11