Caching in Opportunistic Networks with Churn - Semantic Scholar

2 downloads 5501 Views 632KB Size Report
Wireless communication, ad hoc networks, opportunistic communication, content distribution, delay tolerant networks. I. INTRODUCTION. Opportunistic wireless ...
Caching in Opportunistic Networks with Churn Sanpetch Chupisanyarote, Sylvia Kouyoumdjieva, Ólafur Helgason, and Gunnar Karlsson KTH School of Electrical Engineering and Linneaus Center ACCESS Stockholm, Sweden {sanpetch, stkou, olafurr, gk}@kth.se

Abstract—In this paper we examine opportunistic content distribution. We design and evaluate a caching strategy where a node will fetch and share contents on behalf of other nodes, although the contents are not of its own interest. We propose three caching options for improving the use of network resources: relay request on demand, hop-limit, and greedy relay request. The proposed strategies are implemented in the OMNeT++ simulator and evaluated on mobility traces from Legion Studio that have churn. We also compare our strategies with a strategy from the literature. The results are analyzed and they show that the use of opportunistic caching for a community of nodes may enhance the performance marginally while overhead increases significantly.

Wireless communication, ad hoc networks, opportunistic communication, content distribution, delay tolerant networks

I. I NTRODUCTION Opportunistic wireless networks make use of the availability of individual devices for establishing communication [1]. Nodes that are in contact range establish direct communication via radio. If a node discovers that a neighboring node stores contents of interest, it will request those items. A node that seeks contents is called a subscriber and a node that makes contents available is referred to as a publisher. We assume that nodes subscribe to contents of their interest and that they are willing to share the contents. Thus, each node can operate both as publisher and subscriber. Here we study a system in which nodes fetch and store not only private (subscribed) contents but also contents that are not of their own interests. In order to support this system, two types of caching are introduced: private and public. The private cache contains only subscribed contents, while the public cache stores data that are potentially of interest to neighbors. The purpose of caching public contents is to increase content availability in the system with the goal of improving performance. We study different caching strategies and evaluate the performance of data dissemination versus the overhead that the system experiences due to the public caching. The rest of the paper is structured as follows. We review related work in section II. The caching strategies are outlined in Section III. The performance evaluation is described with its setup, and test data in Section IV. Results and evaluation are presented in Section V. Section VI summarizes the study, discusses our finding, and concludes the paper.

II. R ELATED WORK Caching strategies for opportunistic content dissemination usually make use of community formations and exploit the social roles of nodes inside such communities. The term community does not refer only to a physical place or location, neither does it encompass only people who are familiar to one another. What also unites nodes belonging to a community are their common interests. The authors in [2] use the concept to form an overlay and implement publish/subscribe communication in delay tolerant networks. The overlay formation, which is a set of logical links, relies on community detection algorithms. In [3], the node will fetch the contents from the peer that gives the maximum value in a utility function defined by the size of the content, the probability that the content is available and the probability that the node can get access to that content. In [4], the authors present an idea of pre-fetching. A node determines whether to store contents in its cache based on analytic hierarchy process and Grey relational analysis. The authors in [5] propose three content selection strategies— uniform, most popular, and optimized social welfare—based on which nodes select to carry contents that are not of their own interest. SocialCast [6] presents a routing protocol supporting publish/subscribe communication, which relies on prediction of node popularity for taking caching decisions. Nodes that have higher importance in the community are preferred as content carriers. The authors in [7] propose an algorithm called BUBBLE, which tries to select popular nodes (with high centrality) as relays by making a decision based on the knowledge of community structure. However, it depends on the routing protocol rather than a caching mechanism. Our work does not rely on the notion of community, neither does it exploit any of the presented community features. Our strategies do not require a routing algorithm, overlay forming, or prediction of content selection. Instead, we cache content items based on nodes’ requests. III. C ACHING STRATEGIES A. Private vs. public caching Initially users subscribe to all contents they are privately interested in. However, not all users have the same interests; their subscriptions vary. Thus, exchange of private contents is highly dependent on the popularity of the data. When a node contacts another node, its missing contents might not be found. In order to address this, we propose a number of public

2

(a) Fig. 1.

(b)

(a) Example of the caching model. (b) The relay request protocol.

caching strategies. The general idea of public caching is that when a node receives a request it cannot promptly satisfy, it may try to fetch the contents from its neighbors on behalf of the requesting node. Since these contents are not the contents of its interest, the node will store them in a public cache. In order to support the caching strategy, each node supports an available private/public list of the items in the respective cache. Moreover, it keeps track of the missing private/public items in another list. We call an interested list the union of lists with missing items, and a subscribed list - the list of items that correspond to the node’s subscriptions. A waiting list is populated with the IDs of the content items to be downloaded from a peer during a contact. We call a neighbor any node in direct communication range, and we refer to those peers which are in one or more hops away from us as indirect neighbors. Figure 1(a) shows an example of the caching model that we define according to the terminology above. B. Relay Requests Relaying requests allows neighbors to help a subscriber to find missing contents from its indirect neighbors. The relay request protocol is illustrated in Figure 1(b). Each node periodically broadcasts Broadcast REQ messages containing the interested list. A neighboring node replies by sending its available list as a unicast message destined to the node that has broadcasted the request. Any matching items are added to the waiting list to be downloaded from the peering node, one content item at a time (Unicast REQ). We choose to download content items consecutively because of the high mobility of nodes and the short contact durations. Furthermore, nodes give priority to private data over public data. We try to optimize the relay-request process by introducing three strategies with the goal of increasing the chances of bringing data to subscribers. 1) Relay on Demand: In a dense network with high mobility and short contact durations, altruistic nodes may waste storage, transmission capacity and energy to download contents which might never be delivered to the initial requester. The relay on demand strategy requires each incoming broadcast request that cannot be served to be relayed only once. If other nodes in communication range can provide the missing public content, it will be fetched and stored by the neighbor node and transferred to the actual requester upon arrival of the next broadcast request message. However, if no one can provide

the content, the helping functionality of the neighbor node will silently be suppressed. The neighbor node will not initiate new request attempts on behalf of the initial requester if there are no further incoming broadcast messages searching for the same content item. 2) Hop-limited Relay Request: Multi-hop relaying is a strategy that allows contents to be downloaded from other nodes that are many hops away. However, in a network with high mobility multi-hop relaying is not suitable. Therefore, we here present a limited version of this strategy, in which we bound the hop limit to two or three hops from the initial requesting node. 3) Greedy Relay Request: The previous strategies assume that a node can download only from one neighbor at a time, no matter if this neighbor is providing contents of private or public interest. The greedy relay request strategy relaxes this assumption by allowing nodes to constantly monitor content announcements from neighbors while at the same time downloading data from a peer. To do this, the node continues to periodically broadcast its interested list while associated with a peer in a download session. Whenever a new Unicast REQ is received, the node checks both its current download (if it is public content) and the advertised content from the new neighbor (if it is of private interest). If both conditions are true, the node will initiate a download session to the new neighbor and will terminate the current download. IV. PERFORMANCE EVALUATION A. Simulation Model To conduct our study, we use the MiXiM [8] framework for mobile wireless networks together with the OMNeT++ simulator [9]. MiXiM includes models of protocols, radio propagation, node components, and message delivery and it provides a well-constructed API for application to be run on OMNeT++. In this work we use an extension of MiXiM that supports opportunistic networking [10]. B. Content Popularity We first create a pool of contents, which specifies the number of content items available in the network and the size of each content item, which is drawn from a normal distribution. Then, based on a Zipf distribution, each node randomly selects contents that it subscribes to from the pool. The private cache of the device is then populated with a randomly selected portion of those content items. The public cache is initially empty. C. Mobility Model For obtaining realistic mobility traces, we use Legion Studio [11], a multi-agent simulation tool which models pedestrian movement in large spaces, like subway stations and shopping malls. The output trace created by Legion Studio provides us with a snapshot of each node’s location in the system at a granularity of 0.6 seconds. Legion Studio allows us to simulate open systems where nodes enter and leave the system during the simulation (churn). Here we use traces

3

(a) (a)

(b)

Fig. 2. The simulation scenarios: a part of downtown Stockholm (a) and a two-level subway station (b). TABLE I S IMULATION PARAMETERS (I TALIZED VALUES REPRESENT DIFFERENT SETTINGS FOR MOBILITY TRACE 2) Simulation parameters

Values

Periodic broadcast of a request

every 1 sec

Timeout threshold

60 sec

Communication range

10 m, 20 m

Total contents available in the network

1000 items

Numbers of subscribed content items

10 items

Initial number of content items

5 items

Content popularity

Zipf(α=0.368)

Content size

Normal(3kB), stddev= 1 kB

Traces

Subway Station, Östermalm

Target speed

Truncated Normal(1.3m/s)

Simulation period

1 hour, 2 hours

for two scenarios: a subway station and an outdoor city district [12]. 1) Subway Station—This scenario depicts a two-level indoor subway station case, as shown in Figure 2(b). Passengers enter or leave the station through any of the entrance points on the top floor or by trains at the platforms. The active area of the station is 1921 m2 . 2) Östermalm—Östermalm is a downtown area in Stockholm consisting of a grid of interconnected street, as shown in Figure 2(a). The length of each street varies between 20 m and 200 m, and the width is 2 m, to represent sidewalks. We assume that the arrival rates of all streets are equal and when nodes arrive at an intersection, they will go straight on the same street with probability 0.5 or turn to other adjoining streets with equal probability. The active area is 5872 m2 . D. Experiments Each node is configured according to the parameters in Table I, unless otherwise stated. In the Östermalm scenario, each node picks upon arrival a target speed from a truncated normal distribution with mean 1.3 m/s. Nodes arrive at each entrance point according to a Poisson process with rate 0.5 nodes/sec. The Subway station has a predefined arrival model. We evaluate the following scenarios: 1) No caching—A baseline scenario in which nodes use only cache private contents. The dissemination process is driven only by the users’ interest. 2) Public cache with (a) two-hop limit and (b) three-hop limit 3) Public cache with (a) two-hop limit and greedy relay request and (b) three-hop limit and greedy relay request We then compare these strategies with the optimal channel

(b)

Fig. 3. Normalized values of (a) Gpri and (b) Gpub /Gpri with applied relay request strategy for the Subway scenario.

choice proposed in [5]. 4) Optimal Channel Choice Strategy—Apart from periodically broadcasting their available lists and requesting private data, nodes also fetch contents that are not of its interest, to be stored in the public cache. The strategy for selecting public contents is presented in algorithm 2 of [5]. The node first selects a content item from its public cache (content i) and a new public content item from its neighbor (content j). Then it computes a probability q. If min(1, q) > Uniform[0, 1], it will fetch and store the new item j and drop its item i. We have modified the calculation of q to correspond to the Zipf distribution: total numbers of nodes that store content i q= total numbers of nodes that store content j We evaluate this strategy with our mobility scenarios, varying the size of the public cache from 0 to 5 and 10 content items per node. V. RESULTS AND EVALUATION In this section, we present the results from the simulated scenarios. We compare the results in terms of goodput, which we define as follows: 1) Private goodput—the amount of bytes downloaded in the private cache of a node, divided by the lifetime of that node in the system. 2) Public goodput—the amount of bytes downloaded in the public cache of a node, divided by the lifetime of that node in the system. For our evaluation we use the mean values of the two goodput metrics, Gpri and Gpub respectively, which are normalized with respect to all nodes in the system. A. Results for Subway Station Case Figure 3(a) presents the performance of the proposed caching strategies in terms of Gpri ; the goodput is normalized with respect to the baseline scenario. The graphs show that Gpri in strategy 1 is higher than in strategy 2a and 2b. When a node acts altruistic, it tries to download the contents on behalf of others and loses the opportunity to obtain content which is of its own interest. Additionally, the public content which the node has completely fetched might not be provided to the node that requests it in the first place. Consequently, the results for strategy 3a and 3b show that the greedy relay request leads to more contents of interest being disseminated in the network. Furthermore, the rate is higher for the 3-hop limit than the 2-hop limit in the greedy case but not otherwise.

4

(a)

(b)

Fig. 4. Normalized values of (a) Gpri and (b) Gpub /Gpri with applied optimal channel choice strategy for the Subway scenario.

(a)

(b)

Fig. 6. Normalized values of (a) Gpri and (b) Gpub /Gpri with applied optimal channel choice strategy for the Östermalm scenario.

higher goodput and lower overhead ratio, thus making the relay request strategy a better choice in this case.

(a)

(b)

Fig. 5. Normalized values of (a) Gpri and (b) Gpub /Gpri with applied relay request strategy for the Östermalm scenario.

Moreover, Figure 3(b) illustrates that the public goodput in the system Gpub is considerably higher than the private Gpri . This means that there are a lot of public contents distributed in the network but they make quite small contributions to the contents of interest. The increased overhead is the price to be paid for a slightly better data dissemination. The results from the optimal channel choice strategy are presented in Figure 4. It can be seen that when nodes increase the size of the public cache, they get higher Gpri . However, Gpri only increases slightly while Gpub increases considerably, which is similar to the result from the relay request strategy. Gpub in the optimal channel choice strategy is lower than for the relay request strategy because of the limitation in the size of the public cache. Comparing only the private goodput Gpri in the relay request strategy and the optimal channel choice strategy, one can see that the former provides slightly higher goodput values. However, comparing the ratio Gpri /Gpub in Figure 3(b) and Figure 4(b), it turns out that the relay request strategy actually wastes more resources (ratio values are 16.12 and 2.86 respectively). B. Results for Östermalm Case The results from the relay request strategies with the Östermalm case are presented in Figure 5 and Figure 6. The graphs in Figure 5 show again that public caching with relay requests can slightly increase Gpri but they result in significant growth of Gpub . Public cache with three-hop limit with greedy relay request gives the highest Gpri among the five scenarios. The results for the optimal channel choice strategy in Figure 6 confirm that the bigger size of public cache results in higher Gpri . However, it increases only slightly while Gpub increases considerably. Comparing again the best relay strategy and the optimal channel choice, we find that the former has Gpri provides

VI. C ONCLUSION We have studied the usage of public caching for enhancing content dissemination in opportunistic networks in which data exchange is based purely on subscriptions. The main question we have addressed is whether nodes should fetch and store contents which are not of private interest. The study has been conducted with a complete system simulation and for mobility traces that reflect pedestrian mobility in two different urban environments that exhibit churn. The simulation results suggest that public caching improves slightly the application layer goodput. However, this improvement comes at a cost of increased communication overhead. We do not find a unique strategy that maximized the goodput in all cases. Our recommendation based on these findings is that highly mobile nodes in opportunistic content distribution systems only exchange private contents. This recommendation avoids high overhead at only a small loss of goodput. It also relieves the design of the integrity and security problems that might arise when nodes store contents that the user would not like to promote in a network of nodes. Our future work targets evaluating public caching strategies in environments with lower mobility and churn. R EFERENCES [1] O. R. Helgason, E. A. Yavuz, S. T. Kouyoumdjieva, L. Pajevic, and G. Karlsson, “A mobile peer-to-peer system for opportunistic contentcentric networking,” in Proc. of ACM SIGCOMM MobiHeld, 2010. [2] E. Yoneki, P. Hui, S. Chan, and J. Crowcroft, “A socio-aware overlay for publish/subscribe communication in delay tolerant networks,” in Proc. of ACM MSWiM, 2007. [3] C. Boldrini, M. Conti, and A. Passarella, “Contentplace: social-aware data dissemination in opportunistic networks,” in Proc. of ACM MSWiM, 2008. [4] Y. Ma, M. Kibria, and A. Jamalipour, “Cache-based content delivery in opportunistic mobile ad hoc networks,” in IEEE GLOBECOM, 2008. [5] L. Hu, J.-Y. Le Boudec, and M. Vojnovic, “Optimal channel choice for collaborative ad-hoc dissemination,” in Proc. of IEEE INFOCOM, 2010. [6] P. Costa, C. Mascolo, M. Musolesi, and G. Picco, “Socially-aware routing for publish-subscribe in delay-tolerant mobile ad hoc networks,” IEEE Journal on Selected Areas in Communications, vol. 26, no. 5, pp. 748 –760, june 2008. [7] P. Hui, J. Crowcroft, and E. Yoneki, “Bubble rap: social-based forwarding in delay tolerant networks,” in Proc. of ACM MobiHoc, 2008. [8] “MiXiM,” http://mixim.sourceforge.net/developers.html. [9] “OMNeT++,” http://www.omnetpp.org/. [10] O. R. Helgason and K. V. Jónsson, “Opportunistic networking in omnet++,” in Proc. of SIMUtools, 2008. [11] “Legion Studio,” http://www.legion.com/legion-studio. [12] O. Helgason, S. T. Kouyoumdjieva, and G. Karlsson, “Does mobility matter?” in Proc. of WONS, 2010.

Suggest Documents