A Popularity-driven Controller-based Routing and Cooperative ...

4 downloads 5864 Views 265KB Size Report
Controllers are responsible for making judicious request forwarding decisions while holding the most popular content requested in their domains. To gauge the ...
A Popularity-driven Controller-based Routing and Cooperative Caching for Named Data Networks ∗ CRISTAL

Narjes Aloulou∗ , Mouna Ayari∗† , Mohamed Faten Zhani‡ and Leila Saidane∗

Lab., National School of Computer Sciences, Univ. of Manouba, Tunisia. Email: [email protected] † LIP6 Lab., Univ. of Paris 6, Paris, France. Email: [email protected] ‡ Ecole de Technologie Superieure, University of Quebec, Canada. Email: [email protected]

Abstract—In this work, we investigate cache admission policies and forwarding decisions that account for content popularity in order to optimize the named-based routing performances in Named Data Networks (NDN). To achieve this goal, we propose a new controller-based, neighborhood content-aware, popularitydriven routing tightly coupled with a cooperative caching strategy, named Controller-Based Caching and Forwarding Scheme (CCFS). The rationale behind our proposal is to divide the network into domains mastered by cooperative Controllers forming a Connected Dominating Set (CDS). Controllers are responsible for making judicious request forwarding decisions while holding the most popular content requested in their domains. To gauge the effectiveness of our proposal, we conducted extensive simulations. Results show that CCFS reduces the mean hit distance by up to 30% and minimizes the communication cost by up to 47 times in comparison with a prominent existing approach - COntentdriven Bloom filter based Routing Algorithm (COBRA) as well as the default NDN routing. In addition, we observe that the performances of our proposal are very close to those achieved by the Best-route forwarding scheme using location-based Dijkstra’s algorithm. Keywords—Named Data Networks, Cooperative Caching, Request Forwarding, Connected Dominating Set, Content Popularity.

I.

I NTRODUCTION

Nowadays, Internet is more and more used for information dissemination and data content retrieval rather than for endto-end host communication. This paradigm shift in Internet usage has lead to the design of Information Centric Networking (ICN) as a promising candidate for the architecture of the future Internet. Many ICN architectures where the communication is centered around named data instead of host addresses have been designed in the literature [1]. In particular, Named Data Network (NDN) design has emerged as one of the dominant paradigm in ICN Research. The NDN architecture is mainly based on the three following key concepts: (1) hierarchically structured content names, (2) in-network caching mechanisms, and (3) name-based routing strategies. Despite of its many advantages, it has been revealed that NDN suffers from a problem of inefficiency of its default caching and forwarding modules. The default NDN caching policy leaves a copy of the requested data in all intermediate routers along the path from the hit node to the client. Such whole course caching policy can easily lead to a high cache redundancy in the network. Moreover, the forwarding module operating independently to the caching one is based on a blind flooding approach. Content requests are forwarded to all available interfaces in the FIB entry matching the requested

content’s prefix except the incoming one. Thus, in order to make NDN a viable architecture for the future Internet, both research and industrial communities have focused their efforts recently in improving NDN routing and caching efficiency. Particularly, a great deal of interest has been given to cache placement and cache discovery optimization. In this paper, we investigate this particular issue and we propose an efficient popularity-driven, controller-based forwarding and caching strategy called CCFS where content placement and forwarding decisions are tightly joined. We define cooperative controllers forming a Connected Dominating Set (CDS) in the network. Controllers are responsible for holding the most popular content requested in their domains, maintaining an upto-date intra and inter domain cache view and making judicious forwarding decisions. The novel part of the proposed design is how cache cooperation is maintained by Controllers and how the forwarding scheme operates accordingly. We have compared the performances of our novel proposal with a prominent existing routing algorithm named COntent-driven Bloom filter based Routing Algorithm (COBRA), the Basic NDN and Best Route approaches considering a real-word network topology. Simulation results have shown a significant improvement of the mean hit distance and the routing overhead for different cache sizes and content popularity distributions. The rest of the paper is organized as follows. Related work is briefly reviewed in Section II. Section III provides a detailed description of our proposed CCFS strategy. In Section IV, simulation experiment results are analyzed. Finally, Section V concludes the paper. II.

R ELATED W ORK

Extensive research efforts have been dedicated in the literature in order to investigate and explore more effective caching and forwarding schemes for NDN. In [2], [3], [4], authors proposed named routing approaches using specific breadcrumbs to guide the forwarding decision process. These approaches are designated to operate only over a hierarchical network architecture neglecting the arbitrary graph topology of Internet. The selection of the forwarding direction of the content request is based on a trail information storing the routing history of the current node. Each approach proposes its own caching placement policy ensuring that almost one copy of the content will be cached in the path from the server to the client. Based on the same concept as breadcrumbs solutions, authors have proposed in [5] a COntent-driven Bloom filter based Routing Algorithm (COBRA). The routing approach used stable bloomfilters (SBF) to leave traces about retrieved

contents along the downstream paths towards the clients. COBRA performs also an interface raking in order to choose the best path to retrieve the content. The more a SBF matches name contents, the higher the associated interface is ranked. COBRA considers a retransmission as a wrong forwarding decision in the past. With each retransmission of the same request a router adds the next ranked interfaces to those probed before. As a result, if multiple successive retransmissions occur, COBRA will converge to the Basic NDN forwarding scheme. It is worth noting that all the aforementioned approaches are based on an implicit cache coordination without introducing any additional control overhead in the network. Compared with Basic NDN, these approaches presented interesting features such as alleviating the high cache redundancy introduced by the default NDN caching policy and reducing both the server workload and the file download time. However, these schemes may suffer from a scalability issue. Cache discovery is only based on the history information of the content having passed through the node. Otherwise, a router does not have any information about the cache information in its neighborhood. Further, this limited in-path cache availability information is more and more inaccurate if the number of content cache replacements increases. On the other hand, a judicious vision of the availability of the cached content at different scales (inpath, neighborhood or global) can be achieved through explicit cache coordination approaches. However, in the most of cases, these approaches generate prohibitive additional signaling traffic and have failed to be considered as a valuable contribution. III.

CCFS: D ESIGN D ESCRIPTION

We consider in this work the original NDN architecture [6] as a reference model of a Named Data Network. We distinguish three types of nodes: clients (content requesters), NDN routers (core and edge) and content providers. Each NDN has Forwarding Information Base (FIB), Pending Information Table (PIT) and Content Store (CS) structures. Interest packets are used to explicitly address a data chunk of a content file. Data packets are sent back as a response containing the requested data chunk. Recall that our primary goal in this work is to design an efficient cooperative caching mechanism tightly combined with a request forwarding strategy for NDN. To achieve this goal, the following main issues should be addressed. First, a judicious cache cooperation approach is needed for an optimal data spread and positioning in the network. Second, the forwarding module should be able to use the caching location information in order to forward the content request to the closest copy in its neighborhood (either temporary or permanent). Third, the solution should be scalable supporting a large number of nodes, links, prefixes and content requests in the network. To do so, we define specific collaborative NDN Controllers and we propose to divide the network into domains each managed by a Controller. The rest of nodes are referred as Regular NDN Routers. Forwarding decisions and control tasks will be delegated to the Controllers. The raising question here is how to locate the most suitable nodes in the network eligible to assume the role of NDN Controllers. An important point to consider is to reduce as much as possible the distance between a Controller and its dominated Regular

routers on one side and between the neighbor Controllers on the other side. Two Controllers are neighbors if they manage neighbor domains. A good candidate that meets perfectly our objective is the Connected Dominating Set (CDS) algorithm. A CDS constructs a connected sub-graph where each node in the network is less than 1-hop away from at least one node in that subset. The distance between a Controller and its Regular routers will be reduced to one hop. Moreover, all Controllers will be directly connected through the sub-graph of the CDS. We note that there are several CDS algorithms proposed in the literature [7]. In our implementation, we have considered a classical one [8]. We provide in the following a description of our routing design called Controller-based Caching and Forwarding Scheme (CCFS). CCFS is based on three key design concepts: (1) an hybrid cache coordination approach, (2) a popularity-driven caching admission policy, and (3) a Controller-based request forwarding algorithm. A. Hybrid Cache Coordination In order to achieve an efficient request forwarding and data retrieval scheme, Controllers should be aware of permanent and temporary available content copies cached locally as well as in their neighborhood. To do so, a cache coordination mechanism is required. We argue that neither implicit nor explicit coordination, each in its own way, could be the optimal solution. On one hand, implicit coordination relies generally on local cache decision policies, the history of the traffic having passed by the node and the relative position or the particular role of each node in the network. Despite of its considerable advantages, the router cache awareness is limited to the local cached content and the neighborhood state information is implicitly deduced from the traffic already passed by the node. There is no way to a router to be aware of the content cached by a neighbor router if it is outside the data retrieval path. On the other hand, with explicit coordination, caches communicate, share their state and exchange information with each other. Nodes participating in the explicit coordination can easily build a content availability view in the scope of their cooperation and improve the performance of in-network caching. However, a considerable additional computation overhead is usually introduced in the network. Such scheme does not work well in a large-wide network such as Internet. In order to take advantage of each cache coordination approach, we opt to the use of an hybrid cooperative cache strategy: implicit coordination intra-domain and explicit coordination inter-domains. We equip each Controller with a new structure called Cache Information Base (CIB) maintaining a summary of the content cached in its neighborhood. The CIB has as many entries as the Controller’s interfaces towards its mastered Regular routers and its one-hop neighbor Controllers. Each entry contains the interface ID noted here i, the type of the neighbor node ni at the other end of the link with that interface (Regular or Controller) and a Stable Bloom Filter (SBFi ) associated to the interface i. Each SBFi represents a summary of the content stored in the node ni . A Bloom Filter is a space-efficient random data structure conceived to efficiently perform membership queries on a large data set. SBF is a specific counting Bloom Filter that guarantees a lower false positive probability in comparison with other types of Bloom Filters. For more details, a comprehensive survey of Bloom Filters can be found in [9].

B. Popularity-driven Caching Admission Policy Another key design of CCFS is to cache the high popular requested content at the Controllers and the low one at the Regular routers. As a consequence, unsatisfied requests at the Regulars will have a high chance to be satisfied at the Controllers. Moreover, we can easily avoid the case of inefficient cache replacement where a low popular content replaces a high popular one. To do so, we suppose that content is categorized into two main classes: High Popular (HP) representing the hot content requests in the network and Low Popular (LP) grouping the cold one. We note that content categorization is out of the scope of this paper. The Data retrieval and caching admission policy defined by CCFS is as follows. Once receiving a data chunk packet, each router either Controller or Regular forwards the retrieved content to all matching entries in the PIT. Meanwhile, CCFS uses a simple popularity-driven cache admission policy. After receiving a content, a Controller checks its popularity. If it is HP, it stores a copy in its CS. Otherwise, if the outcoming interface is towards one of its mastered Regular routers, the name of the content will be hashed and inserted in the SBF associated to that interface in the CIB. On the other hand, if a Regular router received a data chunk from its Controller, it checks its popularity. If it is LP, it stores a copy in its CS. Otherwise, the content will be discarded. However, if a Regular receives a HP content from an incoming interface different from its Controller, it keeps a copy of the data chunk in its CS. Hence, based on this cache admission policy, a Controller is able to implicitly build a cache view of the LP content stored in its domain while maintaining locally the HP one. Furthermore, each Controller computes periodically in a low frequency the SBF associated to its CS and send it to its one hop neighbor Controllers in a specific Interest packet. Upon receiving such Interest, the Controller updates the SBF corresponding to the CIB entry associated to the incoming interface. We note also that a HP content may become a LP one by the time and vice versa. In order to take into account this feature, we opt to the use of LRU replacement policy within Controllers CS and LFU within Regulars CS. C. Controller-based Request Forwarding Algorithm To make efficient request forwarding decisions, the controllers not only consider the routing table (FIB) but also refer to their neighborhood caching information summarized in the CIB. When receiving an Interest packet, a router checks its CS. If a hit occurs, it returns the matching Data. Otherwise, if the content has been already requested, its adds the incoming interface to the matching PIT entry. Otherwise, if no entry in the PIT matches the Interest, the Interest will be forwarded by the node’s forwarding module while keeping a breadcrumb of the incoming interface in its PIT. In that case, if the router is Regular and the incoming interface is different from its Controller one, it simply forwards the Interest to its Controller. Otherwise, if the incoming interface corresponds to its Controller one, it forwards the Interest according to its FIB. On the other hand, if the router is Controller, it first checks if there is an entry in its CIB matching the footprint related to the requested content. In that case, it forwards the Interest to the matching interface. Otherwise, it forwards the Interest according to the FIB. We note that the CCFS request

forwarding algorithm supports the default NDN forwarding scheme. It does not require that all routers in the network implement the CCFS modules. IV.

P ERFORMANCE E VALUATION

Extensive experiments have been performed to evaluate the performances of CCFS with respect to the following routing approaches: COBRA, Basic NDN and Best Route. Both Basic NDN and Best Route use the default NDN caching policy. As for the forwarding decisions, Basic NDN uses default Flooding. On the other hand, Best Route uses SHORTESTPATH where Interests are forwarded to the router’s interface towards the shortest path in terms of number of hops to reach the permanent content copy. Best Route is not a named-based routing approach. It uses an IP-based like routing algorithm where best routes are calculated applying the Dijkstra algorithm. We have considered this approach commonly used in the simulation tests just for performance comparison purposes. A. Simulation Environment We implemented the CCFS strategy and evaluated its performances using the ndn-sim simulator [10]. In order to simulate a real-like Internet environment, we adopt the wellknown Internet2 [11] as the network topology. The deployed topology consists of 10 core routers, 37 edge routers and 255 clients referring to the 255 Internet2 primary participants. We consider one repository attached to a randomly chosen edge router and we create a content name catalog with 106 unique contents. Content items are segmented into chunks of 10 KB size each. The content size follows a uniform distribution with an average of 100 chunks, resulting in a catalog of around 108 unique data chunks. We assume the content popularity distribution to be the Zipf mode tuned with the parameter α. α indicates the concentration degree of content requests. The bigger α is, the fewer distinct content represents the majority of content requests [12]. We assume that the content items attracting the majority of content requests are labeled as HP . All the other ones are categorized as LP contents. Clients are connected to edge routers and generate Interest packets with an arrival time following a Poisson process with a mean rate λ = 50 chunk requests/s. All routers are equipped with LFU Content Stores (CS). In order to have a fair comparison with the aforementioned routing approaches, we assume that all routers have homogeneous CSs with the same storage capacity. In addition to the FIB, PIT and CS, Controllers dispose of a CIB table. We have used the optimal formula [13] to dimension the SBFs of each Controller router. The latter sends every 30 seconds an Interest packet containing the SBF summarizing the content of its CS to its one hop neighbor Controllers. B. Simulation Results To exhibit the performance of our approach CCFS in different application environments, we have carried out two sets of simulation tests. In the first set, we examine the impact of varying the network cache size on the performances of the tested approaches. In the second set of experiments, we evaluate the performance of our approach in different content popularity patterns. As performance metrics, we have considered the mean hit distance, the server hit ratio, the Interest overhead and the per-node bandwidth consumption.

(a)

(b)

(c)

(d)

Fig. 1. Evaluated Metrics for Different Values of Cache Size: (a): Mean Hit Distance, (b): Server Hit Ratio, (c): Interest Overhead, and (d) Per Node Bandwidth Consumption

The mean hit distance measures the average number of hops that an Interest packet has to travel to find the requested data. The server hit ratio measures the server workload as the fraction of requests served by the repository. The Interest overhead refers to the data retrieval overhead generated in the network in one unit time. It is measured as the average number of Interest packets following through the network in one second. The per-node bandwidth consumption refers to the communication cost generated by each router in the network in one second. The results of each metric for each simulation scenario are averaged over 10 runs performed with varying random simulation seeds. Each run lasted 48 hours as simulation time. 1) Impact of Cache Size Variation: In this set of experiments, we set the Zipf distribution parameter α to 1. With this value of α, we obtained clients requests with a request popularity pattern close to the real life distribution shown in [14]. Fig. 1.(a), depicts the mean hit distance versus the cache to catalog size. We varied the total cache size from 1% to 10% of the whole content, ensuring a CS capacity varying from around 21∗104 to 21∗105 chunks per router. As a general observation, we note that the mean hit distance decreases slightly with the increase of the total cache size portion in the network. This is expected since the more the CS storage capacity increases, the less frequently chunk replacements occur in each CS, the more the opportunity to cache more different content chunks in each intermediate node increases. As a result, the probability to find a cached copy of the requested chunk closer to the client increases. Additionally, we observe that CCFS reduces the mean hit distance in comparison with COBRA and Basic NDN for all total cache size values. As depicted in the figure, CCFS is able to retrieve the content from a hit point 2 hops closer to the client than COBRA and Basic NDN, ensuring an average mean hit distance 30% shorter. This is expected since, COBRA makes routing decision only based on traces left in each router’s interface about retrieved contents. However, the Controllers holding the most popular content in their domains, benefit in addition of a view of the most popular content cached in the neighbor domains in only one hop away. We notice further that the performances of our approach are very close to those achieved by Best-Route. For what concerns the server hit ratio, shown in Fig. 1.(b), we note first that CCFS provides slightly better results than Best Route. An improvement up to 8% of the server hit ratio is gained by CCFS against Best Route. It is also noticeable that the server hit ratio decreases with the increase of the

overall cache size in the network. This is obvious, since when the cache capacity increases, cache replacements will decrease and more cache hits will occur in consequence. Nevertheless, we observe that COBRA and Basic NDN routing algorithms outperform CCFS in terms of server workload. This is simply because, with COBRA, routers’ interfaces towards the server are generally less ranked than those towards downstream. As a result, an Interest packet will be forwarded to a temporary cached copy of the requested content in favor of the permanent one stored in the repository. As for the Basic NDN routing approach, its forwarding scheme is based on a flooding approach. Hence, Interests broadcast in the network are more likely to be served by an intermediate router cache rather than the origin server. With CCFS, Interests are forwarded to the closest copy of the content stored in the neighborhood including the repository. The observed server hit ratio improvement of COBRA against our approach CCFS comes at the price of a very high overhead introduced in the network. As far as load is concerned, CCFS is clearly far away more lightweight than COBRA and Basic NDN. This is clearly shown in Fig. 1.(c). Both COBRA and Basic NDN have introduced a huge amount of Interest packets in the network. For instance, while the maximum load of Interest overhead introduced in the network by CCFS does not exceed 500 KB/s on average, COBRA has introduced at least 16 MB/s as Interest overhead load. The Interest overhead introduced by COBRA is 32 times bigger on average than CCFS for a cache size of 10% even though COBRA uses an implicit cache coordination approach. These results show how well CCFS can reduce the communication load and hence improve the bandwidth consumption in the network. This is indeed confirmed by results in Fig. 1.(d) where CCFS provides a remarkable gain in bandwidth consumption per node up to 38 times on average in comparison with COBRA and Basic NDN. This confirms the scalability of our routing approach and highlights the benefits of delegating the forwarding decision making process to cooperative neighborhood cache-aware controllers forming a connected dominating set in the network. We note also that for all the tested scenarios, the performance of COBRA is very close to Basic NDN. We can conclude that compared to COBRA, our smart and powerful forwarding plane reduces retransmissions in the network due to wrong forwarding decisions and increases the efficiency of the routing approach. 2) Impact of Content Popularity Variation: Fig. 2 shows the impact of varying Zipf α on the aforementioned three

(a) Fig. 2.

(b)

(c)

Evaluated Metrics for Different Values of α: (a): Mean Hit Distance, (b): Server Hit Ratio, and (c): Interest Overhead in one time unit

performance metrics: the mean hit distance, the server hit ratio and Interest overhead. The overall cache size capacity is fixed to 10% of all content. The comparison is done using different values of the α-Zipf parameter in the interval of [0.8, 2]. It is worth noting that α = 0.8 captures the case of worst fairly unpopular data. Moreover, recall that the bigger is α, the fewer different content will be attracted by generated client requests. From results, we can first observe that the performances of the 4 routing approaches become in the same order with α ∈ [1.6, 2]. This is obvious since in that case the majority of client requests are concentrated in a very small number of files that can be easily stored in a CS closer to the clients without ensuring cache replacements. Fig. 2.(a) reveals that considering CCFS against COBRA, the mean hit distance is reduced by 30%, 25% and 7% for α equal to 0.8, 1 and 1.2 respectively. Then, for α ≥ 1.6, the mean hit distance is around one hop on average for COBRA, Best Route and Basic NDN, while it is around 2 hops for CCFS. This is because with values of α greater than 1.5, the cache size of edge routers is bigger enough to store the small number of hot content files. However, with CCFS, HP content is stored at the Controllers at two hops from the client. It is further noting that the mean hit distance and the Interest overhead for COBRA and Basic NDN approaches are decreasing in a much faster rate than CCFS and Best Route. In particular, we notice from Fig. 2.(c) an exponential increase of COBRA and Basic NDN Interest Overhead in the case of content with a fairly distributed popularity (α = 0.8). For Basic NDN, these observations are obvious since in one hand its caching policy is based on caching everything everywhere decreasing the content diversity in the network. On the other hand, Basic NDN forwarding decision policy is based on a flooding approach. As far as COBRA is concerned, results signify that, considering a content popularity pattern tuned with α < 1.4 and a high loaded network (255 clients each requesting content with a rate of 50 chunks/s), requested content becomes much more hardly reachable and retransmissions due to wrong forwarding decisions increase significantly. We note also that the server hit ratio achieved by both COBRA and Basic NDN is slightly better than COBRA and Best route for small values of α. Performances converge with α ≥ 1.4. Finally, we note that with CCFS, routing performance is not greatly affected by the variation of the content popularity distribution. All metrics decrease slightly with the increase of α. No exponential variation behavior is observed.

V.

C ONCLUSION

In this work, we have proposed a new popularity-driven Controller-based Forwarding and Caching Strategy where cooperative Controllers forming a CDS play the role of principal forwarding decision makers in the network. We have also introduced a novel hybrid cache cooperative approach that enables each Controller with a neighborhood cache availability awareness without introducing a prohibitive signalling traffic in the network. Observations are taken considering the worst cases: large scale and high loaded network. Results show that our approach outperforms existing ones and has achieved a very promising performance close to the Best-Route for different cache sizes and content popularity patterns. Particularly, we have appreciated how much our routing scheme is lightweight and makes efficient forwarding decisions. R EFERENCES [1]

[2] [3] [4] [5]

[6] [7] [8] [9]

[10] [11] [12] [13]

[14]

G. Xylomenos et al., A survey of Information-Centric Networking Research, IEEE Communications on Surveys & Tutorials, vol. 6, no. 2, pp. 1024-1049, 2013. E. Rosensweig and J. Kurose, Breadcrumbs: efficient, best-effort content location in cache networks, INFOCOM, Apr. 2009. Y. Li, T. Lin, H. Tang, and P. Sun, A chunk caching location and searching scheme in content-centric networking, ICC, 2012. Y. Li et al., Self assembly caching with dynamic request routing for information-centric networking, Globecom 13, December 2013. M. Tortelli et al., COBRA: Lean Intra-domain Routing in NDN. Proc. of IEEE Consumer Communications and Networking Conference, CCNC, Las Vegas, NV, USA. 2014 L. Z. et al., Named data networking, ACM SIGCOMM Computer Communication Review, vol. 44, no. 3, pp. 6673, July 2014 D. Du and P. Wan, Connected Dominating Set: Theory and Applications, 2013th ed., Springer Science & Business Media, 2013. M. Guha, S. Khuller; Approximation algorithms for connected dominating set. Algorithmica, 20 (1998), pp. 374387. S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, Theory and practice of bloom filters for distributed systems, IEEE Communications Surveys and Tutorials, vol. 14, no. 1, pp. 131155, 2012. A. Afanasyev, I. Moiseenko, and L. Zhang, ndnSIM: NDN simulator for NS-3, NDN, Technical Report NDN-0005, 2012. Internet2 Network Connectors, www.internet2.edu, April, 2014 G. Carofiglio et al., Modeling data transfer in content-centric networking, International Teletraffic Congress, pp. 111 118, 2011. Deng, F., Rafiei, D., Approximately detecting duplicates for streaming data using stable bloom filters, SIGMOD 2006, pp. 2536. ACM Press, New York, 2006. A. Anand, C. Muthukrishnan, A. Akella, and R. Ramjee, Redundancy in network traffic: findings and implications, SIGMETRICS, pp. 37-48, New York, USA, 2009.

Suggest Documents