Collaborative Caching Based on Hash-Routing for ... - acm sigcomm

12 downloads 14434 Views 13MB Size Report
Aug 12, 2013 - Keywords. Information-Centric Networking, collaborative caching, hash- routing. 1. .... on content-oriented networking for efficient content delivery,” ... [4] Z. Li, and G. Simon, “Time-Shifted TV in Content Centric. Networks: the ...
Collaborative Caching Based on Hash-Routing for Information-Centric Networking Sen Wang, Jun Bi, Jianping Wu Institute for Network Sciences and Cyberspace, Tsinghua University Tsinghua National Laboratory for Information Science and Technology (TNList) Beijing, China

[email protected];[email protected];[email protected] Categories and Subject Descriptors C.2.2 [Computer-Communication Networks]: Network Protocols

General Terms Design, Algorithms, Performance

Keywords Information-Centric Networking, collaborative caching, hashrouting.

1. INTRODUCTION Recently research on Information-Centric Networking (ICN overviewed in [1]) has flourished, which attempts to shift the current host-oriented Internet architecture to informationoriented. The built-in caching capability in network is a typical feature of ICN. In ICN, along the path of the request, the first cache having the desired content object responds with its cached copy, which we refer to as en-route caching mode. As the works in [2] and [3] together show, there is a large gap between the practical cache size and the caching capacity needed to cope with the vast population of content in the Internet. In [4], a cooperative in-network caching scheme is proposed, which could achieve collaboration of a cluster of caches with hash-based partitioning of content space. However, as the cluster grows, the overhead of inter-communication of caches could increase significantly. In this paper, we propose a collaborative caching scheme, named as CPHR, to maximize the overall hit ratio of an ISP network by partitioning content space and hash-routing.

As Figure 1 shows, when an Interest (a.k.a request) comes from an ingress link, content router F looks up its FIB to find the corresponding egress router and PAT. By the hash value of the Interest name, content router F can determine the partition the Interest belongs to and the assigned cache in PAT. Note that CPHR allows Interests to be sent directly to the egress router without any caching to avoid extremely large path stretch. Then the name of the assigned cache is prefixed to the name of the Interest. Besides, two additional attributes are added in the Interest. One attribute named “CacheName” indicates the name of the assigned cache and the other attribute named “EgressName” indicates the name of the egress router. After that, the Interest is forwarded according to the new name. Along the path to the assigned cache B, content routers, such as A, do not need to look up their CS tables for the Interest. They forward the Interest normally according to its name. When the assigned cache B receives the Interest, it looks up its CS table to find out if it has the corresponding content packet. If so, it responds the Interest with its cached copy; otherwise, it replaces its name in the Interest name with the value of the EgressName attribute and forwards further the Interest. When the egress router D receives the Interest, it can tell that it is the egress router of the Interest from the EgressName attribute. It deletes its name from the name of the Interest and forward the Interest outside the network. The CacheName attribute remains in the Interest and will be copied into the corresponding content packet. Therefore, along the reverse path of the Interest, the content packet can be cached only in the assigned cache. Interest Name: /Node_D/www.youtube.com/video/spiderman.rm; CacheName: Node_B; EgressName: Node_D

2. SYSTEM DESIGN First we partition the content population in a way that content objects originating from the same egress router are assigned in the same partition. Subsequently these resulting subsets are partitioned evenly by mapping the hash values of content names into partition numbers. As a design example, we leverage NDN [6] (a.k.a CCN) to illustrate how to extend ICN architectures to implement CPHR. To implement CPHR, we extend the FIB table with a new column named “Egress Router” as shown in Figure 1. This column indicates which egress router Interests belonging to a specific prefix are to be sent to before leaving the network. Additionally, at each ingress router, there is a corresponding Partition Assignment Table for each egress router. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). SIGCOMM'13, August 12-16, 2013, Hong Kong, China ACM 978-1-4503-2056-6/13/08.

Inter-ISP Link D

C

B P1 of Egress D

P2 of Egress D E

A Interest Name: /Node_B/www.youtube.com/video/spiderman.rm; CacheName: Node_B; EgressName: Node_D Prefix

Next Hop

/www.youtube.com/ A /www.bbc.com/ E … …

Name: /www.youtube.com/video/spiderman.rm

Inter-ISP Link

F Inte rest

Egress Router D E …

Hash Function

Partition P1 P2 …

Cache B E …

Figure 1. A design example of CPHR on NDN. Before routing Interests to the corresponding content providers, routing them to assigned caches would inevitably result in path stretch. For example, in Figure 1, the Interest experience 2-hop path stretch. We control this negative impact by constraining the worst path stretch during the process of assigning partitions to caches.

535

To solve the problem of assigning partitions originating from different egress routers to caches, we break it down into subproblems which assign partitions originating from the same egress router to caches in order to maximize the overall hit ratio while satisfying the constraint of the worst path stretch. These subproblems could be mapped into the MultiCommodity Facility Location (MCL) problem which is NP-hard. We propose a heuristic algorithm named AssignCache to solve it, by the following two means: 1) Assign exclusively a set of caches to a partition; 2) Limit the volume of requests which are directly sent to the egress router, while trying to find as many cache sets exclusively for partitions as possible. The least served ratio (lsr for short) is imposed on the cache sets found, which denotes the least request proportion of a partition should be served by caches. We first calculate the sum of the request proportions from ingress routers that a router can serve for a specific egress router , denoted by , . Here, “can serve” means the path stretch imposed by using the router as cache does not exceed the path-stretch constraint. Function served_ratio in the following algorithm sums the request proportions from ingress routers in served_set. Algorithm AssignCache for all in calculate , Sort in in decreasing order of , selected_routers =∅ count = 0 for all in current_set = ∅ served_set = ∅ if not in selected_routers current_set = current_set ∪ selected_routers = selected_routers ∪ served_set = served_set ∪ _ while _ served_set selected_routers served_set Find a router , which can serve the largest and non-zero proportion of requests from ingress routers in , If cannot, exit() current_set = current_set ∪ selected_routers = selected_routers ∪ served_set = served_set ∪ _ count=count+1 cache_sets[count]=current_set

4. EVALUATION WITH P2P TRAFFIC We evaluate CPHR with a trace of P2P traffic which is provided by [5]. Basically, the original collected data are the distribution information of a large population of file replicas, which is attained from two-month passive measurement study of the Gnutella file sharing network. Like what the authors in [5] do, we assume that all the file replicas observed in an AS were downloaded from peers outside the AS sometime in the past. Here, we choose the file replicas observed in the network of AT&T. We map the IP address of the owner of each file replica into geographical location by Geo-IP database and choose the nearest PoP as its ingress point. As Figure 2 shows, the scattered red points are the file replicas (Only 1% of the total 1,026,081 replicas are randomly selected and shown in the figure), and the red bars on PoPs are the resulting request number from them. We assume the PoP at Los Angeles is connected to the single egress router from which all the file replicas originate. The green lines are the links of the shortest paths from all the PoPs to the egress point, along

which the en-route caching mode is effective. We set the worst path stretch to 20ms which is roughly equal to the propagation latency from Los Angeles to New York, and set lsr to 0.7, which means at least 0.7 of the requests of a partition from all the ingresses are sent to assigned caches. With Algorithm AssignCache, 11 nodes are found for 11 partitions, which are plotted in Figure 2. 50

45

7 6 8

40

4

3

5

35

1 10

11

2

30

25

9 -120

-110

-100

-90

-80

-70

Figure 2. Evaluation scenario with a P2P trace of AT&T The preliminary result of the trace-driven simulation is presented in Figure 3. CPHR can increase the overall byte hit ratio signifycantly for both cache replacement policy LRU and LFU. Setting cache size to 160G, CPHR can increase the overall byte hit ratio by 105.7% with LRU. 0.5 Overall Byte Hit Ratio

3. CACHE ASSIGNMENT ALGORITHM

0.4 0.3

En-route LRU CPHR LRU En-route LFU CPHR LFU

0.2 0.1 0.0 G G G G G G G G G 10 20 40 80 160 320 640 1280 2560 Cache Size

Figure 3. Overall byte hit ratio vs cache size

5. ACKNOWLEDGMENTS This research is supported by the National High-tech R&D Program (“863” Program) of China (No.2013AA0l0605) and the National Science Foundation of China (No.61073172). Jun Bi is the corresponding author.

6. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

536

J. Choi, J. Han, E. Cho, T. T. Kwon, and Y. Choi, “A survey on content-oriented networking for efficient content delivery,” IEEE Communications Magazine, pp. 121–127, 2011. C. Fricker, P. Robert, J. Roberts and N. Sbihi, “Impact of traffic mix on caching performance in a content-centric network,” In Proc. of IEEE INFOCOM NOMEN, 2012. D. Perino, and M. Varvello, “A Reality Check for Content Centric Networking,” Proceedings of the ACM SIGCOMM workshop on Information-centric networking, 2011. Z. Li, and G. Simon, “Time-Shifted TV in Content Centric Networks: the Case for Cooperative In-Network Caching,” 2011 IEEE International Conference on Communications (ICC), 2011. Mohamed Hefeeda, and Behrooz Noorizadeh, “On the benefits of cooperative proxy caching for peer-to-peer traffic,” IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 7, pp. 998–1010, 2010. L. Zhang et al. Named Data Networking (NDN) Project, 2010.http://named-data.net/ndn-proj.pdf.