Simulation Framework for Adaptive Overlay Content Caching Theodore Zahariadis, Andreas Papadakis, and Nikolaos Nikolaou
deployment, alleviating unnecessary complexities and security configuration issues. According to [1], HTTP is becoming become the workhorse of multimedia content delivery being used for approximately 80% of content (mainly flash video) and having overcome the (dominant until recently) peer to peer protocols. This allows us to estimate that (HTTP-based) content caching solutions are here to stay. The role of caching is fostered by the stabilized and impactful percentage of the cacheable content. Indeed, not all traffic is appropriate for caching purposes. Single – usage content such as VoIP or teleconference traffic as well as content delivered over secure channels (in encrypted form) are not among the cacheable candidates. Web and multimedia traffic (excluding peer to peer) is currently estimated to 70% of the overall traffic with a rough percentage of 30% being reusable, meaning that at least 20% of the overall network traffic can be potentially cached. In fact content replication solutions become increasingly important not only for current but especially for future networks. It is our view that the Future Internet Architecture will encompass (apart from legacy data centres) micro and tiny ones as distributed network nodes. Such nodes can be coaccommodated or combined with large network routers and have adequate storage and processing power to serve a limited number of users. As they will be distributed in the network, at the access network edges, they are candidates for operating as dynamic, in – network content caches. Under this view, we estimate that content replication is finding its way towards the internal part of the network, in the form of in-network caching. This distributed caching overlay forms a complex networking ecosystem which has to consider multiple, sometimes interdependent parameters, related to the content, the end users and the network conditions. These parameters, largely unexplored in a holistic manner, affect the smooth operation of the caching overlay and its performance [2]. In the following, we consider the parameters that affect the content caching schemes focusing on network neighborhoods of a distributed scenario and propose a simulation framework modeling the more important of them. Performing a series of simulations we associate the performance of caches (in terms of hit rates) with these parameters, considering two of the more typical content replacement algorithms. This association can ensure a level of adaptability which is much needed in a dynamic environment, where the external conditions are constantly changing. The structure of the paper is as follows: Section II presents a view of the related work in the area of content replication, the architecture which has been the starting point of our work, and the factors that may affect the operation and the
Abstract—Current Internet architecture is challenged by the overwhelming multiplication of available content and the increasing requirements of the content delivery applications. Content Delivery Networks have been playing a catalytic role in content delivery solutions, replicating (caching) content at the edges of the network. In this paper, we investigate the role and the benefits of a more flexible and adaptive form of overlay content replication, considering changing conditions such as the popularity and the size of the content objects. We have designed and simulated a cache evaluator framework which considers the external varying conditions and can provide guidance on the operations of in – network caches. Keywords—Adaptive, caching, content, overlay, replication, simulation.
I. INTRODUCTION
T
HE overwhelming production of multimedia content along with the increasing user requirements such as reduced latency, guaranteed quality of service and data availability set new challenges to the content and service providers and even more to the network providers. Even the prioritization of the requirements is being changed. When the size of the requested content items has been limited (mainly web pages), the content access latency has been the prime requirement. Currently the size of individual content items has been increased (hundreds of MBs or even GB) as well as the demand for stable quality of experience. This poses the question for efficient access to the content. One of the (commercially successful) solutions employed to enable or facilitate seamless content delivery has been the replication (caching) of the content, mainly at the edges of the network with the usage of surrogate (caching) servers. Content replication mechanisms can be deployed by the content providers, the network providers, and, in principle, by third parties forming the Content Delivery Networks (CDNs). Content replication mechanisms have been primarily associated with the HTTP protocol, although they can be theoretically supported over every delivery mechanism (including streaming protocols). However the ubiquitous nature of HTTP has resulted in straightforward caching
Manuscript received February 21, 2012. This work was supported in part by the Projects COAST ICT-248036 and REVERIE ICT-287723, which are co-funded by the European Community T. Zahariadis is with Synelixis Solutions Ltd, Chalkida, Greece (e-mail:
[email protected]) A. Papadakis is with the Department of Electronics Engineering Educators, School of Pedagogical and Technological Education (ASPETE), Athens, Greece and with Synelixis Solutions Ltd, Chalkida, Greece (phone: +306936887333; e-mail:
[email protected]). N. Nikolaou is with with Synelixis Solutions Ltd, Chalkida, Greece (email:
[email protected]).
978-1-4673-1118-2/12/$31.00 ©2012 IEEE
62
TSP 2012
B. Our Architecture The starting point of our work is the architecture designed and developed within the project COAST [10], as depicted in figure 1. The architecture consists of three network layers / overlays: 1) The network infrastructure layer 2) The content caching overlay and 3) The information overlay. Content is replicated in caches at the network edge, in the content caching overlay. The cache optimizer is responsible for monitoring external conditions, making simulations in real time (or near real time), providing guidance to the cache nodes. It can support caches in deciding which (kinds or types of) objects they should store or evict. Coordinating optimizers are considered a distributed functionality. The Network Monitor is responsible for gathering all network related information: topology, traffic, characteristics of the user Internet access and optionally user location. It can be a variation of an IETF ALTO server or communicate with/supported by external traffic and network optimizer servers. We consider that it is supported by Deep Packet Inspection (DPI). DPI contributes to the discovery of content and services. It also generates information on content characteristics including its popularity. The design and implementation of the DPI service is outside the scope of this work. Nevertheless we consider the information on the content characteristics in the network neighborhood the primary input for the cache optimizer (and our simulation model). In the next subsection we investigate the factors that affect the performance of the local caches. We model them in a quantitative and manageable manner in order to design and implement a simple but powerful simulation framework.
performance of the content replication mechanisms. In Section III we present the way we model these parameters and describe the simulation environment we have designed and implemented. In Section IV the simulation scenarios are presented and their results are presented and discussed. The last section includes the conclusions and a short evaluation of the overall work. II. RELATED WORK AND ARCHITECTURE A. Related Work Web caching mechanisms have been extensively researched and a series of content replacement algorithms have been described, including the most known LRU (Least Recently Used) and LFU (Least Frequently Used) as well as their variation (LRU-min, LRU-Threshold) and others such as GDS (Greedy Dual Size) and HGD (Hierarchical Greedy Dual) [3]. Caching topologies are a hot research topic and they are coarsely separated into structure and unstructured [4]. The level of imposed structure varies from the strictly hierarchical, as met in the tree – like IPTV networks (described in [5]), the peer to peer, where nodes and content are identified using e.g. DHT (Distributed Hash Table) and similar algorithms without the need of centralized, orchestrating entities (described in [6], and the hybrid ones. Distributed caching scenarios and subsequent collaboration has also attracted the attention of researchers ([7]). The interconnection of separately administered CDNs in support of end-to-end content delivery is pursued by the recently formed CDNi (Content Delivery Networks Interconnection) work group [8]. The distribution of web requests from fixed user communities and the correlation of popularity with other content characteristics have been investigated in [9]. Deep Packet Inspection functionality and initiatives such as Application Layer Traffic Optimization (ALTO) and Decoupled Application Data Enroute (DECADE) provide additional information to the caching mechanisms, related to the state of the network itself that can guide the usage of original or cached content, combining data on number of hops to be traversed and availability of the link capacity between the cache and the end users [11-12].
C. Factors Affecting Caching Performance The factors that may affect the performance of the content caching mechanisms are related to: 1) The characteristics of the content (e.g. size of files and interdependence) 2) The distribution of the requests for that content 3) The characteristics of the cache server 4) The network conditions The characteristics of the content (and especially the multimedia /video content) are transformed due to the abundance and popularity of user-generated clips, the growth of multimedia libraries, the wide-spread deployment of IPTV services with advanced, personalized features and the focus on quality based on advanced coding capabilities. Light web pages are replaced by multimedia files, with potentially multiple versions (e.g. with escalating quality in a Scalable Video Coding scenario) or even complementary versions (e.g. in emerging Multi View Coding scenarios where multi views of the content may be simultaneously available). This means that the number of the multimedia files is proliferated and their size is increased. The patterns of the content requests depend on the application (e.g. IPTV or ordinary web page browsing), the characteristics of the end user group requesting for the
Figure 1. Overlay caching architecture.
63
TABLE I INITIALIZED PARAMETERS
content and the (temporarily evolving) content popularity. The content requests can be dense (in terms of requested content) when similar content items are frequently requested within the same user group or sparse where dispersed clients are interested in the same content item. Short – lived content can quickly become popular and highly requested (the hot spot case is a temporally dependent example). The characteristics of the cache server are related to its capacity, the maximum number of incoming and outgoing connections, as well as the processing power. The network conditions are related to the capacity of the links and the temporally – dependent traffic; network monitoring information can be retrieved from ALTO servers.
Name Number of active content objects Size range of content objects Number of requests Timestamps of each request Association among requests and content objects Number of users Duration Cache capacity Content replacement algorithms Max incoming / outgoing connections Link bandwidth Accommodated content objects Weights of accommodated content objects
III. THE SIMULATION FRAMEWORK Each cache object (server) can support a variety of content (re)placement mechanisms in order to manage the cached content. As mentioned, in the current version the two typical mechanisms have been implemented: the LRU (Least Recently Used) and the LFU (Least Frequently Used). When the cache is full, in the case of LRU the least recently content objects are expelled, while in the case of LFU space is ensured expelling the content objects which are least frequently requested (to alleviate the administrative burden a percentage of the overall space, e.g. a 10 – 20 % is freed). Each cache has also a set of properties, the values of which are changing; they are primarily associated with the accommodated content (content objects currently available in the cache) and their weights (e.g. frequency of requests). Table I summarizes the parameters of the models that are created and parameterized in the initialization phase of the simulation. We consider a continuous periodic retro-fit of the values of the dynamic ones (e.g. the content popularity) based on the analysis of the DPI service and other sources.
In the following we describe the way we have modeled the affecting factors investigated in the previous section and the simulation framework. Regarding the content (re)placement mechanisms we consider two of the most frequently used algorithms the LRU (Least Recently Used) and the LFU (Least Frequently Used). The requested content files may reside in the nearest cache server, another cache server or the origin one. Assuming that the maximum benefit is achieved when the content is present at the nearest cache, we define and evaluate two typical metrics (File and Byte hit rates). The typical workflow in the simulation environment includes (a) the initialization process, (b) the serving of the incoming requests and (c) the preparation of the statistics. A. The Initialization Process In the initialization process we configure the parameters related to the content, the requests, the clients, and the cache itself. The cacheable content consists of individual, independent multimedia files. The number of the active content objects / individual files (NObjects) and the range of their size are configurable. The size range is defined as [Smin, Smax] where Smin and Smax are the minimum and maximum file size values respectively. During the initialization process each file is assigned a size (in Kbytes), within this range in a random manner or a pre-defined distribution. The number of the end users NUsers and of the requests (to be performed and served during the execution) NRequests are also configurable. Each request is performed by a specific user for a content object (individual file) at a specific point in time TRequest which is recorded as timestamps (in seconds elapsed after the beginning of the simulation). The larger the number of the requests (or the smaller the configured simulation duration), the (temporally) denser are the incoming requests. In order to enable the increase of the number of requests, the information is stored in an external database. During the request handling phase, the simulator retrieves and handles them in batches. The cache is characterized by static properties including its capacity (in Bytes) and the maximum number of incoming and outgoing requests that can be handled concurrently. The link bandwidth capacities and the size of the requested content objects affect the duration of each connection.
B. Simulation Execution and Statistics During the execution of the simulation, the users are performing the requests for content (each request is associated with a content object) and these requests are routed to the cache. The cache is responsible to manage them in the more effective way. In principle when the requested content object is available in the cache it is directly served, otherwise it has to be served from the original server or another cache. In the latter case (the requested content object is absent) we can consider two basic sub-scenarios: (a) the request of the user is re-routed to the original server and the cache is not involved anymore or (b) the content is retrieved from the origin server, on behalf of the cache and it may also be stored in the cache. While both cases can be followed but we implement the second one, in order to update the conditions for caching the requested content. The cache is designed and implemented as a programmable object with static properties (e.g. the capacity and the intrinsic characteristics) and dynamic ones (current size, cached objects and their weights). At any time of the simulation execution, the cache object can provide information on its status, the size (volume) of the stored content, the concurrent incoming and outgoing connections and the processing load.
64
C. Statistics The statistics are available at the end of the simulation. The main metrics that we have used for evaluating the cache performance have been the file hit rate and the byte hit rate: 1) The file hit rate is the percentage of the requested content objects that are available in the cache to the overall number of requested objects, considering each file as an individual entity (independent of its size) 2) The byte hit rate is the percentage of the volume (total size) of the requested content objects to the volume of the requested content objects. The statistics are completed with the figures of the concurrent connections throughout the simulation duration and the cases when the operation of the cache is hindered due to arriving to the corresponding upper limits. Measurements are logged in external files at configurable intervals (currently retrieved every second).
Figure 2 LFU Byte hit rate
IV. SIMULATION SCENARIOS AND RESULTS A. Scenarios Performing the simulation scenarios we compare the cache performance under varying conditions. We investigate the influence of modeled parameters on the efficiency of the content caching mechanism. The parameters include the characteristics of the user requests and their distribution upon the content files, defining the Zipf – based popularity distribution as well as the range of the size of the content files. The number of the content requests has been taking values from 1.000 to 100.000. The content popularity is associated to the density of the requests configuring the parameter defining the distribution, taking 3 values: 1.05 (sparse requests, i.e. not many requests addressing the same content), 1.2 (medium density) and 1.8 (dense requests, i.e. many requests addressing the same content). The sizes of content files have been considered into three ranges: 1) From 10KB to 10MB addressing a web pages and related objects such as images (referred to as small sized) 2) From 100KB to 100MB which refer to typical multimedia objects (referred to as medium sized) 3) From 1MB to 1GB which include large multimedia objects (referred to as large objects) The parameters are completed with the number of the requesting clients for a single cache and the number of the active content items during the simulation period. The number of scenarios is dependent on the number of the replacement mechanisms, the volumes of request numbers, the density level of the requests, and the ranges of the content size.
Figure 3 LRU Byte hit rate
The hit rates present variability and they are depending on all independent parameters we have identified. For example in the case of LFU algorithm the lowest value of the hit Byte rate has been 52% (for distributed content popularity and the lowest number of requests) reaching a maximum value of 99% (for concentrated content popularity and the largest number of requests), i.e. almost every requested content object can be retrieved from the cache server. In the case of LRU the corresponding min and max values are 45% and 90%. All hit rates have an increasing tendency when the number of requests is increasing. This tendency is sharp for small and medium number (until roughly 50.000 requests) and then stabilizes. The distribution of the popularity of the content has affected the majority of the scenarios: when the requests are dense much better results are achieved. This creates the idea of directing potentially similar requests (based on the characteristics of the end user group related to its homogeneity such as the request originating location) to specific cache servers. Another interesting observation is related to the fact that the performance of all mechanisms, independent of the popularity of the content and the density of the requests, is increased when the size of the content objects reach their
B. Discussion of the Results The execution of the scenarios has yielded interesting results. In figures 2 and 3, the byte hit rates of the LFU and LRU algorithms are presented under varying conditions.
65
lower limit. The smaller (e.g. from 10KB to 10 MB) is the size, the higher is the performance. This is a quite interesting observation, and we assume that splitting large multimedia files (e.g. those larger than 100MB) and serving them as individual files (with a given sequence in their retrieval) can be beneficial for caching purposes. For example the MPEG multimedia streaming technology DASH (Dynamic Adaptive Streaming over HTTP) where a multimedia file is partitioned into one or more segments and delivered using the HTTP protocol may have benefits in the area of caching as well [14]. Regarding the replacement algorithms, the LFU algorithm is persistently more effective than the LRU one. This overbalance increases at the two ends of the range of user requests (min and max numbers of requests) and declines in the upper mid (around 50K requests). Of course LFU is more complex as it has to store information on the frequency of the user requests, while LRU is based on the sequence of the content requests (implemented as a linked list).
performed an extensive set of scenarios, assuming varying conditions. Evaluating the behavior of the caching mechanism we have verified that the effectiveness of the caching mechanism depends on the number of requests, the popularity of the content, the size of the content objects. REFERENCES [1]
[2] [3] [4]
[5] [6]
V. CONCLUSIONS
[7]
The increase of the network capabilities cannot satisfy the explosion of the available content and the subsequent demand on behalf of the end users. Content replication is currently considered a successful solution, which, we estimate, will have increasing impact in future architectures. Caching mechanisms can be deployed as overlays aiming at the enhancement of the network performance. The problem of maximizing the benefit and efficiency of the caching mechanism is both multi – dimensional and dynamic. We have investigated, in a pragmatic way, the parameters that affect the performance, considering the network neighborhood. We have designed and implemented a simulation framework considering these parameters and
[8] [9]
[10] [11] [12] [13] [14]
66
J. Erman, A. Gerber, M. Hajiaghayi, D. Pei, Ol. Spatscheck, "NetworkAware Forward Caching", ACM, Proceedings of the 18th international conference on WWW, 2009 T. Zahariadis, Em. Quacchio, "Fast content-aware delivery in overlay networks", IEEE COMSOC MMTC E-Letter, Vol. 5, No. 1, 2010 S. V. Nagaraj, “Wen Caching and its Application”, The Springer International Series in Engineering and Computer Science, 2004 P. Rodriguez, C. Spanner, E. Biersack, "Analysis of Web Caching Architectures: Hierarchical and Distributed Caching" IEEE/ACM Transactions on Networking, vol. 9, No. 4, 2001 B. Krogfoss, "Hierarchical Cache Optimization in IPTV Networks", IEEE Transactions on Broadcasting, Vo. 55, Issue 1, 2009 W. Shi, Y. Mai, "Performance evaluation of peer-to-peer web caching systems", Journal of Systems and Software - Special issue: Quality software archive, vol. 79 Issue 5, May 2006 S. Borst, V. Gupta, A. Walid, "Distributed Caching Algorithms for Content Distribution Networks", Proceedings of IEEE INFOCOM, 2010. The Internet Engineering Task Force (IETF), CDNI - Content Delivery Network Interconnection, Internet Draft L. Breslau, P. Cao, L. Fan, Gr. Phillips, S. Shenker "Web caching and Zipf-like distributions: evidence and implications" Proceedings of IEEE INFOCOM, 1999 EU project COAST: Content Aware Content Searching, Retrieval and Streaming, www.coast-fp7.eu The Internet Engineering Task Force (IETF), ALTO Protocol, draftietf-alto-protocol-07 The Internet Engineering Task Force (IETF), DECADE - Decoupled Application Data Enroute, draft-ietf-decade EU Project REVERIE: Real and Virtual Engagement in Realistic Immersive Environments, http://www.reveriefp7.eu/ Dynamic adaptive streaming over HTTP (DASH), Part 1 Media presentation description and segment format, ISO/IEC FDIS 23009-1