AISC 327 - Dynamic Cache Resizing in Flashcache

1 downloads 0 Views 2MB Size Report
Page 1 ... introduce a solution for dynamic resizing of cache for Facebook Flash- ... resizing of flash cache on the fly, thus removing the need for restarting the.
Dynamic Cache Resizing in Flashcache Amar More and Pramod Ganjewar MIT Academy of Engineering, Pune, Maharashtra, India {ahmore,pdganjewar}@comp.maepune.ac.in

Abstract. With an increase in the capacity of storage servers, storage caches have became necessity for better performance. For such storage caches, flash based disks have proved to be one of the best solutions for the better storage performance. With the use of flashcache layer as a part of storage servers it often needs to be expanded or scaled to handle a large work load without affecting the quality of service delivered. Resizing the cache space dynamically results in shutting down the servers for some time or transferring the workload to another server for some time and restarting the servers. It often hits the performance of servers while warming up caches for populating valid data in cache. In this paper, we introduce a solution for dynamic resizing of cache for Facebook Flashcache module, without affecting the performance. Our solution achieves resizing of flash cache on the fly, thus removing the need for restarting the server or transferring the loads to another server. It would also eliminate the need for warming up cache after resizing the cache.

1

Introduction

Growing need for performance and improvements in lifecycle and capacity of SSDs has encouraged many storage vendors to use flash based memory or SSDs as cache for their storage and database servers [2,3]. Such caches deliver significant performance gains for I/O intensive workloads at reasonable cost [1]. Flash cache is one of the best solutions for performance of storage systems provided by major storage solution providers [1]. Integrating such caching mechanisms within operating system can achieve the required throughput for I/O intensive workloads and thereby decreasing the complexity in design. Some storage vendors provide modules for their storage products that can be easily integrated within their systems [1].There are also some open source modules available like Facebook Flashcache [4] and Bcache [5], which can be used in open source operating systems like linux. Flashcache and bcache, both can be used as a block layer cache for linux kernel, but both have significant difference in their operations. Bcache supports multiple backing devices for a single cache device, that can be added or removed dynamically. Bcache is designed according to the performance characteristics of SSDs and makes possible optimal usage of SSDs in its scope, that are used as cache. Facebook Flashcache was originally designed to optimize database I/O performance by acting as block cache and later transformed for general purpose I/Os and other applications as well. Flashcache supports a variety of disks with respect to their I/O latency, to be used as cache. Any portable c Springer International Publishing Switzerland 2015  537 S.C. Satapathy et al. (eds.), Proc. of the 3rd Int. Conf. on Front. of Intell. Comput. (FICTA) 2014 – Vol. 1, Advances in Intelligent Systems and Computing 327, DOI: 10.1007/978-3-319-11933-5_60

538

A. More and P. Ganjewar

flash device, SSDs or even some rotational disks with higher speed can be used as cache. All data available on disk can be cached on such devices and made persistent. In order to meet service level agreements of the clients or meeting specific I/O intensive workloads cache space often needs to be resized dynamically. This results in shutting down of server or transferring the workload to another server for some time while the cache space is resized and restarting the server. When the cache space is increased, newly created cache space is often empty and needs some hours to many days for warming up of cache [6]. Some algorithms were designed to warm up the cache, but it needs to maintain and trace some information like logs of data that is used frequently and some heuristics of data that may be used. This data is then used for warming up the cache that was newly created. In this paper, we present an efficient solution for dynamic resizing of cache based on Facebook Flashcache which eliminates the need for restarting the server. The rest of the paper is organized as follows: Section two, consists of a brief working of Facebook Flashcache followed by Challenges identified in section three. In section four and section five, we have described our design and evaluation of performance results of our design.

2

Facebooks Flashcache

Flashcache is a device mapper target developed using the Linux Device Mapper. It is a block layer linux kernel module that can used to cache data on a disks with higher speeds than the disks used for secondary storage, allowing it to boost the I/O performance. Flashcache was primarily designed for InnoDB, later transformed to be used as general purpose. In flashcache, cache is divided into uniform sized buckets and I/Os on cache are mapped using a set associative hash. Each bucket consists of data blocks. Metadata is maintained separately for both, block and cache buckets which allows to handle it easily. Cleaning of sets is trigerred on two conditions, firstly when dirty blocks exceed the configured threshhold value and secondly when some blocks are not used from some time i.e. stale blocks.Cache block metadata consists of the state of the block i.e VALID or DIRTY or INVALID.

Fig. 1. Overview of Facebook Flashcache

Dynamic Cache Resizing in Flashcache

539

Each I/O arriving at flashcache is divided into block sized requests by device mapper and forwarded to flashcache for mapping. Blocks in a bucket are stored by calculating its target bucket using a hash function. Hash function used for target set mapping can be given as: targetset = (dbn/blocksize/setsize)mod(numberof sets) After calculating the target bucket of a block, linear probing is used to find the block within that bucket.Replacement policy used within bucket is FIFO by default, and can be changed on the fly via sysctl to LRU. To maintain data persistence metadata is written onto the disk on a scheduled shutdown. In case of unplanned shutdown, only DIRTY blocks persist on cache, and needs warming up of cache only for VALID blocks.

3

Challenges Identified in Facebook Flashcache

In order to resize the cache in flashcache, we need to bring the system offline, resize the cache and restart and reconfigure the flashcache. Though flashcache provides persistence of cache in writeback mode, after resizing warming up of cache degrades the performance of the system. This is because increase in number of buckets in the cache would result in change in mapping of cache blocks to bucket. Following are the issues which are required to be considered while implementing the resizing 3.1

Remapping of Blocks to Their Appropriate Sets While Resizing

Block mapping in flashcache is done through linear probing within the target set which is calculated through a hash function. The hash function is provided with a start sector number as a key to calculate its respective target set. It requires total number of sets in cache. While resizing the cache dynamically, if device is added in the cache the total number of sets present in the cache will change. This difference in total number of sets results in inappropriate calculation of target set, which can ultimately lead to an inconsistent I/O operation. Thus we need to maintain consistent block and target set mapping during and after the resizing process. 3.2

Data Consistency during Resizing in Writeback Mode

In writeback mode of flashcache, data is written only on cache and later on lazily written to the disk in the background, while cleaning the sets. This cleaning is triggered on two conditions, when dirty threshold value of a particular set exceeds its configured threshold value and other way is, when block is not used for a longer period i.e. block lies fallow on cache. So whenever data is written on cache it is not reflected on disk immediately. Metadata update is done only on transition of block from dirty to valid or invalid or vice versa. Here the major challenge is to prevent data loss and incorrect metadata updation and also maintaining the consistency of data on cache, while resizing the cache. In

540

A. More and P. Ganjewar

writeback mode most of the data is present only on cache and we cannot bypass the cache even for a single I/O. Another challenge here is to handle the I/Os in an appropriate order while resizing without providing the inconsistent data.

4

Design for Dynamic Resizing

In existing flashcache implementation, only a single cache device is supported. While creating the cache, this cache device is configured and divided into sets and blocks. Arrays are used to keep track of the attributes of each set and block of the cache device. These arrays are created at the time of creation of the cache and every internal algorithm of flashcache depends on these arrays as shown in Figure 2 Dynamic resizing supports multiple devices to be added online. For this purpose we have maintained a list of devices instead of a single cache device. Each device in the list has its own array for sets and blocks which are created at the time of resizing. In order to keep internal algorithms intact, few mapping functions are added to introduce a virtual layer between cache device and the internal algorithms. This virtual layer enables Flashcache to work as if it is working with a single cache device. Once cache is resized, number of sets in it gets resized which affects the hash function used to map a block of disk to a block of cache. As this change may introduce inconsistency in the data stored, we have implemented a basic framework for resizing. Later, we built an optimal resizing solution on top of the basic framework.

Fig. 2. Array Mapping

Dynamic Cache Resizing in Flashcache

4.1

541

Basic Approach

In this approach, we follow a basic framework for resizing and considering all the complexities. The process begins by configuring the newly added cache device. Once the device is configured, it gets added to the list of cache devices.After adding cache device to the list, all incoming I/O requests are kept on hold. Complete cache is cleaned to avoid inconsistency due to re-mapping. Size of the cache and total number of sets in the cache are updated and I/O requests are released. As the cache is cleaned completely, count of cache miss will be higher on next I/O operations. Performance will slightly degrade unless the cache is refilled again. 4.2

Optimal Resizing Approach

The optimal resizing approach is more complex than the basic approach forresizing but performance is much higher. We have divided the cache sets into three logical categories viz. Trap set, Re-Mapped sets and Disguised sets. Properties for each category is as follows: 1. Trap set(all blocks are being re-mapped according to their new position after resizing): All I/Os arriving on this set are holded, until each block in this set is remapped on its original position. At a time only single set will be a trap set. 2. Re-mapped Sets(all blocks belong to their original position after resizing): All I/Os arriving on this set will be calculated with new hash function which uses updated number of sets after resizing. 3. Disguised sets(all blocks in these sets are not in correct position after resizing and need re-mapping ): All I/O arriving on these sets will be mapped by calculating with old hash function.

Fig. 3. Set States in Optimal Resizing Approach

542

A. More and P. Ganjewar

As shown in Figure 3, the process for resizing begins by marking a set as trap set, and hold all I/Os arriving on that set only. Visit each block sequentially and apply new hash function on that block, now if that block maps on same set i.e trap set, leave it as it is. If block maps on a different set, it is marked as invalid in the current set and I/O is triggered for that block on its new position after remapping it on a new set. Likewise, we continue this procedure for blocks in trap set. Next, we remove set from trap, i.e. untrap the trapped set and it is marked as Re-Mapped sets and all I/Os arriving at these sets will be now calculated by new hash function. After removing the trap from previous set, next set is visited and it is marked as trap set, and similar procedure is followed. All the I/Os arriving at Re-Mapped set will be calculated using new hash. Until all the sets are remapped, we need to trap the sets and remap each block in it. When all the sets are remapped, we update the superblock and the total number of sets in cache context. Summarizing the overall process, during resizing we use both the hash functions old hash and new hash, and after completing with resizing and updating the superblock and cache context, we use only new hash function.

5

Evaluation

We have tested Flashcache on a system with Intel dual core processor clocked at 2.6GHz and 4GB RAM. For disk device and cache device we have used loop devices. We have compared the performance of our implementation of flashcache with original flashcache having cache size of 2GB and disk size of 5GB. Our system was having cache size initially 1GB and we resized it dynamically by adding additional 1GB of cache and disk size of 5GB respectively.Following are the test results generated by IOzone benchmark for read /write.

Fig. 4. IOzone Reader Report

Dynamic Cache Resizing in Flashcache

543

Fig. 5. IOzone Writer Report

Fig. 6. Flashcache Read/Write Status Report

From the above results of IOzone as shown in Figure 4 and Figure 5, reader and writer, we can observe that our implementation of flashcache does not degrade the normal read write performance. Throughputs of all the variants of flashcache, especially original and advanced, are almost equal in average. Maintaining the actual performance, while introducing resizing was a necessity. Above charts show that the mapping functions in our implementation does not degrade the performance. Flashcache maintains a count of all the read hits in a particular cache for displaying statistics of a cache. To confirm that our implementation offers a decent read hit percentage even after resizing, we tested it for following scenario. We created three write back caches using original, basic and advanced flashcache implementations (one for each). For each cache, cache device was of 1GB and disk device was of 5GB. We wrote a 2.4GB file on each cache. Then we

544

A. More and P. Ganjewar

resized the caches created using basic and advance implementations to 2GB (added one more 1GB cache device to it). Basic implementation cleaned all blocks in the cache device and advanced implementation re-mapped all blocks. Then we started reading the same file from all three cache devices. Figure 6 shows the read hit percentage obtained from each cache at the time of reading this file. It could be observed that advanced resizing gives a slightly higher percentage than basic resizing. Original flashcache gives small read hit percent because it is not resized from 1GB to 2GB cache.

6

Conclusion

Flash caches in storage servers prove to be one of the best solutions to boost the I/O performance in an efficient way in terms of cost and energy. However, such caches are often needed to be resized, which needs restarting of server and warming up caches. We have implemented a system for dynamic cache resizing in Facebook Flashcache, without affecting its performance. Cache hits are maintained to ensure that resizing of cache can be done without the need of warming up the cache. In the same way, resizing of the backing disk dynamically can also be useful in a scenario where one cache device needs to be shared by multiple backing devices. Resizing of the backing disk dynamically is left as future work.

References 1. Byan, S., Lentini, J., Madan, A., Pabon, L.: Mercury: Host-side flash caching for the data center. In: MSST, pp. 1–12 (2012) 2. Lee, S., Kim, T., Kim, K., Kim, J.: Lifetime management of flash-based ssds using recovery-aware dynamic throttling. In: Proc. of USENIX FAST 2012 (2012) 3. Oh, Y., Choi, J., Lee, D., Noh, S.H.: Caching less for better performance: balancing cache size and update cost of flash memory cache in hybrid storage systems. In: Proceedings of the 10th USENIX conference on File and Storage Technologies, FAST 2012, Berkeley, CA, USA, p. 25. USENIX Association (2012) 4. Saab, P.: Releasing Flashcache in MySQL at Facebook Blog, http://www.facebook.com/note.php?noteid=388112370932 (accessed April 27, 2011) 5. Stearns, W., Overstreet, K.: Bcache: Caching Beyond Just RAM, http://bcache.evilpiepirate.org/ (accessed July 2, 2012) 6. Zhang, Y., Soundararajan, G., Storer, M.W., Bairavasundaram, L.N., Subbiah, S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Warming up storage-level caches with bonfire. In: Proceedings of the 11th Conference on File and Storage Technologies (FAST 2013), San Jose, California (February 2013)