A new flash memory management for flash storage system - IEEE Xplore

3 downloads 579 Views 637KB Size Report
conventional storage @le) management techniques. Our focus is on lowering cleaning cost and evenly utilizing flash memory cells while maintaining a balance ...
A New Flash Memory Management for Flash Storage System Han-joon Kim Sang-goo Lee Department of Computer Science Seoul National University Shilim-dong, San 56-1, Seoul, Korea 151-742 {hjkim, sglee)@cygnus.snu.ac.kr seconds). The second drawback is that the number of rewrite operations allowed to each memory cell is limited’. This becomes a great obstacle in developing a ‘stable’ flash memory-based system that is as highly reliable as a disk-based system. This requires the system to wear down all memory blocks as evenly as possible; this process is called cycle-leveling or wear-leveling. Due to these disadvantages, conventional technologies in storage (or file) systems cannot be applied directly to flash memories. A flash memory-based storage system must be able to make effective use of the advantages of flash memory while overcoming its constraints. In this paper, we describe a new way of memory management for flash storage system that copes effectively with the features of flash memory. A flash storage system requires the cleaning mechanism (or cleaner) to reclaim the invalidated space for future usage because data cannot be updated in place. The cleaner for flash storage system is basically the same as the cleaner for log-structured file system (LFS) 171. In any case, performance of the system that requires such a cleaner usually depends on its algorithm. Thus the methods found in previous works such as [1,7] on disk management and [4,6,10] on flash memory management have focused mainly on enhancing efficiency of cleaning mechanism, that is, lowering cleaning cost. However, in addition to minimizing cleaning cost, the system needs to provide acceptable cycle-leveling capability. As the amount of flash memory space gets larger, cycle-leveling is more essential. The problem is that cycle-leveling often prevents reducing cleaning cost. The proposed method accommodates these two oftenconflicting objectives. In our work, the cleaning efficiency is enhanced through a ‘collection’ operation whose activity is to collect fragmented cold data (infrequently updated). The conflict between cleaning and cycleleveling is resolved by integrating cycle-leveling into the cleaning process through a special cleaning criterion for selecting victim segments.

Abstract We propose a new way of managing flash memory space for flash memory-specific file systems based on logstructured file system. Flash memory has attractive features such as non-volatility, and fast U0 speed, but it also sufers flom inability to update in place and limited usage cycles. These drawbacks require many changes to conventional storage @le) management techniques. Our focus is on lowering cleaning cost and evenly utilizing flash memory cells while maintaining a balance between the two often-conflicting goals. The cleaning eficiency is enhanced by dynamically separating cold data and noncold data. The second goal, cycle-leveling is achieved to the degree where the maximum diference between erase cycles is below the error range of the hardware. Simulation results show that the proposed method has significant benefit over naiLe methodr: maximum of 35% reduction in cleaning cost with even spreading writes across segments.

1. Introduction Flash memory is a non-volatile solid state memory. Its density and U 0 performance have improved to a level at which it can be used not only as an auxiliary storage for mobile computers [2,4] but also as a mass storage in a general computing system [lo]. Flash memory is as fast as DRAM in reading operations and hundreds times faster than a hard disk in writing. Thus a storage system based on flash memory can provide a significant performance improvement over those with disk file systems by reducing the YO bottleneck between processor and mechanical disk. These attractive features make flash memory one of the best choices for a mass storage system. Unfortunately, flash memory has two critical drawbacks: First, blocks of memory need to be erased before they can be rewritten. This is because flash memory technology only allows the toggling of individual bits or bytes in one way for writes. The erase operation resets the memory cells with either all ones or all zeros; this needs more time than read or write operation (0.6-0.8

0-7695-0368-3199$10.00 0 1999 IEEE

Typically, the cycling limit is between 100,000 and 1,000,000 erases. A flash region that approaches its cycling limit will experience frequent write failure.

284

The next section describes related works. Section 3 describes how free space is managed under logging strategy for flash memory and presents related goals. Section 4 describes our solution for achieving the presented goals. Section 5 presents the results of our simulations, and Section 6 discusses the conclusions.

The Flash Memory Manager is composed of four major components: Allocator, Cleaner, Cycle-Leveler, and Collector. The Allocator is responsible for keeping a pool of free segments. It decides which of the free segments is to be assigned next. The Cleaner reclaims invalid pages of segments to generate new free space. The Cycle-Leveler is responsible for even distribution of erase cycles over the flash segments. Finally, the Collector performs the activity of isolating cold data from non-cold data on the log. This activity results in reducing the overhead of the Cleaner. The Cleaner is triggered when the size of free space gets below a certain threshold. The cleaning operation on the flash memory log works in two phases: rewriting and erasing phases. In the rewriting phase, the Cleaner first selects a segment in the log, collects valid data from the selected segment and rewrites them onto the end of log. For continual logging, this can invoke the Allocator. Then, in the erasing phase, the erase operation is performed in parallel for all erase blocks in the selected segment. The cleaned segment is reclaimed into free segments pool. Cleaning efficiency depends mainly on the rewriting phase since the erasing phase consists only of a hardware operation, erase, which incurs a fixed cost. Additionally, the Cleaner checks whether segments are unevenly used (cycled). If the degree of skewness in cycles is higher than a given criterion, the Cleaner activates the Cycle-Leveler. The Cycle-Leveler moves the data in the least worn-out segment into the segment in the end of the log. Generally, cycle-leveling can be improved when data in the least worn-out segment is moved to other more worn-out regions, invalidating blocks in the former segment, which will then have a better chance of being erased at cleaning time. After a large number of logging and cleaning operations, cold data gets mixed with non-cold data within each segment on the log. This co-existence of cold and non-cold data can have an adverse effect on cleaning efficiency because cold data is uselessly rewritten when the segment is cleaned. To prevent rewritings of cold data, the Cleaner periodically invokes the Collector which checks the degree of such co-existence, and if necessary, separates cold data from non-cold data.

2. Flash memory management As a basic strategy for flash memory space management, we adopt append-only logging as in LFS. In previous works such as [3,6,9], logging approach has been recommended for a flash memory-specific file system since it automatically solves the inability of update-inplace, and has a positive side effect of cycle-leveling because append-only writes lead to sequential use of unused flash space. To support logging, the flash memory space is partitioned into segments. Each segment is a set of erase blocks' that are logically clustered to be erased together in a single erase operation. In each segment, a fixed number of pages are organized; the page is the unit of VO operation, which is the amount of data transferred between the flash memory and main memory.

2.1. Architecture The flash storage system proposed in this paper is illustrated as Figure 1. First, the File Manager is responsible for deciding and maintaining files as a set of logical pages. Next, the Flash Memory Manger manages the physical flash memory; it determines the physical location of a logical page, and issues the necessary flash U 0 operation. The activity of flash memory manager is similar to that of a disk manager in disk-based storage systems. Finally, the Flash Media Interface involves a set of low-level operations for flash memory array, including erasing, and programming. Among these components, our work is mainly involved in the Flash Memory Manager.

E] 3

.

M a i n Conrponrnrr

of Flash M e m o r y Manager

r /I

2.2. Problem statements & performance metrics

-:

The objectives of a flash memory management are to reduce the cleaning cost and cleaning frequency as much as possible, and to evenly distribute erase cycles over all the segments. In order to quantify the degree to which the objectives are achieved, we present two performance measures in terms of cleaning cost and the degree of cycle-leveling. Our cleaning cost model is based on the cleaning cost formula defined in eNVy flash storage system [lo]. Because frequent cleaning (erasing) over fixed period

requests

Figure 1 Flash Storage System '

The erase block is an independently erasable element within a flash chip and is 4-64Kbytes in size. The erase block is distinguished fiorn the logical YO unit, page.

285

results in higher overall costs, the proposed cleaning cost considers the number of cleanings over a given period as well as cleaning cost at each cleaning time. Dehihn 1 The cumulative cleaning cost is

z:,”;.(-pi

where cf is the number of cleanings over a fixed amount of write request stream, and p, is the utilization of the segment selected for the ithcleaning. Next, as a measurement of how evenly cycles are distributed over all the segments, we use leveling degree as defined below. Definih 2 The leveling degree ( A c ) is the difference between the maximum erase count

(E,,,,)

and the

minimum erase count (E- ). (i.e., A~ = E, ) Instead of the variance or the standard deviation of erase counts, we chose to use the difference as a measure of the degree of cycle-leveling because the system’s lifetime is dependent on the lifetime of the most worn-out segments. Generally, a perfect cycle-leveling where leveling degree is kept at 0 or 1 at all times need not be supported to assure reliability of flash storage systems. This is because erase limits vary from segment to segment within the hardware error range3, and also excessive cycleleveling degrades cleaning (writing) performance. Therefore, we attempt to achieve cycle-leveling only to the extent where the leveling degree is within the error range of the erasure limit.

without any more invalidation. For the same reason, cleaning frequency increases as well; the segment whose utilization is higher due to cold data is more quickly filled up by writes, so it has more chance of being cleaned. We try to overcome the limitation of the greedy cleaning method through a Collector that reduces the degree of coexistence of cold and non-cold data. The Collector intends to isolate cold data from noncold data, which is implemented as a module to periodically collect fragmented cold data over the log. Our collector is invoked once every n times of cleaning (n is called the ‘collection period’). This module collects fragmented cold data of a file and writes them back to the log. The Collector works in two steps (see Figure 2): I) search and collect fragmented cold data (file) pages and 11) cluster them to the end of log. If during the process of collection, the log space is not enough, the Cleaner can be invoked to generate new space. Thus collection operation must have reasonable policies with regard to 1) which cold data are to be collected 2) when cold data are to be collected (collection time) and 3) how much cold data are to be collected (collection size).

end of log

J.

.determine collection sire *identify the file to be collected

C.lI*.li.“

siz, end of log A+

d..lp...



3. Optimization of memory management

.eollcct and rewrite the pagca of files ‘A’ and ‘B’

-1fnecesaary. the Cleaner IS activated

In this section, we first introduce the basic cleaning method mentioned in [7], and then based on this method, incrementally optimize the way of managing the flash memory.

(b) Step I1

Figure 2 Procedure of collection operation 3.1.1. Identification of data being collected. Data being collected is selected considering the degree of fragmentation of randomly selected files. To do this, the mapping table between a file and its pages, which is maintained in the file manager, is referenced to locate pages of the file. The fragmentation degree F is defined as follows:

3.1. Separation of cold data from non-cold data Since the cleaning cost of a segment depends strongly upon its utilization, we can expect that it is better to select the segment with the least utilization. This is known as the greedy policy [7], which we refer to as Greedy I. However, as mentioned in [6,7,10], the greedy policy does not perform well when the degree of locality is high. Especially when utilization is high, locality has a much bigger impact on cleaning cost. The primary cause of increase in cleaning cost with locality is that the cold data fragmented over the log area move around uselessly

where p is the’number of pages occupied by the file, s is the number of segments over which the file is fragmented, and LSg is the number of segments in the log. If p is larger than Ls,, p is replaced by L,

since s cannot be

larger than Ls,. Currently, this error range has not been reported because its measurement is difficult. We will assume that the error range is a number e (say, 100).

Depending on this fragmentation degree, the Collector decides whether the candidates are to be collected; it selects ones with higher fragmentationdegree.

286

leveling into the cleaning process by a special segment cleaning criterion. We also suggest a new segment allocation policy in deciding which segment fkom the free segment pool is allocated to the log.

3.1.2. Collection time & collection size. Since the problem of deciding the collection time (period) and collection size requires a fine-tuning knob that is chosen on an ad hoc basis, we propose a simple heuristic based on utility; the collection size ColSize and collection period ColPeriod are set as (2) and (3) below.

Colsize = k, .pq . Nseg ( 2 )

3.2.1. Cleaning index. A special cleaning index is defined for both lowering the cleaning cost and promoting cycle- . leveling. The cleaning index is the criterion for selecting segments to be cleaned; the segment with the lowest index value4 is selected at cleaning time.

Colperiod = k P (3)

k where Nseg is the number of segments in flash memory space, pw is the average utilization, k, and k, are constants which are experimentally adjusted so as to achieve good cleaning efficiency. This is because high utilization generally yields more cold data, which is more likely to be fkagmented by continuous logging and cleaning operations. We shall refer to the algorithm that empolys collection operation on top of the greedy method as CICL I.

where pi and

= Pure

-

P

cleaning cost + Collection cost ~

1-p

c =p+c 1-p 1-p

(5)

(6) Colperiod

respectively, are the utilization and the

erase count of segment i, and E, is the maximum of erase counts. The cleaning index is composed of two factors: utilization and erase count of each segment. These two factors are weighted by the normalized leveling degree 1, which should map the leveling degree to values between 0 and 1. At normal times when the leveling degree is below a tolerable threshold, the utilization pi has to be the chief factor in selecting a segment. When the leveling degree exceeds the tolerable limit, then the cleaning index must be calculated considering both factors. Thus, as the value of 1 approaches 1, that is, when the leveling degree is high, the cleaning index will be calculated being biased toward cycle-leveling. On the other hand, when the value of 1 approaches 0, that is, when erasures are evenly distributed, segment selection is based on reducing the cleaning cost. In (7), it is desirable that the weighting variable 1 takes a form of a sigmoid function, where a small measure of skew in cycles can be ignored. We define variable 1 as the following sigmoid function which increases monotonically with the leveling degree, A,.

3.1.3. Modification to cleaning cost. The previously defined cleaning cost need to be modified to accommodate cost incurred by the collection operation which also requires rewriting valid data. The cleaning cost can be modified as follows.

Cleaning cost

E,,

SegSize

where c/(l- p ) denotes the collection cost, which is the additional cost required to generate 1- p of fkesh space. This value is best explained as the cost required if one collection were to be performed at each cleaning time; in reality, only after cleaning operations as many as collection period are executed, one collection occurs. Thus, c can be computed as (6) where SegSize is the number of pages in a segment.

I' r

n

fA,#O

I=

10

3.2. Cycle-leveling

fA,=O

where k, is a constant that determines the steepness of this function. The constant k, can be used for controlling the degree of cycle-leveling. Should we wish to loosely achieve cycle-leveling, we have only to make k, larger. When k, is small, near-perfect cycle-leveling will be achieved. The effect of controlling leveling degree by k, can be observed in Figure 4(d). In summary, our cycle-leveler is implemented within the cleaner through a special cleaning

For any pattern of the data access, uneven distribution of erasures tends to occur. In the worst case, if there are static data that are not accessed at all, severe skew in cycles will occur. In a simple bid to promote cycleleveling, when allocating a fkee segment to the log, the least worn-out segment can be chosen. Greedy I1 uses this policy on top of Greedy I. Another solution is to swap the data in the least worn-out segment and the most worn-out segment. Greedy I11 is a method that integrates this swapping solution with Greedy I. Although logging has a potential effect on cycle-leveling, flash system needs a more refined leveling method, especially under writeintensive environments. We propose to integrate cycle-

In case of cost-benefit policy introduced in Sprite-LFS [9], the segment whose cleaning index (agex(1-U)) is the greatest is selected. U

287

index. We refer to the method that integrates this leveling method into CICL I as CICL 11.

3.2.2. Segment allocation policy. Cycle-leveling can be helped by carefully selecting the next fiesh segment to be used from free space. Segment allocation by the Allocator considers the following two cases: a) when appending data to the log and b) when collecting cold data. In case a), a segment with the lowest erase count is selected for replacement to increase its usage. The new segment can be expected to reach its next cleaning cycle earlier because of its hot data. Correspondingly, in case b), a new segment with the highest erase count is allocated to the log. This is because a large amount of cold data will be clustered onto the segment allocated for a collection operation, and thus the segment will experience less invalidationthan the other segments. The method that uses this policy on top of CICL I1 will be referred to as CICL In.

and the other 10% of the references go to 80% of the data with 10% of data unchanged. The total number of write requests is 1 million and the number of data generated from them is between 500 and 2,000 files; initially, the write requests generate the files, and then repeatedly overwrite them. At every write request, a file is selected based on a pseudo-random number, and the write position within a selected file is also randomly decided. The UNIX workloads are based on size distribution of UNIX files described in [5] where we reflect the fact that the size of cold data is larger than that of hot data [8]. The LFS workloads, on the other hand, are characterized by fiequent modification to small data files (approximately 4kbyte files); LFS (90310) workload is the same as the ‘hot-and-cold’ workload used in [7].

4.2. Simulation results

4. Performance analysis 4.1. Simulator and workload

Workload

Flash Memory Type Size of Flash Memory Segment Size Number of Segments (”) Number of Erase Blocks in a Segment I Page Size I

1M/chip lGbytes 16M 64 256 512bytes

We have built a simulator with lGbytes of flash space that is divided into 64 flash segments. Each segment is composed of 256 erase blocks. At every 100 write requests, the Cleaner checks whether the ratio of the number of free segments to the number of total segments is below 0.1, in which case, one segment is cleaned. The segment usage information required by the Cleaner, including the number of valid pages and the erase count of each segment, is kept in the simulator. The Collector chooses the ones with a fiagmentation degree that exceeds 0.8. And, in order to achieve cycle-leveling, the constant k, in the weighting function (8) that controls the leveling degree in the cleaning index is set below 100. We have synthetically generated two types of workload, named UNIX and LFS, in which write access exhibits certain amounts of locality of reference. The degree of locality is denoted in parenthesis following the workload type. In this paper, the degree of locality is denoted as in [7]. To represent unchanging (static) data, we specifL one more parameter. For ekample, ‘ 9 0 3 10,80’ means that 90% of write accesses go to 10% of the data,

288

I

(b) utilization = 0.85

Figure 3 Comparison of cumulative cleaning cost 4.2.1. Cleaning cost. In most workloads, our method has significant benefit over greedy method. Figure 3 shows that when locality exists and segment utilization is high, our methods (CICL series) outperform the greedy (Greedy series) and cost-benefit methods in terms of cleaning cost. The Collector, by paying a small amount of the cost, contributes to the reduction of the overall cleaning cost. For example, by paying a collection cost of only about 8% of the total cleaning cost in the UNIX (9535) workload, CICL I11 reduces the overall cleaning cost by about 35% compared to Greedy I and 11. In CICL 11, the drop in cleaning cost achieved by collection is less visible due to cycle-leveling. However, CICL 111 regains the lost cost reduction effect as a result of the

not satisfactory. Moreover, CICL I, which only employs collection without cycle-leveling, also shows some degree of cycle-leveling effect. This is because the collection operation invalidates the pages in the segments that contain cold data, which makes these segments more likely to be cleaned.

proposed segment allocation policy. In some cases, while CICL I11 levels cycles more strictly than CICL I and 11, CICL 111 lowers cleaning cost further than the other CICL methods. In the UNIX (90+10,80) workload, cleaning cost reduction due to collection is smaller than in other workloads, since the data are less fiagmented. In the LFS workloads, our methods show smaller cost reduction compared to the UNIX workloads. Since LFS workload contains only small files, fewer cold data in segments moves around than in the UNIX workload. The CostBenefit method that supports the cost-benefit policy and age-sort suggested in [7], incurs a high overhead in our simulation environment due to the big overhead required to enforce even age distribution within a segment. In our simulation, even though we maintain a rather strict cycleleveling, the cleaning efficiency does not degrade. We project that, for less strict cycle-leveling, the cleaning cost and frequency will be reduced further.

5. Conclusions We have presented a cost model appropriate to the proposed method, and using this cost model, basic flash memory management is extended with dynamic separation of cold and non-cold data. With this collection operation, cleaning-index for cycle-leveling is integrated. This solution does not depend on logging even though we use a logging strategy for flash space management; any other strategies can be easily used together with our method. In the future, we plan to analyze performance under a more diverse set of real workloads, and moreover, we will try to obtain better cleaning performance by adding an appropriate caching schemes, which should reduce the write traffic on flash memory.

References [ 13 T. Blackwell, J. Hark, and M. Seltzer, “Heuristic Cleaning Algorithms in Log-Structured File Systems”,Proceedings of ’95

(a) UNM(80+20)workload (b)UNIX(90+10,80) workload

Winter, USEhXX 1995, pp.277-287.

[2] R. Chceres, F. Douglis, K. Li, and B. Marsh, “Operating System Implications of solid-statemobile computers”, Technical Report MITL-TR-56-93, Matsushita Information Technology Laboratory, May 1993. [3] B. Dipert, and M. Levy, Designing with FLASH MEMORY, Annabooks, 1994, pp. 227-27 1. [4] F. Douglis, R. Chceres, F. Kaashoek, K. Li, B. Marsh, and J. A. Tauber, “Storage Alternatives for Mobile Computers”,

I

(c) LFS(90+10) workload

(d) Control of the leveling degree

Proceedings of the I“ Symposium on Operating Systems Design and Implementation, 1994, pp.25-37. [SI G. Irlam, “Unix File Size Survey”, 1993;

Figure 4 Changes of leveling degree 4.2.3. Cycle leveling effectiveness. Figure 4 shows the cycle-leveling effect of the proposed method. In most workloads, CICL I1 and I11 holds Acto fluctuate within the range of a given error range. For UNIX (90+10,80) workload in particular, CICL I1 and 111 provide stable leveling effects, while Greedy I and I1 show severe skew in cycles. Greedy 111 also has a leveling effect similar to CICL methods, but among the methods except for CostBenefit, Greedy I11 has the worst cleaning efficiency; this is because a large amount of write cost is required to swap data in two segments. Also, Figure 4 suggests that the segment allocation policy that belongs to CICL 111 is helpful in cycle-leveling compared to the simple policy of selecting the segment with the lowest erase count; this policy in effect contributes to lowering the cleaning cost as well. Greedy I1 outperforms Greedy I, but its effect is

http://www.base.com/gordoni/gordoni.htmL [6] A. Kawaguchi, S. Nishioka, and H. Motoda, “Flash memory Based File System”, Proceedings of ’95 Winter, USENLYTechnicalConference, 1995, pp.155-164. [7] M. Rosenblum, and J. K. Ousterhout, “The Design and Implementation of a Log-Structured File System”, ACM Transactions on Computer Systems, Vol.10, 1992, pp.26-52. [8] C. Ruemmler, and J. Wikes, “UNIX disk access patterns”, Proceedings of ‘93 Winter USENIX Technical Conference,

1993, pp.405-420. [9] D. See, and C. Thurlo. Managing, “Data in an Embedded

System Utilizing Flash Memory”, Intel Technical Note, 1995; http://www.intel.com/design/flcomp/papers/esc-flsh.htm.

[lo] M. Wu, and W. Zwaenepoel, “eNVy:A Non-Volatile, Main Memory Storage System”, Proceedings of the dhInternational Conference on Architectural Support for Programming Languages and Operating Systems, 1994, pp.86-97.

289

Suggest Documents