Towards an Optimal File Allocation Strategy for Specweb99 Tom W. Keller, Karthikeyan Sankaralingam1, H. Peter Hofstee Austin Research Lab, IBM Research, Austin, TX, USA also Dept. of Computer Sciences, University of Texas at Austin, Austin, TX, USA {tkeller,hofstee}@us.ibm.com,
[email protected]
1
4Gbytes of RAM. Our analysis is summarized in Section 8. Abstract This paper studies the file caching characteristics of the industry-standard webserving benchmark SPECweb99 and develops an optimal cost model for balancing performance and disk and RAM costs, where "cost" can be very broadly defined. The model is applied to a realistic 32-bit address hardware configuration to demonstrate a solution that eliminates file accesses as a potential webserving bottleneck for very high workload levels. Index terms - Web-server, file-cache, HTTP, analyticmodeling, simulation, cost-model
I. INTRODUCTION The aim of this study is develop an understanding of file caching scalability issues in SPECweb99 and to present design trade-offs in designing machines to serve large number of connections. In [1] the authors studied the performance impact of uncached file accesses in Specweb99. In this paper we study the performance of caching files and different allocation strategies to optimize performance and cost. The factors that we model in this study are disk delays (average seek and latency time), disk transfer rate and RAM size. In Section 2 we provide an overview of the SPECweb99 benchmark. The problem statement and related work is discussed in Sections 3 and 4, respectively. Section 5 explores the file access characteristics of SPECweb99 in more detail. In Section 6 we develop a theoretical model for file allocation to determine the optimal allocation strategy when disks are limited by bandwidth and we want to minimize the number disks and the amount of RAM simultaneously. This model is applied to a realistic case in Section 7 by discussing a scheme for file allocation that can be implemented on a conventional system with
II. SPECWEB99: OVERVIEW SPECweb99 [2] is the latest benchmark for web-server performance updated from SPECweb96. One or more clients drive an HTTP server with requests in a cycle of "sleep -- HTTP request --- wait for response" connections. Requests take the form of either static file serving requests (static GET), or dynamic file serving requests (dynamic GET) or logging (POST). In POST requests a cookie is returned by the server, instead of an unchanged file (static GET) or a file generated from a small number of static files (dynamic GET). Each request results in an operation. Server throughput is measured in operations per second given that a bandwidth requirement is met for each connection. The sleep interval in the client cycle is dynamically varied such that between 40,000 bytes/sec and 50,000 bytes/sec are maintained. A connection that meets the bandwidth requirement is counted towards the publishable metric, the number of conforming connections. This bandwidth requirement and the way it is implemented also enforces a de facto response-time requirement of 0.03 seconds per operation. Since POST requests result in a cookie being returned to the client, they have no impact upon file caching. Static GET requests comprise 70% of the requests, while dynamic GETs comprise 25.2% of the requests with the remaining 4.8% of requests being POSTs. The dynamic GETs return a variety of slightly modified files. Regardless of whether the GET request is static or dynamic, the contents of the request are drawn from a fixed inventory of files. While dynamic GETs are computationally intensive, they add only about 6KB to the size of the returned file.
Like [1] we safely ignore POSTs in this analysis and concentrate on the static and dynamic GETs which comprise 95.8% of the workload and which fully characterize the file caching demands placed by the workload. The composition of the files to be served is fully specified by the SPECweb99 benchmark in [2] and very completely described in [1]. We will be brief in our characterization, therefore, only touching on those points relevant to a file caching analysis. The set of files is composed of directories. Directories are identical in terms of file counts, sizes and frequencies of access within the directory. Each directory contains 36 files divided into 4 classes of 9 files each. Class 0 contains the smallest files, which range in size from 0.1KB to 0.9KB while Class 3 contains the largest files, ranging in size from 100KB to 900KB. A file is uniquely identified by the triple of its Directory Level, Class Level and File Level. The frequency of access to the 9 files within each class is fixed and constant across all classes. The frequency of access to any particular file can be calculated as the product of (1) the frequency of access to its Directory Level, (2) the frequency of access to its Class Level and (3) the frequency of access to its File Level. The Class Level and File Level frequencies are given in Table 1. All frequencies are defined by Zipf distributions with the characteristic alpha set to 1. For a set of files composed of Nd directories, the probability of accessing directory i is where.
Kd =
Kd i
1 Nd
∑ 1i i =1
Directory Level 1 has the greatest frequency of access while Directory Level Nd has the smallest frequency of access. For a configuration with 1000 directories, this difference is 1000 to 1. For very large configurations of around 30 Gbytes of files, the number of directories is on the order of 6,000, making this a very skewed distribution. The distribution of file sizes and frequencies of access within each directory (while identical for each directory) is also skewed, with the smallest files generally receiving the greatest frequency of accesses. Regardless of the size of the file set, the mean, frequency weighted, size of a file is 14.4KB, while the median (as noted in [1]) is around 3KB. The size of each directory is nearly 5.0Mbytes.
The number of directories, which comprises the file set, is specified by SPECweb99 as a linear function of the number of target connections. The number of achieved connections is the measure reported for the benchmark result and for all practical purposes is the number of target connections. Hence we will only refer to "connections" for the remainder of the paper. This relationship is given by: N d = 25 + 40 61 C , where C is the number of connections. Thus a 4GB address space can fully cache a file set of 3,800 connections. III. PROBLEM STATEMENT Examining the access patterns and the file size of the different files we notice that frequency of access of files and bandwidth requirement of the different files is very skewed, with the smallest files being the most frequently accessed. This means that file caching can be exploited. In this paper we wish to determine a "near optimal" caching strategy that takes into account realistic 32-bit system configurations of RAM and disk that will eliminate static file serving as a bottleneck and to apply that strategy to design feasible target configurations with very large file serving capacities. The design trade-offs involved are: 1. Bandwidth. How much can be sustained for a given RAM and disk configuration? 2. Size and Cost. How much RAM can be afforded? Disk space is relatively inexpensive. 3. Frequencies of access to RAM and disks. A larger miss ratio in memory means we need to fetch more frequently from disk , in turn putting more load on the server and increasing the bandwidth requirement of files. This effect is explained later. Examining these trade-offs gives us an understanding of the problem space. We analyzed the data in two ways – tradeoff of frequency and bandwidth with bandwidth having more priority and tradeoff of bandwidth and frequency with frequency of access having more priority – both of which are discussed in Section 5. The trends observed are independent of problem size. For concreteness, we will analyze the solution for 10,000 connections. Characteristics of a
10,000-connection workload are summarized in Table 2. The motivation for this work is prompted by the addressing limitations of 32-bit systems, 4GB of RAM, which limits the size of a RAM file cache to less than 4GB, in turn limiting the number of connections to 3,800 or so. Current SPECweb99 results for single and multiprocessor 32-bit systems are in the range of 1,000 connections [3] and not RAM limited, but CPU limited by the cost of serving dynamic GETs [1]. The aggregate CPU power that can be placed on a commodity webserver is limited by the Mhz of the individual processors and the relatively poor multiprocessor scaling available in commodity operating systems. The market is addressing both of these limitations by steadily increasing processor Mhz and improving operating system multiprocessor scaling, as well as dramatically reducing the instruction pathlengths. Furthermore, current commodity webserving configurations are growing from 2GB to 4GB for 32-bit systems. The largest webserving configurations today produce SPECweb99 results in the few thousands [3] and require 64bit systems where there is no 4GB RAM limitation for caching. This work attempts to eliminate the future bottleneck imposed by 32bit addressing on file caching by demonstrating how a 4GB system with a few disks can effectively cache a filesytem for a 10,000 connection SPECweb99 configuration, and larger. IV. RELATED WORK Kant and Won in [1] examined the file caching characteristics of the SPECweb99 benchmark and realized that the highly skewed (long tailed) characteristics of file accesses made it very well suited for caching. They developed a simple analytic model of an LRU-replacement cache that determined cache miss ratios as well as the number of I/O's required to replenish the cache. They estimated the impact of I/O pathlength on obtainable throughputs. Their model is characterized by the fraction of bytes cached and the hit ratio. Their goal was to develop a complete analytic model that could be used for performance prediction of the workload across different configurations. This work concentrates on studying caching schemes, rather than developing a complete model. Early in our investigation we realized
that I/O latencies play a crucial role in determining the number of disk drives required to replenish a RAM cache of files. This is because of the bandwidth requirement of each connection of at least 40,000 bytes/sec. While a RAM-resident file may begin being served immediately, with little delay to the connection, a disk-resident file must wait for a seek and rotational latency to begin service. For today's drives this total I/O delay due to the media is on the order of 360Kbytes, which is far in excess of the mean, frequency weighted, file size of 14.4KB. This "wasted disk bandwidth" means that a 10,000 connections (30,000 operations per second) configuration requires 35 disks to sustain just 10% of the accesses. By focusing on the I/O latency issue, we develop a caching scheme that reduces this to 4 or so disks.
V. SIMPLE FILE ALLOCATION STRATEGIES A 10,000 connection filesystem is characterized by frequency of access, bandwidth and size. We examine (and reject) three simple strategies for prioritizing cache placement: by frequency of file access, by file bandwidth and by file size. Six plots are used to graph these three metrics. A. Prioritizing Cache Placement by Frequency of Access Figure 1 displays a plot of the cumulative fraction of accesses against the fraction of all files, where the files are sorted by their decreasing frequency of access. It can be seen that about 20% of the files in the file set contribute almost 90% of the accesses. This means if we cache these files we are guaranteed to have a 90% hit rate. Figure 2 displays a plot of the cumulative fraction of accesses against the fraction of all files, where again the files are sorted by their decreasing frequency of access. (Note: the file size is displayed in a logarithmic scale.) From this graph we can conclude that for a 90% hit rate we would need less than 1GB of space in RAM. Figure 3 displays a plot of the cumulative bandwidth requirement of the files against the cumulative frequency of accesses when the files are sorted in decreasing order of frequency. For a hit rate of 90%, the non-RAM
bandwidth would be approximately 500Mbits/sec. From the above three figures one might conclude that a 1GB file cache in RAM would suffice for a 10,000 connection system, but we see from the following example that realistic disk latencies make this impractical. Assume a disk can sustain a transfer rate of 256Mbits/sec (32 Mbytes/sec), with an average access time of 11ms (these measures are described in Appendix I.) Unfortunately, a 10,000 connection (30,000 operations/sec) server with a miss rate of 10% generates 3,000 file requests to disk every second. The stated access time and bandwidth result in an additional file transfer size of 360KB's worth of bandwidth being wasted for each file access. The three thousand accesses per second thus result in an additional 3,000 accesses/sec* 360KB/access=8,437.5 Mb/s in bandwidth, which requires an additional 33 (8437.5/256) disk drives. Clearly this solution requires an unrealistic number of disks and we should attempt to minimize the disk bandwidth requirements further. According to the data used to generate Figure 2, a 3.9Gbyte cache (4Gbyte -0.1Gbyte overhead) supports 99.1% of the most frequently accessed files. The remaining 0.9% of the accesses requires a bandwidth of 365Mbits/sec and leads to an additional wasted bandwidth of 760Mbits/sec. Thus a total of 5 (4.4) disks would be required. In practice, however, caching schemes do not achieve these low miss rates. Kant and Won[1] report miss rates of at best 5.2% for a cache that holds 10% of the data using an LRU replacement scheme. Figures 4 and 5 shows the result of a simulation using random replacement from our 3.9Gbyte cache (See section 7 for details of the simulation methodology). Figure 4 shows the miss rate in the cache and Figure 5 shows the effective disk bandwidth required. The performance is slightly worse than that quoted in [1] with a miss rate converging to 6.5%. These miss rates would require more than 1500 requests per second to be served by disks and leads to 16 additional disks due to the wasted bandwidth associated with seek time. Therefore, even though sorting by frequency of access leads to attractive miss rate, this miss rate is hard to achieve in practice and bandwidth costs dominate. We next examine prioritizing by bandwidth requirements.
B. Prioritizing Cache Placement by Bandwidth Every file has a bandwidth requirement associated with it, which is given by: Bandwidth requirement of file = filesize * freq * OPS * 8/1024.0 Mbits/sec We sort the entire file set by the bandwidth requirement and cache the files with the most bandwidth requirement and plot the number of files vs. cumulative b/w of those files. This is shown in Figure 6. A small fraction of the files captures a large fraction of the required bandwidth. Figure 7 shows the plot of cumulative file size vs. the cumulative bandwidth requirement when files are sorted by bandwidth. A 4Gbyte cache captures 80% (2685Mbits/sec) of the bandwidth. Figure 8 shows the plot of frequency of access vs. cumulative bandwidth. It can be seen that 80% of the bandwidth gives us only a 62% hit rate. This means the disks will have 11,400 requests per second and, by the same calculation made in the previous subsection, this results in an additional 125 disk drives being required. Rejecting this prioritization scheme, we examine prioritization by file size. C. Prioritizing Cache Placement by File Size If we decide to place all of the largest files on disk (Class 3), we would have 300 accesses to disk per second, since by Table 2 1% of all accesses are made by Class 3, which would require 3.33 drives for the "wasted bandwidth" in addition to the 4.48 drives for the bandwidth required for Class 3 files, for 8 drives total with 3.2Gbytes of RAM. This appears to be the most effective "simple" strategy. Class 3 files have the lowest priority by file size (they are the largest) and lowest priority by frequency (0.01). There will be some Class 3 files with more priority by frequency than some Class 0,1,2 files because the directories have a 1/x frequency distribution. This simple strategy is a good starting point towards an optimal file allocation strategy. D. Conclusion It is clear that there is a trade-off between frequency and bandwidth. In the next section we derive an analytical solution to determine which files should be on disk and derive the
minimum number of disks needed to support a given placement. VI. OPTIMAL FILE ALLOCATION STRATEGY We will model a webserver considering the amount of RAM and number of disks. We will assume that RAM is not limited by bandwidth and thus memory bandwidth is not a bottleneck as the number of connections increases. Disks are limited by bandwidth, which is dictated by the media transfer rate. Disks also have an associated delay time, which can be modeled in two ways: • Increase in the file size for every file fetched from disk • Increase in the bandwidth the disk has to serve for every file fetched from disk.
A. Increase in file size Consider a disk with a Media transfer speed (MS) of 256Mbits/sec (see Appendix I). This disk has a "latency" of 11ms (L), where latency includes the average seek time and the average rotational latency. When a file of size F Kbytes is accessed from the disk, on average 11ms are spent in getting to the correct location on disk. Once the correct location has been found, the disk can instantaneously start transferring data at its maximum sustained rate of 256 Mbits/sec. The total time spent Ttotal and the actual disk bandwidth required for this file (Filebw) can be calculated:
Ttotal = L + F MS F ⇒ Filebw = F = T F total L + MS This "actual bandwidth" can be thought of as transferring a larger file of size F + E at the original disk transfer rate of MS. E is simply defined as:
E = L * MS For the above disk, this number of extra bytes is 360Kbytes. In other words, a realistic transfer of a file of size F requires the same bandwidth as transferring a file of size F+E from a hypothetical "latency free" disk.
B. Increase in bandwidth Consider the same example disk as above. Transferring a file of size F, requires bandwidth of
( E+F ) MS
. If the frequency of access of a file is
freq, and OPS is the number of operations a second then the extra bandwidth requirement is: E*freq*OPS*8/1024 Mbits/sec For the above disk, the extra bandwidth for any file with freq=0.01 at OPS=30000 is 843.75 Mb/s. C. Model Examining the Specweb99 access patterns, we note that some files are best stored on disk and some are best stored in RAM. The cost function we consider is the cost of disks (in any convenient unit) and cost of RAM (in the same unit) that a file requires. We assume that the cost of storing a file in RAM is purely dictated by its size. Thus Cost of storing a file in RAM = F*CRAM, where CRAM is the cost per Kbyte(dollars, power) of RAM We assume that the cost of storing a file in disk, is dictated by its bandwidth requirement. Cost of storing a file in Disk = (F+E)*freq*OPS*CDISK, where
CDISK =
CSINGLE − DISK , with CSINGLE-DISK MS
being the cost of a single disk and MS the media speed measured in Kbytes/sec. D. Optimal allocation to minimize total cost We derive an optimal allocation scheme to reduce the total cost of the server and present a summary of the scheme. The exact derivation can be found in [4]. For a given class and file inside that class, the cost of having the file in RAM is constant for every directory and the cost of having the file on disk is a monotonically decreasing function as the directory index increases, because the frequency of access is inversely proportional to directory index. When the cost of a file on disk becomes less than the cost of that file in RAM for one
directory, all directories that follow it are also placed on disk. From this observation, we determine the latest directory for which the file resides in RAM for every class and each file in that class.
GB for Classes 0--2 and .1 GB for the system out of 4GB total) we can cache 293 directories. The first 293 directories account for 65% of the directory accesses, leaving a miss rate of 0.35 for Class 3 directories.
The 36 values in the Table 3 give the latest directory as a fraction of the total number of directories that resides in RAM. A value greater than 1 indicates that all the directories reside in RAM for that file. It can be seen from the table that all of class 0 and class 1 always reside in RAM. Some files in class 2 and most of files in class 3 should be placed on disk.
So the cached Class 3 files account for (80%)*(65%) = 52% of the Class 3 bandwidth. The remaining 48% of the bandwidth needs to be served from disk. However there is an increase in bandwidth required associated with putting a file on disk because of the disk latency. From previous analysis, we showed that a 11ms latency for a 256Mbits/sec transfer rate disk, incurs an extra file size of 360 Kbytes.
VII. APPLICATION TO A TARGET CONFIGURATION Drawing from the results of the previous section, we derive configurations and file allocations to minimize the cost and amount of RAM to serve a large number of connections. Our design goal is to configure a 10,000 connection machine with 4 GBytes of RAM. Idealized file allocation If we place all of Class 0, 1 and 2 in RAM this will occupy 10% of the entire file set size, 3.2Gbytes. We then cache a “small” percentage of the Class 3 files such that they occupy most of the bandwidth and most of the hits (frequency). Analyzing the frequency distribution of files inside a class and their corresponding sizes given in the table below, we can see that 80% of the accesses contribute to 80% of the bandwidth and to 55.5% of the disk space. If we cache the files of size 300--700 KB in Class 3 of a directory, we will have a hit rate of 0.806 and account for 80.3% of the bandwidth. By caching only these “hot” files, each Class 3 directory requires 2500 Kbytes of RAM. We now need to select a fraction of the directories to cache, as caching all the directories is not feasible.
This means that the extra bandwidth associated with putting a file on disk is: E * frequency of disk access * ops The total bandwidth on disk that has to be served is: Bandwidth of the files + Bandwidth due to file latency effect. Bandwidth of the files
Bandwidth due to file latency effect
Frequency of disk access
Bandwidth
Total Disk Bandwidth Requirement
= 48% of total Class 3 b/w = 0.48*1147.5 = 550.8Mb/s
= E * freq of disk access* ops
= freq of Class 3*MissRate
=360KB*.01*.34*30000*8/1K
= 550.8 + 286.9 = 837.7 Mb/s
With each disk serving 256Mbits/sec, we would need 4 disks to serve that bandwidth. This is supportable by a single SCSI chain.
For a hit rate of “f” in the cache, if the first x directories are cached, we have the relation:
x = N f , where N is the total number of 1
directories. With a cache of 0.7 Gbytes (3.1
=f
1
This can be derived as follows:
1 For a distribution, approximate i
N
∑
i =1
If the first x directories are cached, their cumulative contribution to the accesses is log(x)/K where K=log(ND). If these directories account for a hit rate of f, we get the relation. log( x ) log( N D )
1 i
= log( N ) .
⇒x=Nf
The total memory used is 3.2 + 0.7 = 3.9 GB. The total number of disks required is 4. On a 4Gbyte system this leaves the operating system 0.1GBytes. Of course, using the above formulas one can configure a variety of systems. A. Application to a Realistic Configuration We model a simple caching scheme with LRU replacement and bound its performance between two caching schemes. The "idealized" scheme discussed above is not achievable in practice but servers as an asymptotic bound for our analysis and can be computed analytically. If one were able to implement the idealized scheme then all Class 0, 1, and 2 files would be locked in RAM and the subset of Class 3 files described above. Disk-resident Class 3 files would use a negligibly small RAM buffer while being served. The implementable scheme allows files of Classes 0, 1 and 2 to be cached and never evicted once cached. Class 3 files are cached and are the only class of files which will be
evicted. The replacement policy is LRU; the least recently used Class 3 file is evicted. The miss rate for the idealized scheme is analytically tractable, as is the aggregate bandwidth (which includes wasted bandwidth) since the composition of the disk resident files is known. For the implementable scheme we want to determine the miss rate and the required disk bandwidth (again including wasted bandwidth). We report simulation results for the implementable scheme using random file replacement instead of LRU file replacement, for tractability. Random replacement will perform more poorly than LRU and provides a worse-performance bound on the desired LRU case. Thus, the implementable case performance will be bounded by the idealized case and the random file replacement simulation. We will see that these bounds are close.
1) Simulation Methodology Simple simulations were run by generating an access pattern according to the Specweb99 specification and simulating a cache of 3.9 GB with files being evicted, in contrast to pages. In other words, when a file is chosen for eviction the entire file is evicted and not just the required number of pages to accommodate the accessed file. 2) Generating a file trace 1,500 seconds of web access for the 10,000 connection configuration was simulated. This amounts to 450,000,000 file accesses. A file trace is constructed by generating a random access pattern with the distribution of directories, classes and files specified in SPECweb99. The frequency (f) of access of a file is uniquely determined by its Directory Level, Class Level and File Level. The number of times this file will be accessed in this 1500 second run is 450000000*f. We generate a list as follows: for every file i { n = f*450000000 /* f = freq of access of file i */ for (j = 1 to n) push i on LIST } LIST is a 450000000 entry array whose elements are then randomized to get a random access pattern following the Specweb99 distribution. 3) Simulation Results Figure 9 contains the graph of miss rate calculated every second for the 1,500 seconds simulated. The first 100 to 200 seconds have a very high miss rate because this is the warmup time for the initially empty cache. Note that a sufficient warm-up time is allowed under the benchmark run rules. The straight line on the graph is the miss rate when the most frequently used files are locked in the cache. This asymptote is derived from Figures 3 and 2. Figure 2 plots size vs. Frequency of access. Reading the Size for 3.9 Gbytes, gives a frequency of access of the most frequently accessed files which add up to 3.9 Gbytes. Using Figure 3 we read the cumulative bandwidth of these files to determine the bandwidth from disk; to this we add the extra bandwidth required because of extra file size.
than 5% for a cache less than 4Gbytes and 10,000 connections. Supporting 5% of the file accesses from disk requires an additional 15 disks because of the wasted bandwidth associated with the disk latency.
Figure 10 contains the graph of total bandwidth (including the bandwidth required because of the latency effect) required from disks. The straight line gives the bandwidth required when the most frequently used files are locked in the cache for the idealized case.
A cache replacement scheme is proposed that controls eviction from the cache depending on file attributes(size). In the ideal case, this scheme can support 10,000 connections on a system with a 4Gbyte cache and 3 disks with a bandwidth of 256Mbits/sec each. A practical implementation of the scheme using random replacement is shown to support 10,000 connections using 4Gbytes of cache and 4 disks.
We see that the asymptotic miss rate is bounded by the idealized case at 0.0039 and the random replacement policy at approximately 0.0044 misses per second. The asymptotic disk bandwidth for the implementable case is bounded by the random file replacement strategy at approximately 882Mbits/sec (4 disks) and by the idealized case at 703.125 Mbits/sec (3 disks).
VIII. SUMMARY Disk latencies play a major role in the performance of a web server that caches only a fraction of all files in the SPECweb99 benchmark. With absolute control over the placement of the files in RAM versus disk good results can be obtained by selecting the most frequently accessed files or the files from classes 0,1 and 2. An optimal file placement scheme is derived based on the relative cost of placing files in RAM and on disk. Conventional cache replacement schemes, however, do not approach these ideal configurations because of their relatively high miss rates of greater
IX. REFERENCES [1]
[2] [3] [4]
Krishna Kant and Youjip Won, "Performance Impact of Uncached File Accesses in SPECweb99," Workload Characterization for Computer System Design, Eds. L. John and A. Maynard, pages 87-104, Kluwer Academic Publishers, Boston, 2000. "An explanation of the SPECweb99 benchmark", available from SPEC web site www.specbench.org/osg/web99. SPECweb99 benchmarking results, available from SPEC uwebsite www.specbench.org/osg/web99. Tom W. Keller, Karthikeyan Sankaralingam and H. Peter Hofstee, "An Optimal File Allocation Strategy for SPECweb99," available from the authors.
APPENDIX I A moderate cost/performance SCSI drive is chosen as representative, the IBM Ultrastar 18ES 3.5-inch, 9.1 GB hard disk. All measures are taken from http://www.storage.ibm.com/hardsoft/diskdrdl/ultra/18esdata.htm. The sustained data rate for a single unit on a SCSI chain has been measured in our lab on a 600Mhz Pentium III Intellistation system running Linux at 20 Kbytes/sec (160 Kbits/sec) when reading a large file which is not stored contiguously on the disk. 256Kbits/sec was chosen as the media transfer rate for our target configuration as a more convenient constant to remember.
Data buffer Rotational Speed Latency (avg) Media Transfer Speed
IBM Ultrastar 18ES Hard Disk Drive 2MB Sustained data rate 7200 rpm Seek time (typical read) 4.17 ms Average 159 to 244 Track-to-track Mbits/sec Full track
12.7 to 20.2 MB/sec
7.0 ms 0.8 ms 13.0 ms
TABLES AND FIGURES
Class Freq %
Class
35
0
35
In Class File Freq %
File Size
In Dir. Freq
Class Freq %
Class
(KB)
Bandwidth KB per access
File
3.9
0.0137
1
0.1
0.0014
14
2
0
5.9
0.0207
2
0.2
0.0041
14
35
0
8.8
0.0308
3
0.3
0.0092
35
0
35
0
17.7
0.0620
4
0.4
35.3
0.1236
5
0.5
35
0
11.8
0.0413
6
35
0
7.1
0.0249
35
0
5.0
0.0175
35
0
4.4
0.0154
In Class File Freq %
File (KB)
Bandwidth KB per access
3.9
0.0055
1
10
0.0546
2
5.9
0.0083
2
20
0.1652
14
2
8.8
0.0123
3
30
0.3696
0.0248
14
2
17.7
0.0248
4
40
0.9912
0.0618
14
2
35.3
0.0494
5
50
2.4710
0.6
0.0248
14
2
11.8
0.0165
6
60
0.9912
7
0.7
0.0174
14
2
7.1
0.0099
7
70
0.6958
8
0.8
0.0140
14
2
5.0
0.0070
8
80
0.5600
9
0.9
0.0139
14
2
4.4
0.0062
9
90
0.5544
Tot
4.5
0.1713
Tot
450
6.8530
50
1
3.9
0.0195
1
1
0.0195
1
3
3.9
0.0004
1
100
0.0390
50
1
5.9
0.0295
2
2
0.0590
1
3
5.9
0.0006
2
200
0.1180
50
1
8.8
0.0440
3
3
0.1320
1
3
8.8
0.0009
3
300
0.2640
50
1
17.7
0.0885
4
4
0.3540
1
3
17.7
0.0018
4
400
0.7080
50
1
35.3
0.1765
5
5
0.8825
1
3
35.3
0.0035
5
500
1.7650
50
1
11.8
0.0590
6
6
0.3540
1
3
11.8
0.0012
6
600
0.7080
50
1
7.1
0.0355
7
7
0.2485
1
3
7.1
0.0007
7
700
0.4970
50
1
5.0
0.0250
8
8
0.2000
1
3
5.0
0.0005
8
800
0.4000
50
1
4.4
0.0220
9
9
0.1980
1
3
4.4
0.0004
9
900
0.3960
Tot
45
2.4475
Tot
4500
4.8950
TABLE 1: File Characteristics of a SPECweb99 Directory
Number of connections/second Number of operations/second Total data
31.39 Gbytes
Number of directories
6583
Number of files
236988 Bandwidth Requirement 2220 Mbits/sec 1147 Mbits/sec 3367 Mbits/sec
Class 0, 1, 2 files Class 3 files All Classes
10000 30000
Size 3.20 GBytes 28.19 GBytes 31.39 GBytes
Table 2: Characteristics of Specweb99 with 10000 Connections File# 1 2 3 4
File Size
In Dir. Freq
Access Freq. 0.353 0.177 0.117 0.088
Relative File Size 5 4 6 3
Bandwidth 1.765 0.708 0.702 0.264
5 Total
Class # 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
0.071 7 0.497 0.806 25(55.5%) 3.936(80.3%) Table 4: File characteristics
File Size Freq of Latest Dir Latest Dir Number (KB) Access inRAM/Nd in RAM 1 0.5 0.353357 582.7 INF 2 0.4 0.176678 364.1 INF 3 0.6 0.117786 161.9 INF 4 0.3 0.088339 242.6 INF 5 0.7 0.070671 83.2 INF 6 0.2 0.058893 242.6 INF 7 0.8 0.050480 52.0 1.07e+199 8 0.9 0.044170 40.5 7.69e+154 9 0.1 0.039262 323.4 INF 1 5 0.353357 84.2 INF 2 4 0.176678 52.5 6.21e+200 3 6 0.117786 23.4 6.80e+89 4 3 0.088339 34.9 3.60e+133 5 7 0.070671 12.1 2.59e+46 6 2 0.058893 34.8 1.54e+133 7 8 0.050480 7.5 1.43e+29 8 9 0.044170 5.9 6.01e+22 9 1 0.039262 46.3 1.08e+177 Table 3A: Number of connections: 10000, Number of directories: 6583
Class # 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3
Latest Dir File Size Freq of Latest Dir in RAM Number (KB) Access inRAM/Nd 1 50 0.353357 2.6 2.02e+10 2 40 0.176678 1.6 2.26e+06 3 60 0.117786 0.7 1.15e+03 4 30 0.088339 1.0 1.59e+04 5 70 0.070671 0.3971 5.01e+01 6 20 0.058893 1.0237 1.23e+04 7 80 0.050480 0.2539 1.42e+01 8 90 0.044170 0.2020 9.01e+00 9 10 0.039262 1.3291 1.81e+05 1 500 0.353357 0.0396 2.16e+00 2 400 0.176678 0.0219 1.85e+00 3 600 0.117786 0.0123 1.70e+00 4 300 0.088339 0.0126 1.70e+00 5 700 0.070671 0.0069 1.62e+00 6 200 0.058893 0.0107 1.67e+00 7 800 0.050480 0.0047 1.55e+00 8 900 0.044170 0.0040 1.58e+00 9 100 0.039262 0.0118 1.69e+00 Table 3B: Number of connections: 10000, Number of directories: 6583