On the Role of Compression in Distributed Systems

30 downloads 823 Views 902KB Size Report
Feb 23, 1992 - Data compression has long been used as a method to reduce the ... PC hard disk, sometimes improving I/O performance but often being ...
O n the Role of Compression in Distributed Systems Fred Douglis

(do gli. @mitl.com) Matsushita Information Technology Laboratory 182 Nassau Street, Third Floor Princeton, N J 08542 U S A

Abstract Compression has been used in numerous ways for many years, but recently two factors have combined in a way to push compression to the forefront of distributed systems. First, the disparity between processor speeds and I / 0 rates is ever-increasing, making it possible to perform compression in software to a much greater extent than was previously feasible. Second, the growth of new applications demanding enormous data rates, such as digital video and audio, makes hardware compression increasingly desirable: I discuss the importance of compression in various environments and describe how compression may be used not only to reduce the demand for disk space, disk

bandwidth, and network bandwidth, but alsoto appear to extend physicalmemory. 1

Introduction

D a t a compression has long been used as a m e t h o d to reduce the n u m b e r of bits be~_g stored or t r a n s m i t t e d [7], but stored where and t r a n s m i t t e d over w h a t ? T h e answer t o this question is changing as the gap between processing speed and I / O b a n d w i d t h increases, a n d the uses for computers shift toward multimedia and o t h e r h i g h - d a t a - r a t e applications. Traditionally, compression was used t o reduce disk storage demands, b y selectively compressing ~les t h a t h a d not been accessed for a long time, a n d to reduce b a n d w i d t h reqll~rem e n t s over such media as wide-area networks and modems. Some forms of compression are now nearly ubiquitous: for example, the vast m a j o r i t y of " a n o n y m o u s F T P ~ archives store all b u t the most trivial of flies in compressed f o r m a t , saving disk space whl]e m a k i n g retrievals over t h e Internet faster. ( O t h e r sites sometimes store t h e original files in uncompressed f o r m a t b u t allow on-the-fly compression upon request.) Images impose especially excessive requirements on disk storage, disk b a n d w i d t h , a n d network b a n d w i d t h if t h e y are n o t compressed; the current t r e n d t o w a r d multimedia environments ensures t h a t compres, sion will play an increasingly i m p o r t a n t role in most distributed systems, even over only local-area networks. But m u l t i m e d i a is j u s t one aspect of a t r e n d in c o m p u t i n g in general a n d d i s t r i b u t e d systems specifically. Processor speeds continue to increase at s u b s t a n t i a l r a t e s , and user interfaces take a d v a n t a g e o f the increase in processing power to provide various f o r m s of n e w functionality. At the s a m e time, though, disk speeds are improving in speed m u c h less rapidly, a n d soon t h e fast processors m a y connect to a slow wireless m e t r o p o l i t a n a r e a n e t w o r k r a t h e r t h a n a relatively fast local network. The " I / O t.ottleneck" is a l r e a d y a problem for disks, and some systems a t t e m p t to address it using techniques such as caching

logging [9]. In this p a p e r I give a specific example of a use for compression t h a t goes b e y o n d 88

existing

file s y s t e m a n d networking work: using an area of physical m e m o r y as a cache of compressed v i r t u a l me_nory pages to reduce or eliminate transfers to and from a slower backing store. Mobile computers have t h e greatest p o t e n t i a l of benefitting from this use of compression, since t h e y are likely to execute applications t h a t do n o t fit comfortably in physical memory. 2 What Does Compression Buy? As was m e n t i o n e d above, compression can be used. to store somethi=g in less space or t r a n s f e r it less expensively. Often the two go h a n d in h a n d , such as when compressed d a t a is w r i t t e n to a disk: the disk I / O takes less time, since less bits are bei~_g transferred, and t h e storage occupied on the disk after the transfer is less. Unfortunately, t h e time t o compress t h e d a t a can exceed the savings from transferring less d a t a , resulting in an overall d e g r a d a t i o n [2]. But with hardwv~e compression, or when the disparity between processing speed and I / O is great, compressioz~ actually improv'es overall performance. E i t h e r or b o t h of these situations will be increasingly c o m m o n in the future. liven today, compression buys a significant improvement in disk capacity with only a slight d e g r a d a t i o n in performance. C a t e a n d Gross combined compression and caching in a two-level file store, which a u t o m a t i c a l l y compresses least-recently-used files in order t o save disk space [3]. Stacker 1 uses compression to increase t h e effective capacity of a P C h a r d disk, sometimes improving I / O performance but often being m e a s u r a b l y slower t h a n w i t h o u t compression; the slowness is due to t h e relative performance of a n I n t e l 386 chip c o m p a r e d to a typical P C h a r d drive, and can be improved w i t h a special processor b o a r d [5]. T a u n t o n described a mechanism for compressing binary executables on Acorn personal c o m p u t e r s , reducing disk space requirements and improving file system b a n d w i d t h ; since only decompression was being performed on-line, the processing overhead wins offset by t h e r e d u c t i o n in I / O [10]. Burrows, et al., combined on-]me compression w i t h Sprite LFS [9] t o compress and decompress all disk I / O a u t o m a t i c a l l y [2]. T h e y used it primarily t o increase effective disk capacity but expect t h a t with h a r d w a r e compression, t h e increase in effective disk b a n d w i d t h would improve overall system performance. So far I have talked primarily systems? T h e answer is t h a t disl years ago1 on Sun-3 workstationl 10-Mbit E t h e r n e t into a server's w h e n a d i s t r i b u t e d s y s t e m contv d a t a over a wide-area network, m a y not find a Gigabit wide-area network much of a such pairs suggests t h a t all hosts s h o u l d go to l e n g t h s transferred. Even if compression slowed improve p e r f o r m a n c e o n c e contention i s charge users in p r o p o r t i o n to t h e amounl t r a n s m i t t i n g it m a y lower costs slgnlflcar A n d any c o m m u n i c a t i o n over the low-baJ benefit f r o m compression regardless of cc . . . . . . . . .

allows computers to avoid slow

I/O completely.

IStackez is a trademark of Stac Electronics.

89

Just like compressing data on disk can aUow

a user t o save larger fries (or more files) w i t h o u t b u y i n g a n o t h e r disk, compressing d a t a in m e m o r y c a n allow a user t o r u n larger processes w i t h o u t b u y i n g more m e m o r y . In this case, v i r t u a l m e m o r y pages would be compressed and w r i t t e n back i n t o physical m e m o r y r a t h e r t h a n t o a backing store. This idea was briefly m e n t i o n e d by A.ppel a n d Li, w h o claimed t h a t user-level s u p p o r t for compression would be m o r e effective t h a n generic o p e r a t i n g s y s t e m s u p p o r t [1]. Nevertheless, I believe t h a t 0 S s u p p o r t for compressed v i r t u a l m e m o r y c~u prove useful. I refer t o t h e technique of t r a d i n g regular physical m e m o r y for d a t a s t o r e d in compressed f o r m a t as a compression cache (CCACHE). O n e question, o f course, is w h e t h e r paging is even i n t e r e s t i n g or desirable. I n some e n v i r o n m e n t s , p a g i n g is largely pass6 due t o t h e large ~_m_ounts of m e m o r y available [4]. However, I believe t h e r e are still systems a n d applications t h a t could benefit g r e a t l y f r o m a cheaper m e t h o d of paging. Mobile computers are especially likely t o h a v e less m e m o r y available t h a n needed, a n d t h e cost of paging over a wlretess n e t w o r k or o n t o a small local disk rn~ght be much m o r e t h a n t h e cost of compressing t h e page. E v e n in a w o r k s t a t i o n b a s e d e n v i r o n m e n t , a m a i n - m e m o r y d a t a b a s e t h a t is w i t h i n a f a c t o r of two or t h r e e of t h e size o f available physical m e m o r y m i g h t fit completely in m e m o r y in compressed f o r m a t . T h e i m p o r t a n t issues are how effective compression is at saving m e m o r y a n d how costly c o m p r e s s i o n is by c o m p a r i s o n t o n e t w o r k or disk I / 0 . I n t h e best case, compression can be done in h a r d w a r e , a n d t h e cost of c o m p r e s s i o n will be l i m i t e d o n l y b y m e m o r y - t o - m e m o r y b a n d w i d t h . In this case I would expect t h e CCACHE to provide a s u b s t a n t i a l i m p r o v e m e n t in performance. B e y o n d this, however, I a r g u e t h a t on-line compression in software c a n still provide benefits to a wide class of a p p l i c a t i o n s if designed properly. In this section I describe an i m p l e m e n t a t i o n of t h e CCACHE in software, a n d t h e perfo~i~ance of some b e n c h m a r k s , to s u p p o r t this claim. 8.1 Overview T h e r e are a n u m b e r of p o t e n t i a l benefits of u s i n g a compression cache. F i r s t , c o m p r e s s i n g a p a g e a n d r e t a i n i n g it i n m e m o r y c a n be cheaper t h a n w r i t i n g it t o b a c k i n g store. Second, r e a d i n g it b a c k f r o m m e m o r y is likely t o be m u c h cheaper t h a n r e a d i n g it f r o m b a c k i n g store, even t a k i n g t h e cost o f decompression i n t o account. Finally, if t h e p a g e is e v e n t u a l l y w r i t t e n t o b a c k i n g store, it can be w r i t t e n in c o m p r e s s e d formate reqnlring less b a n d w i d t h . T h e CCACHB h a s some p o t e n t i a l d i s a d v a n t a g e s as well. M o s t i m p o r t a n t l y , it c o m p e t e s w i t h user processes (and Perhaps t h e file s y s t e m cache) for m e m o r y . P u t t i n g a page in t h e CCACHE h a s t h e e~ect of freeing up o n l y p a r t of a page, so m o r e pages m u s t b e t r a n s f e r r e d f r o m u n c o m p r e s s e d m e m o r y t o t h e CCACHE in order to m a k e r o o m f o r o n e new page. Processes will t a k e a d d i t i o n a l page f a u l t s b e c a u s e t h e y are given less m e m o r y . ( T h i s suggests t h a t t h e CCACHZ should b e of a variable size, a n d s h o u l d b e u s e d o n l y w h e n its p r e s e n c e i m p r o v e s p e r f o r m a n c e r a t h e r t h a n degrading it.) Additionally, t h e CcACHB adds o v e r h e a d in t h e f o r m of a d d e d c o m p l e x i t y d u r i n g page faults, as well as e x t r a code a n d d a t a a s s o c i a t e d w i t h t h e cache. 3.2 Target Environment As was m e n t i o n e d above, the CCACHZ is p r i m a r i l y t a r g e t e d for mobile c o m p u t e r s , w h i c h t y p i c a l l y h a v e less m e m o r y t h a n desk-based c o m p u t e r s of t h e same g e n e r a t i o n . Nevertheless, t h e i d e a o f t h e CCACHZ e x t e n d s n a t u r a l l y t o a n y e n v i r o n m e n t t h a t pages w h e n a p p l i c a t i o n s use m o r e v i r t u a l m e m o r y t h a n p h y s i c a l memory. I h a v e a d d e d a compression cache t o t h e

90

S p r i t e o p e r a t i n g s y s t e m [8], r u n n i n g o n D E C s t a t i o n 2 5 0 0 0 / 2 0 0 w o r k s t a t i o n s ( a p p r o x i m a t e l y 18 S P E C m a r k s , r u n n i n g a 25 M H z M I P S R 3 0 0 0 p r o c e s s o r ) . S p r i t e is a p a r t i c u l a r l y s u i t a b l e s y s t e m for t h e CCACHE, b e c a u s e it a l r e a d y s u p p o r t s t h e i d e a of t r a d i n g p h y s i c a l m e m o r y b e t w e e n t h e v i r t u a l m e m o r y s y s t e m a n d t h e file s y s t e m [6]. T r a d i n g me~r~ory a t h i r d way, i n c l u d i n g t h e c o m p r e s s i o n cache, w o u l d be a n a t u r a l e x t e n s i o n of t h i s t e c h n i q u e . T h e i n i t i a l p r o t o t y p e of t h e c o m p r e s s i o n cache h a s b e e n simplified in a n u m b e r of w a y s . F i r s t of all, I use a fixed-size CCACHB, w h i c h is c o n f i g u r e d a t k e r n e l l o a d t i m e . T h e fixed-size c a c h e is s i m p l e r b u t r e s u l t s in u n n e e d e d o v e r h e a d for ~ p p l i c a t i o n s t h a t w o u l d o t h e r w i s e fit in m e m o r y ; m e a s u r e m e n t s of t h e effects of t h i s are p r e s e n t e d below. Secondly, c o m p r e s s e d p a g e s a r e w r i t t e n o u t u s i n g t h e s a m e n u m b e r of b y t e s as u n c o m p r e s s e d p a g e s : a full 4K b y t e file block. I n t h e f u t u r e c o m p r e s s e d pages will be c o m b i n e d i n t o a s m a l l e r n u m b e r of file b l o c k s t o save d i s k b a n d w i d t h . T h i r d , I use o n l y a single L Z - b a s e d c o m p r e s s i o n a l g o r i t h m [11] for all d a t a . T h e c o m p r e s s i o n a l g o r i t h m r e d u c e s p a g e s t o 3 0 - 4 0 ~ of t h e i r o r i g i n a l size on a v e r a g e , d e p e n d i n g o n t h e a p p l i c a t i o n ralx; I w o u l d e x p e c t a p p l i c a t i o n specific c o m p r e s s i o n a l g o r i t h m s t o do m u c h b e t t e r . F i n a l l y , in o r d e r t o e s t i m a t e t h e effects o f c o m p r e s s i o n in a s y s t e m w i t h a l a r g e g a p b e t w e e n p r o c e s s o r a n d I / O speed, I c o n f i g u r e d t h e t e s t s y s t e m t o p a g e t o a local disk runn~ug t h e o r i g i n a l S p r i t e file s y s t e m [6] r a t h e r t h a n t h e f a s t e r S p r i t e L F S [9]. 3.3 Target Applications I n e x t a d d r e s s t h e issue of w h a t a p p l i c a t i o n s m i g h t m a k e effective use of a c o m p r e s s i o n cache. O b v i o u s l y , m a n y a p p l i c a t i o n s will fit i n t o m e m o r y w i t h o u t t h e n e e d for c o m p r e s s i o n , a n d o t h e r a p p l i c a t i o n s m a y c a u s e excessive i / 0 e v e n With c o m p r e s s i o n . T h e a p p l i c a t i o n s t h a t c a n •best m a k e u s e of a CCACHB ~re t h o s e t h a t will n o t c o m f o r t a b l y fit i n t o p h y s i c a l m e m o r y w i t h o u t c o m p r e s s i o n b u t will fit if s o m e o f t h e i r p a g e s a r e c o m p r e s s e d . O n e c a n i m a g i n e a c o n t r i v e d s c e n a r i o in w h i c h t h e CCACHE w o u l d p r o v i d e significant p e r f o r m a n c e b e n e f i t s : if a n a p p l i c a t i o n cycles l i n e a r l y t h r o u g h a w o r k i n g set t h a t is o n e p a g e l a r g e r t h a n t h e m a x i m u m n u m b e r o f p a g e s allowed t o b e r e s i d e n t , a n d a l e a s t - r e c e n t l y - u s e d a l g o r i t h m is u s e d for ~ a g e r e p l a c e m e n t , t h e n t h e p r o c e s s will t a k e a p a g e f a u l t o n e a c h n e w p a g e . W i t h t h e comp wi]l be s a t i s f i e d b y a Another scenario, formly in an address m e m o r y for t h e CCAC t h e a v e r a g e cost of a c o m p r e s s i o n is m u c h i n t h e c o m p r e s s i o n c~ _ . Performance In o r d e r t o e v a l u a t e t h e |~uearly t h o u g h a set o f page, it would repeat th t h e a v e r a g e t i m e for eac w i t h e a c h line i n d i c a t i n g a s y s t e m that• was artifi. t o u s e r p r o c e s s e s , i n c l u d i n g a n y m e m o r y set a s i d e for t h e CCACH~. S.4

v

2 D E C s t ~ t i o n is a t r a d e m a r k of Disgltal E q u l p m c n t Corporation.

91

For a d d r e s s spaces

30

,

,

,

'

'

25

,fiiiiii...

20

7.../.

£ 15

....

.................. .......

I0 i

I

/

"~ ~" ;'lit" "il't~='-',~ ='~ l:'-ll ~':" .....

/

.:

]

/

..m 5

?'

/

,/ l

I

. . . . :.:.:.:.L.x ............. .........

t

I

10

1.5

20

Working

,Set

25

30

35.

40

(liB}

F i g u r e I: Compressios CscAe PeP/orm.sz~ce Uader Thra.shi~g. W i t h a zero-sized cache (i.e.,an unmodified system), a large n u m b e r of pages fit in m e m o r y without measurable page-fault overhead, but once the system starts thrashing it pays for a disk read and disk write per page access. T h e average access cost continues to increase as the inltialstartup period without page faults becomes a lesser portion of total access time. T h e CCACH]~ reduces the average access time considerably even though compression is done in software. Measurements were taken on a DECstation 5000/200 with approximately 14 Mbytes available for user processes, paging to a local R Z 5 7 disk, with a page size of 4 Kbytes.

t h a t fit i n t o p h y s i c a l m e m o r y w h e n p a r t l y c o m p r e s s e d b u t n o t w h e n t h e CCACItE w a s n o t u s e d , .the CCACHE p r o v i d e d a t h r e e - f o l d i m p r o v e m e n t o f t h e a v e r a g e access t i m e . N o t e t h a t w i t h a variable-sized compression cache, the p e r f o r m a n c e of t h e s y s t e m on this b e n c h m a r k should follow the lowest line of the g r a p h for a n y given address space size (i.e., negligible overhead w h e n the b e n c h m a r k fits in m e m o r y , a n d m o d e r a t e overhead all the w a y u p to a b o u t 30 Mbytes). Furthermore, I estimate that this line w o u l d shift d o w n w a r d b y a b o u t 3ms/access if compression were p e r f o r m e d i~ hardware. 3 A s - a second test, I r a n an application t h a t responds to queries based o n t h e c o n t e n t s o f a h a s h t a b l e . T h e d a t a b a s e c o n t a i ~ i u g t h e h a s h t a b l e w a s o v e r 40 M b y t e s , all o f w h i c h w a s c a c h e d i n m e m o r y , a n d t h e t o t a l a d d r e s s s p a c e o f t h e p r o c e s s w a s o v e r 60 M b y t e s . T h e b e n c h m a r k c o n s i s t e d of r u n n i n g a set o f 9 q u e r i e s a g a i n s t t h i s d a t a b a s e ; t h e q u e r i e s w e r e run once to load the process's address space with the appropriate parts of its database and a s e c o n d t i m e t o e v a l u a t e p e r f o r m a n c e . T h e b e n c h m a r k was r u n o n t h e s a m e D E C s t a t i o n a~ t h e p r e v i o u s m e a s u r e m e n t s 3 b u t w i t h o u t t h e artificial r e s t r i c t i o n , so i t h a d a b o u t 25 M b y t e s o f m e m o r y available, l:Lunning w i t h a 1 2 - M b y t e CCACHE w a s 32% f a s t e r t h a n w i t h o u t t h e c o m p r e s s i o n c a c h e (9.28 m i n u t e s v e r s u s 13.7). SThls estimate was obtaiued by setting aside memory for the compression cache but simply d a t a r a t h e r t h a n c o m p r e s s i n g a n d d e c o m p r e s s i n g it: c o m p r e s s i o n ~ d d e d 3 m s p e r access.

92Q 9

bcop~ing t h e

4

Conclusions

T h e widespread use of compression for file systems, Internet F T P access, m o d e m s , and m u l t i m e d i a indicates t h a t compression will play a n increasingly i m p o r t a n t role in f u t u r e d i s t r i b u t e d systems. T h e disparity between processor speeds a n d n e t w o r k a n d disk speeds will m a k e it m o r e a n d m o r e desirable to p e r f o r m compression for ~ a n y forms of I / 0 , even if c o m p r e s s i o n is done in software. T h i s will be especially true w h e n designing s y s t e m s t o use wireless networks. At the same time, t h o u g h , t h e widespread use of c o m p r e s s i o n for m u l t i m e d i a should m a k e h a r d w a r e compression m o r e readily available. Techniques such as t h e c o m p r e s s i o n cache t h a t have limited useful]hess when compression is p e r f o r m e d in s o f t w a r e will t h r i v e once t h a t i~ done. Acknowledgements M a x v i n T h e i m e r a n d o t h e r s a t Xerox PAP~C previously p u r s u e d t h e i d e a of i n t e g r a t i n g v i r t u a l m e m o r y a n d compression, cited as a p e r s o n a l c o m m u n i c a t i o n f r o m P e t e r s e n in t h e p a p e r b y A p p e l a n d Li [1]. B r i a n M a r s h a n d R a f a e l Alonso c o n t r i b u t e d t o t h e design a n d i n i t i a l i m p l e m e n t a t i o n of t h e compression cache, a n d h e l p e d t o formalize t h e e x p e c t e d p e r f o r m a n c e of t h e COACH,. Lisa B a h l e r and Dick L i p t o n p r o v i d e d helpful f e e d b a c k o n earlier d r a f t s o f this p a p e r , which h e l p e d i m p r o v e its c o n t e n t a n d p r e s e n t a t i o n . Lastly, I w o u l d like t o t h a n k A n d r e w Appel, B r i a n B e r s h a d , Mike Burrows, Mike J o n e s , Kal Li, K r i s h P o n a m g i , a n d J o n a t h a n S a n d b e r g for their c o m m e n t s a n d suggestions, References [1] Andrew W. Appel and Kal Li. Virtual memory primitives for user programs. In Proceeding8 of the Fourth International Conference on Architectural Support for Programmia9 Langaagea and Operating Systems, pages 96-107, Santa Clara, CA, April 1991. [2] M. Burrows, C. Jerian, B. Larnpson, and T. Mann. On-line d a t a compression in a log-structured file system. Technical Report 85, DEC Systems Rese~ch Center, April 1992. [3] Vincent Care and Thomas Gross. Combining the concepts of compression and Caching for a two-level filesystem. In The Fourth Internation Conferegce ogArchitectu~a! Support for Programming Language8 and Operating Systems, pages 200-211. ACM, April 1991. [4] Robert Hagmann. Comments on workstation operating systems and virtual memory. In Proceedi~g# of tt~e Second Workshop on Watt,station Operating S~lstema, pages 43-48, Pacific Groves C A, September 1989. IEEE. [5] John Markoff. Double-hard-disk capacity, through software. The New York Times, February 23 1992. [6] M. Nelson, B. Welch, and 3. Ousterhout. Caching in the Sprite network file system. A CM "]~.anaac~ion~ o~ Computer S31atema, 6(1):134-154, February 1988. [7] Maxk Nelson. The Data Compression Book M&T Books, 1991.

....

[8] J. Ousterhout, A. Cherenson, F. DougHs, M. Nelson, and B. Welch. The Sprite network operating system. I E E E Computer, 21(2):23--36, Februaxy 1988.

[9] M. Rosex system,. i n Procee [10] Mark Taunton. Compressed executables: an exercise in thinking small. In Proceedings of the U S E N I X :1991 Su~rtmer Con,ference, 1991. . . . . . . [II] Ross N. Williams. An extremely fast ZIV-Lempel data compression algorithm, In Data Compre~sion Conference, pages 362-371, April 1991. 93

Suggest Documents