Anita K. Jones, Robert J. Chansler Jr.,. Ivor Durham, Karsten Schwans ..... into abstract capabilities. An abstract capability is a pointer to an abstract object; it.
S t a r O S , a Nlultiprocessor Operating System f o r the Support of Task Forces
A n i t a K. Jones, Robert J. Chansler Jr., I v o r Durham, Karsten Schwans and S t e v e n R. Vegdahl Department of Computer Science Carnegie-Mellon University Pittsburgh, PA 15213
StarO$ Is a m e s s a g e - b a s e d , o b j e c t - o r i e n t e d , m u l t l p r o c e s s o r o p e r a t i n g s y s t e m , s p e c i f i c a l l y designed to support task forces, large collections of c o n c u r r e n t l y e x e c u t i n g p r o c e s s e s t h a t c o o p e r a t e to a c c o m p l i s h a s i n g l e purpose. StarOS has been i m p l e m e n t e d a t C a r n e g i e - M e l l o n University for the 5 0 p r o c e s s o r Cm* m u l t i - m i c r o p r o c e s s o r computer.
R e a l i z i n g t h e s e potential b e n e f i t s requires s o f t w a r e s t r u c t u r e s t h a t m a k e e f f e c t i v e use of the hardware. S t a r O S Is an e x p e r i m e n t a l o p e r a t i n g s y s t e m for a multi-microprocessor with approximately 50 p r o c e s s o r s and 3 M b y t e s of main memory (in 1 9 7 9 ) [5]. S t a r O S Is d e s i g n e d for t h e support of and e x p e r i m e n t a t i o n w i t h task forces, s o f t w a r e composed of many cooperating, communicating "small" p r o c e s s e s , t o g e t h e r w i t h supporting code and data. C o l l e c t i v e l y , t h e t a s k f o r c e components accomplish a single task. Our o b j e c t i v e is t o determine w h e t h e r t a s k f o r c e s o f t w a r e is c o n d u c i v e to achieving the potential benefits of multiprocessors, and to u n d e r s t a n d t h e d e s i g n issues r e l a t e d to o p e r a t i n g s y s t e m f a c i l i t i e s t h a t s u p p o r t t a s k forces. A limited V e r s i o n o f S t a r O S has b e e n running since 1 9 7 7 . An adapted and expanded v e r s i o n Is now being completed.
Cm*,
In t h i s p a p e r , w e f i r s t discuss t h e a t t r i b u t e s of t a s k f o r c e s o f t w a r e and o f t h e Cm* a r c h i t e c t u r e . We then d i s c u s s s o m e o f t h e f a c i l i t i e s in StarOS t h a t allow d e v e l o p m e n t and e x p e r i m e n t a t i o n with t a s k forces. S t a r O $ I t s e l f is p r e s e n t e d as an e x a m p l e t a s k force.
1. Introduction
T e c h n o l o g i c a l a d v a n c e s h a v e made It a t t r a c t i v e t o Interconnect m a n y l e s s e x p e n s i v e processors and memories to construct a powerful, cost-effective computer. Potential b e n e f i t s Include increased cost-performance r e s u l t i n g from t h e e x p l o i t a t i o n of m a n y c h e a p p r o c e s s o r s , e n h a n c e d reliability in t h e I n t e g r i t y o f d a t a and in t h e a v a i l a b i l i t y of useful processing cycles, and a physically adaptable computer whose c a p a c i t y can be e x p a n d e d or r e d u c e d b y a d d i t i o n or r e m o v a l of modular components.
I t is a p p r o p r i a t e t o a n a l y z e t a s k f o r c e s in detail. Processes o f a t a s k f o r c e are t y p i c a l l y small in comparison to counterparts in a uniprocessor m u l t i p r o g r a m m i n g s y s t e m , and t h e r e are more of them. T h e d e s i r a b i l i t y o f m a n y small processes, r a t h e r than a few l a r g e r o n e s , d e r i v e s d i r e c t l y from the t h r e e potential benefits listed above: To a c h i e v e cost-performance or e v e n a b s o l u t e performance, a c o m p u t a t i o n is d e c o m p o s e d into small parts, each of w h i c h is p e r f o r m e d b y a s e p a r a t e process e x e c u t i n g in p a r a l l e l w i t h t h e o t h e r s . This s t r a t e g y maximizes u s a g e o f t h e a v a i l a b l e parallelism. Enhanced reliability may r e s u l t if t h e t a s k is d e c o m p o s e d such t h a t no one p r o c e s s p e r f o r m s an Indispensable function. If an e r r o r c a n b e c o n t a i n e d so t h a t it results in the d e s t r u c t i o n o f no more t h a n one process, the t a s k m i g h t s t i l l b e c o m p l e t e d . In this case, a more reliable i m p l e m e n t a t i o n r e s u l t s . The third p o t e n t i a l b e n e f i t of h a r d w a r e a d a p t a b i l i t y will be well s e r v e d if the t a s k force c a n g r o w (or shrink) w i t h t h e addition (or r e m o v a l ) o f p r o c e s s o r and memory resources. This Is p a r t i c u l a r l y e a s y if d a t a s t r u c t u r e s are s e p a r a t e l y I o c a t a b l e and a d d r e s s a b l e e n t i t i e s w h o s e size or n u m b e r c a n v a r y . L i k e w i s e , it is particularly e a s y to
This work was sponsoredby the Defense AdvancedResearchProjects Agency' (DOD), ARPA Order No. 3597, monitored by the Air Force Avionios Laboratory Under Contract F33615-78-C-1551. The views and conclusions containedin this documentare thoseof the authors and should not be interpreted as representingthe official policies, either expressed or implied, of the Defense AdvancedResearchProjects Agency or the US Government.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1979 A C M 0-89791-009-5/79/1200/0117
$00.75
117
d o w h e n t h e t a s k f o r c e is composed, In part, of d u p l i c a t e d p r o c e s s e s or d e t a .
c o p e w i t h t h e p r o b l e m s t h a t arise in large applications designed t o e x p l o i t t h e parallelism inherent in distributed systems--whether loosely or t i g h t l y c o u p l e d . It is our o b j e c t i v e t o e x p l o r e facilities t h a t do so.
An i n d i v i d u a l p r o c e s s in a t a s k f o r c e is specialized; i t h a s o n l y a small p a r t of t h e o v e r a l l work of the t a s k force to accomplish. Hence, it n e e d s t o a c c e s s r e l a t i v e l y l i t t l e d a t a or c o d e . As a result, the address d o m a i n o f a p r o c e s s m a y be small. This st,ggests t h a t e a c h u n i t o f c o d e and e a c h d a t a s t r u c t u r e should be s e p a r a t e l y a d d r e s s a b l e so t h a t a d d r e s s domains can b e ' t a i l o r e d t o t h e r e q u i r e m e n t s of the individual process.
2. Cm*
Architecture
T h e d e s t g n o f an o p e r a t i n g s y s t e m Is influenced by t h e h a r d w a r e r e s o u r c e s i;c manages. Hence, it is a p p r o p r i a t e t o s k e t c h t h e s a l i e n t a s p e c t s of the Cm* architecture. A d d i t i o n a l d e s c r i p t i v e detail can be f o u n d In p a p e r s b y Fuller and Swan, e t al. [5, 15]. Cm~ w a s d e s i g n e d and a p r o t o t y p e implemented at Carnegie-Mellon University. The p r o t o t y p e began r u n n i n g ' i n Spring, 1 0 7 7 . By Fall, 1 0 7 0 it included .50 processors.
P r o c e s s e s o f a t a s k f o r c e r e l y more on o t h e r processes than is the case in time t y p i c a l multiprogramming system. W h a t is performed by a s u b r o u t i n e in t h e multiprogramming s y s t e m may be p e r f o r m e d b y a s e p a r a t e p r o c e s s in t h e t a s k force. Inter-process c o m m u n i c a t i o n and synchronization are substantially more frequent. Hence, each process m u s t b e a b l e t o a d d r e s s some d a t a o b j e c t s , such as m a i l b o x e s a n d s e m a p h o r e s , for communication and synchronization.
The Cm* multi-miniprocessor consists of interconnected computer modules, each an a u t o n o m o u s c o m p u t i n g e n g i n e I In the e x i s t i n g s y s t e m each c o m p u t e r module Is implemented by a DEC L S I - 1 1 , a s t a n d a r d LSI-11 b u s , memory, and d e v i c e s . All p r i m a r y m e m o r y in t h e s y s t e m Is p o t e n t i a l l y a c c e s s i b l e t o e a c h p r o c e s s o r . Each computer module I n c l u d e s a l o c a l s w i t c h , t h e Slocal, which s e l e c t i v e l y r o u t e s p r o c e s s o r m e m o r y r e f e r e n c e s e i t h e r to the l o c a l m e m o r y o f t h e c o m p u t e r module or else o n t o the M a p Bus. The S l o c a l l i k e w i s e a c c e p t s r e f e r e n c e s to i t s l o c a l m e m o r y t h a t e m a n a t e from d i s t a n t processors. Up t o f o u r t e e n c o m p u t e r modules may be c o n n e c t e d t o a M a p Bus s o t h a t t h e y s h a r e t h e use of a single Kmap, a p r o c e s s o r r e s p o n s i b l e for routing memory r e q u e s t s a n d d a t a b e t w e e n Slocals. Together, t h e S l o c a l s a n d t h e Kmap i m p l e m e n t a d i s t r i b u t e d switch.
T a s k f o r c e s v a r y along dimensions not e v e n found in m u l t i p r o g r a m m i n g s y s t e m s . For e x a m p l e , the number o f p r o c e s s e s in a single t a s k f o r c e may v a r y not with t h e n u m b e r o f f u n c t i o n s t o be performed, but with available resources. W h e r e s e v e r a l processors are a v a i l a b l e , I n p u t d a t a m a y be p a r t i t i o n e d and p r o c e s s e s r e p l i c a t e d s o t h a t e a c h p r o c e s s performs the same f u n c t i o n , b u t on a r e s t r i c t e d portion of data. In s u m m a r y , t h e p r o c e s s in a t a s k f o r c e is s m a l l - - s m a l l b e c a u s e t h e function it e x e c u t e s Is a s m a l l p o r t i o n o f t h e Overall task, and small in the size o f i t s a d d r e s s i n g domain. A t a s k f o r c e consists of p o t e n t i a l l y a l a r g e n u m b e r o f components, r e l a t e d In a c o m p l e x s t r u c t u r e . The s t r u c t u r e and composition of a task force m a y v a r y t o a considerable e x t e n t dynamically. The task f o r c e is the unit of r e s p o n s i b i l i t y f o r a n y f u n c t i o n a l i t y o t h e r than the most primitive. H e n c e , It Is t h e unit of a c c o u n t a b i l i t y and t h e u n i t f o r w h i c h m a j o r r e s o u r c e scheduling decisions are made.
T h e c o m p u t e r modules, Kmap, and Map Bus t o g e t h e r comprise a cluster. Clusters are c o n n e c t e d via I n t e r c l u s t e r Buses running b e t w e e n the Kmaps. A C m , configuration c a n h a v e an a r b i t r a r y number of clusters, a l t h o u g h c l u s t e r s n e e d not h a v e d i r e c t I n t e r c l u s t e r Bus c o n n e c t i o n s t o e v e r y o t h e r cluster in a c o n f i g u r a t i o n . The number of c o m p u t e r modules In a c l u s t e r a n d t h e n u m b e r of c l u s t e r s In a s y s t e m may vary. C u r r e n t l y , our Cm* Implementation consists of f i v e c l u s t e r s , 5 0 p r o c e s s o r s , and 3 m e g a b y t e s of p r i m a r y m e m o r y d i s t r i b u t e d r e l a t i v e l y e v e n l y across the c l u s t e r s and across t h e computer modules. Synchronous Line Units a r e p r o v i d e d on s e v e r a l processors for connecting terminal d e v i c e s or multiplexors. Each c l u s t e r is c o n n e c t e d to to a DEC K L I O p r o c e s s o r w i t h higll s p e e d "DA Links." For o n - l i n e s t o r a g e o f d a t a , e a c h c l u s t e r is provided with o n e o r m o r e m o v i n g - h e a d disk controllors. An e x a m p l e
W e h a v e d e f i n e d t a s k f o r c e s quite generally. For example, a sequence of three processes connected via Unix pipes would constitute a task force [14]. H o w e v e r , s u c h a t a s k f o r c e Is e x c e e d i n g l y simple: neither Its s t r u c t u r e , nor its composition, change dynamically; it does not require any but s t r a i g h t f o r w a r d s y n c h r o n i z a t i o n b a s e d on d a t a passing t h r o u g h t h e p i p e s ; it e x h i b i t s no communication paths o r o r g a n i z a t i o n a l s t r u c t u r e for handling errors so t h a t t h e y c a n b e r e l a t e d t o t h e t a s k f o r c e as a whole, r a t h e r t h a n a s i n g l e p r o c e s s ; and it provides no basis f o r c o o r d i n a t i n g time o u t p u t t h a t a p p e a r s o=1 the user termlnah While an e x t r e m e l y useful program o r g a n i z a t i o n , s u c h s t r u c t u r e s a r e not s u f f i c i e n t to
1The name Cm= stands for an arbitrary number of Cm's, or Computer Modules. The x is derived from the notation introducedby Kleene for regular expressions.
118
INTEP.GLUSTER BU~ 1
M A P I~U5 PPP°t|
..
I_
BUS
- -
SEKI/~L L|NIES