Encapsulation Constructs in Systems ... - Semantic Scholar

1 downloads 0 Views 2MB Size Report
WILLIAM F. APPELBE. University of California at .... UNIX is implemented in C, a machine-oriented high-level language. UNIX obtains much of its ...... MITCHELL, J.G., MAYBURY, W., AND SWEET, R. Mesa language manual. Xerox Palo Alto.
Encapsulation Constructs in Systems Programming Languages WILLIAM F. APPELBE University of California at San Diego and A. P. RAVN Copenhagen University, Denmark

This paper investigates the desirable properties of programming language constructs that support encapsulation of environments and abstract data types. These properties are illustrated by using a simple multiuser file system as a model. The requirements for such a file system are outlined; then the model file system design is described by a hierarchy of encapsulated abstract data types and environments. The high-level language constructs necessary to directly implement the model file system design are identified. It is concluded that environment encapsulation and abstract data types must be supported by different constructs, and the desirable properties of such constructs are outlined. A superset of Ada e that effectively supports both environments and abstract data types is introduced and used to implement the model file system. The encapsulation constructs of several modern systems programming languages are evaluated. Each of these languages is shown to be insufficient for a direct implementation of the model file system design. Categories and Subject Descriptors: D.3.2 [Programming Languages]: Language Classification--

Ada; D.3.3 [ P r o g r a m m i n g Languages]: Language Constructs--abstract data types; modules; packages; D.4.7 [Operating Systems]: Organization and Design--hierarchical design General Terms: Design, Languages Additional Key Words and Phrases: Encapsulation, systems

systems programming languages, file

1. INTRODUCTION Traditionally, systems software has been written almost exclusively in assembler or other low-level languages. Recently, specialized systems programming languages (SPLs), have evolved to reduce the high costs of developing and maintaining systems software. A programming language is classified as an SPL when

Authors' addresses: W. F. Appelbe, Department of Electrical Engineering and Computer Sciences, University of California at San Diego, La Jolla, CA 92093; A. P. Ravn, DIKU, Copenhagen University, Sigurdsgade 41, Copenhagen, DK-2200 Denmark. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1984 ACM 0164-0925/84/0400o0129 $00.75 ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984, Pages 129-158.

130



W.F. Appelbe and A. P. Ravn

it meets the following criteria: --An S P L must be a high-levellanguage, that is,itmust be machine independent, and provide secure data and control abstractions. Thus assembler languages and machine-oriented high-level languages (MOHLs), such as C [6] and BLISS, are not classifiedas SPLs even though they are widely used for systems programming. q A programming language must be primarily intended for implementing systems programs for it to be considered an SPL. Thus, general-purpose languages such as PL/I are not classifiedas SPLs. Also, programming languages such as Alphard [10],whose emphasis is upon verificationand abstraction and which generally have not been fully implemented, are excluded. A n S P L must be implementable on a range of computer environments and usable to generate efficient software systems.

The advantages of using an SPL for implementing system software include --Reducing overall coding and maintenance effort, because an SPL permits a direct or high-level implementation of the system design. --Increasing system reliability, because an SPL can provide compile-time and runtime checking for user and systems programs. Exclusive use of an SPL could eliminate much of the need for hardware and software protection mechanisms for systems software security. Despite the considerable language design effort devoted to SPLs, there has been comparatively little effort devoted to comparing the effectiveness of different SPLs. Such comparisons have generally been restricted to simple algorithms and applications (e.g., [15]), and to contrasting the efficiency of SPL programs, rather than the ability of SPLs to support the development of large, reliable software systems. This article focuses upon the desirable features of SPLs for supporting "programming-in-the-large," that is,the language constructs necessary to support the decomposition of systems into a hierarchy of encapsulated modules and interfaces. These constructs are illustratedby the design and implementation of one major component of a multiuser operating system: the file system. The S P L implementation of such a filesystem must support (i) management of dynamically allocated resources, for example, I/O buffers, logicaland physical files,pipes, and directories; (ii) establishing reliable interfaces, so that users can neither maliciously nor unintentionally defeat the resource allocationscheme; (iii) sharing of resources at the user level; (iv) a high-levelrepresentation of the hardware environment. These requirements arise in many other components of multiuser operating systems, such as network interfacesand user-levelmultitasking. The intent of this paper is not to design either a "better" S P L or a "better" filesystem. Instead, an existing general-purpose filesystem is taken as a model, ACM

Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



131

and the difficulties of implementing such a file system directly with existing SPLs are analyzed. A superset of Ada 1 has been used as a basis for implementing the file system, since it is not possible to securely implement the file system directly in Ada. UNIX 2 [5] has inspired the model file system, because it provides an adaptable multiuser file system with a simple, but effective, user interface. UNIX is implemented in C, a machine-oriented high-level language. UNIX obtains much of its flexibility at the expense of an insecure implementation. C does not provide type checking and encourages the use of insecure language constructs, such as pointers, which indirectly access arrays and procedures [8]. Thus, the C implementation of the UNIX file system design is difficult to maintain and verify. Ideally it should be possible to implement an adaptable file system entirely in an SPL, achieving reliability and maintainability without compromising efficiency. The implementation should also directly reflect the design of the file system, so that the implementation can be easily maintained and verified. A secure implementation of a file system using an SPL would eliminate some of the need for run-time security checks of user programs. Security could be statically guaranteed through the SPL type-checking mechanism. Run-time security checks would still be needed for fault tolerance and also in environments in which user programs could be written in insecure languages. 2. DESIGN OF A MODEL FILE SYSTEM

The specification of any software system must define the external interfaces of the system. The external interfaces of the model file system are (i) the user interface, modeled on the UNIX I/O system; (ii) the hardware interface. The user interface is outlined in Section 2.2. Since techniques for implementing hardware interfaces are not of concern in this paper, it is assumed that the hardware interface is implemented by an I/O manager for each hardware device. The internal details of the implementation of each manager, such as their process structure, scheduling, and interrupt handling should not affect the theme of file system design. It is assumed that each device I/O manager provides a uniform interface, consisting of the operations read, write, status, open, and close. The model file system design is based on two distinct encapsulation constructs: (i) environments, for example, logical I/O and physical I/O; (ii) abstract data types, for example, files, queues, I/O buffers, and device interfaces. The properties and rationale for these two constructs are discussed in the following section. Ada is a t r a d e m a r k of t h e D e p a r t m e n t of Defense. 2 U N I X is a t r a d e m a r k of Bell Laboratories. A C M Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

132



W.F. Appelbe and A. P. Ravn

2.1 Requirements for Encapsulation Constructs

An environment is a static encapsulation of a collection of declarations: constants, types, objects, subprograms, and abstract data types. Every environment has an interface and an implementation. The interface, or specification, consists of the declarations exported by the environment. The implementation, or body, consists of the declarations that implement the environment specification. The SPL construct for encapsulating environment declarations is referred to as a p a c k a g e . Abstract data types define a domain, often referred to as a "type," and a collection of operations that may be performed upon objects of that domain. For example, a file abstract data type would define the file operations, such as read and write, and the domain of file objects. Most high-level languages provide type constructors such as arrays and records for defining new domains, but such constructs do not encapsulate operations. An abstract data type may be regarded as a secure encapsulation mechanism for defining new domains and their operations. The SPL construct for abstract data type declarations is referred to as a class. A class construct must be able to encapsulate all operations upon class objects, including assignment, copying, comparison, initialization, and termination (or finalization). For example, a secure file declaration must (i) control the initial state of file objects; (ii) reclaim system resources, such as I/O buffers, when a file object is deallocated owing to exiting the scope of the file declaration; (iii) control the comparison of file objects, since such a comparison could either be meaningless or a breach of file security; (iv) restrict the use of the assignment operator, since such an operation may either create insecure copies or deallocate the system resources associated with a file. A class construct must also encapsulate all details of the class implementation, so that only the operations defined in the interface are visible. For example, a secure implementation of a file class must not allow a user to directly access the device associated with the file. Like environments, abstract data types have an interface and an implementation. The abstract data type interface consists of the set of exported operations upon objects of that type. The body defines the state of an object and the implementation of the operations. Encapsulation constructs for SPLs should distinctly separate the interface, which specifies the objects and operations exported by the construct, from the implementation. The advantages of separation of interface and implementation are (i) ease of separate compilation; (ii) the ability to provide more than one implementation; (iii) the ability to declare interfaces whose implementations are mutually dependent. The distinction between package and class constructs is illustrated by their application to the design of hierarchical operating systems. Typically, a hierarchical operating system consists of a series of levels, referred to as virtual ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages Table I.

Number of instances Creation of an instance

Exports

133

Properties of Classes and Packages Class

SPL Applications



Package

Abstract data types, e.g., file, device interface, I/O buffer Arbitrary

Environments, e.g., physical I / 0 , logical I/O, root file directory Only one

class A is . . . . . The class declaration introduces a new type, A. This type can be used to declare class objects, e.g., a:A, or as a type identifier in other declarations, e.g., t y p e D is a r r a y ( . . . ) of A The class exports operations that are performed upon initialization and termination of class instances Only explicitly declared operations on instances of an abstract data type may be exported

package E is... The package declaration defines a new environment, E. Environments are statically introduced, use E, and initialized prior to execution of any program that accesses the environment

Any declarations, such as constants, variables, types, classes, routines, and packages may be exported

machines. Each successive level defines a new interface for higher levels and is implemented using declarations imported from the interfaces of lower levels. Each level's interface is a unique environment or package consisting of a collection of objects, subprograms, and data types. Within each level, some data types, such as page tables, device interfaces, and files, must be encapsulated so that the operations that may be performed upon these data are restricted. Such protected types are implemented as classes. Table I summarizes the distinctions between packages and classes. Although environments and abstract data types are two distinct encapsulation concepts, recent SPLs only provide a single construct for encapsulation. Ada, Modula-2, and Mesa provide a package construct, whereas CLU, Alphard, Concurrent Pascal, and Pascal Plus provide a class construct. However, it is difficult to implement classes effectively using a package construct and conversely. If an SPL includes a package construct, classes can be simulated by a package that exports a type, class_type, and a set of operations for class_type objects. Some SPL versions of package encapsulation, such as Ada, provide special mechanisms to protect the access to exported class_types. However, such protection mechanisms are syntactically cumbersome (Section 4.1) and usually insecure, since they provide no control over operations such as copying and termination. Package encapsulation also does not permit a class_type with a package to have several distinct implementations. If an SPL includes only a class construct, packages can be simulated if a class can export declarations other than routines. This defeats the security of the class and introduces problems of type equivalence and aliasing. Given two instances a and b of the same class A, are the two exported types a.T and b.T associated with the instances equivalent? If these types are equivalent, name equivalence is ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

134



W F Appelbe and A P Ravn

violated, since two distinct names denote the same type. Conversely. if types are equivalent only if the objects that export them are identical, then aliasing implies that type equivalence must be checked at run time. Type equivalence is further complicated by parameterized class declarations, types that are associated with classes in case variant records, and multiple implementations of a class. All these problems can be avoided if classes are secure and cannot export types, and if a separate package construct that can export types is provided by the SPL. 2.2 User Interface for the File System This paper uses U N I X as a model for the file system requirements and user interface. The U N I X file system has the advantages of flexibility, simplicity, and efficiency, and the disadvantage of insecurity. This insecurity is both (i) internal, since there is little type checking within the C implementation; (ii) external, since knowledgeable users can force deadlocks and defeat run-time security checks. Both internal and external insecurity can be largely overcome if systems and user programs are written in a strongly typed S P L that encourages information hiding. Although U N I X has been used as a model, no attempt has been made to include all the facilities of the U N I X file system in the design. Only those facilities that are difficult to implement securely in many SPLs have been included. The choice of U N I X for a model file system is not intended to be an issue in this paper. Any other flexible, multiuser file system would be a Suitable choice. The requirements for the user interface of such a file system are as follows: (i) A logical file, referred to simply as a fi/e, consists of a sequence of items, accessed either sequentially or randomly by means of operations such as read, write, and seek. When a file is declared, the user must specify the type of items. Before a file can be accessed, it must be explicitly opened and a file name specified. A file name specifies a particular physical file, that is, a data area on some device. A file may be closed and reopened with a different file name. A file is implicitly closed when the scope of the file is exited. (ii) Files are organized into a hierarchy using directories. Thus a file name specifies a directory path to be accessed. A user cannot directly access a directory, but can create and delete directories. The file system must provide secure sharing of files and directories among independent users. (iii) Devices are visible to the user as intrinsic file names in the directory hierarchy, so that device and file operations are syntactically identical. (iv) Several different implementations of the file interface may be provided. In particular, the model file system itself defines pipes, which are similar to U N I X pipes. A pipe acts as a FIFO queue: A file read operation removes an item from the head of the queue, and a file write operation inserts an item into the queue. Read operations are blocked while queue is empty. (v) Files can appear as components of other type declarations and can be passed as parameters to procedures, allowing file manipulation libraries to be developed. In the model file system there is only a single type for file items (only ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages

135



LogicAl 10

Notation"

denotes a paek~e specification

© t

denotes a e l ~ specification

I denotes a ~ e

l' ,'"~

body

denotes a elmmbody

denotes exported/imported declarations Fig. 1. Organization of the user interfacefor the modelfile system.

files of strings are provided). If user-defined types for file items are introduced, then the SPL implementation of classes must support type parameters (such as g e n e r i c constructs in Ada). The implementation of user-defined types for file items would also require a mechanism to overcome strong type checking as physical files are untyped. The organization of the interfaces for the model file system, using the package and class concepts introduced in Section 2.2, is illustrated in Figure 1. The user interface does not provide operations such as searching a directory or changing file protection modes, although these operations are provided by the UNIX file system. These operations were omitted since the goal of the model file system was a simple interface that illustrates encapsulation concepts, rather than a complete interface that closely parallels UNIX.

3. IMPLEMENTATION OF THE MODEL FILE SYSTEM None of the present SPLs of which we are aware can effectively implement an entire model file system that satisfies all the above requirements, because none provide sufficiently powerful encapsulation constructs. 3 However, several SPLs, such as Mesa and Ada, do provide extensive high-level support for implementing modular systems software. Ada was chosen as the base language because (i) it is a "state-of-the-art" SPL that provides strong typing, separation of interfaces and their implementation, encapsulation of environments (Ada packages), and exception handling; (ii) it has a comprehensive, though informal, reference manual [13]. 3Section 4 discussesthe limitationsof several current SPLs in detail. ACMTransactionson ProgrammingLanguagesandSystems,Vol.6, No. 2,April1984.

136



W.F. Appelbe and A. P. Ravn

T h e file system is designed and implemented in Ada, with an extension to support classes. T h e extension is necessary for security, since there is no mechanism for associating termination routines with type implementations in standard Ada (Section 4.1). T h u s it is not possible to implicitly close a file when the scope of the file's declaration is exited and the file is terminated. Also, Ada provides no mechanism to enable an abstract data type, such as a file, to have more t h a n one implementation. Since we wish to avoid designing "yet another" programming language, we have deliberately tried to keep the extension to Ada as simple as possible. T h e extension eliminates the need for the cumbersome p r i v a t e and l i m i t e d p r i v a t e type construct in Ada. T h e intention is to introduce a simple, secure, direct implementation of the file system and t h e n to show how limitations of existing SPLs, such as Ada, C o n c u r r e n t Pascal, and Pascal Plus, inhibit an implementation of the file system. 3.1 Language Constructs to Support the Model File System T h e S P L implementation of the model file system is written in Ada, with an extension to support class declarations. T h e syntax of the extension 4 to Ada, following the grammar and section numbers of the Ada standard [13] is 3.1 basic_declaration : : = . . . I class specification 3.9 proper_body : : = . . . I class_body 7.4 class_specification ::= class c/ass_identifier [formal_part] is Isubprogram _ specification } end [c/ass_identifier]; class_body ::= class body c/ass_name [formal_part] is declarative _part [begin sequence_ of statements- -Initialization [exception

{exception_ handler}] ] [terminate sequence_of_statements- -Termination [exception

{exception_ handler}] ] end [c/ass_name];

c/ass_name ::= c/ass_identifier ['body_identifier] A class declaration introduces a new type, denoted by c/ass_identifier. Only those operations defined in the class_specification may be applied to objects of 4private and limited private types will be deleted, generic classes will be implemented but have been omitted above to simplify the syntax. ACMTransactionson ProgrammingLanguagesand Systems,Vol.6, No. 2, April1984.

Encapsulation Constructs in Systems Programming Languages

137

t h a t type. A class_body defines a representation for the corresponding class_ specification. A class_body declarative part must include declarations for the bodies of each of the operations introduced by the class_specification. T h e class body may also include a sequence of statements t h a t are executed upon initialization and termination of an instance o f the class. For class objects, this occurs when the scope, or declarative region, containing the object declaration is entered or exited, respectively. Class declarations may include formal parameters, similar to subprogram declarations in Ada. T h e actual parameters are bound when a class object is instantiated, following the same rules as for subprogram invocation. T h e scope of class parameters is limited to the initialization code. T h u s the parameters are not visible in either subprogram bodies or the termination code. This restriction is necessary because the lifetime of a class object may exceed t h a t of the actual parameters if a dynamically allocated class object is passed local variables as actual parameters. If this restriction were not introduced, complex run-time storage m a n a g e m e n t for classes would be necessary. A class body name may either be a c/ass_identifier, indicating a unique or default implementation, or class_identifier'body_identifier, indicating an alternative implementation. An object declared as 5 class_ object : class_identifier['body_identifier][actual_parameter_part]; will be bound to the corresponding body. T h e a c t u a l _ p a r a m e t e r _ p a r t can only be omitted if the corresponding class declaration has no parameters, or if all parameters have default values. A body_identifier is never directly visible; rather it is an attribute of a c/ass_identifier. As with Ada subprograms, class bodies must be declared in the same scope as the corresponding class specification, and each class specification must have at least one corresponding body. T h e scope of body_identifiers is the same as the scope of the corresponding c/ass_identifier. Thus, body_identifiers are visible wherever the corresponding c/ass_identifier is visible. Two syntactic forms for operations upon abstract data types have been adopted by SPLs: (i) Attribute notation, for example, f.open(...); (ii) Procedural notation, for example, open(f,...). For simplicity the latter notation is adopted, ~ as it enables n o n u n a r y operations, such as comparing two objects, to be easily represented. Within the body of a class operation the declarative part of the enclosing class body is visible, so t h a t the implementation of class objects can be accessed as selected components. 5If a compound data type, such as a record, contains class objects, then any object declaration of that compound type must assign an initial value (class body and actual parameters} to each class component. 6The choice between attribute and procedural notation is a syntactic issue outside the scope of this paper, which depends upon the detailed scope rules of Ada. ACMTransactionson ProgrammingLanguagesand Systems,Vol.6, No. 2, April1984.

138



W.F. Appelbe and A. P. Ravn

F o r example,

class body File (file_name : String : - ' " ' ) is - - T h e default implementation of File, using Physical_IO is_open : Boolean: - - T R U E if the file is open procedure open (f : File; status : out Open_Status; name : String; mode : File_Mode) is begin if f.is_open:-Check the current state of f7 end open; - - other Class Operations

end File; W i t h i n the initialization a n d t e r m i n a t i o n s t a t e m e n t s of the class body, the declarative p a r t of the enclosing class body is directly visible, a n d the class object is denoted b y the class name:

class body File (file_name : String := ' " ' ) is --Class Operations

begin - - S e t the initial state of the File if(file_name = ' " ' ) then is_open := FALSE; else open {File, status, file_name, READ_WRITE); end if; terminate --Close the file if it is still open if is_open t h e n close (File); end if; end File; F o r m a l p a r a m e t e r s of s u b p r o g r a m s t h a t are class objects m u s t be passed b y reference. T h e i n t e n t of this restriction is to eliminate the p r o b l e m s associated with implicit copying of shared resources. F o r example, if the r e a d operation were a s y n c h r o n o u s a n d the file i m p l e m e n t a t i o n declared an IO buffer whose address was passed to the device driver, t h e n t r a n s m i t t i n g a file object b y copyrestore could create a copy of the file's IO buffer a n d play havoc with buffer management. Ada p e r m i t s an i m p l e m e n t a t i o n to use either call b y value-result or call by reference. E x t e n d i n g this policy to classes would require s y s t e m s p r o g r a m m e r s f.is_open is a selected component of f. If attribute notation had been adopted, f would be an implicit parameter of open, and its components would be directly visible, for example, if is_open . . . . ACMTransactionson ProgrammingLanguagesand Systems,Vol.6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



139

to adopt strategies such as --allocating all shared resources dynamically and only accessing them through a c c e s s variables (Ada pointers)--this may impose an undesirable overhead of dynamic storage allocation. --placing all shared resources within a monitor t a s k , and only accessing them through e n t r y calls--this would probably impose a high intertask communication overhead. Initialization and termination code for classes is not executed for formal class parameters since class parameters are passed by reference. Class instances that are components of dynamically allocated types (access types in Ada) are initialized when an allocator, new, is executed. Termination code must be executed when an implementation reclaims storage for a dynamically allocated object that contains a class. 8 Neither assignment nor the predefined comparisons for equality and inequality are implicitly declared by classes. As a consequence of this, functions that return class types are not permitted in class specifications. For example, it is illegal to declare an operation f u n c t i o n temp_file return File;

--Creates a temporary file since the file object returned by a call to temp_file cannot be assigned or used as an actual parameter. Instead, the operation would need to be specified as follows: p r o c e d u r e temp_file (source : in out File);

Although the predefined relational operators for equality, inequality, and assignment are not exported by a class specification, these operations are permitted within the class implementation. For example, p r o c e d u r e assign (source, target : in out File); P r o c e d u r e assign (source, target : in out File) is begin

target := source; end assign;

Within the class, implementation assignment and equality testing of classes are always by reference. Thus, if x and y are two objects of class C, then x = y will be true if and only if x and y both reference the same object. After the assignment x := y, x will reference the same object as y, and x = y will be true. Assignment by copying and copy comparison can be implemented by assigning or comparing each component within the class. For example, p r o c e d u r e copy (source, target : in out File);

s Ada does not provide an explicit mechanism for storage reclamation for dynamicallyallocated objects. ACMTransactionson ProgrammingLanguagesand Systems,Vol.6, No. 2,April1984.

140



W.F. Appelbe and A. P. Ravn

procedure copy (source, target : in out File) is begin

target.state_var := source.state_var; --Assign the entire state of the source file to the target file end copy;

Assignment by reference can. be used to provide secure sharing of dynamically assigned resources [12]. For example, a central pool of I/O buffers is maintained by the model file system. Each open file has an associated I/O buffer, which is assigned to it by the I/O Buffer Manager. By using assignment by reference to allocate I/O buffers, file I/O buffers can be directly accessed by file operations, yet assigned dynamically by the I/O Buffer Manager. Assignment by reference is efficient and permits resource sharing, but has the disadvantage of potential insecurity by creating an alias and dereferencing a class object. However, since the assignment operator may only be used in the class implementation, the side effects of class assignment can be fully controlled. It is the responsibility of the class implementation to ensure that, when the assignment operation is used, the class objects are properly terminated. For example, a file assignment operation could use a semaphore to keep count of the number of references to a file and only deallocate a files resources when the reference count was zero. class File is procedure assign (source, target : in out File); end File; class body File is

--Count_Monitor is a task type which provides a semaphore, accessed by - - e n t r y s to increment and decrement_and_test the semaphore. usage_count : Count_Monitor; p r o c e d u r e deallocate (f: File) is --close f and return all global resources such as I/O buffers held by the file end deallocate; procedure assign (source, target : in out File) is begin

increment (target.usage_ count); if decrement_and_test (target.usage_count) then deallocate (target.usage_count); end if; target := source; --target and source now reference the same File end assign; begin --Initialization code, usage_count is set to 1 ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



141

terminate if decrement_and_test (File.usage_count) then deallocate (File.usage_count); end if; end File:

In the case of non-unary-class operations, if all class actual parameters do not have identical implementations, then the operation is not well defined and a C O N S T R A I N T _ E R R O R exception will be raised. For example, suppose the binary operation assign was included in the File class specification, together with the unary-class operation read, write, etc. If two implementations of files were declared, such as File'Sequential and File'Random, then each File implementation would contain a distinct implementation for the assign operation. A call to assign with parameters of two distinct file implementations, source_file : File'Sequential; sink_file : File'Random; begin

assign (source_file, sink_file);

raises C O N S T R A I N T _ E R R O R , since the implementation of source-file is not visible within the implementation of assign associated with sink_file, and conversely. Thus, binary operations should not usually be exported by a class that has multiple implementations. In the above example, assign could be implemented outside the class specification using the unary-class interface operations read, write, open, etc. If it is necessary to determine the implementation associated with a class object, then the class specification must include an explicit operation that distinguishes among implementations. For example, the File class specification could include the operation function is_sequential (f : File) return boolean;

--Returns TRUE if f is a sequential file The class construct introduced above is intended to provide a simple extension to Ada to permit the secure implementation of encapsulated abstract data types. Although the requirements for the class construct, introduced in Section 2.1, can be simply stated, it is not trivial to effectively integrate such a construct into a complex existing SPL such as Ada. In addition to the properties described above, the proposed class construct requires some changes to the Ada language standard for visibility rules, exception handling, and generic units. Nevertheless, without some extention to Ada to support encapsulated abstract data types, it is difficult to implement the model file system effectively. Also, the proposed class construct is simpler in many respects than the Ada p r i v a t e and l i m i t e d p r i v a t e type constructs which it replaces. A high-level SPL construct will only be effective if the construct does not impose an unacceptable overhead at compile or run time. The proposed class ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

142



W.F. Appelbe and A. P. Ravn

construct imposes a compile-time overhead comparable to that for Ada subprograms or packages. The run-time overhead is minimal since - - a class specification consists of a set of subprograms, and thus each class body can be represented as a vector of subprogram entries; --each class object consists of a data area and a reference to a class body--the data area and class body reference can be allocated when the object declaration is elaborated; --calls to class termination code can be generated by the compiler when the scope ~of a class object is exited or when dynamically allocated objects are reclaimed. Thus class objects do not require a run-time support package for storage management or type checking, and the only overhead associated with multiple class bodies is indirect access and checking that all class operands of non-unaryclass operations reference identical implementations. 3.2 Logical I/O Interface

The package specification for the user interface of the model file system described in Figure 1, in extended Ada, is package Logical_IO is --Logical_IO defines the abstract data types file and stream --Logical_IO is used by both user programs and the user environment, --which defines standard_IO type File_Mode is (READ, WRITE, READ_WRITE); type Open_ Status is (GRANTED, BLOCKED, --open failed: file was in use (write) ILLEGAL); --open failed: security violation type File_Security is (STATIC, --from prior close TEMPORARY, --close will delete file READ, READ_WRITE); END_OF_FILE: exception; USE_ERROR : exception;

procedure create_directory (name: String; result: out Boolean); procedure delete_directory (name : String; result: out Boolean); class File (file_name : Spring := '"') is --default null file_name procedure open (f: File; status: out Open_Status; name: String := '"'; mode: File_Mole := READ); procedure read (f: File; item: out String); procedure write (f: File, item : String); procedure seek (f: File; index: Natural); --relative to Start-of-File procedure close (f: File; secure: File_Security := STATIC); end File; end Logical_IO; ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



143

3.3 Logical I/0 Implementation The implementation of the Logical_IO package has four major components: (i) (ii) (iii) (iv)

the the the the

specification of the Physical_IO package; implementation of the Physical_IO package; specification and implementation of the file directories; implementations of the file class (IO Files and Pipes), using (i) and (iii).

In a modular implementation the file class should be distinct from the management of physical storage resources. Thus a separate package, File_Directory, is introduced to --manage the allocation of physical storage, --implement the creation and deletion of directories, --provide subprograms to search directories for the file implementation. Within the File_Directory the status of a physical file is encapsulated by a file Descriptor class. The Descriptor class exports operations to map logical file addresses to physical addresses. The implementation of I/O Files uses the Directory package to open and close files. When a file is opened the Directory package is called to search for the physical file associated with the logical file name. The search operation returns a Descriptor, which is used by subsequent read and write operations. The implementation of the Directory package is responsible for enforcing policies such as locking a file that is opened for writing against being opened for read or write access by others. Directories are maintained as files of Descriptor data. Hence the implementation of the Directory package uses the file interface to access directories. The implementation of files relies upon the file directory package interface, and the implementation of the file directory package relies upon the file interface. Such an interdependency of data types is only possible if the interface and implementation of the two data types are distinct. The organization of the Logical_IO implementation is given in Figure 2.

3.4 Physical I/0 Interface The UNIX file system adopts the convention that each physical device provides the same interface9 to the higher levels of the file system. The same principle has been adopted in implementing the model file system. The device interface in extended Ada is a class that exports the device I/O operations open, close, read, and write upon physical files, together with status operations: access_mode (READ, WRITE, or READ_WRITE device), status (the status of the last I/O operation), bur_size (the size of the I/O buffer for block I/O devices). Not all devices can provide both the I/O operations read and write. Devices for which either of these I/O operations is not valid, such as write on a READ mode device, must still implement these operations as run-time errors within the corresponding class body. Each physical device is implemented by an instance of 9 Actually block a n d c h a r a c t e r I / O devices are two distinct categories of interfaces in U N I X . However, in t h e model file s y s t e m we a s s u m e only one interface category. ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

144



W.F. Appelbe and A. P. Ravn

LOGICAL_IO I

I~L~O

1 I I

,'E~IRECTO,Y

I

\ . . . . . 11

,'.

/a_;

.'

', ~

............

FFOQUEL I

,'H

!

/

DEVICE SET DEVICE:ARMY(DEVICSEET)OF i DEVICEItI~FACE;

Fig. 2.

Organization of the Logical_IO implementation.

PHYSICALIO

the device interface class. Thus every file will perform physical I/O operations using the device interface specified by the file descriptor associated with the file_ name. For example, if a file F is opened by the operator open(F, "/DEV/TTY"), then all subsequent reads will use the TTY interface (given that the device interface in the directory entry f o r / D E V / T T Y is the TTY). There are three approaches in extended Ada to implementing dynamic binding of a file to a particular device interface in the physical I/O package. (i) Case Selection. Each device is a distinct package, with similar I/O subprograms declared in its specification: package Disk_ 1 is procedure read (...) i s . . . end Disk_l;

package Disk_ 2 is procedure read (...) i s . . . end Disk_2;

ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

EncapsulationConstructsin SystemsProgramming Languages



145

p a c k a g e Card is p r o c e d u r e read (...) i s . . .

e n d Card;

a n d an e n u m e r a t i o n type t y p e Device_Set is (DISK_I_ID, DISK_2_ID . . . . CARD_ID); is used in the file i m p l e m e n t a t i o n to select devices for physical I / O operations, for example, c a s e device_name is w h e n D I S K _ I _ I D ~ Disk_l.read (...); w h e n DISK_2_ID ~ Disk_2.read (...);

w h e n CARD_ID

~ Card.read (...);

e n d case;

(ii) Dynamic Binding. E a c h device is a distinct instance of a D e v i c e _ I n t e r f a c e class. T h e D e v i c e _ I n t e r f a c e class has an i m p l e m e n t a t i o n for each category of physical device. Devices are declared as instances of these distinct Device_ Interface i m p l e m e n t a t i o n s in the physical I / O interface: c l a s s Device_Interface is

p r o c e d u r e read (...) i s . . . e n d Device_Interface;

c l a s s b o d y Device_Interface'Disk i s . . .

class body Device_Interface'Card i s . . . t y p e Device_Access is a c c e s s Device_Interface; disk_l: Device_Access := n e w Interface'Disk (...); disk_2 : Device_Access := n e w Interface'Disk (...);

cards : Device_Access := n e w Interface'Card (...); and the file i m p l e m e n t a t i o n declares a variable of type Device_Access, which is assigned to access a particular D e v i c e _ I n t e r f a c e w h e n the file is opened. d e v i c e : Device_Access;

--Assigned (e.g., to disk_l) when the file is

opened

T h e device is t h e n accessed by a class operation such as r e a d (device.all . . . . ); --device.all is the device interface (iii) Indexing. As above, each device is a distinct instance of a D e v i c e _ I n t e r f a c e class, and the D e v i c e _ I n t e r f a c e class has an i m p l e m e n t a t i o n for each category ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

146



W.F. Appelbe and A. P. Ravn

of physical device. An enumeration of physical devices type Device_Set is (DISK_I_ID, DISK_2_ID . . . . CARD_ID);

is used to allocate an array of device_interfaces in physical I/O, which are initialized as distinct device implementations: device : a r r a y (Device_Set) of Device_Interface := (DISK_I_ID ~ Device_Interface'disk (...), DISK_2_ID ~ Device_Interface'disk (...), CARD_ID ~ Device_Interface'card (...)); The file implementation uses an index device_id : Device_Set; --Initialized when the file is opened to access the device implementation for each physical I/O operation such as read: read (device (device_id) . . . . ); The first approach, case selection, is clumsy and inefficient, and (worst) requires the file implementation case statements to be modified whenever the set of devices, Device_Set, is modified. Case selection is the only approach supported by standard Ada, since both the other approaches rely upon a device interface class. The second approach, dynamic binding, is efficient but uses dynamic allocation and is potentially insecure. Dynamic binding, using assignment by reference, is unnecessary since the device is not accessed anonymously; the file always knows the instance of the device to which it is "bound." The third approach, indexing, is both simple and secure, and is used in the physical I/O package below. It illustrates several important properties of classes: - - A n array of Device_Interfaces may be used since classes can be components of other type and class declarations, and class components are initialized when objects containing them are allocated. - - T h e Device_Interfaces do not need to have a common implementation, since a class may have more than one implementation. - - T h e Device_Interfaces cannot be modified outside the Physical_IO package, since operations upon class instances such as assignment are not exported. An alternative SPL approach to providing multiple class implementations is to bind class instances to implementations at load time (e.g., the C/Mesa configuration language [9]) or at run time by a procedure call such as initialize_class (device(disk_l), "code_file_name"). The physical I/O specification in extended Ada is with IO_Buffer_Manager; --IO_Buffers are parameters to device operations use IO_Buffer_Manager; package Physical_IO is --Physical_IO defines the device_interface class and the set of devices type physical_address is new Natural; --address of data on a device NULL_ADDRESS : constant physical_address :--- ....; --Device_set is an enumeration of all devices in the system configuration ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



147

type Device_Set is (DISK_I_ID, DISK_2_ID . . . . . CARD_ID) t y p e Device_Mode is (READ, WRITE, READ_WRITE); type Device_Status is n e w Natural; --Device status code class Device_Interface is f u n c t i o n access_mode (d: Device_Interface) r e t u r n device_mode; f u n c t i o n buf_size (d :Device_Interface) r e t u r n Natural; p r o c e d u r e open (d: Device_Interface; write_access: Boolean := FALSE); p r o c e d u r e read (d: Device_Interface; from: physical_address; buff: IO_Buffer); p r o c e d u r e write (d: Device _ Interface; to: physical_ address; buff: IO- Buffer); f u n c t i o n status (d: Device_Interface) r e t u r n Device_Status; p r o c e d u r e close (d: Device_ Interface); end device_interface; device: a r r a y (Device_Set) of device_interface := --Initialize each of the Device Interfaces. - - The bodies of the interfaces must be declared in the - - same scope as the class declaration (Device_Interface'Disk (...), Device_Interface'Disk (...),

Device_Interface'Card (...)), invalid_operation: exception; end Physical_IO;

3.5 Physical I/0 Implementation T h e body of the Physical_IO package includes the body of each different device category, together with any other c o m p o n e n t s needed to implement low-level I/O. Since the detailed implementation of the I/O devices is not relevant to encapsulation, it has been omitted fro~m the example. p a c k a g e body Physical_IO is --defines the class bodies for each of the device type class body Device_Interface'Disk (...) is - - D i s k Interrupt Handler encapsulates,the hardware interface task Disk_Interrupt_Handler is e n t r y IO_Complete; e n t r y start_IO (...); for IO_Complete use at . . . end Disk_ Interrrupt_ Handler; task body Disk_Interrupt_Handler i s . . . e n d Disk_ Interrupt_ Handler; Disk_Status : . . . --Disk_Status represents the Disk device status f u n c t i o n access_mode (d:Device_Interface) r e t u r n device_mode is begin return (READ_WRITE); --all physical disk files are READ_WRITE end access_mode;

function buf_size (d: Device_Interface) r e t u r n Natural is begin return (...); --Implementation dependent constant:disk block size end buf_size; p r o c e d u r e open (d: Device_Interface; write_access : Boolean :-- FALSE) is --Reset disk status end open; ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

148

W. F. Appelbe and A. P. Ravn

procedure read (d: Device_Interface; from: physical_address; buff: IO_Buffer) is

--Call the disk interrupt handler entry end read; procedure write (d: Device_Interface; to : physical_address; buff: IO_Buffer) is

--Call the disk interrupt handler entry end write; function status (d:Device_Interface) return Device_Status is

--Return the current disk status end status; procedure close (d: Device_Interface) is --Reset disk status end close; begin

Initialize the disk device using the class parameters end Device_Interface'Disk;

--Other Device Interfaces class body Device_Interface'Card (...) is --implementation of operations for Card end Device_Interface'Card; end Physical_IO; 3.6 File Implementation

The model file system provides two distinct implementations of the file class: (1) I/O Files, referred to simply as files; (2) Pipes, which maintain FIFO queues of data items. It is important to distinguish between the concepts of two distinct types of files and two distinct implementations of files. In the case of distinct types of files, the type of a file is statically determined and every file object declaration must specify which file type it uses. In the case of distinct implementations of files, the implementation of a file is dynamically determined and all file operations can be applied to any file implementation. Thus routines that access files can be independent of the files' implementation. The I/O file implementation uses the Physical_IO package, the IO_Buffer_ Manager package, together with directory management routines exported by the Directory package. The Pipe file implementation uses the Queue class, with operations enqueue and dequeue, which are assumed to be declared in a separate FIFO_Queue package. Since the queue operations introduce no new language concepts, their declaration has been omitted. ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



149

The I/O file implementation defines (i) file status data, for example, is_open:Boolean; --TRUE if the file is open (ii) exceptions and error handling routines; (iii) the bodies of the file operations open, close, read, write, and seek; (iv) the initialization and termination code for file instances. The File_Directory package and the body of the Logical_IO package containing the implementation of files are as follows: with Physical_IO; --Descriptor operations return Physical_Addresses and Devices use Physical_IO; package File_Directory is --File_Directory is called by the file operations open and close and --by the Logical_IO subprograms create and delete directory class Descriptor is --Descriptor encapsulates the directory data for a file. --Make_descriptor is called to initialize a descriptor --with data read from the directory file. --Get_device and get_address are called by the file --open, read, and write operations. procedure make_descriptor (d: Descriptor; descriptor_data: String); function get_device (d: Descriptor) return Device_Set; function get_address (d : Descriptor; 1: Logical_Address) return Physical_Address; end Descriptor; procedure create_file (name: String; file_id: Descriptor); procedure delete_file (file_id: Descriptor); procedure find_file (name: String; file_id: Descriptor); function make_dir (name: String) return Boolean; function remove_dir (name:String) return Boolean; end File_Directory; with Logical_IO;--Each Directory is a File containing Descriptors use Logical_IO; package body File_Directory is --define the implementation of file directory entries class body Descriptor is --Descriptor operations and the root descriptor end Descriptor; ROOT_FILE_NAME: constant String := "/"; root_directory_file : File (ROOT_FILE_NAME);

--Accessed by directory subprograms procedure create_file {name : String; file_id: Descriptor) is end create_file;

--Bodies of other directory subprograms, delete_file, --find_file, make_dir, remove_dir ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

150



W.F. Appelbe and A. P. Ravn

begin --Opens the root_directory_file and initializes file tables, etc. end File_Directory; with Physical_IO; with FIFO_ Queue; with IO_Buffer_Manager; with File_Directory; package body Logical_IO is use File_Directory; procedure create_directory (name: String; result: out Boolean) is --Calls make_dir in the File_Directory package end create_directory; procedure delete_directory (name: String; result: out Boolean) is --Calls remove_dir in the File_Directory package end delete_directory; class body File (file_name: String := '"') is - - T h e default implementation of File, using Physical_IO --all Directory access is through the File: Directory package use Physical_IO; use IO_Buffer_Manager; --File status data is_open: Boolean; - - T R U E if the file is open physical_ status: Descriptor; --Physical state of the file device _id: Device_ Set; procedure open (f : File; status: out Open_Status; name :String :-- ' ...., mode :File_Mode := READ) is --Calls find_file in the File_Directory package - - a n d create_file if the file does not exist end open; --Bodies of other File operations read,

--write, seek, close begin - - s e t the initial state of the File if (file_name = '"') then is_open := FALSE; else open (File, status, file_name, READ_WRITE); end if; terminate --Close the file if it is still open A C M Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



151

if is_open then close (File); end if; end File;

class body File'Pipe (file_name: String :~ '"') is --The implementation of Pipes, using FIFO Queues use FIFO_Queue; buffer:Queue; --Queue class is exported by FIFO_Queue p r o c e d u r e open (f : File; status: out Open_Status; name : String := " " ; mode :File_Mode := READ) is --Calls initialize_queue in the FIFO_Queue package --The name parameter must be null, and the mode READ_WRITE end open;

--Bodies of other File'Pipe operations read, write, seek, and close - - P i p e initialization and termination end File'Pipe; begin

--Initialization for Logical_IO end Logical_IO;

4. ENCAPSULATION CONSTRUCTS IN SYSTEMS PROGRAMMING LANGUAGES

Facilities for encapsulation, or modularization, have appeared in many languages. Encapsulation constructs can be either dynamic or static. Dynamic encapsulation constructs, such as classes in SIMULA [1] and clusters in CLU [7], are present at run time. Static encapsulation constructs in Ada [4], such as the package, do n o t need to be represented at run time. Both static encapsulation of environments (packages) and dynamic encapsulation of abstract data types (classes) must be provided in SPLs. Systems such as the model file system can be designed as a hierarchy of packages and classes. Such designs should be directly and effectively implementable by an SPL. In the following sections encapsulation constructs provided by several modern SPLs (Ada, Pascal Plus, Concurrent Pascal, Modula-2, Mesa, and CLU) have been evaluated. The evaluations of encapsulation are strictly based on the ability of the SPL to implement the model file system specifications directly. Comparison of language constructs other than encapsulation mechanisms, such as data types, processes, interprocess communication, and exception handling, have been omitted. Several current SPLs, such as EUCLID, have been omitted from the evaluation below. A comprehensive survey of SPLs is not included because of the large ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

152



W.F. Appelbe and A. P. Ravn

n u m b e r of specialized p r o g r a m m i n g languages a n d dialects t h a t have been developed for systems p r o g r a m m i n g . Instead, we have chosen a representative survey of current SPLs, omitting those S P L s whose encapsulation constructs are covered by other S P L s t h a t have been evaluated. T h r e e major issues are raised by the comparison: (1) H o w effectively can the language i m p l e m e n t package a n d class encapsulation? (2) H o w effectively does the language s e p a r a t e the interface a n d i m p l e m e n t a t i o n of encapsulation constructs? (3) W h a t facilities does the language provide for d y n a m i c binding a n d object sharing? , 4.1 Ada

T h e principal Ada construct for encapsulation of declarations 1° is the p a c k a g e . A class construct can be partially s i m u l a t e d by m e a n s of a l i m i t e d p r i v a t e d a t a type within a package. T h u s the class declaration class c is p r o c e d u r e opl; p r o c e d u r e opn; end C; class b o d y C is p r o c e d u r e opl ( . . . ) is p r o c e d u r e opn ( . . . ) is begin --initialization code for C, c_init terminate --termination code for C end C; with class i n s t a n t i a t i o n a n d operations c_object: C; opl(c_object . . . . ); is simulated in Ada by p a c k a g e C is t y p e C_type is limited p r i v a t e ; p r o c e d u r e opl(c : C _ t y p e . . . ) i s . . . p r o c e d u r e opn(c: C _ t y p e . . . ) i s . . . ~oAda also provides task types and generic packages, tasks can be regarded as a special category of class construct, in which the class initialization code is concurrently executed. Like classes, tasks can only be accessed using operations (entrys) declared in the task specification, generic packages provide a mechanism for parameterizing packages by types, subprograms, values, and objects, generic packages cannot be used to simulate run-time classes, because instances of generic packages are always elaborated at compile time. ACMTransactionson ProgrammingLanguagesand Systems,Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



153

private function c_init... --initialization code for C type C_type is record --only record types can have default initialization in Ada

c_state:... := c_init(... ); end record; end C; package body C is procedure opl ( ... ) is procedure opn ( . . . ) is end C;

and class instantiation and operations are implemented by c_object: C.C_Type; C.opl (c_object .... ); The Ada implementation is syntactically more cumbersome since the package and abstract data type(s) it implements are distinct. A major limitation is that Ada cannot specify operations to be performed on the c_object when it is terminated (i.e., its enclosing scope is exited). Either the user must be responsible for returning c_object's resources by means of a terminate operation or the compiler must automatically generate code to deallocate system resources. The former solution is insecure and the latter solution forces the Ada compiler to be system dependent. Ada provides mechanisms to control task termination. However, the need for control of termination extends to objects other than tasks. A further limitation of Ada is the lack of semantic precision in defining parameter passing for objects. By using l i m i t e d p r i v a t e types in Ada, it is not possible to directly ensure that insecure copies of objects are not created (Section 3.1). Ada also provides no mechanism to permit l i m i t e d p r i v a t e types to have more than a single implementation. Although the language does not explicitly forbid providing more than one body for a given package, there is no mechanism within the language to choose a particular implementation. Dynamic assignment might be implemented in Ada, without using dynamic allocation, by using unchecked conversion: generic type Source is limited private; type Address is access Source; function assign_access (s : in Source) return Address; with UNCHECKED_CONVERSION; function assign_access (s : in Source) return Address is function value is new UNCHECKED_CONVERSION (SYSTEM,Address); begin return value (s'Address); --Address is a predefined attribute of any variable end assign_access;

Dynamic binding

is provided by an

instance

of the

generic library

ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

154



W.F. Appelbe and A. P. Ravn

function assign_access: with assign_access; type Dynamic_File is access File; function assign_file is new assign_access (File, l~ynamic_File}; f: File; d: Dynamic_File; d := assign_file (f); This approach is insecure and implementation dependent, and requires the user to distinguish files that are dynamically assigned by using an access type. Some of the limitations of the Ada implementation of classes can be overcome by adopting conventions for declaring and using l i m i t e d p r i v a t e types within packages [11]. However such programming conventions require discipline and cooperation among programmers and systems designers for the conventions to be effective. 4.2 Pascal Plus

Pascal Plus [14] is an extension of Pascal designed for modular multiprogramruing. It provides an e n v e l o p e mechanism that can model both packages and classes. The syntax of an envelope definition is

envelope [module] envelope_name [parameter_list]; [declaration_list] block If the m o d u l e option is used, then one instance of the envelope is defined and declared; that is, the envelope module models a package. If the m o d u l e option is omitted, the envelope models a class, and instances of the envelope are declared as follows: instance envelope_id: envelope_name An envelope exports read (or execute) access to any identifier (or routine) whose name is preceded by "*" The envelope block contains both initialization and termination code for instances of the envelope, separated by the "statement"***. Although envelopes are a simple, flexible mechanism, they have several disadvantages: (i) The asterisk notation for exported identifiers does not distinguish the envelope interface from its implementation. Thus separate compilation is difficult, and each instance of an envelope has the same implementation. (ii) If a strict read-only export rule is enforced, including indirect access using exported pointers, then it is not possible for a manager envelope to allocate resources that can be directly accessed by other envelopes. (iii) Name equivalence of types is desirable by any SPL. Since an envelope can export (asterisked) type identifiers, different instances of an identical type can have different names. (iv) Like Ada, Pascal Plus provides no means of dynamic binding without dynamic allocation. ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



155

4.3 Concurrent Pascal

Concurrent Pascal [2] is an extension of a Pascal subset, Sequential Pascal, with a bold ambition: "The aim of Concurrent Pascal is to do for operating systems what Sequential Pascal has done for compilers: to reduce the programming effort by an order of magnitude" [3, p. xiii]. The encapsulation constructs provided by Concurrent Pascal, are processes, classes, and monitors. A Concurrent Pascal class is a restricted implementation of the class concept, and a Concurrent Pascal monitor is a class in which each operation upon a class instance is a critical region. Concurrent Pascal classes have many restrictions: --Class, process, and monitor type declarations cannot be nested. A Concurrent Pascal compilation consists of a sequence of such system type declarations. --System type objects such as classes can only be declared as permanent variables, that is, as global variables within other system types. --Classes cannot be declared as type components of other type declarations. --Upon initialization, system types such as classes can only be passed constant parameters. The effect of these restrictions is that all class, monitor, and process objects are statically allocated at load time and all shared data are encapsulated by monitors. Thus, there is no provision for termination code since all class instances are permanent. Concurrent Pascal has no package construct and does not separate the specification and representation of classes. Thus Concurrent Pascal is unsuited to separate compilation. Concurrent Pascal provides neither dynamic allocation (pointers in Pascal) nor dynamic binding for classes. Because of the lack of dynamic binding and globally accessible data, Concurrent Pascal forces inefficient implementations, such as copying data for class operations which write a buffer to a file. Although Concurrent Pascal's restrictions enhance software reliability, Concurrent Pascal is too restrictive for an effective implementation of systems with dynamic resources such as files and buffers in a multiuser system [12]. 4.4 Modula-2 Modula-2 [16] is designed as a systems programming language for minicomputers. Its primary ancestor is Pascal, and its principal encapsulation mechanism is the module construct. Modules are encapsulation mechanisms similar to Ada packages, with the following differences: (i) Modules have no formal parameters (Ada provides generic parameters). (ii) Representation and specification of modules are not separable. Instead module interfaces are specified by import/export lists in the module declaration. Thus Modula modules share the limitations of Ada packages. 4.5 Mesa Mesa [9] is a systems programming language developed a t the Xero Palo Alto Research Center (Xerox PARC). The language is integrated with a linking loader, ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

156

W.F. Appelbe and A. P. Ravn

C/Mesa, designed to support the development and maintenance of modular software written in Mesa. The distinctive features of Mesa are (i) a sophisticated collection of language constructs to provide separate compilation of systems (composed of modules) and the information hiding between them; (ii) support of procedures as proper data types and "pointer arithmetic"; (iii) powerful exception and error-handling facilities; (iv) two mechanisms for supporting concurrency: (a) a high-level coroutine mechanism using message p o r t s for transfer of control, (b) dynamically f o r k e d processes that synchronize using m o n i t o r s . Mesa provides a single encapsulation construct, the module. Mesa modules resemble packages rather than classes because (i) modules are not proper data types since module variables are not provided; (ii) module definitions cannot be nested; (iii) a module may start an instance of another module as a coroutine, but it cannot otherwise control its lifetime. A module can declare separate instances or interface records of another module. Dynamic binding (pointers to f r a m e s ) is provided, for example: directory device_interface: from "device_interface" --defines the device operations operations open, etc. device : program imports disk_l_device: device_interface, --bound by CMesa disk_2_device: device_interface, exports Physical_IO = --defines the types device_id, --device_access: type = pointer to frame [device_interface] --and link_file_to_device begin link_file_to_device: public procedure [device_name: device_id] returns [device_access] = begin return [select device_name from disk_l = disk_l_device; disk_2 = disk_2_device; [endcase] end end

Thus a file can be dynamically bound to a particular device interface module by calling link_file_to_device: access ~-- link_file_to_device[directory.device_name] ACM Transactions on Programming Languages and Systems, Vol. 6, No, 2, April 1984.

Encapsulation Constructs in Systems Programming Languages



157

and a file read can be implemented by item ~-- access.read[physical_file, index]. Thus Mesa's sophisticated constructs for separate compilation, together with dynamic binding, allow one module to have several static implementations. Mesa provides private data types; hence Mesa can simulate classes by an approach similar to that outlined for Ada. Like Ada it cannot control class termination. 4.6 CLU

CLU [7] was designed to support the use of abstractions in program development. The principal construct for data encapsulation in CLU, the cluster, implements abstract data types. Clusters can be used as type components of other declarations and can be parameterized by types and simple constants, permitting a single definition of a group of related abstract data types. All assignment in CLU is by sharing or reference. Hence dynamic binding is the default in CLU. Although this is efficient and enables resource sharing to be directly implemented, it has two disadvantages: (i) Assignment by reference creates an alias, which can encourage a style of programming that makes heavy use of side effects and hence is unreliable and difficult to maintain or verify. (ii) Since all assignment is by reference, all CLU objects are dynamically allocated and, in principle, continue to exist forever. In practice, a CLU implementation requires run-time support for automatic storage reclamation. Such storage reclamation and dynamic allocation may pose an unacceptable overhead for applications such as simple process control systems. CLU includes the cluster implementation within the cluster specification, and hence each cluster has a unique implementation. Also, cluster declarations cannot be nested and termination code for clusters cannot be specified since CLU objects are never explicitly deallocated. 5. EXTENDING THE MODEL FILE SYSTEM

The user interface of the model file system has several limitations that prevent it from being effective for a large, multiuser operating system. The major restrictions are (i) no hierarchical file directory, (ii) only a single file type and files of strings, (iii) no user-oriented file security. Each of these restrictions can be removed by extending the specifications and implementation of the model file system. The extensions can be specified and implemented without any need to restructure the model file system. Implementing the entire extended file system requires a systems programming language that supports --type parameterization (or Ada generics) of classes, to support user-defined file and stream item types; ACM Transactions on Programming Languages and Systems, Vol. 6, No. 2, April 1984.

158



W.F. Appelbe and A, P. Ravn

- - u n c h e c k e d conversion, to convert device buffer types to file item types; - - c o n c u r r e n t programming, including an interprocess c o m m u n i c a t i o n mechanism t h a t supports time-outs; --specification of the hardware interface; - - e x c e p t i o n handling.

6. CONCLUSION No existing high-level systems p r o g r a m m i n g language provides sufficient encapsulation m e c h a n i s m s to p e r m i t a direct, secure, effective implementation of the model file system. Although the model file system is an "artificial" application, it is based on requirements similar to m a n y real software systems. Language constructs sufficient to implement the model file system directly have been outlined. Complex systems, such as the model file system, provide a challenge to language designers to provide better encapsulation m e c h a n i s m s for developing secure, reliable, adaptable systems software.

REFERENCES 1. BIRTWlSTLE,G. Simula Begin. Auerbach, Pennsauken, N.J., 1973. 2. BRINCHHANSEN,P. The programming language Concurrent Pascal. IEEE Trans. Softw. Eng. SE-1 (1975), 199-207. 3. BRINCHHANSEN,P. The Architecture of Concurrent Programs. Prentice-Hall, EnglewoodCliffs, N.J., 1977. 4. ICHBIAH, J.D., HELIARD, J.C., ROUBINE, O., BARNES, J.G.P., KRIEG-BRUECKNER,B., AND WICHMANN,B.A. Rationale for the design of the Ada programming language. SIGPLAN Not. 14, 6, part B (June 1979). 5. KERNIGHAN,B.W., AND PLAUGER,P.J. The UNIX programming environment. Softw. Pract. Exper. 9 (1979), 1-15. 6. KERNIGHAN,B.W., ANDRITCHIE,D.M. The C Programming Language. Software Series. Prentice-Hall, EnglewoodCliffs, N.J., 1978. 7. LISKOV,B.H., ATKINSON,R., BLOOM,T., MOSS,E., SCHAFFERT,J.C., SCHIEFLER,R., SYNDER, A. CLU Reference Manual. Lecture Notes in Computer Science, vol 114. Springer Verlag, New York, 1981. 8. MATETI,P. Pascalversus C: A Subjective Comparison. Lecture Notes in Computer Science, vol. 79. Springer Verlag, New York, 1980, pp. 37-70. 9. MITCHELL,J.G., MAYBURY,W., AND SWEET, R. Mesa language manual. Xerox Palo Alto Research Center, Palo Alto, Calif., 1978. 10. SHAW,M., Ed. Alphard: Form and Content. Springer Verlag, New York, 1981. 11. SHERMAN,M., HISGEN,A., AND ROSENBERG,J. A methodologyfor programming abstract data types in Ada. In Proceedings of the AdaTec Conference on Ada (Arlington, Va., Oct 6-8), 1982, pp. 66-75. 12. SILBERSCHATZ,A., KIEBURTZ,R.B., AND BERNSTEIN, J.A. Extending Concurrent Pascal to allow dynamic resource management. IEEE Trans. Softw. Eng. SE-3 (1977), 210-217. 13. UNITED STATES DEPARTMENTOF DEFENSE. Military standard Ada programming language. ANSI/MIL-STD 1815A. American National Standards Institute, 1983. 14. WELSH,J., AND BUSTARD,D.W. Pascal-Plus--Another language for modular multiprogramruing. So#w. Pract. Exper. 9 (1979), 947-958. 15. WELSH, J., ANDLISTER, A. A comparative study of task communication in Ada. So#w. Pract. Exper. 11 (1981), 257-290. 16. WlRTH,N. Programming in Modula-2. Springer Verlag, New York, 1983. Received June 1981; revised August 1982 and August 1983; accepted August 1983 ACMTransactionson ProgrammingLanguagesand Systems,Vol.6, No. 2, April1984.