an integrated data base on a distributed processor ... Distributed data management systems possess a .... a very large one in the commercial world, yet this.
Sys1lem Architecture for Distibuled Data Management Richard Peebles* and Eric Manning University of Waterloo, Ontario, Canada
Introduction Successful implementation of most distributed processing systems hinges on solutions to the problems of data mangement, some of which arise directly from the nature of distributed architecture, while others carry over from centralized systems, acquiring new importance in their broadened environment. Numerous solutions have been proposed for the most important of these problems. In a distributed computer system, multiple computers are logically and physically interconnected over "thin-wire" (low bandwidth) channels and cooperate under decentralized system-wide control to execute application programs. Examples of thinwire systems are Arpanet, the packet-switched network of the U.S. Defense Communications Agency, and Mininet, a transaction-oriented research network being developed at the University of Waterloo. These may be contrasted with,high-bandwidth or "thick-wire" multiprocessor architectures, such as the Honeywell 6080 or the Pluribus IMP. A practical consequence of thin-wire design is that processing control is in multiple centers. No one processor can coordinate the others; all must cooperate in harmony as a community of equals. The key issue is that interprocess communication is at least an order of magnitude slower when the communicating tasks are in separate computers than it is when they are executing in the same
*Now on leave at Digital Equipment Corp., Maynard, Mass. 40
machine. Therefore, no single process can learn the global state of the entire system nor issue control commands quickly enough for efficient operation, so that multiple centers of control are implied. This definition does not imply that the computers are geographically distributed. On the contrary, the machines might all be located in one room; they could be considered a loosely coupled multiprocessor rather than a network. The choice of machine location is a result of economic and political considerations; that is, the distributed system architecture is amenable to both single-site and widespread machine location. The debate between centralization and decentralization has thus taken something of a new twist. The old arguments for centralized systems were based on economies of scale in computer architecture, simplification of computer center operation, and the need for an integrated data base. The arguments for economies of scale are highly questionable. Arguments for simplified computer system management are countered by the desire for corporate management to distribute computing power, together with the fact that distributed systems can be operated in one room, and are therefore as easily manageable as centralized systems. The elegance of the distributed system architecture is that these arguments are independent of the other issues of system design. If you plan for thin-wire communications from the outset, the machines can be located wherever considerations of economy, availability, and management combine to dictate. However, the advocates of centralized systems still hold one trump card: the need for an integrated COMPUTER
data base. This need is stronger now than ever. Nevertheless, an integrated data base doesn't necessarily require a single machine or a thick-wire multiprocessor architecture. On the other hand, imposing an integrated data base on a distributed processor architecture presents several problems.
management are of two types: fundamental problems, common to all of the architectures; and incidental problems, a result of trying to integrate systems that were built to stand alone. The latter occur in all but the most purely integrated architectures.
Fundamental problems
Architecture approaches Distributed data management systems possess a kind of architecture that is analogous to the more generalized architecture of whole computer systems. Various approaches have been taken to the study of distributed data management architecture;' we prefer a simple classification (Figure 1) that divides systems into integrated and federated classes. In the integrated architecture, each component data base management software package "knows about" the existence of all the others. This approach is possible only if the entire system is designed and built from the ground up. If, on the other hand, existing data base systems must be used, the result is one of two forms of a "federated" architecture. In the simplest case, each of the software data base managers supports the same model and command language. Such a system is said to be homogeneous. However, an integrator must be superimposed, containing the data and procedures required to make the collection of data base managers behave as a single entity. For example, a request or a data item that is not found locally may be routed to one or more remote data base managers by data search integration routines. Problems of integrator design are discussed later. The third class of architectures is a heterogeneous federation, in which totally disparate data base systems are linked. This type of architecture is clearly the most difficult to support. Both integrators and translators are necessary in heterogeneous federations. When the data base managers support different data models-for example, owner-coupled sets at one site and relations at another-both commands and data must be translated in order to pass between them. If application programs are to use the local data model, then the figure is correct as shown. If they are to use a common data model, then the translator resides between the integrator and the data base manager. An integrated architecture does not obviate the need for, nor desirability of, specialization at the system hosts. Specialization of data base content and processing is certain to occur. As more complex data models individually evolve from the common base, the need for translation is reintroduced. However, the common base remains as the vehicle for this translation; it is, therefore, much simpler than the interface problem in a heterogeneous federation. Naturally, life is not quite so simple. These three system architectures represent points on a spectrum rather than the only possible alternatives. Within this spectrum, the problems of distributed data January 1978
Any data management architecture must address five principal issues: how to provide an integrated data base, where to store data in the system, how to locate data, how to control concurrent access, and how to provide security and integrity. The distributed environment introduces new complications to each of these problems. They are intimately entwined and vary in complexity with network size, data base
DATA BASE MANAGER (1)
DATA BASE MANAGER (2)
DATA BASE MANAGER (n)
HOST-HOTRANSLATORI(2) IN LTWO
(c) L
HOST-HOST COMMUNICATIONS NETWORK
Figure 1. Architectures for distributed data management systems classified as (a) integrated, or loosely coupled multiprocessors, (b) homogeneous federation, or (c) heterogeneous federation. 41
size, availability requirements, and response time needs. Some design constraints, such as high availability, may be easier to meet in a distributed system; while others, such as response time, may become harder to satisfy. In this short article we can give only a brief description of the nature of these problems and point to some of the work that has been done toward solving them. If the distributed system is to be more than a communication channel plus a set of independent systems, then the data base must span more than one processor site. The purpose of the integrator components of Figures l(b) and l(c) is to achieve this span. All of the remaining problems discussed in this section must be solved in order to build such a distributed data base, but the first requirement is that the notion of distributed data be expressed in some way. Data distribution is desirable at both the information structure level and the storage structure level of data modeling. Distribution at the information structure level means that data location is an explicit part of the data model seen by application programs. Suppose, for example, a customer file is partitioned by city of residence. Then an application program can ask for data from the customer file for a particular city-say, Omaha. Distribution at the storage structure level implies the opposite; the location of data is hidden from application programs. Rather, data is allocated to processor sites by storage management algorithms. This requires that mechanisms be built into (or on top of) the data base managers to locate data objects and to allow for their migration. Information structure distribution should not force storage structure distribution. That is, the Omaha customer file might be' moved, temporarily or permanently, to a processor located in Hartford, without affecting the way application programs use the file.
Where to store data In a distributed system there is a clear benefit in storing data at the processor where it is most frequently used, since thin-wire communication implies that access to remote data is much slower than access to local data. If the frequency of access is known from all processors for each file or other unit of retrieval, then an optimal allocation of data objects to processors can be defined as one that minimizes the average access time. But this model is rather naive; several additional factors should be considered as well. For example, you should distinguish between update and retrieval access, since the former requires at least one extra transfer over the network to store the new data. Storage and transmission costs can be considered; and you may be free to adjust communication channel capacity as well as storage location. Storing more than one copy of a 42
data object tends to reduce read-only access times, but also tends to increase the time required for update, since all the copies must be updated consistently. Access patterns may change over time, so that today's optimal allocation may not be optimal tomorrow. The data allocation problem has been shown to be "polynomial complete' '2-i.e, execution times of the best known algorithms grow exponentially with the size of the problem. Today's practical applications involve more data than can be optimally allocated by any known algorithm; therefore heuristic techniques are necessary. These are techniques that are known to work well, but have not been proved mathematically to find the optimum solution. A number of researchers listed in the bibliography have applied mathematical programming optimization models. Certain recent contributions represent the most realistic sets of assumptions. One of these introduces the notion of imperfect knowledge of access statistics, noting that these statistics may change with time.3 These researchers divide time into periods during which they take the access patterns to be static, and find an optimal static assignment of data to processors. If the access patterns for the next period are known, another assignment can be computed. But if these assignments are not identical, then the method incurs the cost of moving the data into the new configuration. Thus their aggregate model minimizes both access and reconfiguration costs. Another study developed a very comprehensive model that distinguishes query and update traffic.4 Updates require more message transmission between the processing host and the storing host than queries do. The problem is how to allocate copies of data objects to processors and also to allocate communication bandwidths, to minimize the combined storage and transmission costs. The technique imposes the constraint that average access time is to be bounded by a parameter supplied by the designer. Then the average message delay is expressed in terms of the network traffic (using Kleinrock's formula), which in turn depends upon the data objects and capacity assignments. A further constraint is that availability of object j must exceed a parameter A(j). Again heuristic methods must be applied if problems of even moderate size are to be tackled. The investigators programmed an absolute optimizer and a heuristic optimizer. For five processors and 20 files the absolute optimizer was still computing after 20 minutes. The heuristic program found a better solution in 5 minutes. Clearly, more research is needed in this area. A problem with five processors and 20 files is is hardly a very large one in the commercial world, yet this study showed that it sorely taxes current solution techniques that seek an absolute optimum. But researchers must be warned by the polynomial complete character of the data allocation problem; they should seek improved heuristics, because it seems unlikely that absolute optimum-seeking algorithms will be successful. COMPUTER
Locating data
locked by the other. Either prevention or detection
and recovery is necessary. When data is needed, how is it found in the disDeadlock control in a single system is well undertributed system? Presumably a request for a data stood,6 but in a distributed data management system object can originate at any processor; the data base the locking problem is made more difficult by the manager at that processor must have some algorithm existence of multiple centers of control. Since in a for finding where it is stored. Several alternative thin-wire system no one processor is allowed to constrategies are available for locating a data object, trol all the others, concurrency must be controlled assuming that it has an identifier unique in the net- through cooperating algorithms executed at each work. The choice of strategy will depend on the host. The problem is to ensure that the distributed number of objects from which one is to be found, the data base remains consistent despite attempts at fraction of requested objects that are stored at the concurrent update from different processors. Objects at more than one host, such as duplicate data or requesting site, and the dynamics of updating. The simplest procedure is to encode location infor- structural information, must be concurrently lockmation in the identifier. But then moving an object able. There have been many proposed algorithms for becomes extremely onerous, because every reference to the object must be found and updated, and one maintaining the consistency of multiple copies of can never be sure all references have been corrected. data. One algorithm assumes that accesses to an So this solution is almost certainly not acceptable. object are either queries of the contents or updates Another simple approach is to first look for the that append new data or overwrite old data.7 A "time object locally-at the site where the request is stamp" is recorded for both queries and updates, made-and, if it is not found, to broadcast a request and the most recent value is used as a stored item's for it to all other sites. This procedure creates an true value. The algorithm achieves "eventual conenormous quantity of communication traffic; in a sistency," although at any given time there is no network of N processors, it invokes N-2 data base guarantee of consistency. More recently, this algomanagers that need not have been involved. Only if rithm has been expanded to work with general upbroadcasts are very rare will this be tolerable. dates, which include modification of old data as well A third alternative is to provide each data base as augmentation and replacement.8 Two other manager with a full index that identifies the location proposals impose a linear order on resources that of every data object in the network. However, each are to be locked, and permit lockings only in the manager must keep its index somewhere locally; sequence defined by this ordering.9'10 This must be thus the problem is one of storage space, especially done if deadlock prevention is desired and a lock for times when the units of retrieval are small and cannot be preempted. numerous-short files or records within files. FurtherAnother strategy for deadlock prevention with more, every time an object is added, deleted, or multiple copy objects is to perform multi-step lockremoved, every index must be updated. ing. This strategy adds an in-preparation state There are intermediate solutions. For example, between the two extremes of locked and unlocked, full indexes could be kept at only a few service loca- or available." Locking all copies of the object tions. Or if broadcast search is to be used, it may be requires the requesting processor to first issue a possible to define "neighbors" for each processor, PREPARE request, and the data base managers to X, as those processors most likely to hold data acknowledge the request. Only when the PREPARE requested at X. acknowledgments have been received can the reAn analysis of this problem,5 comparing costs and quester issue a lock command to each. The sequence access delays for a number of alternative index of events that can occur is complex. structures, shows that for update rates of less than Another investigation compares centralized and 10% of the total access rate, distributed full indexes distributed detection schemes.'2 In a centralized are better than a centralized full index. On the other scheme, one processor has the responsibility of hand, the choice reverses for higher update rates. monitoring potential lockups, departing from true The local index with broadcast search procedure is system-wide control, as defined at the beginning of almost never preferred. this article. In a distributed procedure, each data base manager periodically broadcasts the local current status to each of the other managers. From this information, all the managers then construct an approximation of the global picture. Both of these schemes involve substantial overhead and are Concurrency control probably not acceptable if the locking granularityTo maximize the concurrent use of system resources the size of the unit of retrieval-is small. Further research on both detection and prevention by multiple users, shared access is allowed to the system resources. A data object is "locked" by a schemes is needed. In particular, far too little is user only when he must be assured that it is not in known about the probability of interference and some transient state. As soon as locking of resources deadlock in concurrent data base accesses. For is permitted, the possibility of deadlock arises, when transaction processing systems there is strong two or more users each are trying to reach an object reason to believe that interference is rare, and that January 1978
43
elaborate avoidance algorithms would not be economical. Research on this issue is under way.
Security and integrity Data base integrity can be comprised by inadequate concurrency control, erroneous software, security breaches, or by system failure ("crashes"). Concurrency has already been discussed; the correctness of software is beyond the scope of this article. The security problem can be rendered more or less difficult in a distributed system than in a centralized one, depending upon details of the system architecture and structure of the application. Problems that arise include securing data during transmission and assuring that any site to which data is sent enforces the same security policy as the site that normally holds it. The literature is almost totally devoid of solutions to problems of the latter type. In a distributed system that is subject to component failure (a likely proposition), correct operation is hard to guarantee. Suppose, for example, that a transaction updates three records, each stored at a separate host. None of the three updates is in effect until all have been completed and acknowledged. Suppose the source of the transaction, having received all three acknowledgments, sends out the message, "OK to put into effect," to the three data base managers, but one of those three goes down before it has put its update into effect. This is closely analogous to the problems of data transmission protocols, where correctness in a certain sense cannot be absolutely guaranteed.13
A recent investigation attempts to resolve the problem by writing to "stable storage" an "inten-
tions list" of actions necessary to complete the updating parts of a transaction.'4 The stable storage ensures that either write opertions are executed completely or make no change to the stored data; this is true even if a crash occurs in the middle of a write (the investigators suggest an architecture). An intentions list must have an "idempotency property," which guarantees that repeated execution of any subsequence of the intentions list has precisely the same effect as executing it once, provided that those sequences cover the whole list. Thus, if operations are restarted after a crash, it is acceptable to repeat some previously executed writes. Simple write operations have this property. With these tools, singleand multi-machine algorithms can be defined that guarantee the eventual execution of any properly defined transaction. (Users must define the start and end of a transaction. If the end is not signaled before a time-out occurs, the transaction is aborted.) The fundamental problems of distributed data management have not yet been given adequate attention, but a great deal of research is in progress. Security and integrity control are vitally important if distributed systems are to operate correctly. However, most of the work cited in each of the problem 44
areas we have discussed consists of designs on paper only; implemenatation and efficiency studies are needed.
Incidental problems
Unfortunately, none of the generalized data base management packages that are currently available meets the requirements for distributed data management. Therefore, either vendors or users will have to devise "cut and paste" techniques to support distributed systems. For example, where does one express data distribution in a model that adheres to the Codasyl Data Base Task Group (DBTG) recommendation? Few users are prepared to modify their model software to handle such definitions. This means that system integrators must be built on top of the current software until such time as the vendors provide modified software. Writing such integrators is not trivial. If, for example, a customer file is to be spread over several processors, the procedures that correlate customer identifications to processors must be written by the user. Furthermore, this is only a partial solution. If a DBTG set linking all customers with overdue accounts were desired, then the applications programs would have to recognize that this set would be implemented as several sets, one per host. This violates the desirable property of keeping the network structure invisible to applications. Additional problems arise because the current operating systems on which data management facilities are built do not support efficient interprocess communication, which must be available for data location algorithms, concurrency control procedures, and crash recovery. The construction of federated systems is not impossible, but it is likely to be very difficult. This is especially so in the heterogeneous case where data must somehow be translated in format when moved between hosts. This problem is unsolved in general, although a committee of the Codasyl Stored Data Definition and Translation Task Group has recently published a detailed proposal that addresses this issue."'
Examples Few distributed data management systems are operational. Those operated by the airlines and by banks are impressive, but they are "home-built" and do not represent generally available packages. Some manufacturers, such as Digital Equipment Corp. and Prime Computer, Inc., modestly support distributed data management, but it exists chiefly in the form of support for interprocess communication and nontransparent data distribution for homoCOM PUTER
geneous hosts. We know of no commercial systems that support all of the requirements outlined- in the previous sections on fundamental problems. However, many manufacturers are actively studying the problem. At the University of Waterloo we have been working on the logical design of distributed systems. Our objective is to do transaction processing on distributed data bases through an integrated architecture. Since transaction processing has modest computational requirements, the design is based on minicomputers. We view the data base as a single logical entity, but we assume that accesses to it exhibit geographic locality of reference. This means that the data base can be partitioned into components so that the majority of hits on a given component come from terminals in one geographic region; the network has a processor for each partition. But locality of reference is a statistical property, and infrequent access to remote components are required. Furthermore, management is assumed to have occasional need to process the data as a whole. Therefore, communication channels link the computers, providing the basis for a distributed system. This system, called Mininet, has been described in detail elsewhere.16 Its design is summarized in the following paragraphs, as an illustration of an integrated architecture. The support of transaction processing leads naturally to the design of a message-switched operating system. A transaction is handled by a number of small tasks. Each task performs a portion of the computation and invokes the cooperation of other tasks by sending them messages. An example is shown in Figure 2, where the circles represent the tasks that participate in processing a sales transaction, and the lines represent messages sent between tasks. Messages are passed by the communication nucleus of the system. This consists of a dedicated task, together with the communication subnet and its interface. The task is called the switch; a copy of it -is executed at each host. An application task sends a message by invoking the switch with a SEND command that identifies the receiver. The SEND command has precisely the same format, whether the receiving task is in the same host or in a remote host. The data management function is also implemented with this task structure. This is the basis for transparency of the network structure; the distribution of tasks among hosts of the system can be made invisible to the application code. All knowledge of network structure is concentrated in descriptive tables that are interpreted by the communication nucleus. (Given the transparency of task location, it would therefore be a simple matter to implement "back end" data management machines for Mininet.) Transparency of data distribution at the storage structure level is provided by cooperating groups of data management tasks. Each member of a group resides in a separate host. A request for data sent to any member of a group may involve all members. They cooperate to find the requested data file, using January 1978
any of the data search strategies outlined previously, in the section on locating data. This design has three key features that form the basis of constructing an integrated system on a distributed architecture. Dividing the computations into multiple small tasks. distributes the workload over'several processors. Transparency of task location in the interprocess c'ommunication mechanism allows reconfiguration of hardware and mobility of tasks over the system hosts. And finally, the use of data access task groups provides data location transparency. The Mininet system is being implemented on PDP-11/45 processors. The communication nucleus is operational, and the data management system is being constructed. Our approach to the fundamental problems outlined in this article will be described in a forthcoming technical report.
70
Figure 2. Transactions are processed by tasks (indicated by circles). Lines between circles indicate communication paths. Rectangles at bottom are secondary storage units; double rectangle at top is the point-of sale terminal.
45
Conclusion Distributed processing offers exciting possibilities. It promises improved performance, improved system availability, easier adjustment to a growing workload, and some individualized control over machine resources. But wide application awaits the implementation of distributed data management services and solutions to several difficult problems. Most notably, crash recovery has received insufficient attention. Even when these problems are solved, the problem of migrating from current systems to the new architecture must still be addressed. Nevertheless, the dramatic drop in the cost of computers makes distributed systems inevitable. People will purchase small machines to solve small local problems, and then recognize the advantages of connecting small machines to their current big systems. Data processing management would be wise to plan for growth in this direction by recognizing the requirements of distributed data management, and by demanding integrated systems from vendors. U
References 1. M. E. Deppe and J. P. Fry, "Distributed Databases: A Summary of Research," Computer Networks, Vol. 1, No. 2, September 1976, pp. 130-138. 2. K. Eswaran, "Placement of Records in a File and File Allocation in a Computer Network," Proc. IFIP Congress, 1974, pp. 304-307. 3. K. D. Levin and H. L. Morgan, "A Dynamic Model for Distributed Data Bases," Proc. ORSA/TIMS Conference, Spring 1975. 4. S. A. Mahmoud and J. S. Riordan, "Optimal Allocation of Resources in Distributed Information Networks," ACM Transactions on Database Systems, Vol. 1, No. 1, March 1976, pp. 66-78. 5. W. W. Chu, "Performance of File Directory Systems for Data Bases in Star and Distributed Networks," AFIPS Conf Proc., Vol. 45, 1976 NCC, pp. 577-587. 6. P. Brinch-Hansen, Operating System Principles, Prentice-Hall, 1973. 7. P. R. Johnson and R. H. Thomas, "The Maintenance of Duplicate Databases," ARPA Network Working Group, Request for Comments #677, Network Information Center, Document #31507, January 1975. 8. R. H. Thomas, "A Solution to the Update Problem for Multiple Copy Databases Which Uses Distributed Control," Bolt Beranek and Newman Inc., Report #3340, July 1976. 9. J. Gray, "Locking in a Decentralized Computer System," IBM Corporation, Technical Report TR-RJ1346, February 1974. 10. J. Day, "A Principle of Resilient Sharing of Distributed Resources," Transcript of the Distributed Processing Workshop, Computer Architecture News (ACM), Vol. 5, No. 6, February 1977, p. 12. 11. A. P. Mullery, "The Distributed Control of Multiple Copies of Data," IBM Corporation, Technical Report TR-RC5781, December 1975.
46
12. S. A. Mahmoud, "Resource Allocation and File Access Control in Distributed Information Networks," PhD dissertation, Carleton University, 1975. 13. C. Sunshine, "Issues in Communication Protocol Design-Formal Correctness," Reprints, ACM Interprocess Communication Workshop, March 1975, p. 118. 14. H. Sturgiss and B. Lampson, "Crash Recovery in a Distributed Data Storage System," Transcript, Distributed Processing Workshop, Computer Architecture News (ACM), Vol.'5, No.6, February 1977, p. 10. 15. Committee on Data Systems Languages, Stored Data Definition and Translation Task Group, "Stored-Data Description and Data Translation: A Model and Language," Information Systems, Vol. 2, No. 3, 1977. 16. Jacques Labetoulle, Eric Manning, and Richard Peebles, "A Homogeneous Network for Data Sharing," (two-part article), Computer Networks, Vol. 1, Issue 4, 1977.
Bibliography 1. Canaday, R. H., et al., "A Back-End Computer for Data Base Management," CACM, Vol. 17, No. 10, pp. 575-582. 2. Casey, R. G., "Allocation of Copies of a File in an Information Network," AFIPS Conf. Proc., Vol. 40, 1972 SJCC, pp.617-626. 3. Chandy, K. M., and J. E. Hewes, "File Allocation in Distributed Systems," Proc. Conf on Modeling and Measurement, Harvard, Massachusetts, March 1976, p. 10. 4. Chen, P. P. S., "Optimal File Allocation in Multi-Level Storage Systems," AFIPS Conf. Proc., Vol. 42, 1973 NCC, pp. 277-282. 5. Chu, W. W., "Optimal File Allocation in a Computer Network," IEEE Trans. on Computers, Vol. C-18, No. 10, October 1969, pp. 885-889. 6. Chu, W. W., and G. Ohlmacher, "Avoiding Deadlock in Distributed Data Bases," Proc., ACM National Conf, November 1974, pp. 156-160. 7. Chupin, J. C., "Control Concepts of a Logical Network Machine for Data Banks," Proc., IFIP Congress, 1974, pp. 291-295. 8. Holler, E., "Files in Computer Networks," Proc., First European Conference on Computer Systems, May 1973, pp. 381-396. 9. Karp, R. M., "Reliability Among Combinational Problems," Complexity of Computations, Plenum Press, 1972. 10. Kleinrock, L., Stochastic Message Flow and Delay, Dover, 1972. 11. Levin, K. D., "Organizing Distributed Databases in Computer Networks," PhD dissertation, University of Pennsylvania, 1974. 12. Metcalf, R., "Strategies for Operating Systems in Computer Networks," Proc. ACM, 1972, pp. 278-281. 13. Morrissey, John, "Distributed Processing SystemsSome Economic and Technical Issues," Transcript, Distributed Processing Workshop, Computer Architecture News (ACM), Vol. 5, No. 6, February 1977, p. 13.
COM PUTER
OREGON REPORT ON COMPUTING
14. Peebles, Richard, and Eric Manning, "A Computer Architecture for Large Distributed Data Bases," Proc., First Conference on Very Large Data Bases, Framingham, Massachusetts, September 1975. 15. Rtoberts, L. G., and B. D. Wessler, "Computer Network Development to Achieve Resource Sharing," AFIPS Conf Proc., Vol 35, 1970 (Spring Joint Computer Conference), pp. 543-549. 16. Whitney, V. K. M., "A Study of Optimal File Assignment and Communication Network Configuration," PhD dissertation, University of Michigan, 1970. Additional reference material appears in the Proceedings of the Second Berkeley Workshop on Distrbuted Data Management and Computer Networks, May 25-27, 1977, available from National Technical Information Service.
In cooperation with:
Oregon State University University of Southwest Lousiana ACM IEEE Computer Society Tektronix, Inc. Presents:
PROBLEMS OF THE 80'S: A LOOK INTO THE FUTURE OF COMPUTING March 21 st and 22nd Sheraton Portland Hotel Portland, Oregon
Richard Peebles, with the Department of Computer Science at the University of Waterloo since 1972, is an active member of the university's Computer Communications Networks Group. His principal research interests are in the area of distributed data base management. In addition he has worked on computer network performance modeling and on data base design (representation of data semantics). A member of ACM, he serves on the editorial board of the association's Transactions on Database Systems. He received his BSc in physics from McGill University in 1966 and his PhD in computer science from the University of Pennsylvania in 1972.
Presentations & Participants:
FUTUROLOGISTS: A. Ralston, M.V. Wilkes, E. Joseph SOFTWARE ENGINEERING: A. Wasserman, L.A. Belady, S.L. Gerhart, W. Waite, W.A. Wulf, E. Miller EDUCATION AND TRAINING: U. Pooch, M. Muller, T. Frederick, C.V. Ramamoorthy, Crockett, R. Chattergy, R. Austing DATABASE SYSTEMS: B.D. Shriver, S. Fuller, Borgerson, H. Stone, J. Thornton, R.P. Case, R.F. Rosin, R. Barton, J. Dennis, M.J. Watson SOCIAL/POLITICAL/ECONOMIC/LEGAL ISSUES: H. Wiener PERSONAL COMPUTING: R. S. Heisner, A. Goldberg, A. Osborn, J. Warren, T. Nelson BASIC TECHNOLOGY: H. Caswell, E.D. Eichelberger, W.C. Holton, S. Krishnaswamy, W. Lattin, P. Losleben, R.J. Petschauer, H. Stopper, R.M. Sullivan
Eric Manning is director of the Computer Communications Network Group at the University of Waterloo, where he has also worked as a member of the Department of Computer Science since 1968. An author of numerous articles on computer communications and auto|niated fault diagnosis in digital systems, he has lectured extensively on _ data switching, distributed processing, and network performance measurement as a member of the Computer Society's Distinguished Visitors Program. He is an editor of the journal ComputerNetworks. Manning received his BSc and MSc in applied mathematics from the University of.Waterloo and his PhD in electrical engineering from the University of Illinois (Urbana).
Registration Rates:
g
Non -Student
Student
Pre-conference Conference $60. $75 $50. $35.
Registration fees include two lunches, all coffee and tea breaks, one cocktail hour, and a copy of the Conference Digest.
CONTACT:
Terry Hamm Tektronix, Inc. 60-456 P.O. Box 500
January 1978
Reader Service Number 3