.
Book Reviews
Technical Editor: Marcin Paprzycki Dept. of Science and Mathematics Univ. of Texas of the Permian Basin Odessa, TX 79762-0001
[email protected]
Two books on distributed realtime systems reviewed by Janusz Zalewski, University of Central Florida Distributed Real-Time Systems: Monitoring, Visualization, Debugging and Analysis by Jeffrey J.P. Tsai, Yaodong Bi, Steve J.H. Yang, and Ross A.W. Smith 317 pages $54.95 John Wiley & Sons New York 1996 ISBN 0-471-16007-5 Knowledge-Based Software Development for Real-Time Distributed Systems by Jeffrey J.P. Tsai and Thomas J. Weigert 234 pages $79 World Scientific Singapore 1994 ISBN 981-02-1128-7
86
Despite the increasing market demand for, and the undoubted spread of, distributed realtime applications, not many books are available on distributed real-time systems. The most recent one, Distributed Real-Time Systems: Monitoring, Visualization, Debugging and Analysis, focuses mostly on the analysis of such systems via their monitoring and visualization, as the subtitle mentions. The authors follow mostly an experimental approach to analysis of distributed real-time systems (calling it the dynamic approach). Their approach consists of three steps: monitoring the system and collecting relevant data, filtering and visualization of the collected data, and the analysis proper. This contrasts with the theoretical (static) approach, which is based mostly on formal verification techniques, whose treatment in this book is limited to Petri nets. The book is divided into two parts: Basic Concepts, and Theory and Practice. The Basic Concepts part first presents the fundamental concepts of real-time and distributed systems. The authors list the principal characteristics of distributed real-time systems as continuous operation, stringent timing constraint, asynchronous and multithreaded process interaction, unpredictable communication delays and race conditions, nondeterminism and nonrepeatability, and global clock reference. The book then discusses basic monitoring techniques for such systems: software, hardware, and hybrid. These chapters are the most interesting in this part of the book. Monitoring combines two operations: triggering and recording. Triggering detects predefined events during program execution, and recording collects and stores the data pertinent to those events. Software monitoring techniques use interrupts or embedded code for triggering, and hardware monitoring techniques passively snoop the system paths for predefined combi-
nations of signals. This explains the principal difference between the two approaches: software techniques are invasive and add to the system workload, while hardware techniques are noninvasive and separate from the system workload. Hybrid techniques combine elements of both approaches. The difficulty with monitoring distributed systems is that not only must each node of the target system be equipped with the appropriate hardware or software devices, but all these devices must also connect to a central location via an existing or dual network. The four design goals of monitoring distributed real-time systems are to • allow transparent monitoring, • minimize and predict monitoring overhead, • improve memory speed and space limitations, and • minimize the machine dependency of monitoring hardware. Subsequently, the book briefly discusses several known algorithms for software, hardware, and hybrid monitoring. The authors use a clear hierarchy to describe hardware and hybrid techniques: uniprocessor techniques first, followed by multiprocessor and then distributed techniques. It’s a pity that they did not present software techniques in this way, too. A good overview of debugging techniques, as an extension and the actual goal of monitoring, follows the monitoring chapters. The last chapter of the first part discusses specification techniques. It provides a good, although very brief, overview of theoretical approaches. (However, this chapter is out of place, as we’ll see.) The authors list five types of methods widely used for specification and verification: Petri nets, temporal logics, statetransition systems, process algebras, and syn-
IEEE Concurrency
.
The difficulty with monitoring distributed systems is that not only must each node of the target system be equipped with the appropriate hardware or software devices, but all these devices must also connect to a central location via an existing or dual network.
chronous languages. They also briefly present their timed versions. The Theory and Practice part starts with a chapter on monitoring, in which the authors describe the system they built. The book then covers theoretical approaches: graph-based timing analysis (which the authors describe thoroughly) and timing-constraint Petri nets. This arrangement of topics is confusing; the text would have read more easily had the chapters on specification techniques and monitoring been switched. Distributed Real-Time Systems: Monitoring, Visualization, Debugging and Analysis seems addressed to software engineers and graduate students interested in performance analysis of real-time distributed systems. However, to my surprise, it does not mention who the intended audience is. The book would be stronger if it focused exclusively on experimental methods of analysis for distributed and real-time systems—that is, emphasizing and extending material from its first part and including the chapters on monitoring and visualization from the second. Theoretical approaches, including Petri nets, although extremely important as methods of static analysis, form a completely separate discipline and do not belong in this book. Reading this book encouraged me to look at an earlier one by Tsai and Thomas J. Weigert, Knowledge-Based Software Development for Real-Time Distributed Systems. From the perspective of the software life cycle, this earlier book is on software development, while the other mostly contributes to the maintenance phase. This book’s essence lies in the application of Frorl (frame and rule-oriented requirements language), developed by Tsai and Weigert, to the formal development of real-time distributed systems. Thus, this book could encompass the material the authors should have omitted from the other book. A brief introductory chapter surveys various aspects of formal software development. The book then quickly gets to the point by discussing the features of specification languages that are needed to express requirements and by presenting the logic foundations for a specification language. Desirable features include • • • • • • •
freedom from implementation concerns, naturalness of representation, operational interpretation, validity checking, verifiability, soundness and completeness, and ease of modification and construction. The book outlines Frorl, which possesses
April–June 1997
all these features; presents a couple of examples; and then discusses Frorl’s logic foundation, which is based on a nonmonotonic variant of Horn-clause logic. Nonmonotonicity of the logic is necessary to support the semantics of inheritance and exception handling. Next, Tsai and Weigert discuss features of real-time distributed systems that have counterparts in any development methodology. (This discussion is similar to that in the other book reviewed here.) They end this discussion by describing a number of additional modeling constructs that enhance Frorl by expressing the requirements of distributed and real-time systems. For example, added mechanisms for describing real-time processes include those for expressing periodic behavior, timing constraints on sporadic actions, and defining temporal properties. The authors then present a variant of temporal logic that facilitates describing and reasoning about the time-related information. The ordinary temporal logic, as introduced to computer science by Amir Pnueli, does not allow specifying an absolute time but only allows the relative orderings between actions. (For more information, read The Temporal Logic of Reactive and Concurrent Systems and its sequel, Temporal Verification of Reactive Systems: Safety—reviewed in the Jan.–Mar. 1997 Concurrency—by Zohar Manna and Pnueli, Springer-Verlag, 1992 and 1996.) The temporal logic used here is rather peculiar and extends, by adding time, the modal µ-calculus, which is originally a modal logic with labels and a least fixed-point operator. Next, the book outlines three strategies for analyzing and verifying distributed real-time systems: resolution refutation, model checking, and a graph-based approach. It then discusses, using examples in Frorl, the specification and verification of knowledge-based systems. This discussion is not very convincing, because knowledge-based systems do not, in principle, react in real time, simply because you can’t guarantee the time that an exhaustive search through their solution space will take. Two important chapters end the book. One discusses the step-by-step transformation of Frorl specifications into computer programs written in a procedural language, and the other introduces techniques for debugging Frorl specifications—both within a conceptual framework of knowledge-based systems. Like the other book, Knowledge-Based Software Development for Real-Time Distributed Systems does not mention the intended audience. However, based on its contents, I would say it is definitely for researchers interested in recent theoretical advances in the specification and verification of real-time systems.
87
.
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering reviewed by Michael Mikolajczak, Merrill Lynch, New York
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering by Ian Foster 379 pages $54.82 Addison-Wesley Reading, Mass. 1995 ISBN 0-201-57594-9
In Designing and Building Parallel Programs, Ian Foster attempts “to provide a practitioner’s guide for students, programmers, engineers, and scientists who wish to design and build efficient and cost-effective programs for parallel and distributed computer systems.” He views parallel programming as an “engineering discipline.” To reinforce that view, he has divided the book into three parts: Concepts, Tools, and Resources.
Concepts This part starts with an introduction to parallel computers and computation. Foster uses arguments from VLSI theory, application trends, and networking developments to predict a future in which parallelism is essential to programming. Although these arguments are convincing, history has shown that the days of commodity parallel computers and software might be further in the future. For example, many critics have predicted the demise of single-processor systems, citing the physical limitations of silicon and VLSI complexity theory results. Yet, the single-processor PC is alive and well, breaking performance barriers with the introduction of faster and faster processors. Second, the vast majority of applications on the market do not require as much computational power as Foster implies. Most commercial computationally intensive applications are enterprise-wide databases. Such applications successfully run on high-performance workstations with two to 16 processors. These computers are hardly parallel. Third, the recent explosive growth of networks would seem a natural prelude to development of parallel and distributed applications. In reality, most distributed systems are client-server systems having simple communication patterns and very few processors. Foster correctly asserts that the role of parallel computers will increase. He should, however, warn his readers that such a time might be further away than they think. In the last section of the introductory chapter, Foster gives several examples of parallel algorithms: 2D finite difference, molecular
88
dynamics, and search. He clearly describes the algorithms and explains how the abstract concepts introduced previously relate to concrete problems. However, this section is rather more difficult than previous ones; perhaps he could have presented fewer problems. Also, readers would likely benefit from a concise algorithmic description/pseudocode common to all problems. Chapter 2 introduces the basics of designing parallel algorithms. In particular, the reader encounters two problem-decomposition techniques: domain decomposition and functional decomposition. Although Foster attempts to clearly distinguish them, I got the impression that the two techniques have much in common. For example, in the atmospheremodeling problem (which I’ll discuss more, later), functional decomposition and domain decomposition appear very similar: the functional unit modeling the atmosphere likely contains data structures only relevant to the atmosphere. Hence, the two techniques yield the same results. Could he have used a different example? This chapter then introduces various modes of communication in parallel systems. Foster categorizes communication as global, local, structured, unstructured, and so on. He does not, however, describe broadcasts, multicasts, and other communication terms familiar to students of distributed systems. In a similar vein, in the design methodology, he does not consider the possibility of communication failure. Also, when describing asynchronous communication, he relies on tasks polling for incoming data. This is clearly unnecessary because most of today’s parallel applications are multithreaded. Finally, after reading this chapter, we are expected (according to the chapter’s preamble) to be able to design communication patterns for parallel applications. Yet, Foster does not go through the design process. Instead, he introduces several commonly used patterns such as divideand-conquer strategies for parallel summation. This section should be vastly expanded to fulfill its role. After discussing two other important subjects—agglomeration and the mapping of
IEEE Concurrency
.
The essence of concurrent composition, according to Foster, is to interleave parallel computation in various modules among different processes.
April–June 1997
tasks to processors—Foster provides several complete examples of parallel-algorithm development. The atmosphere-modeling problem is very complete and lucid. Unfortunately, the partial differential equations specifying atmosphere dynamics shed little light on the problem. Perhaps he should have provided the difference equations for updating each stencil. The floor-plan-optimization and computational chemistry examples are a bit harder to follow; nevertheless, they are very enlightening. Finally, Foster could have provided several alternatives for each problem (in the agglomeration and mapping phases of the design, for example). This would illustrate how various parallel architectures lead to different designs for the same algorithm. The importance of modular design cannot be emphasized enough, especially for developing programs and libraries for parallel architectures. Hence, Foster’s decision to devote all of Chapter 4 to this issue commands my applause. In particular, he introduces the notions of sequential, parallel, and concurrent program composition. Although the subsection on sequential and parallel composition is quite clear and the ScaLapack examples are quite informative, the treatment of concurrent composition is inadequate. The essence of concurrent composition, according to Foster, is to interleave parallel computation in various modules among different processes. A thread of execution in one task might itself require computation that must be performed in parallel. This sharply contrasts with sequential and parallel composition, where a processor can only execute code contained in a single module. Yet, Foster does not introduce threads at all, although they should form the foundation of concurrently composed programs. This omission is starkly evident in Example 4.2. The send operation in the while loop should execute on a different thread, concurrently with other_computation. Without multithreaded environments, concurrently composed programs will likely be highly inefficient and suffer from buffer overflow. The implementers of CC++, which is introduced in a later chapter, agree with this assessment: threads are a fundamental notion in their language. This chapter, like the previous chapters, concludes with extended examples. The first is an old favorite: convolution. The presentation is quite lucid, although a brief description of convolution’s applicability might help readers not well-versed in electrical engineering. Also, I would have liked to have seen a complete program written in one of the numerous packages mentioned in previous chapters. The tuple-space and matrix-multiplication prob-
lems are much more involved; I suppose they’re intended for advanced readers.
Tools While the Concepts part introduces more theoretical areas of parallel computing, the Tools part forms an extended manual page of commonly used packages and tools. The author describes CC++, Fortran M, HPF, and MPI. Clear descriptions of programming interfaces for each of these, and programming examples at the end of each chapter, will help novices get started quickly. Seasoned professionals will enjoy the discussions of nondeterminism and mapping in CC++ and of the differences and similarities between the two flavors of Fortran. This part refers throughout to concepts introduced earlier. The contrasting sections on modularity, mapping, and performance in Chapters 5 through 8 are particularly informative. For example, Foster contrasts the performance consequences of global pointers in CC++ versus messaging in MPI or data parallelism in HPF. One small shortcoming of this part is the lack of emphasis on each parallel-programming package’s relative importance and popularity. For instance, the use of CC++ has been limited primarily to parallel-computing academic circles because of the scarcity of commercially available compilers and its lackluster performance. On the other hand, MPI, HPF, and FM have enjoyed considerable success in a wide variety of academic research environments. Another minor issue is the omission of other packages available commercially or otherwise. A chapter on packages such as Orbix or Active Messages might have been beneficial. Orbix particularly deserves at least a mention, given its unprecedented success in telecommunications and banking.
Resources This part contains detailed descriptions of parallel random generators and hypercube algorithms. Although the contents of the chapter on hypercube algorithms are standard and need not be elaborated upon, Foster’s choice of the parallel random-number generator appears somewhat odd. The list of references is very complete and reinforces my opinion that the book is an extended introduction to the field.
Uneven yet useful Designing and Building Parallel Programs is intended to fulfill two functions. First, it is an introduction to parallel computing. This is
89
.
clearly shown by the inclusion of the Concepts part, which is a succinct overview of theoretical parallel algorithm design. The book doubles as a handy reference for professionals developing code in any of the parallel programming packages reviewed in the Tools part. Although the book’s overall design and content are well thought out, a number of minor deficiencies take away some of its luster. For example, the level of difficulty is uneven. Most chapters begin with a simple description of the topic. The reader will be very quickly surprised, however, by the examples at the end, whose level of detail is frequently too high for their intended purpose. The set of differential equations in one of the case studies illustrates this problem. Many undergraduate readers are unlikely to appreciate the importance of computing the Fock matrix or to fully understand calculation of convolution; the book describes both in detail. If Foster is aiming at the graduate audience, he should have included more advanced material at the beginning of each chapter and thus given his book a more academic tone, or he should have chosen simpler examples. Many readers will also find surprising the lack of emphasis on the most recent developments in parallel and distributed computing. Languages such as CC++ or HPF have been around for a number of years. The hottest products on the market combine the features of high-performance parallel-programming
libraries and modern object-oriented development systems. A multitude of Corba implementations, for example, possess many features of CC++ and are used successfully by commercial enterprises all over the world. Perhaps Foster should have included a Corba implementation (such as Orbix) in the Tools part. Distributed computing’s importance in the marketplace and research is difficult to underestimate. From simple client-server applications, such as popular Web servers, to complex real-time distributed systems, such applications are virtually everywhere. Yet this book devotes no time to even a simple description of distributed systems and libraries. Foster dismisses this important field as a subdiscipline of parallel computing. Implementing parallel algorithms on a network of workstations introduces many complexities, such as possibility of failure. You couldn’t reliably implement any algorithm presented in the book on such a system. A description of any of the multitude of distributed-programming packages, such as ISIS, would more than adequately familiarize the reader with relevant issues. Although Designing and Building Parallel Programs might be a few years out of date and a bit choppy in its presentation, it is nevertheless an extremely useful educational tool or reference. It will certainly be an excellent addition to my bookshelf.
New Books Authentication Systems for Secure Networks, Rolf Oppliger, 186 pp., $59, Artech House, Boston, 1996, ISBN 089006-510-1. Client/Server Programming with Java and CORBA, Robert Orfali and Dan Harkey, 688 pp., $44.95, John Wiley & Sons, New York, 1997, ISBN 0-47116351-1. Communication and Computing for Distributed Multimedia Systems, Guojun Lu, 416 pp., $69, Artech House, Boston, 1996, ISBN 0-89006-884-4. Computer Architecture: Concepts and Evolution, Gerrit A. Blaauw and Frederick P. Brooks Jr., 1,213 pp., $64.51, Addison-Wesley, Reading, Mass., 1997, ISBN 0-201-10557-8. 90
Concurrent Programming in Java: Design Principles and Patterns, Doug Lea, 352 pp., $34.38, Addison Wesley Longman, Reading, Mass., 1997, ISBN 0201-69581-2. Coordinating Distributed Objects: An Actor-Based Approach to Synchronization, Svend Frølund, 195 pp., $32.50, MIT Press, Cambridge, Mass., 1996, ISBN 0262-06188-0.
Multithreading Applications in Win32: The Complete Guide to Threads, Jim Beveridge and Robert Wiener, 368 pp., $39.95, Addison Wesley Longman, Reading, Mass., 1997, ISBN 0-20146136-6. Paradigms for Fast Parallel Approximability, J. Diaz, M.J. Serna, P. Spirakis, and J. Toran, 200 pp., £25.00, Cambridge Univ. Press, Cambridge, UK, 1997, ISBN 0-521-43170-0.
Evolution of Parallel Cellular Machines: The Cellular Programming Approach, M. Sipper, 199 pp., Springer-Verlag, Berlin, 1997, ISBN 3-540-62613-1.
Principles of Transaction Processing, Philip A. Bernstein and Eric Newcomer, 358 pp., $39.95, Morgan Kaufmann, San Francisco, 1997, ISBN 1-55860-415-4.
Local Area Networks: A Client/Server Approach, James E. Goldman, 751 pp., $73.95, John Wiley & Sons, New York, 1996, ISBN 0-471-14162-3.
Programming with POSIX Threads, David R. Butenhof, 384 pp., $34.38, Addison Wesley Longman, Reading, Mass., 1997, ISBN 0-201-63392-2. IEEE Concurrency