Dec 17, 2001 - taOS in the LambdaTOOL environment, some tutorials, a description ..... Monitoring of language variables and system-level objects such as ...
Programming in DeltaOS by Examples CoreTek Systems, Inc. 1107B, CEC Building 6 South Zhongguancun Road Beijing 100086 People’s Republic of China December 17, 2001
Contents I
An Overview of DeltaOS
3
1 Introduction 2 The 2.1 2.2 2.3
2.4 2.5 2.6
4
Why for Real-Time Operating Systems Introduction . . . . . . . . . . . . . . . . . . . . . . . . What is a Real-Time Operating System . . . . . . . . When does it Make Sense to use a Real-Time Kernel? 2.3.1 An Example . . . . . . . . . . . . . . . . . . . 2.3.2 Adding Complications . . . . . . . . . . . . . . 2.3.3 Real-Time Kernel Solution . . . . . . . . . . . Why to Use a Real-Time Operating System . . . . . . Other Market Trends . . . . . . . . . . . . . . . . . . . Why To Choose DeltaOS . . . . . . . . . . . . . . . .
3 DeltaCORE Real-Time Kernel 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 What is DeltaOS? . . . . . . . . . . . . . . . 3.1.2 System Architecture . . . . . . . . . . . . . . 3.1.3 Integrated Development Environment . . . . 3.2 DeltaCORE Real-Time Kernel . . . . . . . . . . . 3.2.1 Overview . . . . . . . . . . . . . . . . . . . . 3.2.2 Multitasking Implementation . . . . . . . . . 3.2.3 Overview of System Operations . . . . . . . . 3.2.4 Task Management . . . . . . . . . . . . . . . 3.2.5 Storage Allocation . . . . . . . . . . . . . . . 3.2.6 Communication, Synchronization and Mutual 3.2.7 The Message Queue . . . . . . . . . . . . . . 3.2.8 Events . . . . . . . . . . . . . . . . . . . . . . 3.2.9 Semaphores . . . . . . . . . . . . . . . . . . . 3.2.10 Asynchronous Signals . . . . . . . . . . . . . 3.2.11 Time Management . . . . . . . . . . . . . . . 3.2.12 Interrupt Service Routines . . . . . . . . . . . i
. . . . . . . . .
. . . . . . . . .
7 7 8 8 8 9 11 12 13 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
16 16 16 16 18 18 18 19 21 28 32 35 36 38 39 41 42 45
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
3.2.13 Tasks Using Other Components . . . . . . . . . . . . . . .
II
Data Structures in DeltaOS
48
50
4 Introduction
51
5 Static Data Structures 5.1 Sequential Lists . . . . . . . . . . . . . . . 5.1.1 Specification of Lists . . . . . . . . 5.1.2 Implementation of Lists . . . . . . 5.2 Stacks . . . . . . . . . . . . . . . . . . . . 5.2.1 Specification of Stacks . . . . . . . 5.2.2 Implementation of Stacks . . . . . 5.3 Queues . . . . . . . . . . . . . . . . . . . . 5.3.1 Specification of Queues . . . . . . 5.3.2 Implementation of Queues . . . . . 5.4 Priority Queues . . . . . . . . . . . . . . . 5.4.1 Specification of Priority Queues . . 5.4.2 Implementation of Priority Queues 5.5 Search and Sort Algorithms . . . . . . . . 5.5.1 Search Algorithms . . . . . . . . . 5.5.2 Sort Algorithms . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
53 53 53 54 58 58 59 61 62 63 66 66 67 69 70 71
6 Generic Data Types 6.1 Template Functions . . . . . . . . . . . . 6.2 Template Classes . . . . . . . . . . . . . . 6.3 Generic Class of Stacks . . . . . . . . . . 6.3.1 Specification of Generic Stacks . . 6.3.2 Implementation of Generic Stacks 6.4 Generic Search and Sort Algorithms . . . 6.4.1 Generic Search Algorithm . . . . . 6.4.2 Generic Sort Algorithm . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
78 78 79 79 79 80 82 82 83
7 Dynamic Data Structures 7.1 Safe Arrays . . . . . . . . . . . . . . . . . . 7.1.1 Specification of Dynamic Arrays . . 7.1.2 Implementation of Dynamic Arrays . 7.2 String Class . . . . . . . . . . . . . . . . . . 7.2.1 Specification of String . . . . . . . . 7.2.2 Implementation of String . . . . . . 7.2.3 Pattern Matching Algorithm . . . . 7.3 Integral Sets . . . . . . . . . . . . . . . . . . 7.3.1 Specification of Sets . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
88 . 88 . 88 . 89 . 94 . 94 . 96 . 107 . 109 . 109
ii
7.4
7.5
7.6
7.7
7.3.2 Implementation of Sets . . . . . . . . . . . . 7.3.3 Applications of Class Set . . . . . . . . . . . Linked Lists . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Nodes . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Specification of Linked Lists . . . . . . . . . . 7.4.3 Implementation of Linked Lists . . . . . . . . Circular Lists . . . . . . . . . . . . . . . . . . . . . . 7.5.1 Specification of Circular Lists . . . . . . . . . 7.5.2 Implementation of Circular Lists . . . . . . . Doubly Linked Lists . . . . . . . . . . . . . . . . . . 7.6.1 Specification of Doubly Linked Lists . . . . . 7.6.2 Implementation of Doubly Linked Lists . . . Iterators . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.1 Separation of Data and Control Abstractions 7.7.2 Specification of Iterator . . . . . . . . . . . 7.7.3 Implementation of Iterator . . . . . . . . . 7.7.4 Deriving List Iterator . . . . . . . . . . . . . 7.7.5 Array Iterator . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
111 116 117 118 120 122 131 132 132 134 135 135 137 137 138 139 139 143
8 Nonlinear Structures 146 8.1 Binary Tree Structures . . . . . . . . . . . . . . . . . . . . . . . . 146 8.1.1 Specification of Trees . . . . . . . . . . . . . . . . . . . . . 146 8.1.2 Implementation of Trees . . . . . . . . . . . . . . . . . . . 147 8.1.3 Tree Traversals . . . . . . . . . . . . . . . . . . . . . . . . 149 8.1.4 Using Tree Scan Algorithms . . . . . . . . . . . . . . . . . 152 8.2 Elimination of Recursion . . . . . . . . . . . . . . . . . . . . . . . 156 8.2.1 Recursive Programs . . . . . . . . . . . . . . . . . . . . . 156 8.2.2 Execution Tree of a Recursive Procedure . . . . . . . . . 157 8.2.3 Computation of a Recursive Procedure for a Principal Call 158 8.2.4 Call Trees of a Recursive Procedure . . . . . . . . . . . . 159 8.2.5 Depth of a Recursive Call – Embedded Recursive Calls . 159 8.2.6 Elimination of Formal Parameters and Local Variables . . 159 8.2.7 Elimination of Recursion . . . . . . . . . . . . . . . . . . 160 8.3 Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . 172 8.3.1 Specification of Binary Search Trees . . . . . . . . . . . . 173 8.3.2 Implementation of Binary Search Trees . . . . . . . . . . 175 8.4 Array-Based Binary Trees . . . . . . . . . . . . . . . . . . . . . . 185 8.4.1 Application: The Tournament Sort . . . . . . . . . . . . . 186 8.4.2 Tree Iterators . . . . . . . . . . . . . . . . . . . . . . . . . 190 8.5 Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 8.5.1 The Heap as a List . . . . . . . . . . . . . . . . . . . . . . 194 8.5.2 Specification of Heaps . . . . . . . . . . . . . . . . . . . . 195 8.5.3 Implementation of Heaps . . . . . . . . . . . . . . . . . . 196 8.5.4 Application: Heap Sort . . . . . . . . . . . . . . . . . . . 203 iii
8.6
8.7
8.8
8.9
III
Priority Queues . . . . . . . . . . . . . . . . . . 8.6.1 Specification of Priority Queues . . . . . 8.6.2 Implementation of Priority Queues . . . AVL Trees . . . . . . . . . . . . . . . . . . . . . 8.7.1 Specification of AVLTreeNode Class . . . 8.7.2 Implementation AVLTreeNode Class . . 8.7.3 Specification of AVLTree Class . . . . . 8.7.4 Implementation of AVLTree Class . . . . Graphs . . . . . . . . . . . . . . . . . . . . . . . 8.8.1 Definition of Graph . . . . . . . . . . . 8.8.2 Specification of Graphs . . . . . . . . . 8.8.3 Implementation of Graph . . . . . . . . Hashing and Hash Table Class . . . . . . . . . 8.9.1 Hashing Functions . . . . . . . . . . . . 8.9.2 Collision Resolution . . . . . . . . . . . 8.9.3 Specification of Hash Table Class . . . . 8.9.4 Specification of HashTableIterator . . 8.9.5 Implementation of Hash Table Class . . 8.9.6 Implementation of HashTableIterator
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
Concurrent Programming in DeltaOS
9 A Cyclical Executive for Small 9.1 Concept . . . . . . . . . . . . 9.2 Timers . . . . . . . . . . . . . 9.3 Inter-Task Communication . 9.4 Priorities . . . . . . . . . . . 9.5 Queueing . . . . . . . . . . . 9.6 Task Switching . . . . . . . . 9.7 Instance Data . . . . . . . . . 9.8 Advantages . . . . . . . . . . 9.9 Disadvantages . . . . . . . . . 9.10 Comments . . . . . . . . . . .
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
245 . . . . . . . . . .
10 How to Break an Application Down into Tasks 10.1 When Multi-tasking is Required . . . . . . . . . . . 10.2 Implementing Concurrent Operations as Separate Tasks . . . . . . . . . . . . . . . . . . . 10.2.1 Single Task Solution . . . . . . . . . . . . . 10.2.2 Three Task Solution . . . . . . . . . . . . . 10.3 Design Principles . . . . . . . . . . . . . . . . . . . 10.3.1 Keep the Task Count Down . . . . . . . . . 10.3.2 When Multiple Task Instances Make Sense iv
203 204 204 205 206 207 209 210 219 220 220 222 235 236 236 237 238 239 242
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
246 246 247 248 248 248 248 248 249 249 250
251 . . . . . . . . 251 . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
251 252 252 253 253 253
10.3.3 Implementing Resource Management as a Task . . . . . . 254 10.3.4 Defering Interrupt Processing to a Task . . . . . . . . . . 254 11 Concurrent Programming in DeltaOS 11.1 Classical Process Coordinate Problems 11.1.1 The Bounded-Buffer Problem . 11.1.2 The Reader/Writer Problem . 11.2 Dining Philosopher Problem . . . . . . 11.2.1 Problem Description . . . . . . 11.2.2 Solutions . . . . . . . . . . . . 11.3 Sieve Algorithm for Prime Numbers . 11.3.1 Problem Description . . . . . . 11.3.2 Solution . . . . . . . . . . . . . 11.4 Barber Shop Problem . . . . . . . . . 11.4.1 Problem Description . . . . . . 11.4.2 Solution . . . . . . . . . . . . . 11.5 Cigarett Smoker Problem . . . . . . . 11.5.1 Problem Description . . . . . . 11.5.2 Solution without Deadlock . . 11.6 Parallel Quick Sorting . . . . . . . . . 11.6.1 Problem Description . . . . . . 11.6.2 First Solution . . . . . . . . . . 11.6.3 Second Solution . . . . . . . . 11.7 Runner and Timer Problem . . . . . . 11.7.1 Problem Description . . . . . . 11.7.2 Purpose . . . . . . . . . . . . . 11.7.3 Solution . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
256 256 256 258 260 260 261 275 275 276 280 280 280 285 285 286 291 291 292 294 299 299 299 299
12 Synchronization and Communication 12.1 Concurrency Conditions . . . . . . . . . . . . . . . . . 12.2 Critical Regions . . . . . . . . . . . . . . . . . . . . . . 12.2.1 Specification of Critical Regions . . . . . . . . 12.2.2 Implementation of Critical Regions . . . . . . . 12.3 Conditional Critical Regions . . . . . . . . . . . . . . . 12.3.1 Specification of Conditional Critical Regions . 12.3.2 Implementation of Conditional Critical Regions 12.4 Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.1 Definitions and Implementations . . . . . . . . 12.4.2 Monitor SynStack . . . . . . . . . . . . . . . . 12.4.3 Implementation of SynStack . . . . . . . . . . 12.4.4 Application of SynStack . . . . . . . . . . . . . 12.4.5 Condition as a Class . . . . . . . . . . . . . . . 12.4.6 Implementation of SynStack Again . . . . . . . 12.5 Interprocess Communication . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
303 303 304 305 306 306 307 307 315 315 316 318 322 323 325 329
v
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
12.5.1 Naming . . . . . . . . . . . . . . 12.5.2 Buffering . . . . . . . . . . . . . 12.6 Synchronized Communication Channels 12.6.1 Specification of Channel . . . . . 12.6.2 Implementation of Channel . . . 12.6.3 Application of Channel . . . . . 12.7 Asynchronized Communication Buffer . 12.7.1 Specification of SynStore . . . . 12.7.2 Implementation of SynStore . . A Glossaries
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
330 332 333 334 334 336 339 340 340 344
vi
List of Figures
vii
List of Tables 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
System System System System System System System System
Calls Calls Calls Calls Calls Calls Calls Calls
for for for for for for for for
Task Management . . Storage Management Message Queue . . . Events . . . . . . . . Semaphores . . . . . Asynchronous Signals Time Management . ISR . . . . . . . . . .
1
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
28 33 36 38 40 41 43 47
2 c 2001 CoreTek Systems, Inc. Copyright Document Title: Concurrent Programming in DeltaOS Revision Date: June 2001
c Copyright This document and the associated software contain information proprietary to CoreTek Systems Inc. and may be used only in accordance with the CoreTek Systems agreement under which this package is provided. No part of this document may be copied, reproduced, transmitted, translated, or reduced to any electronic medium or machine-readable form without the prior written content of CoreTek Systems. CoreTek Systems makes no representation with respect to the contents, and assumes no responsibility for any errors that might appear in this document. CoreTek Systems specially disclaims any implied warranties of merchantability or fitness for particular purpose. This publication and the contents hereof are subject to change without notice.
Trademark The following are trademarks of CoreTek Systems, Inc.: • DeltaSystem; • DeltaOS; • DeltaCORE; • DeltaFILE; • DeltaNET; • LambdaTOOL; • LambdaMONITOR. All other products mentioned are the trademarks, service marks, or registered trademark of the respective holders.
Part I
An Overview of DeltaOS
3
Chapter 1
Introduction This is book about concurrent and real-time programming in the embedded real-time operating system DeltaOS. The ability to write concurrent programs, i.e., programs with components that can be expressed in parallel, is desirable for many reasons: • It leads to notational convenience and conceptual elegance in writing operating systems and real-time systems, all of which may have many events occurring concurrently. • Inherently concurrent algorithms are best expressed with the concurrency explicitly stated; otherwise the structure of the algorithm may be lost. • Even on single CPU computers concurrent programming can reduce program execution time because lengthy input/output operations and the CPU operation can proceed in parallel. In this book, we will explain the concurrent programming facilities in DeltaOS and show how to use them effectively in writing concurrent programs. This book is written on responsing many requests from DeltaOS users for having a textbook on teaching DeltaOS programming with examples. In DeltaOS reference manuales, there was very few examples being provided to explain the concepts and usages for its system services and leaving a large number of unanswered questions to user themselves. Also, there are very few textbooks existed for teaching DeltaOS programming techniques. Therefore, we believe that it is our duty to provide such a work to the DeltaOS world.
Organization This book is organized as follows: 4
Introduction
5
• Part I – An Overview of DeltaOS Introduces the basic concepts of DeltaOS as follows: Chapter 1. Introduction Provides a brief introduction of book. Chapter 2. The Why for Real-Time Operating Systems The explaination of the major reasons to use a real-time operating system for embedded software development is given. Chapter 3. DeltaCORE Real-Time Kernel Describes the basic concepts of DeltaOS. • Part II – Data Structures in DeltaOS Chapter 5. Static Data Structures Some static data structures are presented. Chapter 7. Dynamic Data Structures A number of dynamic data structures such as linked lists, trees and graphs are presented. Chapter 6. Generic Data Structures Some generic data structures are presented. Chapter 8. Nonlinear Structures A number of non-linear data structures such as heaps are presented. • Part III – Concurrent Programming in DeltaOS Chapter 9. A Cyclical Executive for Small Systems This chapter presents the concept and the code for the simple cyclical executive. Chapter 10. How to Break an Application Down into Tasks Provides some guidelines that may be useful when designing software for real-time multi-tasking systems. Specifically, it addresses how an application is broken down into multiple concurrent tasks. Chapter 11. Concurrent Programming in DeltaOS The solutions for a number of classical problems in the field of concurrent and distributed computation are presented. Chapter 12. Synchronization and Communication We will use the DeltaOS primitives to build the synchronization and communication objects.
Related Documentation When using the DeltaOS software you might want to have on hand the other manuals of the basic documentation set:
6
Introduction
• DeltaSystem Getting Started contains an introduction to the DeltaOS in the LambdaTOOL environment, some tutorials, a description of board-support packages, configuration instructions, information on files and directories, and some board-specific information. It also includes introductory material on using the LambdaMONITOR debugger. • DeltaOS Programmer’s Reference contains detailed descriptions of system services, interfaces and drivers, configuration tables, and memory usage. • DeltaOS System Calls provides a reference of DeltaCORE, DeltaNET, and DeltaFILE system calls and error codes. • DeltaOS Advanced Topics contains information on how to customize your usage of your DeltaOS. It contains sections on using and crating BSPs and Assembly Language information. • DeltaOS Application Examples describes the application examples that are provided for you and tutorials on how to use these examples.
Chapter 2
The Why for Real-Time Operating Systems 2.1
Introduction
Time-to-market is a primary concern for today’s engineer. The world no longer allows the choice of any two out of the following three items: on-time, on-budget, meets-specifications. Today’s new products must have all three criteria satisfied or the whole project is deemed unsuccessful. Computer industry changes are occurring so rapidly that its more like a revolution than an evolutionary process. Hardware platforms are becoming faster and cheaper. Operating systems and networking software are enabling distributed, scalable computers to totally redefine the entire application landscape. Over the past ten years, processor speeds have increased about 60% per year, while the price/performance ratio for workstations and PCs has decreased. This has dramatically changed the way we create and develop applications, and provides opportunities to develop embedded computer systems to perform feats previously deemed impossible. Consumers are demanding more robust products, which increases the underlying software complexity. The traditional microcontroller approach has been overtaken by consumer demands for more features in new systems at an everescalating rate. Shorter product cycles and growing software complexity force producers to focus on “cost-to-market” and “time-to-market” issues. Current embedded systems are designed as full-fledged computer systems that happen to be inside other products. Embedded systems designers must also face the same issues as mainstream developers. Cost-to-market and time-to-market pressures make software reuse the critical issue. 7
8
Why RTOS
Previously, real-time embedded application software had limited feature sets and was designed from scratch. These earlier systems were comprised of a single process, single CPU, and single thread environment. A static run-time capability was all that was required. The future will be quite different. Market demands now require larger, more full-featured applications using commercial off-the-shelf components. One of the most critical “build or buy” decisions facing real-time system designers is whether to purchase an off-the-shelf (commercial) real-time operating system or write your own. A number of signs indicate that the market acceptance of using the commercial RTOS solution is now the solution in a vast majority of applications.
2.2
What is a Real-Time Operating System
An operating system (OS) is code that provides services to manage a computer system’s resources[21, 12, 23, 17, 10]. It separates an application from hardware by treating the hardware as an abstract machine. In its simplest form, an operating system performs only one task. Adding interrupts adds tasks in the form of the interrupt service routines. By adding interrupts and placing them into foreground and background segments, you begin to define the work of a modern operating system. A real-time operating system (RTOS) is one that performs its job - managing resources - within a given time constraint to meet a system’s requirements. There is no specific response time that qualifies an operating system as “real time”. Rather, it is up to the target application, failing to meet the specified response time will cause severe consequences. A multitasking RTOS provides code for partitioning a design into multiple tasks, where each task runs independently of the others. Special techniques supply each task with the resources and service needed to guarantee that the application meets its time constraints[26]. A common technique is to have a preemptive priority based scheduler for the multitude of tasks. A kernel is the part of a multitasking system responsible for the management of tasks, that is, managing the CPU’s time and communication among the tasks. Such is DeltaOS, developed and supported by CoreTek Systems, Inc.
2.3 2.3.1
When does it Make Sense to use a RealTime Kernel? An Example
Not all real-time systems gain from using a real-time kernel. Consider a realtime system comprised of three motors and three switches. The switches have
2.3. WHEN DOES IT MAKE SENSE TO USE A REAL-TIME KERNEL? 9 two positions, ON and OFF. The switches must be scanned at about 10 times per second, and the motors turned on or off as appropriate. A possible solution is shown below. void main(void) { int i; for (;;) { for (i= 0; i < 3; i++ ) { if (switchChanged(i)) changeMotor(i); } } } Note that the code above does not incorporate any timing to insure that each switch is scanned at least 10 times per second. However, on even basic processors, the code will scan much faster than 10 times per second, and since a faster scan rate will do no harm, the above code is presented as an adequate solution.
2.3.2
Adding Complications
Assume now that the motor switches must be polled at a fairly accurate rate of 10 times per second. This can be handled by modifying the code as follows: for (;;) { if (OneTenthSecondIsUp) { for (i= 0; i < 3; i++ ) { if (switchChanged(i)) changeMotor(i); } OneTenthSecondIsUp = 0; } } The global OneTenthSecondIsUp is set to 1 by the timer interrupt service routine every 100 milliseconds. As a further complication, assume now that a pressure gage must be checked every 50 milliseconds and a valve opened if the pressure is greater than 100 psi. Once opened, the valve must be closed after the pressure drops below 90 psi. This is done by adding the following to the loop. if (FiftyMsIsUp) { switch (valveState) { case CLOSED: if (pressure() > 100) { openValve(); valveState = OPEN; }
10
Why RTOS
break; case OPEN: if (pressure() < 90) { closeValve(); valveState = CLOSED; } } FiftyMsIsUp = 0; } The global FiftyMsIsUp is set to 1 by the timer interrupt service routine every fifty milliseconds. At this point the for (;;) loop is becoming quite large, and should be re-written as follows. for (;;) { checkMotorSwitches(); checkPressure(); } As more “tasks” are added to the system, they would be coded as C functions and a call to the function added to the loop as has been done above. An inherent problem with this approach is that as function calls are added to the loop, the “round trip” time through the loop increases, and at some point, the timing requirement that the switches be scanned 10 times per second may not be met. As a final complication, assume that the system is connected to a network and that incoming datagrams must be processed. This could be handled by adding yet another function call to the loop. for (;;) { checkMotorSwitches(); checkPressure(); checkDatagrams(); } If the function checkDataGrams is not called at a sufficient rate, datagrams can be lost. In order to avoid this, a queue must be created so that when the interrupt service routine for incoming datagrams is entered, the datagram is placed into a queue for processing by the function checkDatagrams. The technique outlined above is referred to as a cyclical or round robin kernel/executive. It is adequate for a number of applications, and for some, it is the only choice. For example, consider an 8051 microcontroller application with no external memory. In this type of environment there is simply not enough RAM memory to accommodate a preemptive multi-tasking kernel that requires a private stack for each task. The cyclical executive is small, compact, and easy to use and understand. On the negative side, there are no priorities, busy
2.3. WHEN DOES IT MAKE SENSE TO USE A REAL-TIME KERNEL?11
waiting is employed, and more responsibility (e.g. timer management) is put on the programmer.
2.3.3
Real-Time Kernel Solution
The example above has three main requirements: Concurrency (the 3 separate tasks to be performed) Timing (sampling switches and pressure) Message queuing (queuing of incoming datagrams) These requirements are some of the features that a typical real-time kernel offers. The kernel provides concurrency by allowing the creation of independent tasks; timing is provided by a timer management system; and message queuing is built into the kernel’s message system. Using a real-time kernel, the above example could be implemented by coding each of the functions checkMotorSwitches, checkPressure, checkDatagrams as independent tasks. void checkMotorSwitches(void) { while (TRUE) { pause(100L); for (i = 0; i < 3; i++) { if (switchChanged(i)) changeMotor(i); } } } void checkPressure(void) { while (TRUE) { pause(50L); if (pressure() > 100) { closeValve(); while (TRUE) { pause(50L); if (pressure < 90) { openValve(); break; } } } } } void checkDatagrams(void) { typeMsg * msg; while (TRUE) {
12
Why RTOS
status = delta_message_queue_receive ( qrid, data_gram_buf, &size, DELTA_WAIT, timeout ); msg = isr_msg_buf[0]; processDataGram(msg); freeMsg(msg); } } Each of the above C functions are started in main (not shown) by making DeltaCORE system calls. The tasks checkMotorSwitches and checkPressure wake up periodically and check conditions. The task checkDatagrams only wakes up when a message is sent to it, since delta message queue receive suspends the task until a message arrives. The message is sent to checkDatagrams by the receive datagram interrupt service routine. Note the following: Busy waiting is eliminated. Timer management is no longer a concern of the programmer. checkPressure does not have to retain a state variable. Queue development and management is no longer a concern of the programmer. Note also that a typical kernel allows for priorities. So, for example, if the checkPressure task sample rate of 50 milliseconds is critical, the task could be run at a higher priority, meaning that if the 50 millisecond timer expires while another task is running, that task would be preempted, to run checkPressure. Any real-time system can be coded without a kernel, however, even simple systems can benefit from one. For some applications, timer management alone may be reason enough to use a kernel. Much can be written on this subject, and if there is enough interest, a more complete article may be written in the future.
2.4
Why to Use a Real-Time Operating System
Some reasons to use a real-time operating system are that it: 1. Provides a structure for overall system design. 2. Provides a common programming interface to all tasks. 3. Organizes system design into various functional models. 4. Permits independent work by a group of programmers. 5. Reduces development time by performing simple tasks.
2.5. OTHER MARKET TRENDS
13
6. Helps to manage time-critical functions easily. 7. Simplifies debugging by decomposing system in many simple tasks. 8. Makes documentation easier to maintain. 9. Lowers long-term maintenance because of a modular design. If the only constant in life is change - in embedded systems, it seems the only constant is the diversity of systems being built, with the number and complexity of applications on the rise. The trend has been for hardware oriented companies to move away from proprietary software.
2.5
Other Market Trends
As the price of faster and more sophisticated CPUs drops, the impetus is to use these processors to build more sophisticated systems. One way to extend a system’s life is to leverage the use of existing code. Since an RTOS separates the main line code from the hardware, the RTOS will support the transfer and reuse of that same application code to the more advanced CPU. The predominance of the proprietary design for embedded systems is coming to an end. Development times are decreasing and systems are becoming more complex. In order to become more efficient, most embedded system developers are moving away from proprietary designs to commercially available out-sourced options. It is not surprising that reliability was the most important factor by far when making design decisions. The amount of time available to develop an embedded system is decreasing significantly. Completing an embedded system in less than 6 months is not easy. Completing one in under 3 months is a real challenge. Such a short development cycle cannot be accomplished simply by throwing more people at the project. The future is bringing shorter development time and also higher complexity. Instead of hiring additional embedded system developers, the developers’ efficiency must increase. Not only will development time decrease, but embedded systems are becoming increasingly complex. More complex systems that will have to be created in less time than they are today.
2.6
Why To Choose DeltaOS
Embedded systems designers frequently have trouble integrating the different tools available for systems design. The wide variety of tools and the proprietary and unique interfaces mean that designers inevitably face a grab bag of hardware and software tools to choose from. Many of the common problems developers typically face while the project is in full swing - a terrible time to
14
Why RTOS
deal with surprises - can be avoided by choosing the right tools for the job and understanding each component in an embedded tool set. A typical software development environment includes an assembler or compiler (with linker/locator), a debugger, an operating system, as well as a text editor. In embedded development systems, when the code cannot run on the host; a different environment must be provided. The target code can run in an software simulation of the target microcontroller unit, or in a hardware environment, such as: a simple development board, a stand-alone emulator, or perhaps the target circuit board modified for development. Good technical support is obviously a key influence in your tool choice. Because a wide variety of tools are often used on a single project, it may not be clear whether a particular product is the culprit when problems arise. Consider how this environment is impacted with a proprietary OS. And when choosing tools, such as an OS, check technical support references to under- stand whether the vendor will help support all design tools when the going gets tough. Is his technical support experienced, or does that firm employ new graduates or college interns for support? If the firm is a life style company (one or two man operation) where will your design team be if you need help when technical support in on vacation or sick leave? Designers often worry too much about tool price instead of securing the best tools for the job. The true project cost includes time to market. Time-to-market addresses many aspects of the marketplace. The best reason to be first with a product is the additional income. This extra income is due to a higher price and more profit for a longer time compared to the later entrants. The leader can afford to reduce prices as competitors enter the market, since the development costs are covered by the time the late entrants appear. The consequences of not being first in the market are severe for those companies in the bottom half of the market. The classic market distribution for an industry: the first company has a 40 percent share of the market; the second, about 25 percent; the third, less than 15 percent; and the fourth, less than 10 percent of the market. As a result, the later market entrants have higher component costs, due to smaller volumes, and higher unit prices for competitive products. The lower prices and higher costs result in lower profits for the product line. Late entrants will also have to divert energy catching up with the leader(s). In addition, lower-tier companies face a loss of stature in the industry. The ability to expedite both processes and materials, and most importantly, to continually accelerate the entire process through the development is critical. Shorter product life-cycles translate into shorter design cycles. The desire for new products with more features and performance requires more detailed design and analysis. These engineering requirements preclude taking short-cuts in the design process, but the time-to-market requires less time. Incremental and hierarchical design techniques have become increasingly important. Breaking a design into manageable segments allows us to put more total engineering resources into the design, and it also allows us to exploit concurrent
2.6. WHY TO CHOOSE DELTAOS
15
software and design techniques. Reuse can increase your productivity because there is less reinvention of the same thing. Reuse leverages your resources. By reusing existing components, you can focus on more advanced technology instead of recreating what has already been done. The essence of reuse is the capture and codification of knowledge. There is no way that engineers can reuse an entity without knowing its precise function and the parameters under which it operates. An operating system becomes associated with the CPU, having the source code, good documentation, and excellent technical support are very important. Using a commercial OS, such as DeltaOS, that is ported, tested, and used on a wide range of microcontrollers and processors is important for reuse of your valuable intellectual property. The growth rate of the real-time industry is directly limited by the productivity of its real-time software developers.
Chapter 3
DeltaCORE Real-Time Kernel 3.1 3.1.1
Introduction What is DeltaOS?
DeltaOS is a modular, high-performance real-time operating system designed specially for embedded microprocessors. It provides a complete multitasking environment based on open system standards. DeltaOS is degined to meet three overriding objectives: • Performance • Reliability • Ease-of-use The result is a fast, deterministic, yet accessible system software solution. The DeltaOS software is supported by an integrated set of cross development tools that can reside on UNIX-, Windows- or DOS-based computers. These tools can communicate with a target over a serial or TCP/IP network connection.
3.1.2
System Architecture
The DeltaOS software employs a modular architecture. It is built around the DeltaCORE real-time multi-tasking kernel and a collection of companion software components. Software components are standard building blocks delivered as absolute position-independent code modules. They are standard parts in the sense that they are unchanged from one application to another. This black box 16
3.1. INTRODUCTION
17
technique eliminates maintenance by the user and assures reliability, because hundreds of applications execute the same, identical code. Unlike most system software, a software component is not wired down to a piece of hardware. It makes no assumptions about the execution/target environment. Each software component utilizes a user-supplied configuration table that contains application- and hardware-related parameters to configure itself at startup. Every component implements a logical collection of system calls. To the application developer, system calls appear as re-entrant C functions callable from an application. Any combination of components can be incorporated into a system to match user’s real-time design requirements. The DeltaOS components are listed below. • DeltaCORE Real-time Multitasking Kernel. A field-proven, multitasking kernel that provides a responsive, efficient mechanism for coordinating the activities of user’s real-time system. • DeltaNET TCP/IP Network Manager. A complete TCP/IP implementation including gateway routing, UDP, ARP, and ICMP protocols; uses a standard socket interface that includes stream, datagram, and raw sockets. • DeltaFILE File System Manager. Gives efficient access to mass storage devices, both local and on a network. Includes support for CD-ROM devices, MS-DOS compatible floppy disks, and a high-speed proprietary file system. • ANSI C Standard Library. Provides familiar ANSI C run-time functions such as printf(), scanf(), and so forth, in the target environment. In addition to these core components, DeltaOS includes the following: • Networking protocols including SNMP, FTP, Telnet, TFTP, NFS, and STREAMS • Run-time loader • User application shell • Support for C++ applications • Boot ROMs • Pre-configured versions of DeltaOS for popular commercial hardware • DeltaOS templates for custom configurations • Chip-level device drivers • Sample applications
18
DeltaCORE Real-Time Kernel
3.1.3
Integrated Development Environment
The DeltaOS integrated cross-development environment can reside on a UNIX-, Windows- or DOS-based computer. It includes C and C++ optimizing compilers, a target CPU simulator, a DeltaCORE OS simulator, and a crossdebug solution that supports source- and system-level debugging. The DeltaOS debugging environment centers on the LambdaMONITOR system-level debugger and optional high-level debugger. The high-level debugger executes on user’s host computer and works in conjunction with the LambdaMONITOR system-level debugger, which runs on a target system. The combination of the LambdaMONITOR debugger and optional host debugger provides a multitasking debug solution that features: • A sophisticated mouse and window user interface. • Automatic tracking of program execution through source code files. • Traces and breaks on high-level language statements. • Breaks on task state changes and operating system calls. • Monitoring of language variables and system-level objects such as tasks, queues and semaphores[7]. • Profiling for performance tuning and analysis. • System and task debug modes. • The ability to debug optimized code. The LambdaMONITOR debugger, in addition to acting as a back end for a high-level debugger on the host, can function as a stand-alone target-resident debugger that can accompany the final product to provide a field maintenance capability.
3.2 3.2.1
DeltaCORE Real-Time Kernel Overview
Discussions in this section focus primarily on concepts relevant to a singleprocessor system. The DeltaCORE kernel is a real-time, multitasking operating system kernel. As such, it acts as a nucleus of supervisory software that • Performs services on demand • Schedules, manages, and allocates resources
3.2. DELTACORE REAL-TIME KERNEL
19
• Generally coordinates multiple, asynchronous activities The DeltaCORE kernel maintains a highly simplified view of application software, irrespective of the application’s inner complexities. To the DeltaCORE kernel, applications consist of three classes of program elements: • Tasks • I/O Device Drivers • Interrupt Service Routines (ISRs) Tasks, their virtual environment, and ISRs are the primary topics of discussion in this section. The I/O system and device drivers are discussed in other section. A multitasked system is dynamic because task switching is driven by temporal events. In a multitasking system, while tasks are internally synchronous, different tasks can execute asynchronously. A task can be stopped to allow execution to pass to another task at any time.
3.2.2
Multitasking Implementation
Thus, a multi-tasked implementation closely parallels the real world, which is mainly asynchronous and/or cyclical as far as real-time systems apply. Application software for multitasking systems is likely to be far more structured, race-free, maintainable, and re-usable. Several DeltaCORE kernel attributes help solve the problems inherent in real-time software development. They include • Partitioning of actions into multiple tasks, each capable of executing in parallel (overlapping) with other tasks: the DeltaCORE kernel switches on cue between tasks, thus enabling applications to act asynchronously – in response to the outside world. • Task prioritization. The DeltaCORE kernel always executes the highest priority task that can run. • Task preemption. If an action is in progress and a higher priority external event occurs, the event’s associated action takes over immediately. • Powerful, race-free synchronization mechanisms available to applications, which include message queues, semaphores, multiple-wait events, and asynchronous signals. • Timing functions, such as wake-up, alarm timers, and timeouts for servicing cyclical, external events.
20
DeltaCORE Real-Time Kernel
Concept of a Task From the system’s perspective, a task is the smallest unit of execution that can compete on its own for system resources. A task lives in a virtual, insulated environment furnished by the DeltaCORE kernel. Within this space, a task can use system resources or wait for them to become available, if necessary, without explicit concern for other tasks. Resources include the CPU, I/O devices, memory space, and so on. Conceptually, a task can execute concurrently with, and independent of, other tasks. The DeltaCORE kernel simply switches between different tasks on cue. The cues come by way of system calls to the DeltaCORE kernel. For example, a system call might cause the kernel to stop one task in mid-stream and continue another from the last stopping point. Although each task is a logically separate set of actions, it must coordinate and synchronize itself, with actions in other tasks or with ISRs, by calling DeltaCORE system services. Decomposition Criteria The decomposition of a complex application into a set of tasks and ISRs is a matter of balance and trade-offs, but one which obviously impacts the degree of parallelism, and therefore efficiency, that can be achieved. Excessive decomposition exacts an inordinate amount of overhead activity required in switching between the virtual environments of different tasks. Insufficient decomposition reduces throughput, because actions in each task proceed serially, whether they need to or not. There are no fixed rules for partitioning an application; the strategy used depends on the nature of the application. First of all, if an application involves multiple, independent main jobs (for example, control of N independent robots), then each job should have one or more tasks to itself. Within each job, however, the partitioning into multiple, cooperating tasks requires much more analysis and experience. The following discussion presents a set of reasonably sufficient criteria, whereby a job with multiple actions can be divided into separate tasks. Note that there are no necessary conditions for combining two tasks into one task, though this might result in a loss of efficiency or clarity. By the same token, a task can always be split into two, though perhaps with some loss of efficiency. Terminology: In this discussion, a job is defined as a group of one or more tasks, and a task is defined as a group of one or more actions. An action (act) is a locus of instruction execution, often a loop. A dependent action (dact) is an action containing one and only one dependent condition; this condition requires the
3.2. DELTACORE REAL-TIME KERNEL
21
action to wait until the condition is true, but the condition can only be made true by another dact. Decomposition Criteria: Given a task with actions A and B, if any one of the following criteria are satisfied, then actions A and B should be in separate tasks: Time – dact A and dact B are dependent on cyclical conditions that have different frequencies or phases. Asynchrony – dact A and dact B are dependent on conditions that have no temporal relationships to each other. Priority – dact A and dact B are dependent on conditions that require a different priority of attention. Clarity/Maintainability – act A and act B are either functionally or logically removed from each other. The DeltaCORE kernel imposes essentially no limit on the number of tasks that can coexist in an application. You simply specify in the DeltaCORE Configuration Table the maximum number of tasks expected to be active contemporaneously, and the DeltaCORE kernel allocates sufficient memory for the requisite system data structures to manage that many tasks.
3.2.3
Overview of System Operations
DeltaCORE kernel services can be separated into the following categories: • Task Management • Storage Allocation • Message Queue Services • Event and Asynchronous Signal Services • Semaphore Services • Time Management and Timer Services • Interrupt Completion Service • Error Handling Service • Multiprocessor Support Services
Detailed descriptions of each system call are provided in DeltaOS System Calls. The remainder of this chapter provides more details on the principles of DeltaCORE kernel operation and is highly recommended reading for first-time users of the DeltaCORE kernel.
22
DeltaCORE Real-Time Kernel
Task States A task can be in one of several execution states. A task’s state can change only as result of a system call made to the DeltaCORE kernel by the task itself, or by another task or ISR. From a macroscopic perspective, a multitasked application moves along by virtue of system calls into DeltaCORE, forcing the DeltaCORE kernel to then change the states of affected tasks and, possibly as a result, switch from running one task to running another. Therefore, gaining a complete understanding of task states and state transitions is an important step towards using the DeltaCORE kernel properly and fully in the design of multi-tasked applications. To the DeltaCORE kernel, a task does not exist either before it is created or after it is deleted. A created task must be started before it can execute. A created-but-unstarted task is therefore in an innocuous, embryonic state. Once started, a task generally resides in one of three states: • Ready • Running • Blocked A ready task is runnable (not blocked), and waits only for higher priority tasks to release the CPU. Because a task can be started only by a call from a running task, and there can be only one running task at any given instant, a new task always starts in the ready state. A running task is a ready task that has been given use of the CPU. There is always one and only one running task. In general, the running task has the highest priority among all ready tasks; unless the task’s preemption has been turned off. A task becomes blocked only as the result of some deliberate action on the part of the task itself, usually a system call that causes the calling task to wait. Thus, a task cannot go from the ready state to blocked, because only a running task can perform system calls. State Transitions Each state transition is described in detail below. Note the following abbreviations: • E for Running (Executing) • R for Ready • B for Blocked (E → B) A running task (E) becomes blocked when:
3.2. DELTACORE REAL-TIME KERNEL
23
1. It requests a message (delta message queue receive/delta message queue vreceive with wait) from an empty message queue; or 2. It waits for an event condition (delta event receive with wait enabled) that is not presently pending; or 3. It requests a semaphore token (delta semaphore obtain with wait) that is not presently available; or 4. It requests memory (delta region get segment with wait) that is not presently available; or 5. It pauses for a time interval (delta timer wakeafter) or until a particular time (delta timer wake when). (B → R) A blocked task (B) becomes ready when: • A message arrives at the message queue (delta message queue send, delta message queue urgent/delta message queue vurgent, delta message queue broadcast/delta message queue vbroadcast) where B has been waiting, and B is first in that wait queue; or • An event is sent to B (delta event send), fulfilling the event condition it has been waiting for; or • A semaphore token is returned (delta semaphore release), and B is first in that wait queue; or • Memory returned to the region (delta region return segment) allows a memory segment that to be allocated to B; or • B has been waiting with a timeout option for events, a message, a semaphore, or a memory segment, and that timeout interval expires; or • B has been delayed, and its delay interval expires or its wakeup time arrives; or • B is waiting at a message queue, semaphore or memory region, and that queue, semaphore or region is deleted by another task. (B → E) A blocked task (B) becomes the running task when: • Any one of the (B → R) conditions occurs, B has higher priority than the last running task, and the last running task has preemption enabled. (R → E) A ready task (R) becomes running when the last running task (E): • Blocks; or
• Re-enables preemption, and R has higher priority than E; or
24
DeltaCORE Real-Time Kernel
• Has preemption enabled, and E changes its own, or R’s, priority so that R now has higher priority than E and all other ready tasks; or • Runs out of its timeslice, its roundrobin mode is enabled, and R has the same priority as E. (E → R) The running task (E) becomes a ready task when: • Any one of the (B → E) conditions occurs for a blocked task (B) as a result of a system call by E or an ISR; or • Any one of the conditions 2-4 of (R → E) occurs. A fourth, but secondary, state is the suspended state. A suspended task cannot run until it is explicitly resumed. Suspension is very similar to blocking, but there are fundamental differences. First, a task can block only itself, but it can suspend other tasks as well as itself. Second, a blocked task can also be suspended. In this case, the effects are additive – that task must be both unblocked and resumed, the order being irrelevant, before the task can become ready or running. Note: The task states discussed above should not be confused with user and supervisor program states that exist on some processors. The latter are hardware states of privilege. Task Scheduling The DeltaCORE kernel employs a priority-based, preemptive scheduling algorithm. In general, the DeltaCORE kernel ensures that, at any point in time, the running task is the one with the highest priority among all ready-to-run tasks in the system. However, you can modify DeltaCORE scheduling behavior by selectively enabling and disabling preemption or time-slicing for one or more tasks. Each task has a mode word, with two settable bits that can affect scheduling. One bit controls the task’s preemptibility. If disabled, then once the task enters the running state, it will stay running even if other tasks of higher priority enter the ready state. A task switch will occur only if the running task blocks, or if it re-enables preemption. A second mode bit controls timeslicing. If the running task’s timeslice bit is enabled, the DeltaCORE kernel automatically tracks how long the task has been running. When the task exceeds the predetermined timeslice, and other tasks with the same priority are ready to run, the DeltaCORE kernel switches to run one of those tasks. Timeslicing only affects scheduling among equal priority tasks.
3.2. DELTACORE REAL-TIME KERNEL
25
Task Priority A priority must be assigned to each task when it is created. There are 256 priority levels – 255 is the highest, 0 the lowest. Certain priority levels are reserved for use by special DeltaOS tasks. Level 0 is reserved for the IDLE daemon task furnished by the DeltaCORE kernel. Levels 240 - 255 are reserved for a variety of high priority tasks, including the DeltaCORE ROOT. A task’s priority, including that of system tasks, can be changed at runtime by calling the delta task set priority system call. When a task enters the ready state, the DeltaCORE kernel puts it into an indexed ready queue behind tasks of higher or equal priority. All ready queue operations, including insertions and removals, are achieved in fast, constant time. No search loop is needed. During dispatch, when it is about to exit and return to the application code, the DeltaCORE kernel will normally run the task with the highest priority in the ready queue. If this is the same task that was last running, then the DeltaCORE kernel simply returns to it. Otherwise, the last running task must have either blocked, or one or more ready tasks now have higher priority. In the first (blocked) case, the DeltaCORE kernel will always switch to run the task currently at the top of the indexed ready queue. In the second case, technically known as preemption, the DeltaCORE kernel will also perform a task switch, unless the last running task has its preemption mode disabled, in which case the dispatcher has no choice but to return to it. Note that a running task can only be preempted by a task of higher or equal (if timeslicing enabled) priority. Therefore, the assignment of priority levels is crucial in any application. A particular ready task cannot run unless all tasks with higher priority are blocked. By the same token, a running task can be preempted at any time, if an interrupt occurs and the attendant ISR unblocks a higher priority task. Roundrobin by Timeslicing In addition to priority, the DeltaCORE kernel can use timeslicing to schedule task execution. However, timesliced (roundrobin) scheduling can be turned on/off on a per task basis, and is always secondary to priority considerations. You can specify the timeslice quantum in the Configuration Table using the parameter kc ticks2slice. For example, if this value is 6, and the clock frequency (kc ticks2sec) is 60, a full slice will be 1/10 second. Each task carries a timeslice counter, initialized by the DeltaCORE kernel to the timeslice quantum when the task is created. Whenever a clock tick is announced to the DeltaCORE kernel, the DeltaCORE time manager decrements the running task’s timeslice counter unless it is already 0. The timeslice counter is meaningless if the task’s roundrobin bit or the preemption bit is dis-
26
DeltaCORE Real-Time Kernel
abled. If the running task’s roundrobin bit and preemption bit is enabled and its time-slice counter is 0, two outcomes are possible as follows: 1. If all other presently ready tasks have lower priority, then no special scheduling takes place. The task’s timeslice counter stays at zero, so long as it stays in the running or ready state. 2. If one or more other tasks of the same priority are ready, the DeltaCORE kernel moves the running task from the running state into the ready state, and re-enters it into the indexed ready queue behind all other ready tasks of the same priority. This forces the DeltaCORE dispatcher to switch from that last running task to the task now at the top of the ready queue. The last running task’s timeslice counter is given a full timeslice, in preparation for its next turn to run. Regardless of whether or not its roundrobin mode bit is enabled, when a task becomes ready from the blocked state, the DeltaCORE kernel always inserts it into the indexed ready queue behind all tasks of higher or equal priority. At the same time, the task’s timeslice counter is refreshed with a new, full count. Note: The preemption mode bit takes precedence over roundrobin scheduling. If the running task has preemption disabled, then it will preclude roundrobin and continue to run. In general, real-time systems rarely require time-slicing, except to insure that certain tasks will not inadvertently monopolize the CPU. Therefore, the DeltaCORE kernel by default initializes each task with the roundrobin mode disabled. For example, shared priority is often used to prevent mutual preemption among certain tasks, such as those that share non-reentrant critical regions. In such cases, roundrobin should be left disabled for all such related tasks, in order to prevent the DeltaCORE kernel from switching tasks in the midst of such a region. To maximize efficiency, a task’s roundrobin should be left disabled, if: 1. It has a priority level to itself, or 2. It shares its priority level with one or more other tasks, but roundrobin by timeslice among them is not necessary. Manual Roundrobin For certain applications, automatic roundrobin by timeslice might not be suitable. However, there might still be a need to perform roundrobin manually – that is, the running task might need to explicitly give up the CPU to other ready tasks of the same priority. The DeltaCORE kernel supports manual roundrobin, via the delta taskwake after system call with a zero interval. If the running task is the only
3.2. DELTACORE REAL-TIME KERNEL
27
ready task at that priority level, then the call simply returns to it. If there are one or more ready tasks at the same priority, then the DeltaCORE kernel will take the calling task from the running state into the ready state, thereby putting it behind all ready tasks of that priority. This forces the DeltaCORE kernel to switch from that last running task to another task of the same priority now at the head of the ready queue. Dispatch Criteria Dispatch refers to the exit stage of the DeltaCORE kernel, where it must decide which task to run upon exit; that is, whether it should continue with the running task, or switch to run another ready task. If the DeltaCORE kernel is entered because of a system call from a task, then the DeltaCORE kernel will always exit through the dispatcher, in order to catch up with any state transitions that might have been caused by the system call. For example, the calling task might have blocked itself, or made a higher priority blocked task ready. On the other hand, if the DeltaCORE kernel is entered because of a system call by an ISR, then the DeltaCORE kernel will not dispatch, but will instead return directly to the calling ISR, to allow the ISR to finish its duties. Because a system call from an ISR might have caused a state transition, such as readying a blocked task, a dispatch must be forced at some point. This is the reason for the I RETURN entry into the DeltaCORE kernel, which is used by an ISR to exit the interrupt service, and at the same time allow the DeltaCORE kernel to execute a dispatch. Objects, Names, and IDs The DeltaCORE kernel is an object-oriented operating system kernel. Object classes include t asks, memory regions, memory partitions, message queues, and semaphores. Each object is created at runtime and known throughout the system by two identities – a pre-assigned name and a run-time ID. An object’s 32- bit (4 characters, if ASCII) name is user-assigned and passed to the DeltaCORE kernel as input to an Obj CREATE (e.g. delta task create) system call. The DeltaCORE kernel in turn generates and assigns a unique, 32-bit object ID (e.g. Tid) to the new object. Except for Obj IDENT (e.g. delta message queue ident) calls, all system calls that reference an object must use its ID. For example, a task is suspended using its Tid, a message is sent to a message queue using its Qid, and so forth. The run-time ID of an object is of course known to its creator task – it is returned by the Obj CREATE system call. Any other task that knows an object only by its user-assigned name can obtain its ID in one of two ways: 1. Use the system call Obj IDENT once with the object’s name as input; the DeltaCORE kernel returns the object’s ID, which can then be saved
28
DeltaCORE Real-Time Kernel
Name delta delta delta delta delta delta delta delta delta delta delta
task task task task task task task task task task task
create ident start restart delete suspend resume set priority mode set note get note
Description Create a new task. Get the ID of a task. Start a new task. Restart a task. Delete a task. Suspend a task. Resume a suspended task. Change a task’s priority. Change calling task’s mode bits. Set a task’s notepad register. Get a task’s notepad register.
Table 3.1: System Calls for Task Management
away. 2. Or, the object ID can be obtained from the parent task in one of several ways. For example, the parent can store away the object’s ID in a global variable – the Tid for task ABCD can be saved in a global variable with a name like ABCD TID, for access by all other tasks. An object’s ID contains implicitly the location, even in a multiprocessor distributed system, of the object’s control block (e.g. TCB or QCB), a structure used by the DeltaCORE kernel to manage and operate on the abstract object. Objects are truly dynamic – the binding of a named object to its reference handle is deferred to runtime. By analogy, the DeltaCORE kernel treats objects like files. A file is created by name. But to avoid searching, read and write operations use the file’s ID returned by create or open. Thus, delta task create is analogous to File Create, and delta task ident to File Open. As noted above, an object’s name can be any 32-bit integer. However, it is customary to use four-character ASCII names, because ASCII names are more easily remembered, and DeltaOS debug tools will display an object name in ASCII, if possible.
3.2.4
Task Management
In general, task management provides dynamic creation and deletion of tasks, and control over task attributes. The available system calls in this group are:
3.2. DELTACORE REAL-TIME KERNEL
29
Creation of a Task Task creation refers to two operations. The first is the actual creation of the task by the delta task create call. The second is making the task ready to run by the delta task start call. These two calls work in conjunction so the DeltaCORE kernel can schedule the task for execution and allow the task to compete for other system resources. Refer to DeltaOS System Calls for a description of delta task create and delta task start. A parent task creates a child task by calling delta task create. The parent task passes the following input parameters to the child task: • A user-assigned name • A priority level for scheduling purposes • Sizes for one or two stacks • Several flags Refer to the description of delta task create in DeltaOS System Calls for a description of the preceding parameters. delta task create acquires and sets up a Task Control Block (TCB) for the child task, then it allocates a memory segment (from Region 0) large enough for the task’s stack(s) and any necessary extensions. Extensions are extra memory areas required for optional features. For example: n A floating point context save area for systems with co-processors n Memory needed by other system components (such as DeltaFILE, DeltaNET, and so forth) to hold per-task data This memory segment is linked to the TCB. delta task create returns a task identifier assigned by the DeltaCORE kernel. The delta task start call must be used to complete the creation. deltatask start supplies the starting address of the new task, a mode word that controls its initial execution behavior, and an optional argument list. Once started, the task is ready-to-run, and is scheduled for execution based on its assigned priority. With two exceptions, all user tasks that form a multitasking application are created dynamically at runtime. One exception is the ROOT task, which is created and started by the DeltaCORE kernel as part of its startup initialization. After startup, the DeltaCORE kernel simply passes control to the ROOT task. The other exception is the default IDLE task, also provided as part of startup. All other tasks are created by explicit system calls to the DeltaCORE kernel, when needed. In some designs, ROOT can initialize the rest of the application by creating all the other tasks at once. In other systems, ROOT might create a few tasks, which in turn can create a second layer of tasks, which in turn can create a third
30
DeltaCORE Real-Time Kernel
layer, and so on. The total number of active tasks in your system is limited by the kc ntask specification in the DeltaCORE Configuration Table. The code segment of a task must be memory resident. It can be in ROM, or loaded into RAM either at startup or at the time of its creation. A task’s data area can be statically assigned, or dynamically requested from the DeltaCORE kernel. Task Control Block A task control block (TCB) is a system data structure allocated and maintained by the DeltaCORE kernel for each task after it has been created. A TCB contains everything the kernel needs to know about a task, including its name, priority, remainder of timeslice, and of course its context. Generally, context refers to the state of machine registers. When a task is running, its context is highly dynamic and is the actual contents of these registers. When the task is not running, its context is frozen and kept in the TCB, to be restored the next time it runs. There are certain overhead structures within a TCB that are used by the DeltaCORE kernel to maintain it in various system-wide queues and structures. For example, a TCB might be in one of several queues – the ready queue, a message wait queue, a semaphore wait queue, or a memory region wait queue. It might additionally be in a timeout queue. At DeltaCORE kernel startup, a fixed number of TCBs is allocated reflecting the maximum number of concurrently active tasks specified in the DeltaCORE Configuration Table entry kc ntask. A TCB is allocated to each task when it is created, and is reclaimed for reuse when the task is deleted. A task’s Tid contains, among other things, the encoded address of the task’s TCB. Thus, for system calls that supply Tid as input, the DeltaCORE kernel can quickly locate the target task’s TCB. By convention, a Tid value of 0 is an alias for the running task. Thus, if 0 is used as the Tid in a system call, the target will be the calling task’s TCB. Task Mode Word Each task carries a mode word that can be used to modify scheduling decisions or control its execution environment: • Preemption Enabled/Disabled – If a task has preemption disabled, then so long as it is ready, the DeltaCORE kernel will continue to run it, even if there are higher priority tasks also ready. • Roundrobin Enabled/Disabled. • ASR Enabled/Disabled – Each task can have an Asynchronous Signal Service Routine (ASR), which must be established by the as catch system
3.2. DELTACORE REAL-TIME KERNEL
31
call. Asynchronous signals behave much like software interrupts. If a task’s ASR is enabled, then an as send system call directed at the task will force it to leave its expected execution path, execute the ASR, and then return to the expected execution path. • Interrupt Control – Allows interrupts to be disabled while a task is running. On some processors, you can fine-tune interrupt control. Details are provided in the delta task mode() and delta task start() call descriptions in DeltaOS System Calls. A task’s mode word is set up initially by the delta task start call and can be changed dynamically using the delta task mode call. Some processor versions of DeltaCORE place restrictions on which mode attributes can be changed by delta task mode(). To ensure correct operation of the application, you should avoid direct modification of the CPU control/status register. Use delta task mode for such purposes, so that the DeltaCORE kernel is correctly informed of such changes. Task Stacks Each task must have its own stack, or stacks. You declare the size of the stack(s) when you create the task using delta task create(). Details regarding processor-specific use of stacks are provided in the delta task create() call description of DeltaOS System Calls. Task Memory The DeltaCORE kernel allocates and maintains a task’s stack(s), but it has no explicit knowledge of a task’s code or data areas. For most applications, application code is memory resident prior to system startup, being either ROM resident or bootloaded. For some systems, a task can be brought into memory just before it is created or started; in which case, memory allocation and/or location sensitivity should be considered. Death of a Task A task can terminate itself, or another task. The delta task delete removes a created task by reclaiming its TCB and returning the stack memory segment to Region 0. The TCB is marked as free, and can be reused by a new task. The proper reclamation of resources such as segments, buffers, or semaphores should be an important part of task deletion. This is particularly true for dynamic applications, wherein parts of the system can be shutdown and/or regenerated on demand. In general, delta task delete should only be used to perform self-deletion. The reason is simple. When used to forcibly delete another task, delta task delete denies that task a chance to perform any necessary cleanup work.
32
DeltaCORE Real-Time Kernel
A preferable method is to use the delta task restart call, which forces a task back to its initial entry point. Because delta task restart can pass an optional argument list, the target task can use this to distinguish between a delta task start, a meaningful delta task restart, or a request for selfdeletion. In the latter case, the task can return any allocated resources, execute any necessary cleanup code, and then gracefully call delta task delete to delete itself. A deleted task ceases to exist insofar as the DeltaCORE kernel is concerned, and any references to it, whether by name or by Tid, will evoke an error return. Notepad Registers Each task has 16 software notepad 32-bit registers. They are carried in a task’s TCB, and can be set and read using the delta task set note and delta task get note calls, respectively. The purpose of these registers is to provide to each task, in a standard system-wide manner, a set of named variables that can be set and read by other tasks, including by remote tasks on other processor nodes. Eight of these notepad registers are reserved for system use. The remaining eight can be used for any application specific purpose. The Idle Task At startup, the DeltaCORE kernel automatically creates and starts an idle task, named IDLE, whose sole purpose in life is to soak up CPU time when no other task can run. IDLE runs at priority 0 with a stack allocated from Region 0 whose size is equal to kc rootsst. On most processors, IDLE executes only an infinite loop. On some processors, DeltaCORE can be configured to call a user-defined routine when IDLE is executed. This user-defined routine can be used for purposes such as power conservation. Though simple, IDLE is an important task. It must not be tampered with via delta task delete, delta task suspend, delta task set priority, or delta task mode, unless you have provided an equivalent task to fulfill this necessary idling function.
3.2.5
Storage Allocation
DeltaCORE storage management services provide dynamic allocation of both variable size segments and fixed size buffers. The system calls are Regions and Segments A memory region is a user-defined, physically contiguous block of memory. Regions can possess distinctive implicit attributes. For example, one can reside in
3.2. DELTACORE REAL-TIME KERNEL
Name delta delta delta delta delta delta delta delta delta delta
region create region ident region delete region get segment region ret segment partition create partition ident partition delete partition get buffer partiton return buffer
33
Description Create a memory region. Get the ID of a memory region. Delete a memory region. Allocate a segment from a region. Return a segment to a region. Create a partition of buffers. Get the ID of a partition. Delete a partition of buffers. Get a buffer from a partition. Return a buffer to a partition.
Table 3.2: System Calls for Storage Management
strictly local RAM, another in system-wide accessible RAM. Regions must be mutually disjoint and can otherwise be positioned on any long word boundary. Like tasks, regions are dynamic abstract objects managed by the DeltaCORE kernel. A region is created using the delta region create call with the following inputs – its user-assigned name, starting address and length, and unit size. The DeltaCORE system call delta region create returns a region ID (RNid) to the caller. For any other task that knows a region only by name, the delta region ident call can be used to obtain a named region’s RNid. A segment is a variable-sized piece of memory from a memory region, allocated by the DeltaCORE kernel on the delta region get segment system call. Inputs to delta region get segment include a region ID, a segment size that might be anything, and an option to wait until there is sufficient free memory in the region. The delta region return segment call reclaims an allocated segment and returns it to a region. A region can be deleted, although this is rarely used in a typical application. For one thing, deletion must be carefully considered, and is allowed by the DeltaCORE kernel only if there are no outstanding segments allocated from it, or if the delete override option was used when the region was created. Special Region 0 The DeltaCORE kernel requires at least one region in order to function. This special region’s name is RN#0 and its id is zero (0). The start address and length of this region are specified in the DeltaCORE Configuration Table. During DeltaCORE startup, the DeltaCORE kernel first carves a Data Segment from the beginning of Region 0 for its own data area and control structures such as TCBs, etc. The remaining block of Region 0 is used for task stacks,
34
DeltaCORE Real-Time Kernel
as well as any user delta region get segment calls. The DeltaCORE kernel pre-allocates memory for its own use. That is, after startup, the DeltaCORE kernel makes no dynamic demands for memory. However, when the delta task create system call is used to create a new task, the DeltaCORE kernel will internally generate an delta region get segment call to obtain a segment from Region 0 to use as the task’s stack (or stacks in the case of certain processors). Similarly, when delta message queue vcreate is used to create a variable length message queue, the DeltaCORE kernel allocates a segment from Region 0 to store messages pending at the queue. Note that the DeltaCORE kernel keeps track of each task’s stack segment and each variable length message queue’s message storage segment. When a task or variable length queue is deleted, the DeltaCORE kernel automatically reclaims the segment and returns it to Region 0. Like any memory region, your application can make delta region get segment and delta region returnsegment system calls to Region 0 to dynamically allocate and return variable-sized memory segments. Region 0, by default, queues any tasks waiting there for segment allocation by FIFO order. Allocation Algorithm The DeltaCORE kernel takes a piece at the beginning of the input memory area to use as the region’s control block (RNCB). The size of the RNCB varies, depending on the region size and its unit size parameter, described below. Each memory region has a unit size parameter, specified as an input to delta region create. This region-specific parameter is the region’s smallest unit of allocation. This unit must be a power of 2, but greater than or equal to 16 bytes. Any segment allocated by delta region get segment is always a size equal to the nearest multiple of unit size. For example, if a region’s unit size is 32 bytes, and an delta region get segment call requests 130 bytes, then a segment with 5 units or 160 bytes will be allocated. A region’s length cannot be greater than 32,767 times the unit size of the region. The unit size specification has a significant impact on (1) the efficiency of the allocation algorithm, and (2) the size of the region’s RNCB. The larger the unit size, the faster the delta region get segment and delta region retsegment execution, and the smaller the RNCB. The DeltaCORE region manager uses an efficient heap management algorithm. A region’s RNCB holds an allocation map and a heap structure used to manage an ordered list of free segments. By maintaining free segments in order of decreasing size, an delta region get segment call only needs to check the first such segment. If the segment is too small, then allocation is clearly impossible. The caller can wait, wait with timeout, or return immediately with an error code. If the segment is large enough, then it will be split. One part is returned to the calling task. The other part is re-entered into the heap struc-
3.2. DELTACORE REAL-TIME KERNEL
35
ture. If the segment exactly equals the requested segment size, it will not be split. When delta region return segment returns a segment, the DeltaCORE kernel always tries to merge it with its neighbor segments, if one or both of them happen to be free. Merging is fast, because the neighbor segments can be located without searching. The resulting segment is then re-entered into the heap structure. Partitions and Buffers A memory partition is a user-defined, physically contiguous block of memory, divided into a set of equal-sized buffers. Aside from having different buffer sizes, partitions can have distinctive implicit attributes. For example, one can reside in strictly local RAM, another in system-wide accessible RAM. Partitions must be mutually disjoint. Like regions, partitions are dynamic abstract objects managed by the DeltaCORE kernel. A partition is created using the delta partition create call with the following inputs – its user-assigned name, starting address and length, and buffer size. The system call delta partition create returns a partition ID (PTid) assigned by the DeltaCORE kernel to the caller. For any other task that knows a partition only by name, the delta partition ident call can be used to obtain a named partition’s PTid. The DeltaCORE kernel takes a small piece at the beginning of the input memory area to use as the partition’s control block (PTCB). The rest of the partition is organized as a pool of equal-sized buffers. Because of this simple organization, the delta partition get buffer and delta partition return buffer system calls are highly efficient. A partition has the following limits – it must start on a long-word boundary and its buffer size must be a power of 2, but greater than or equal to 4 bytes. Partitions can be deleted, although this is rarely done in a typical application. For one thing, deletion must be carefully considered, and is allowed by the DeltaCORE kernel only if there are no outstanding buffers allocated from it. Partitions can be used, in a tightly-coupled multiprocessor configuration, for efficient data exchange between processor nodes.
3.2.6
Communication, Synchronization and Mutual Exclusion
A DeltaCORE based application is generally partitioned into a set of tasks and interrupt service routines (ISRs). Conceptually, each task is a thread of independent actions that can execute concurrently with other tasks. However, cooperating tasks need to exchange data, synchronize actions, or share exclusive resources. To service task-to-task as well as ISR-to-task communication,
36
Name delta delta delta delta delta delta delta
DeltaCORE Real-Time Kernel
message message message message message message message
queue queue queue queue queue queue queue
create ident delete receive send urgent broadcast
Description Create a message queue. Get the ID of a message queue. Delete a message queue. Get/wait for a message from a queue. Post a message at the end of a queue. Put a message at head of a queue. Broadcast a message to a queue.
Table 3.3: System Calls for Message Queue
synchronization, and mutual exclusion, the DeltaCORE kernel provides three sets of facilities – message queues, events, and semaphores.
3.2.7
The Message Queue
Message queues provide a highly flexible, general-purpose mechanism to implement communication and synchronization. The related system calls are listed below: Like a task, a message queue is an abstract object, created dynamically using the delta message queue create system call. delta message queue create accepts as input a user-assigned name and several characteristics, including whether tasks waiting for messages there will wait first-in-first-out, or by task priority, whether the message queue has a limited length, and whether a set of message buffers will be reserved for its private use. A queue is not explicitly bound to any task. Logically, one or more tasks can send messages to a queue, and one or more tasks can request messages from it. A message queue therefore serves as a many-to-many communication switching station. Consider this many-to-1 communication example. A server task can use a message queue as its input request queue. Several client tasks independently send request messages to this queue. The server task waits at this queue for input requests, processes them, and goes back for more – a single queue, single server implementation. The number of message queues in your system is limited by the kc nqueue specification in the DeltaCORE Configuration Table. A message queue can be deleted using the delta message queue delete system call. If one or more tasks are waiting there, they will be removed from the wait queue and returned to the ready state. When they run, each task will have returned from their respective delta message queue receive call with an error code (Queue Deleted). On the other hand, if there are messages posted at the queue, then the DeltaCORE kernel will reclaim the message buffers and all message contents
3.2. DELTACORE REAL-TIME KERNEL
37
are lost. The Queue Control Block Like a Tid, a message queue’s Qid carries the location of the queue’s control block (QCB), even in a multiprocessor configuration. This is an important notion, because using the Qid to reference a message queue totally eliminates the need to search for its control structure. A QCB is allocated to a message queue when it is created, and reclaimed for re-use when it is deleted. This structure contains the queue’s name and ID, wait-queueing method, and message queue length and limit. Queue Operations A queue usually has two types of users – sources and sinks. A source posts messages, and can be a task or an ISR. A sink consumes messages, and can be another task or (with certain restrictions) an ISR. There are three different ways to post a message – delta message queue send, delta message queue urgent, and delta message queue broadcast. When a message arrives at a queue, and there is no task waiting, it is copied into a message buffer taken from either the shared or (if it has one) the queue’s private, free buffer pool. The message buffer is then entered into the message queue. A delta message queue send call puts a message at the end of the message queue. delta message queue urgent inserts a message at the front of the message queue. When a message arrives at a queue, and there are one or more tasks already waiting there, then the message will be given to the first task in the wait queue. No message buffer will be used. That task then leaves the queue, and becomes ready to run. The delta message queue broadcast system call broadcasts a message to all tasks waiting at a queue. This provides an efficient method to wake up multiple tasks with a single system call. There is only one way to request a message from a queue – the delta message queue receive system call. If no message is pending, the task can elect to wait, wait with timeout, or return unconditionally. If a task elects to wait, it will either be by first-in-first-out or by task priority order, depending on the specifications given when the queue was created. If the message queue is non-empty, then the first message in the queue will be returned to the caller. The message buffer that held that message is then released back to the shared or the queue’s private free buffer pool. Messages and Message Buffers Messages are fixed length, consisting of four long words. A message’s content is entirely dependent on the application. It can be used to carry data, pointer
38
DeltaCORE Real-Time Kernel
Name delta event receive delta event send
Description Get or wait for events. Send events to a task.
Table 3.4: System Calls for Events
to data, data size, the sender’s Tid, a response queue Qid, or some combination of the above. In the degenerate case where a message is used purely for synchronization, it might carry no information at all. When a message arrives at a message queue and no task is waiting, the message must be copied into a message buffer that is then entered into the message queue. A DeltaCORE message buffer consists of five long words. Four of the long words are the message and one is a link field. The link field links one message buffer to another. At startup, the DeltaCORE kernel allocates a shared pool of free message buffers. The size of this pool is equal to the kc nmsgbuf entry in the DeltaCORE Configuration Table. A message queue can be created to use either a pool of buffers shared among many queues or its own private pool of buffers. In the first case, messages arriving at the queue will use free buffers from the shared pool on an as-needed basis. In the second case, a number of free buffers equal to the queue’s maximum length are taken from the shared pool and set aside for the private use of the message queue.
3.2.8
Events
The DeltaCORE kernel provides a set of synchronization-by-event facilities. Each task has 32 event flags it can wait on, bit-wise encoded in a 32-bit word. The high 16 bits are reserved for system use. The lower 16 event flags are user definable. Two DeltaCORE system calls provide synchronization by events between tasks and between tasks and ISRs: delta event send is used to send one or more events to another task. With delta event receive, a task can wait for, with or without timeout, or request without waiting, one or more of its own events. One important feature of events is that a task can wait for one event, one of several events (OR), or all of several events (AND). Event Operations Events are independent of each other. The delta event receive call permits synchronization to the arrival of one or more events, qualified by an AND or OR condition. If all the required event bits are on (i.e. pending), then the delta event receive call resets them and returns immediately. Otherwise,
3.2. DELTACORE REAL-TIME KERNEL
39
the task can elect to return immediately or block until the desired event(s) have been received. A task or ISR can send one or more events to another task. If the target task is not waiting for any event, or if it is waiting for events other than those being sent, delta event send simply turns the event bit(s) on, which makes the events pending. If the target task is waiting for some or all of the events being sent, then those arriving events that match are used to satisfy the waiting task. The other non-matching events are made pending, as before. If the requisite event condition is now completely satisfied, the task is unblocked and made ready-to-run; otherwise, the wait continues for the remaining events. Events Versus Messages Events differ from messages in the following sense: • An event can be used to synchronize with a task, but it cannot directly carry any information. • Topologically, events are sent point to point. That is, they explicitly identify the receiving task. A message, on the other hand, is sent to a message queue. In a multi-receiver case, a message sender does not necessarily know which task will receive the message. • One delta event receive call can condition the caller to wait for multiple events. delta message queue receive, on the other hand, can only wait for one message from one queue. • Messages are automatically buffered and queued. Events are neither counted nor queued. If an event is already pending when a second, identical one is sent to the same task, the second event will have no effect.
3.2.9
Semaphores
The DeltaCORE kernel provides a set of familiar semaphore operations. In general, they are most useful as resource tokens in implementing mutual exclusion. The related system calls are listed below: Like a message queue, a semaphore is an abstract object, created dynamically using the delta semaphore create system call. delta semaphore create accepts as input a user-assigned name, an initial count, and several characteristics, including whether tasks waiting for the semaphore will wait first-in-first-out, or by task priority. The initial count parameter should reflect the number of available “tokens” at the semaphore. delta semaphore create assigns a unique ID, the SMid, to each semaphore. The number of semaphores in your system is limited by the kc nsema4 specification in the DeltaCORE Configuration Table.
40
DeltaCORE Real-Time Kernel
Name delta delta delta delta delta
semaphore semaphore semaphore semaphore semaphore
create ident delete obtain release
Description Create a semaphore. Get the ID of a semaphore. Delete a semaphore. Get / wait for a semaphore token. Return a semaphore token.
Table 3.5: System Calls for Semaphores
A semaphore can be deleted using the delta semaphore delete system call. If one or more tasks are waiting there, they will be removed from the wait queue and returned to the ready state. When they run, each task will have returned from its respective delta semaphore obtain call with an error code (Semaphore Deleted).
The Semaphore Control Block Like a Qid, a semaphore’s SMid carries the location of the semaphore control block (SMCB), even in a multiprocessor configuration. This is an important notion, because using the SMid to reference a semaphore eliminates completely the need to search for its control structure. An SMCB is allocated to a semaphore when it is created, and reclaimed for re-use when it is deleted. This structure contains the semaphore’s name and ID, the token count, and wait-queueing method. It also contains the head and tail of a doubly linked task wait queue.
Semaphore Operations The DeltaCORE kernel supports the traditional P and V semaphore primitives. The delta semaphore obtain call requests a token. If the semaphore token count is non-zero, then delta semaphore obtain decrements the count and the operation is successful. If the count is zero, then the caller can elect to wait, wait with timeout, or return unconditionally. If a task elects to wait, it will either be by first-in-first-out or by task priority order, depending on the specifications given when the semaphore was created. The delta semaphore release call returns a semaphore token. If no tasks are waiting at the semaphore, then delta semaphore release increments the semaphore token count. If tasks are waiting, then the first task in the wait list of semaphore is released from the list and made ready to run.
3.2. DELTACORE REAL-TIME KERNEL
Name delta signal catch delta signal send delta signal return
41
Description Establish a task’s ASR. Send signals to a task. Return from an ASR.
Table 3.6: System Calls for Asynchronous Signals
3.2.10
Asynchronous Signals
Each task can optionally have an Asynchronous Signal Service Routine (ASR). The ASR’s purpose is to allow a task to have two asynchronous parts – a main body and an ASR. In essence, just as one task can execute asynchronously from another task, an ASR provides a similar capability within a task. Using signals, one task or ISR can selectively force another task out of its normal locus of execution – that is, from the task’s main body into its ASR. Signals provide a “software interrupt” mechanism. This asynchronous communications capability is invaluable to many system designs. Without it, workarounds must depend on synchronous services such as messages or events, which, even if possible, suffer a great loss in efficiency. There are three related system calls: An asynchronous signal is a user-defined condition. Each task has 32 signals, encoded bit-wise in a long word. To receive signals, a task must establish an ASR using the delta signal catch call. The delta signal send call can be used to send one or more asynchronous signals to a task, thereby forcing the task, the next time it is dispatched, to first go to its ASR. At the end of an ASR, a call to delta signal return allows the DeltaCORE kernel to return the task to its original point of execution. The ASR A task can have only one active ASR, established using the delta signal catch call. A task’s ASR executes in the task’s context – from the outside, it is not possible to discern whether a task is executing in its main code body or its ASR. The delta signal catch call supplies both the ASR’s starting address and its initial mode of execution. This mode replaces the mode of the task’s main code body as long as the ASR is executing. It is used to control the ASR’s execution behavior, including whether it is preemptible and whether or not further asynchronous signals are accepted. Typically, ASRs execute with asynchronous signals disabled. Otherwise, the ASR must be programmed to handle re-entrancy. The details of how an ASR gains control are processor-specific; this information can be found in the description of delta signal catch in DeltaOS System Calls.
42
DeltaCORE Real-Time Kernel
A task can disable and enable its ASR selectively by calling delta task mode. Any signals received while a task’s ASR is disabled are left pending. When re-enabled, an ASR will receive control if there are any pending signals. Asynchronous Signal Operations The delta signal send call makes the specified signals pending at the target task, without affecting its state or when it will run. If the target task is not the running task, its ASR takes over only when it is next dispatched to run. If the target is the running task, which is possible only if the signals are sent by the task itself or, more likely, by an ISR, then the running task’s course changes immediately to the ASR. Signals Versus Events Despite their resemblance, asynchronous signals are fundamentally different from events, as follows: • To synchronize to an event, a task must explicitly call delta event receive. delta event send by itself has no effect on the receiving task’s state. By contrast, delta siganl send can unilaterally force the receiving task to execute its ASR. • From the perspective of the receiving task, response to events is synchronous; it occurs only after a successful delta event receive call. Response to signals is asynchronous; it can happen at any point in the task’s execution. Note that, while this involuntary-response behavior is by design, it can be modified to some extent by using delta task mode to disable (i.e. postpone) asynchronous signal processing.
3.2.11
Time Management
Time management provides the following functions: • Maintain calendar time and date. • Timeout (optional) a task that is waiting for messages, semaphores, events or segments. • Wake up or send an alarm to a task after a designated interval or at an appointed time. • Track the running task’s timeslice, and mechanize roundrobin scheduling. These functions depend on periodic timer interrupts, and will not work in the absence of a real-time clock or timer hardware. The explicit time management system calls are:
3.2. DELTACORE REAL-TIME KERNEL
Name delta clock tick delta delta delta delta delta delta delta
clock clock timer timer timer timer timer
set get wake after wake when fire after fire when cancel
43
Description Inform the DeltaCORE kernel of clock tick arrival. Set time and date. Get time and date. Wakeup task after interval. Wakeup task at appointed time. Send events to task after interval. Send events to task at appointed time. Cancel an alarm timer.
Table 3.7: System Calls for Time Management
The Time Unit The system time unit is a clock tick, defined as the interval between delta clock tick system calls. This call is used to announce to the DeltaCORE kernel the arrival of a clock tick – it is normally called from the real-time clock ISR on each timer interrupt. The frequency of delta clock tick determines the granularity of the system time-base. Obviously, the higher the frequency, the higher the time resolution for timeouts, etc. On the other hand, processing each clock tick takes a small amount of system overhead. You can specify this clock tick frequency in the DeltaCORE Configuration Table as kc ticks2sec. For example, if this value is specified as 100, the system time manager will interpret 100 delta clock tick system calls to be one second, real-time. Time and Date The DeltaCORE kernel maintains true calendar time and date, including perpetual leap year compensation. Two DeltaCORE system calls, delta timerset and delta timer get, allow you to set and obtain the date and time of day. Time resolution is accurate to system time ticks. No elapsed tick counter is included, because this can be easily maintained by your own code. For example, your real-time clock ISR can, in addition to calling delta clock tick on each clock interrupt, increment a 32- bit global counter variable. Timeouts Implicitly, the DeltaCORE kernel uses the time manager to provide a timeout facility to other system calls, e.g. delta message queue receive, delta message queue vreceive, delta event receive, delta semaphore obtain, and
44
DeltaCORE Real-Time Kernel
delta region get segment. The DeltaCORE kernel uses a proprietary timing structure and algorithm, which, in addition to being efficient, guarantees constant-time operations. Both task entry into and removal from the timeout state are performed in constant time – no search loops are required. If a task is waiting, say for message (delta message queue receive), with timeout, and the message arrives in time, then the task is simply removed from the timing structure, given the message, and made ready to run. If the message does not arrive before the time interval expires, then the task will be given an error code indicating timeout, and made ready to run. Timeout is measured in ticks. If kc ticks2sec is 100, and an interval of 50 milliseconds is required, then a value of 5 should be specified. Timeout intervals are 32 bits wide, allowing a maximum of 2 32 ticks. A timeout value of n will expire on the nth forthcoming tick. Because the system call can happen anywhere between two ticks, this implies that the real-time interval will be between n-1 and n ticks. Absolute Versus Relative Timing There are two ways a task can specify timing – relative or absolute. Relative timing is specified as an interval, measured in ticks. Absolute timing is specified as an appointed calendar date and time. The system calls delta timer wake after and delta timer fire after accept relative timing specifications. The system calls delta timer wake when and delta timer fire when accept absolute time specifications. Note that absolute timing is affected by any delta clock set calls that change the calendar date and time, whereas relative timings are not affected. In addition, use of absolute time specifications might require additional time manipulations. Wakeups Versus Alarms There are two distinct ways a task can respond to timing. The first way is to go to sleep (i.e. block), and wake up at the desired time. This synchronous method is supported by the delta timer wake after and delta timer wake when calls. The second way is to set an alarm timer, and then continue running. This asynchronous method is supported by delta timer fire after and delta timer fire when. When the alarm timer goes off, the DeltaCORE kernel will internally call delta event send to send the designated events to the task. Of course, the task must call delta event receive in order to test or wait for the scheduled event. Alarm timers offer several interesting features. First, the calling task can execute while the timer is counting down. Second, a task can arm more than
3.2. DELTACORE REAL-TIME KERNEL
45
one alarm timer, each set to go off at different times, corresponding to multiple expected conditions. This multiple alarm capability is especially useful in implementing nested timers, a common requirement in more sophisticated communications systems. Third, alarm timers can be cancelled using the delta timer cancel call. In essence, the wakeup mechanism is useful only in timing an entire task. The alarm mechanism can be used to time transactions within a task. Timeslice If the running task’s mode word has its roundrobin bit and preemptible bit on, then the DeltaCORE kernel will countdown the task’s assigned timeslice. If it is still running when its timeslice is down to zero, then roundrobin scheduling will take place. You can specify the amount of time that constitutes a full timeslice in the DeltaCORE Configuration Table as kc ticks2slice. For instance, if that value is 10, and the kc ticks2sec is 100, then a full timeslice is equivalent to about one-tenth of a second. The countdown or consumption of a timeslice is somewhat heuristic in nature, and might not exactly reflect the actual elapsed time a task has been running.
3.2.12
Interrupt Service Routines
Interrupt service routines (ISRs) are critical to any real-time system. On one side, an ISR handles interrupts, and performs whatever minimum action is required, to reset a device, to read/write some data, etc. On the other side, an ISR might drive one or more tasks, and cause them to respond to, and process, the conditions related to the interrupt. An ISR’s operation should be kept as brief as possible, in order to minimize masking of other interrupts at the same or lower levels. Normally, it simply clears the interrupt condition and performs the necessary physical data transfer. Any additional handling of the data should be deferred to an associated task with the appropriate (software) priority. This task can synchronize its actions to the occurrence of a hardware interrupt, by using either a message queue, events flag, semaphores, or ASR. Interrupt Entry On Coldfire, PowerPC, MIPS, and x86 processors, interrupts should be directly vectored to the user-supplied ISRs. As early as possible, the ISR should call the I ENTER entry in the DeltaCORE kernel. I ENTER sets an internal flag to indicate that an interrupt is being serviced and then returns to the ISR.
46
DeltaCORE Real-Time Kernel
Synchronizing With Tasks An ISR usually communicates with one or more tasks, either directly, or indirectly as part of its input/output transactions. The nature of this communication is usually to drive a task, forcing it to run and handle the interrupting condition. This is similar to the task-to-task type of communication or synchronization, with two important differences. First, an ISR is usually a communication/synchronization source – it often needs to return a semaphore, or send a message or an event to a task. An ISR is rarely a communication sink – it cannot wait for a message or an event. Second, a system call made from an ISR will always return immediately to the ISR, without going through the normal DeltaCORE dispatch. For example, even if an ISR sends a message and wakes up a high priority task, the DeltaCORE kernel must nevertheless return first to the ISR. This deferred dispatching is necessary, because the ISR must be allowed to complete. The DeltaCORE kernel allows an ISR to make any of the synchronization sourcing system calls, including delta message queue send, delta message queue urgent and delta message queue broadcast to post messages to message queues, delta semaphore obtain to return a semaphore, and delta event send to send events to tasks. A typical system implementation, for example, can use a message queue for this ISR-to-task communication. A task requests and waits for a message at the queue. An ISR sends a message to the queue, thereby unblocking the task and making it ready to run. The ISR then exits using the I RETURN entry into the DeltaCORE kernel. Among other things, I RETURN causes the DeltaCORE kernel to dispatch to run the highest priority task, which can be the interrupted running task, or the task just awakened by the ISR. The message, as usual, can be used to carry data or pointers to data, or for synchronization. In some applications, an ISR might additionally have the need to dequeue messages from a message queue. For example, a message queue might be used to hold a chain of commands. Tasks needing service will send command messages to the queue. When an ISR finishes one command, it checks to see if the command chain is now empty. If not, then it will dequeue the next command in the chain and start it. To support this type of implementation, the DeltaCORE kernel allows an ISR to make delta message queue receive system calls to obtain messages from a queue, and delta semahore release calls to acquire a semaphore. Note, however, that these calls must use the “no-wait” option, so that the call will return whether or not a message or semaphore is available. System Calls Allowed From an ISR The restricted subset of DeltaCORE system calls that can be issued from an ISR are as follows. Conditions necessary for the call to be issued from an ISR are in parentheses. As noted earlier, because an ISR cannot block, a delta -
3.2. DELTACORE REAL-TIME KERNEL
Name delta signal send delta event send delta partition get buffer delta partition return buffer delta message queue broadcast delta message queue receive
delta message queue send delta message queue urgent delta message queue vbroadcast delta message queue vreceive
delta message queue vsend delta message queue vurgent delta semaphore obtain delta semaphore release delta task get note delta task resume delta task set note delta clock get delta clock set delta clock tick
47
Description Send asynchronous signals to a task (local task). Send events to a task (local task). Get a buffer from a partition (local partition). Return a buffer to a partition (local partition). Broadcast a message to an ordinary queue (local queue). Get a message from an ordinary message queue (no-wait and local queue). Post a message to end of an ordinary message queue (local queue). Post a message at head of an ordinary message queue (local queue). Broadcast a variable length message to queue (local queue). Get a message from a variable length message queue (no-wait and local queue). Post a message to end of a variable length message queue (local queue). Post a message at head of a variable length message queue (local queue). Acquire a semaphore (no-wait and local semaphore). Return a semaphore (local semaphore). Get a task’s software register (local task). Resume a suspended task (local task). Set a task’s software register (local task). Get time and date. Set time and date. Announce a clock tick to the DeltaCORE kernel.
Table 3.8: System Calls for ISR
48
DeltaCORE Real-Time Kernel
message queue receive, delta message queue vreceive, or delta semaphore obtain call from an ISR must use the no-wait, i.e. unconditional return, option. Also, because remote service calls block, the above services can only be called from an ISR if the referenced object is local. All other DeltaCORE system calls are either not meaningful in the context of an ISR, or can be functionally served by another system call. Making calls not listed above from an ISR will lead to dangerous race conditions, and unpredictable results.
3.2.13
Tasks Using Other Components
CoreTek Systems offers many other system components that can be used in systems with the DeltaCORE kernel. While these components are easy to install and use, they require special consideration with respect to their internal resources and multitasking. During normal operation, components internally allocate and hold resources on behalf of calling tasks. Some resources are held only during execution of a service call. Others are held indefinitely and this depends on the state of the task. In the DeltaFILE component, for example, control information is kept whenever files are open. The DeltaCORE service calls delta task restart and delta task delete asynchronously alter the execution path of a task and present special problems relative to management of these resources. The subsections that follow discuss deletion and restart-related issues in detail and present recommended methods for performing these operations. Deleting Tasks That Use Components To avoid permanent loss of component resources, the DeltaCORE kernel does not allow deletion of a task that is holding any such resource. Instead, delta task delete returns an error code, which indicates that the task to be deleted holds one or more resources. The exact conditions under which components hold resources are complex. In general, any task that has made a component service call might be holding resources. But all components provide a facility for returning all of their taskrelated resources, via a single service call. We recommend that these calls be made prior to calling delta task delete. DeltaFILE and DeltaNET components can hold resources that must be returned before a task can be deleted. These resources are returned by calling close f(0), close(0) and fclose(0), and free(-1) respectively. Obviously, calls to components not in use should be omitted. Because only the task to be deleted can make the necessary close calls, the simplest way to delete a task is to restart the task, passing arguments to it that indicate that the task should delete itself. (Of course, the task code must be written to check its arguments and behave accordingly.)
3.2. DELTACORE REAL-TIME KERNEL
49
Restarting Tasks That Use Components The DeltaCORE kernel allows a task to be restarted regardless of its current state. Check the sections in this manual for each component to determine its behavior on task restart. It is possible to restart a task while the task is executing code within the components themselves. Consider the following example: 1. Task A makes a DeltaFILE call. 2. While executing DeltaFILE code, task A is preempted by task B. 3. Task B then restarts task A. In such situations, the DeltaFILE component will correctly return resources as required. However, a file system volume might be left in an inconsistent state. For example, if delta task restart interrupts a create f operation, a file descriptor (FD) might have been allocated but not the directory entry. As a result, an FD could be permanently lost. But, the DeltaFILE component is aware of this danger, and returns a warning, via the delta task restart. All components are notified of task restarts, so expect such warnings from any of them.
Part II
Data Structures in DeltaOS
50
Chapter 4
Introduction For many computer programmers, only the first few years of formalization seem to be formative ones. After that time they become captive to the concepts acquired first. The programming challenges of those early days remain those of the future. The languages and techniques they learned yesterday become the governing criteria for their very basic thought processes of the tomorrow. They are slaves not to the technology of the present, but to that of the past. The harm to the computing industry arising from this situation is compounded by the alarming fact that no one seems to leave the computing profession. In few other areas is the Peter Principle so evident: Everyone rises to his level of incompetence and then stays there, doing what he is least competent to do. The area in which this situation has its worst effect is in the choice of data organization to use in setting up particular applications. The choice of ways in which to structure collections of data for use with programs is today overwhelming, not just in theory, but even under constraints of time and space efficiency. It has become generally practisible to organize data according to its logical structure. and according to the natural interrelationship among its components; one must still of course take cognizance of performance pragmatics, but logical requirements need no longer be so distorted in deference to physical considerations. Good system support for complex data structures is available today, where it was not before. It is absolutely essential that this new freedom of representation be exploited. We emphasize this plea with the claim that data organization, more than programming techniques, provides the bounds on quality of applications produced. It is difficult to write a decent computer program whose data are structured in ways artificial and unnatural to their logical characteristics. Similarly, it is difficult to write a really bad program if its data are well organized according to their natural structure. The organization of a collection of data dictates to an extreme extent what can and will be done to the data. 51
52
Introduction
With this view of the paramount importance of data, we present here a discussion of various data structures widely used in embedded programming. The general sources most useful in preparation of this part of the book are [15, 16, 2, 25, 27], and [1].
Chapter 5
Static Data Structures We begin by investigation data structures whose organizational characteristics are invariant through their life time. Such structures are the ones most commonly used by embedded engineers today, because of widespread support within high-level programming languages. The structure we shall look at in this chapter are all so unchanging in their characteristics that it is possible for compilers to produce highly efficient machine code for accessing their parts when a program is executed.
5.1
Sequential Lists
Shopping items, a bus schedule, a telephone directory, tax tables, and inventory records are examples of lists. In each case, the objects include a sequence of items. Many applications involve maintaining a list. For example, a business inventory maintains supply and reordering information, a personal office creates payroll information for the list of company employees, keywords for a compiler stored in a list of reserved words, and so forth. The basic list operations include inserting a new item at the rear of the list, deleting an item, accessing an item in the list by position, and clearing the list. We also have operations to test whether the list is empty or whether an item is in the list.
5.1.1
Specification of Lists
We now present the specification of sequential list class SeqList. const int MaxListSize = 50; typedef int DataType;
53
54
Static Structures
class SeqList { private: // List storage array and number of current list elements DataType listitem[MaxListSize]; int size; public: // Constructor SeqList(void); // List access methods int ListSize(void) const; int ListEmpty(void) const; int Find (DataType& item) const; DataType GetData(int pos) const; // List modification methods void Insert(const DataType& item); void Delete(const DataType& item); DataType DeleteFront(void); void ClearList(void); };
The name DataType is used to represent a general data type. Before including the class from the file, use typedef to equate the name DataType with a specific type. The variable size maintains the current size of the list. Initially size is set to 0. Since a static array is used to implement the list, the constant MaxListSize is the upper bound for the size of the list. An attempt to insert more than MaxListSize elements into the list causes an error message and program termination.
5.1.2
Implementation of Lists
The implementation of the SeqList class uses array listitem to store the data. The collection allocates storage for MaxListSize number of items of type DataType. The number of elements in the list is maintained in the data member size. The private data member size maintains the length of the list for the Insert and Delete operations. The value of size is the focus for the constructor and the methods ListSize, ListEmpty, and ClearList. The constructor set size to 0. // Constructor. set size to 0 SeqList::SeqList (void): size(0) {}
5.1. SEQUENTIAL LISTS
55
List Modification Methods Insert adds a new element at the rear of the list and increases the length by 1. If insertion of an item will exceed the size of array listitem, the method terminate immediately. The parameter item is passed as a reference to a constant. If the size of DataType is large, the use of a reference parameter avoids the inefficient data copying that is required in a call-by-value parameter. The keyword const assures that the actual parameter cannot be modified. This same type of parameter passing is used by the method Delete. // Insert item at the rear of the list. Terminate the program // if the list size would exceed MaxListSize. void SeqList::Insert(const int& item) { // Will an insertion exceed maximum list size allowed? if (size+1 > MaxListSize) { return; } // Index of rear is current value of size. insert at rear listitem[size] = item; size++; // Increment list size }
The Delete method searches for the first occurrence in the list of the data value item. The function requires the relational equals operator == to be defined for DataType. In some case, this may require that the user provide a special function that redefines the operator == for DataType. If item is not found at index i, the operation quietly concludes without changing the list. If item is found, it is removed from the list by shifting all elements with indices i + 1 to the end of the list left one position. // Search for item in the list and delete it if found void SeqList::Delete(const int& item) { int i = 0; // Search for item while (i < size && !(item == listitem[i])) i++; if (i < size) // Successful if i < size { // Shift the tail of the list to the left one position while (i < size-1) { listitem[i] = listitem[i+1]; i++; }
56
Static Structures
size--; // Decrement size } }
We may also define an method for deleting an item at front of the list. // Delete element at front of list and return its value. // terminate the program with an error message if the list is empty. DataType SeqList::DeleteFront(void) { int frontItem; // List is empty if size == 0 if (size == 0) { exit(1); } frontItem = listitem[0]; // Get value from position 0. Delete(frontItem); // Delete the first item and shift terms return frontItem; // Return the original value }
ClearList method is defined by setting size to 0. // Clears list by setting size to 0 void SeqList::ClearList(void) { size = 0; }
List Access Methods The access method ListSize returns the number of elements in the list. // Return number of elements in list int SeqList::ListSize(void) const { return size; }
The access method Find takes a parameter that serves as the key and sequentially scans the list looking for a match. If the list is empty or no match is found, Find return 0 (False). If the item is located in the list at index i, Find assigns the data record listitem[i] to the matching list item and returns 1 (True). On a match, the process of assigning the data value of the listitem to the parameter is critical in application involving data records. For instance, assume DataType is a struct with a key field and a value field and that the == operator tests only the key field. On input, the parameter item may define only the key field. On output, item is assigned both the key and value fields.
5.1. SEQUENTIAL LISTS
57
// Take item as key and search the list. Return True if item // is in the list and False otherwise. If found, // assign the list element to the reference parameter item int SeqList::Find(int& item) const { int i = 0; if (ListEmpty()) return 0; // Return False when list empty while (i < size && !(item == listitem[i])) i++; if (i < size) { item = listitem[i]; // Assign list element to item return 1; // Return True } else return 0; // Return False }
The GetData method returns the data value at position pos in the list. If pos does not lie the range 0 to size - 1, the program will terminate. // Return value at position pos in list. If pos is not valid // list position, teminate program with an error message. DataType SeqList::GetData(int pos) const { // Terminate program if pos out of range if (pos < 0 || pos >= size) { exit(1); } return listitem[pos]; }
List Test Methods A list test method which checks if the list is empty or not is defined as follows: // Tests for an empty list int SeqList::ListEmpty(void) const { return size == 0; }
The SeqList class does not provide a method to change the value of an item directly. To make such a change, we must first find the item and retrieve the data record, delete the item, modify the record, and reinsert the new data into the list. Of course, this change the position because the new item goes at the rear of the list.
58
Static Structures
5.2
Stacks
A stack is one of the most frequently used and most important data structures. Applications of stacks are vast. For instance, syntax recognition in a compiler is stack based, as is expression evaluation. At a lower level, stacks are used to pass parameters to functions and to make the actual function call to and return from a function. A stack is a list of items that are accessible at only one end of the list. Items are added or deleted from the list only at the top of the stack. A stack structure features operations that add and delete items. A push operation adds an item to the top of the stack. The operation of removing an element from the stack is said to pop the stack. The last item added to the stack is the first one removed. For this reason, a stack is said to have LIFO (last-in/first-out) ordering.
5.2.1
Specification of Stacks
The stack members include a list, an index or pointer to the top of the stack and the set of stack operations. We use an array to hold the stack elements. As a result, the stack size may not exceed the number of elements in the array and the stack full condition is relevant. The declaration of a stack object includes the stack size that defines the maximum number of elements in the list. The size has a default value MaxStackSize = 50. The list (stacklist, the maximum number of elements in the stack (size) and the index (top) are private members. The operations are public. Initially the stack is empty and top = -1. Items enter the array (Push) in increasing order of the indices and come off the stack (Pop) in decreasing order of the indices. const int MaxStackSize = 50; typedef int DataType; class Stack { private: // Private data members. stack array, and top DataType stacklist[MaxStackSize]; int top; public: // Constructor; initialize the top Stack(void); // Stack modification operations void Push(const DataType& item); DataType Pop(void);
5.2. STACKS
59
void ClearStack(void); // Stack access DataType Peek(void) const; // Stack test methods int StackEmpty(void) const; int StackFull(void) const; // Array implementation };
• The data in the stack is of type DataType, which must be defined using typedef. • The user is responsible to check for a full stack before attempting to push an element on the stack and to check for an empty stack before attepmting to pop an element from the stack. • StackEmpty return 1 (True) if the stack is empty and 0 (False) otherwise. Use StackEmpty to determine whether a Pop operation can be performed. • StackFull return 1 (True) if the stack is full and 0 (False) otherwise. Use StackFull to determine whether a Push operation can be performed. • ClearStack makes the stack empty by setting top = -1. This method allows the stack to be used for another purpose.
5.2.2
Implementation of Stacks
Stask Constructor The Stack constructor initializes the index top to have -1, which is equivalent to a stack empty. // Initialize stack top. Stack::Stack(void) : top(-1) {}
Stack Operations The two primary stack operations insert (Push) and delete (Pop) an element from the stack. The class provides the Peek operation, which allows a client to retrieve the data from the item at the top of the stack without acyclically removing the item. To Push an item on the stack, increment the index top by 1 and assign the new item to the array stacklist. An attempt to add an item to full stack should cause an error message and program termination.
60
Static Structures
// Push item on the the stack void Stack::Push(const DataType& item) { // If stacklist is full, terminate the program if (top == MaxStackSize-1) { return; } // Increment top and copy item to stacklist top++; stacklist[top] = item; }
The Pop operation deletes an item from the stack by first copying the value from the top of the stack to a local variable temp and then decrementing top by 1. The content of variable temp becomes the return value. An attempt to delete an item from an empty stack should causes an error message and the program terminates. // Pop the stack and return the top element DataType Stack::Pop(void) { DataType temp; // If stack is empty, terminate the program if (top == -1) { exit(1); } // Record the top element temp = stacklist[top]; // Decrement top and return former top element top--; return temp; }
The Peek operation essentially duplicates the definition of Pop with a single important exception. The index top is not decremented, leaving the stack intact. // Return the value at the top of the stack DataType Stack::Peek(void) const { // If the stack is empty, terminate the program if (top == -1) { exit(1); } return stacklist[top]; }
5.3. QUEUES
61
The ClearStack method resets the top of the stack to -1. This restores the initial condition determined by the constructor. // Clear all items from the stack void Stack::ClearStack(void) { top = -1; }
The stack Push and Pop operations involve direct access to the top of the stack and do not depend on the number of elements in the list. Thus, both operations have computing time O(1). Stack Test Conditions During execution, stack operations terminate the program when we attempt to access the stack incorrectly; for example, when we attempt to Peek into an empty stack. To protect the integrity of the stack, the class provides operations to test the status of the stack. The function StackEmpty checks whether top is -1. If so, the stack is empty and 1 (True) is returned; otherwise, 0 (False) is returned. // Test for an empty stack int Stack::StackEmpty(void) const { // Return the logical value top == -1 return top == -1; }
The function StackFull checks whether top is MaxStackSize - 1. If so, the stack is full and 1 (True) is returned; otherwise, 0 (False) is returned. // Test for a full stack int Stack::StackFull(void) const { // Test the position of top return top == MaxStackSize-1; }
5.3
Queues
A queue is a data structure that stores elements in a list and permits data access only at the two ends of the list. An element is inserted at the rear of the list and is deleted from the front of the list. Applications use a queue to store items in their order of occurrence.
62
Static Structures
Elements are removed from the queue in the same order in which they are stored and hence a queue provides FIFO (first-in/first-out) or FCFS (firstcome/first-served) ordering. The orderly serving of store customers and the buffering of printer jobs in a print spooler are classic examples of queues. A queue includes a list and specific references to the front and rear positions. These positions are used to insert and delete items in the queue. Like a stack, a queue stores items of generic type DataType. Like a stack, an abstract queue does not limit the number of entries. However, if an array is used to implement the list, a “queue full” condition can occur. Queues are used extensively in computer modelling, such as the simulation of teller lines in a bank. Multiuser operating systems maintain queues of programs waiting to execute and of jobs waiting to print.
5.3.1
Specification of Queues
The Queue class is implemented by using an array to hold the list of items and by defining variables that maintain the front and rear positions. Since an array is used to implement the list, our class contains a method QFull to test whether the array is filled. This method will be eliminated when a linked list implementation of a queue is used. typedef int DataType; // maximum size of a queue list const int MaxQSize = 50; class Queue { private: // Queue array and its parameters int front, rear, count; DataType qlist[MaxQSize]; public: // Constructor Queue (void); // initialize integer data members // Queue modification operations void QInsert(const DataType& item); DataType QDelete(void); void ClearQueue(void); // Queue access DataType QFront(void) const;
5.3. QUEUES
63
// Queue test methods int QLength(void) const; int QEmpty(void) const; int QFull(void) const; };
• The generic data type DataType allows a queue to handle different data types. • The Queue class contains a list (qlist) whose maximum size is determined by the constant MaxQSize. • The data member count records the number of elements in the queue. The value also determined whether the queue is empty or full. • QInsert takes an element item of type DataType and insert it at the rear of the queue, and QDelete removes and returns the element at the front of the queue. The method QFront returns the value of the item at the front of queue, which allows us to “peek” at the next element that will be deleted. • The programer should test QEmpty before deleting an item and QFull before inserting a new member if there is a chance the queue is empty or full. If the preconditions for QInsert or QDelete are violated, the program will terminated.
5.3.2
Implementation of Queues
Our implementation of queues introduces a circular model. Rather than shifting items left when an element is deleted, the queue elements are logically arranged in a circle. The variable front is always the location of the first element of the queue, and it advances to the right around the circle as deletion occur. The variable rear is the location where the next insertion occurs. After an insertion, rear moves circularly to the right. A variable count maintains a record of the number of elements in the queue, and if count equals MaxQSize elements, the queue is full. Implement the circular motion using remaindering: • Move rear forward: rear = (rear + 1) % MaxQSize; • Move front forward: front = (front + 1) % MaxQSize; Queue Constructor The Queue constructor initializes the data items front, rear, and count to 0. This establishes an empty queue.
64
Static Structures
// Initialize queue front, rear, count Queue::Queue(void) : front(0), rear(0), count(0) {}
Queue Operations A queue allows only a limited set of operations that add a new item (QINsert) or remove an item (QDelete). The class also provides QFront, which allows us to peek at the first element in the queue. For some applications, this “peek” operation allows us to determine whether an item should be removed from the list. Before the insertion process begins, the index rear points at the next available position in the list. The new item is placed into this location and the queue count is increased by 1. qlist[rear] = item; count++; After placing the element in the list, the rear index must be updated to point at the next location. Since we are using a circular queue model, the insert may occur at the end of the array qlist[size-1] with rear repositioned to the front of the list. The calculation is done using the remainder operator “%”. // Insert item into the queue void Queue::QInsert (const DataType& item) { // Terminate if queue is full if (count == MaxQSize) { return; } // Increment count, assign item to qlist and update rear count++; qlist[rear] = item; rear = (rear+1) % MaxQSize; }
The QDelete operation removes an item from the front of the queue, a position that is referenced by the index front. We start the deletion process by copying the value into a temporary variable and decrementing the queue count. item = qlist[front]; count--; In our circular model, front must be repositioned to the next element in the list by using the remainder operator “%”. front = (front + 1) % MaxQSize;
5.3. QUEUES
65
The value from the temporary location becomes the return value. // Delete element from front of queue and return its value DataType Queue::QDelete(void) { DataType temp; // If qlist is empty, terminate the program if (count == 0) { exit(1); } // Record value at the front of the queue temp = qlist[front]; // Decrement count, advance front and return former front count--; front = (front+1) % MaxQSize; return temp; }
The QFront operation essentially duplicates the definition of QDelete with a single important exception. The index front is not decremented, leaving the queue intact. // Return value of the first entry DataType Queue::QFront(void) const { return qlist[front]; }
In addition a function for measuring the length of the queue is provided. // Return number of queue elements int Queue::QLength(void) const { return count; }
Also we need a function for resetting the Queue by resetting the variables front, rear and count to 0. // Clear the queue by resetting count, front and rear to 0 void Queue::ClearQueue(void) { count = 0; front = 0; rear = 0; }
66
Static Structures
Queue Test Conditions Again, two test conditions are provided. One for checking whether the queue is empty or not; another for checking whether the queue is full or not. // Test for an empty queue int Queue::QEmpty(void) const { // Return the logical value count == 0 return count == 0; } // Test for a full queue int Queue::QFull(void) const { // Return the logical value count == MaxQSize return count == MaxQSize; }
5.4
Priority Queues
As discussed, a queue is data structure that provides a FIFO ordering of elements. The queue removes the “oldest” item from the list. Applications often require a modified version of queue storage in which the item of highest priority is removed from the list. This structure, called a priority queue, has the operations PInsert operation and PDelete. PInsert simply inserts a data element into the list, and PDelete operation removes the most important (highest priority) element from the list as measured by some external criterion that distinguishes the elements in the list. In most applications, elements in a priority queue are key-value pair in which the key specifies the priority level. For instance, in an operating system, each task has a task descriptor and a priority level that serves as the key. When deleting an item from a priority queue, there may several elements in the list with the same priority level. In that case, we could require that these items be treated like a queue. This would have the effect of serving items of the same priority in the order of their arrival.
5.4.1
Specification of Priority Queues
In this section, we provide a variety of implementations for priority queue. In each case, a list object is allocated to store the items. We use the count parameter and the list access methods to insert and delete items. We store items in an array whose elements are of generic type DataType. The storing of items in an array requires the class to provide PQFull method. #include
5.4. PRIORITY QUEUES
67
typedef int DataType; // maximum size of the priority queue array const int MaxPQSize = 50; class PQueue { private: // priority queue array and count int count; DataType pqlist[MaxPQSize]; public: // constructor PQueue (void); // priority queue modification operations void PQInsert(const DataType& item); DataType PQDelete(void); void ClearPQ(void); // priority queue test methods int PQEmpty(void) const; int PQFull(void) const; int PQLength(void) const; };
• The constant MaxPQSize determines the size of the array pqlist. • The PQInsert method simply inserts item into the list. The specification makes no assumptions about where the element is placed in the list. • The PQDelete method removes the element of highest priority from the list. We assume that the highest priority element is the one with the smallest value. The smallest value is determined by using the less than comparison operator “ 0) { // find the minimum value and its index in pqlist min = pqlist[0]; // assume pqlist[0] is the minimum // visit remaining elments, updating minimum and index for (i = 1; i < count; i++) if (pqlist[i] < min) { // new minimum is pqlist[i]. new minindex is i min = pqlist[i]; minindex = i;
5.5. SEARCH AND SORT ALGORITHMS
69
} // move rear element to minindex and decrement count pqlist[minindex] = pqlist[count-1]; count--; } // qlist is empty, terminate the program else { // Deleting from an empty priority queue! exit(1); } // return minimum value return min; }
Other Utilities The data member count contains the number of elements in the list. This value is used in the implementation of PQLength, PQEmpty, and PQFull. // return number of list elements int PQueue::PQLength(void) const { return count; } // test for an empty priority queue int PQueue::PQEmpty(void) const { return count == 0; } // test for a full priority queue int PQueue::PQFull(void) const { return count == MaxPQSize; } // clear the priority queue by resetting count to 0 void PQueue::ClearPQ(void) { count = 0; }
5.5
Search and Sort Algorithms
Arrays and records are built-in data structures in most programming languages. We use these structures to develop important algorithms throughout the book.
70
Static Structures
An array is fundamental data structure for lists. For many applications, we use search and sort utilities to find an item in an array-based list and to order the data.
5.5.1
Search Algorithms
Sequential Search A sequential search looks for an item in a list using a target value called the key. The algorithm begins at a user-supplied index, called start, and traverses the remaining items in the list, comparing each item with the key. The scan continues until the key is found or the list is exhausted. If the key is found, the function returns the index of the matched element in the list; otherwise, the value -1 is returned. The function SeqSearch requires four parameters, the list address, the starting index for the search, the number of element, and the key. typedef int DataType; // Search the n element arrray a for a match with key // using the sequential search. return the index of the // matching array element or -1 if a match does not occur int SeqSearch(DataType list[], int n, DataType key) { for (int i = 0; i < n; i++) if (list[i] == key) return i; // Return index of the matching item return -1; // Search failed. return -1 }
The sequential search algorithm applies to any array for which the operator “==” is defined for the item type. Binary Search The sequential search applies to any list. If the list is ordered, an algorithm, called the binary search, provides an improved search technique. The following steps describe the algorithm. Assume that the list is stored as an array. The indices at the ends of the list are low = 0 and high = n - 1 where n is the number of elements in the array. 1. Compute the index of the array’s midpoint: mid = (low+high) / 2;
2. Compare the value at this midpoint with the key. If a match occurs, return the index mid to locate the key.
5.5. SEARCH AND SORT ALGORITHMS
71
if (A[mid] == key) return mid;
If A[mid] < key, a match must lie in the index range mid+1 . . . high, the right half of the original list. This is true because the list is ordered. The new bounaries are low = mid + 1 and high. If key < A[mid], a match must lie in the index range low . . . mid-1. The new bounaries are low and high = mid - 1. The algorithm refines the location of a match by halving the length of the interval in which key can exist and then executing the same search algorithm on the smaller sublist. Eventually, if the key is not in the list, low will exceed high and the algorithm returns the failure indicator -1 (match not found). The function uses the generic type DataType, which must support both the equality (‘==’) and the less than (‘ high). // Search a sorted arrray a for a match with key // using the binary search. return the index of the // matching array element or -1 if a match does not occur int BinSearch(DataType list[], int low, int high, DataType key) { int mid; DataType midvalue; while (low list[j]. // Swap values of x and y. Used by all the sorting algorithms here void Swap (T &x, T &y) { T temp; temp = x; x = y; y = temp; } // Sort an array of n elements of type T using the // exchange sort algorithm void ExchangeSort(T a[], int size) { int i, j; // Make size-1 passes through the data for (i = 0; i < size-1; i++) // Locate least of a[i]...a[size-1] at a[i] for (j = i+1; j < size; j++) if (a[j] < a[i]) Swap(a[j], a[i]); }
Selection Sort We assume that the n data items are stored in an array A and make n - 1 passes over the list. On pass 0 (first pass), we select the smallest element in the list and exchange it with A[0], the first element in the list. After completing pass 0, the front of the list (A[0]) is ordered and the tail (A[1] to A[n-1]) remains unordered. Pass 1 looks at the unordered tail of the list and selects the smallest element, which is then stored in A[1]. Pass 2 locates the smallest element in the sublist A[2] to A[n-1] and exchanges it with A[2]. The process continues through N - 1 passes at which point the tail of the list is reduced to a single element (the largest in the list) and the entire array is sorted. // Sort an array of n elements of type T using the // selection sort algorithm void SelectionSort(T A[], int n)
5.5. SEARCH AND SORT ALGORITHMS
73
{ // Index of smallest item in each pass int smallIndex; int i, j; // Sort A[0]..A[n-2], and A[n-1] is in place for (i = 0; i < n-1; i++) { // Start the scan at index i; set smallIndex to i smallIndex = i; // j scans the sublist A[i+1]..A[n-1] for (j = i+1; j < n; j++) // Update smallIndex if smaller element is found if (A[j] < A[smallIndex]) smallIndex = j; // When finished, place smallest item in A[i] Swap(A[i], A[smallIndex]); } }
Bubble Sort For an array A with n elements, the bubble sort requires up to n - 1 passes. For each pass, we compare adjacent elements and exchange their values when the first element is greater than the second element. At the end of the each pass, the largest element has “bubbled up” to the top of the current sublist. For instance, after pass 0 is complete, the tail of the list (A[n-1]) is sorted and the front of the list remains unordered. Let’s look at the details of the passes. In the process, we maintain a record of the last index that is involved in an exchange. The variable lastExchangeIndex is used for this purpose and is set to 0 at the start of each pass. Pass 0 compares adjacent elements (A[0], A[1]), (A[1], A[2]), . . ., (A[n-2], A[n-1]). For each pair (A[i], A[i+1]), exchange the values if A[i+1] < A[i] and update lastExchangeIndex to i. At the end of the pass, the largest elements is in A[n-1] and the value lastExchangeIndex indicates that all elements in the tail of the list from A[lastExchangeIndex] to A[n-1] are sorted order. For subsequent passes, we compare adjacent elements in the sublist A[0] to A[lastExchangeIndex]. The process terminates when lastExchangeIndex = 0. The algorithm makes a maximum of n-1 passes. // BubbleSort is passed an array A and list count n. it // sorts the data by making a series of passes as long as // lastExchangeIndex > 0. void BubbleSort(T A[], int n) { int i,j; // Index of last exchange int lastExchangeIndex;
74
Static Structures
// i is the index of last element in the sublist i = n-1; // Continue the process until no exchanges are made while (i > 0) { // Start lastExchangeIndex at 0 lastExchangeIndex = 0; // Scan the sublist A[0] to A[i] for (j = 0; j < i; j++) // Exchange a pair and update lastExchangeIndex if (A[j+1] < A[j]) { Swap(A[j], A[j+1]); lastExchangeIndex = j; } // Set i to index of the last exchange. Continue sorting // the sublist A[0] to A[i] i = lastExchangeIndex; }
Insertion Sort The function InsertSort is passed an array A and the size of the list n. Let’s look at pass i (1 ≤ i ≤ n-1). The sublist A[0] to A[i-1] is already sorted in ascending order. The pass assigns A[i] to the list. Let A[i] be the TARGET and move down the list, comparing the TARGET with items A[i-1], A[i-2], and so forth. Stop the scan at the first element A[j] that is less than or equal to TARGET or at the beginning of the list (j = 0). As we move down the list, slide each element to the right (A[j] = A[j-1]). When we have found the correct location for A[j], insert it at location j. // Insertion sort orders sublists A[0] ... A[i], 1 pivot while (scanUp charsLeft) count = charsLeft; newsize = size - count; newstr = new char [newsize]; if (newstr == NULL) Error(outOfMemory); for (i=0, p=newstr, q=str; i 2, get all but the first and last // characters of the pattern if (patLength > 2) insidePattern = P.Substr(1,patLength-2); lastStrIndex = S.Length()-1; // index last char in S // start search from here to match first chars searchIndex = startindex; // look for match with 1st char of the pattern matchStart = S.Find(patStartChar,searchIndex); // index of last char of possible match matchEnd = matchStart + patInc; // repeatedly look for match at 1st // that last char not past the end while (matchStart != -1 && matchEnd // do 1st and last chars match? if (S[matchEnd]==patEndChar) { // if pattern one or two chars, if (patLength 4; }
When the correct array index is located, BitMask returns an unsigned short value containing a 1 in the bit position representing elt. This mask can be used to set or clear the bit. // Create an unsigned short value with a 1 in the // bit position of elt unsigned short Set::BitMask(const T& elt) const { // Use & to find remainder after dividing by // 16. 0 stays in right-most bit, 15 goes on far left return 1 data; }
Insert Operations The insert operations create a new node with a new data field. The node is then placed in the list at the current location or immediately after the current location. Inserting A Node: InsertFront The operation of inserting a node at the front of a linked list requires us to reassign a value to the pointer head since the list has a new front. The problem of maintaining the head of the list is fundamental to list management. If you lose the head, you lose the list! Before beginning the insertion, head identifies the front of the list. After the insertion, the new node will occupy the front of the list and the previous front of the list will occupy the second position. Hence, the pointer field of the new node is assigned the current value of head, and head is assigned the address of the new node. This assignment is performed by using GetNode to create the new node. The function InsertFront takes the current head of the list, which is a pointer that defines the list, and also takes the new data value. It inserts the data value in a node at the front of the list. Since the head is going to be modified by the operation, head is passed as a reference parameter. // Insert item at front of list template void LinkedList::InsertFront(const T& item) { // call Reset if the list is not empty if (front != NULL) Reset(); InsertAt(item); // inserts at front }
Inserting A Node: InsertRear Placing a node at the rear of a list requires an initial test to determine if the list is empty or not. If it is, create a new node with a NULL pointer field and assign its address to head. The operation is implemented by InsertFront. For
128
Dynamic Data Structures
a non-empty list, we must scan the nodes in the list to locate the rear node. We identify its position when the next field of the current object is NULL. The insertion is executed by first creating a new node (GetNode) and then inserting it after the current Node object (InsertAfter). Since the insertion may change the value of the head pointer, head is passed as a reference parameter. // Insert item at rear of list template void LinkedList::InsertRear(const T& item) { Node *newNode; prevPtr = rear; newNode = GetNode(item); // create the new node if (rear == NULL) // if list empty, insert at front front = rear = newNode; else { rear->InsertAfter(newNode); rear = newNode; } currPtr = rear; position = size; size++; }
Inserting A Node: InsertAt The InsertAt method places the new node at the current location. The new node is placed immediately before the current node. The current position is set to the new node. This operation is used when creating an ordered list. // Insert item at the current list position template void LinkedList::InsertAt(const T& item) { Node *newNode; // two cases: inserting at the front or inside the list if (prevPtr == NULL) { // inserting at the front of the list. also places // node into an empty list newNode = GetNode(item,front); front = newNode; } else { // inserting inside the list. place node after prevPtr newNode = GetNode(item);
7.4. LINKED LISTS
129
prevPtr->InsertAfter(newNode); } // if prevPtr == rear, we are inserting into empty list // or at rear of non-empty list; update rear and position if (prevPtr == rear) { rear = newNode; position = size; } // update currPtr and increment the list size currPtr = newNode; size++; // increment list size }
Inserting A Node: InserAfter The InsertAfter operation places the new node after the current position and assigns currPtr to the new node. The operation serves the same purpose as the InsertAfter method in the Node class. // Insert item after the current list position template void LinkedList::InsertAfter(const T& item) { Node *p; p = GetNode(item); if (front == NULL) { // inserting into an empty list front = currPtr = rear = p; position = 0; } else { // inserting after last node of list if (currPtr == NULL) currPtr = prevPtr; currPtr->InsertAfter(p); if (currPtr == rear) { rear = p; position = size; } else position++; prevPtr = currPtr; currPtr = p; } size++; // increment list size }
130
Dynamic Data Structures
Delete Operations The delete operations remove a node from the list. Deleting A Node: DeleteFront The DeleteFront removes the first node in the list. // Delete the node at the front of list template T LinkedList::DeleteFront(void) { T item; Reset(); if (front == NULL) { exit(1); } item = currPtr->data; DeleteAt(); return item; }
Deleting A Node: DeleteAt DeleteAt removes the node at the current position. // Delete the node at the current list position template void LinkedList::DeleteAt(void) { Node *p; // error if empty list or at end of list if (currPtr == NULL) { return; } // deletion must occur at front node or inside the list if (prevPtr == NULL) { // save address of front and unlink it. if this // is the last node, front becomes NULL p = front; front = front->NextNode(); } else // unlink interior node after prevPtr. save address p = prevPtr->DeleteAfter();
7.5. CIRCULAR LISTS
131
// if rear is deleted, new rear is prevPtr and position // is decremented; otherwise, position is the same // if p was last node, rear = NULL and position = -1 if (p == rear) { rear = prevPtr; position--; } // move currPtr past deleted node. if p is last node // in the list, currPtr becomes NULL currPtr = p->NextNode(); // free the node and decrement the list FreeNode(p); size--; }
Concatenation of Two Linked Lists The function ConcatLinks concatenates two lists by sequencing through the nodes in the second list and inserting each of its values at the rear of the first list. The function scans the second list and extracts the data value from each node using the method Data. The value is then used to append a new node at the rear of the first list using the method InsertRear. template void ConcatLinks(LinkedList& L1, LinkedList& L2) { // reset both lists to the front L1.Reset(); L2.Reset(); // traverse L2. insert each data value at the rear of L1 while (!L2.EndOfList()) { L1.InsertRear(L2.Data()); L2.Next(); } }
7.5
Circular Lists
A NULL-terminated linked list is a sequence of nodes that begins with a head node and ends with a NULL pointer field. In this section, we develop an alternative model for a list called a circular linked list, which simplifies the design and coding of sequential list algorithms. Many professional programmers use the circular model to implement linked list.
132
Dynamic Data Structures
An empty circular list contains a node, which has an uninitialized data field. This node is called the header and initially points to itself. The role of the header is to point to the first “real” node in the list and hence the header is often referred to as a sentinel node. In the circular model of a linked list, an empty list actually contains one node, and the NULL pointer is never used. As nodes are added to the list, the last node points to the header node. We can think of a circular linked list as being a bracelet with the header node serving as a clasp. The header ties together the real nodes in the list.
7.5.1
Specification of Circular Lists
We now declare the CNode class, which builds nodes for a circular list. The class provides a default constructor that allows an uninitialized data field. This constructor is used to create the header. The class is similar to the Node class. In fact, all the class members have the same name and the same function. template class CNode { private: // circular link to the next node CNode *next; public: // data is public T data; // constructors CNode(void); CNode (const T& item); // list modification methods void InsertAfter(CNode *p); CNode *DeleteAfter(void); // obtain the address of the next node CNode *NextNode(void) const; };
7.5.2
Implementation of Circular Lists
Constructors The constructors initialize a node by having it point to itself, so each node can serve as a header for an empty list. Conveniently, “itself” is the pointer this and hence for the default constructor it becomes
7.5. CIRCULAR LISTS
133
// constructor that creates an empty list and // leaves the data uninitialized. use for header template CNode::CNode(void) { // initialize the node so it points to itself next = this; }
A second construtor takes a parameter and uses it to initialize the data field. // constructor that creates an empty list and initializes data template CNode::CNode(const T& item) { // set node to point to itself and initialize data next = this; data = item; }
Neither constructors requires a parameter specifying an initial value for the next field. All required alterations of the next field are accomplished by using the InsertAfter and DeletAfter methods. Circular Node Operations The circular node class provides the NextNode method, which is used to traverse a list. Like the Node class method, NextNode returns the pointer value next. // return pointer to the next node template CNode *CNode::NextNode(void) const { return next; }
InsertAfter adds node p immediately after the current object. No special algorithm is required to load a node at the front of a list since we merely execute InsertAfter(header). The presence of a sentinel or header node eliminates the technical special case of the head that haunt the list processing. // insert a node p after the current one template void CNode::InsertAfter(CNode *p) { // p points to successor of the current node, and current node // points to p. p->next = next; next = p; }
134
Dynamic Data Structures
The removal of a node from the list is handled by the DeleteAfter method. DeleteAfter removes the node immediately following the current node and then returns a pointer to the delete node. If next == this, there are no other nodes in the list, and a node should not delete itself. In this case, the operation returns the value NULL. // delete the node following current and return its address template CNode *CNode::DeleteAfter(void) { // save address of node to be deleted CNode *tempPtr = next; // if next is the address of current object (this), we are // pointing to ourself. We don’t delete ourself! return NULL if (next == this) return NULL; // current node points to successor of tempPtr. next = tempPtr->next; // return the pointer to the unlinked node return tempPtr; }
7.6
Doubly Linked Lists
The scanning of either NULL-terminated or circular linked list occurs from left to right. The circular list is more flexible, allowing a scan to begin at any location in the list and continue to the starting position. These lists have limitations since they do not allow the user retrace steps and scan backward in the list. They inefficiently handle the simple task of deleting a node p since we must traverse the list and find the pointer to the node preceding p. With some applications, the user wants to access a list in reverse order. For instance, a baseball manager maintains a list of players ordered by batting average from lowest to highest. To measure the hitting proficiency of the players for the batting title, the list must be traversed in reverse. This can be done using a stack but the algorithm is not very convenient. In cases where we need to access nodes in either direction, a doubly linked list is helpful. A node in a doubly linked list contains two pointers and the data field. Doubly linked nodes extend a circular list to create a powerful and flexible list handling structure.
7.6. DOUBLY LINKED LISTS
7.6.1
135
Specification of Doubly Linked Lists
The class DNode is a node handling class for circular doubly linked lists. The data members are similar to the singly linked CNode class except that two “next” pointers are used. There are two insert operations, one for each direction, and the delete operation, which removes the current node from the list. The value of a private pointer is returned using the functions NextNodeRight and NextNodeLeft. template class DNode { private: // circular links to the left and right DNode *left; DNode *right; public: // data is public T data; // constructors DNode(void); DNode (const T& item); // list modification methods void InsertRight(DNode *p); void InsertLeft(DNode *p); DNode *DeleteNode(void); // obtain address of the next node to the left or right DNode *NextNodeRight(void) const; DNode *NextNodeLeft(void) const; };
7.6.2
Implementation of Doubly Linked Lists
Constructors A constructor creates an empty list by assigning the node address this to both left and right. If a parameter item is passed to the constructor, the node’s data member is initialized to item. // constructor that creates an empty list and // leaves the data uninitialized. use for header template DNode::DNode(void) { // initialize the node so it points to itself
136
Dynamic Data Structures
left = right = this; } // constructor that creates an empty list and initializes data template DNode::DNode(const T& item) { // set node to point to itself and initialize data left = right = this; data = item; }
List Operations To insert node p to the right of the current node, four pointers must be assigned. // insert a node p to the right of current node template void DNode::InsertRight(DNode *p) { // link p to its successor on the right p->right = right; right->left = p; // link p to the current node on its left p->left = this; right = p; }
The InsertLeft methods exchanges right with left in the algorithm for InsertRight. // insert a node p to the left of current node template void DNode::InsertLeft(DNode *p) { // link p to its successor on the left p->left = left; left->right = p; // link p to the current node on its right p->right = this; left = p; }
To delete the current node, two pointers must be changed. The method returns a pointer to the deleted node. // unlink the current node from the list and return its address
7.7. ITERATORS
137
template DNode *DNode::DeleteNode(void) { // node to the left must be linked to current node’s right left->right = right; // node to the right must be linked to current node’s left right->left = left; // return the address of the current node return this; }
We also need the operations for taking the next node on the left and the next node on the right of the current node. // return pointer to the next node on the right template DNode *DNode::NextNodeRight(void) const { return right; } // return pointer to the next node on the left template DNode *DNode::NextNodeLeft(void) const { return left; }
7.7 7.7.1
Iterators Separation of Data and Control Abstractions
Many list handling algorithms assume that we can scan the items and take some action. A class derived from List provides methods to add and delete data values. In general, it does not provide methods that are explicitly used to scan the list. Rather it assumes that some external process will traverse the list and maintain a record of the current position in the list. For an array or a SeqList object L, we can traverse a list using a loop and a position index. For a SeqList object L, the method GetData accesses the data value. For binary trees, hash tables, and dictionaries, list traversal is more complex. For instance, tree traversal is recursive and can be done using a recursive inorder, preorder, and postorder scan. These scanning methods can be added to a binary tree maintenance class. However, a recursive function does not allow
138
Dynamic Data Structures
the client to stop the traversal process, perform other tasks, and then continue the iteration. As we shall see, an iterative traversal can be done by maintaining tree node pointers on a stack. The tree class will need to contain an iterative implementation for each traversal order, even though a client may not perform a tree traversal or may consistently use one traversal order. It is preferable to separate the data abstraction from the control abstraction. A solution to the problem of list traversal is to create an Iterator class whose job is to traverse the elements of a data structure such as a linked list or tree. An iterator is initialized to point at the start of the list (head, root, etc.). The iterator provides methods Next() and EndOfList() that allows us to move through the list. The iterator object maintains a record of the state of the iterator between calls to Next. With an iterator, the client can stop the scanning process, examine the contents of a data item, and perform other tasks as well. The client is given a tool to traverse the list without having to maintain underlying indices or pointers. By having a class include an iterator as a friend, we can associate a traversal object with the class and give the iterator access to the items in the list. Implementation of the iterator methods uses the underlying structure of the list. In this section, we include a general discussion of iterators. Using virtual functions, we declare an abstract class provides a common interface for all iterator operations, although the derived iterator classes differ in implementation.
7.7.2
Specification of Iterator
We define the abstract Iterator class as a template for general list iterator. Every iterator we develop for the remainder of the text is derived from this class. template class Iterator { protected: // indicates whether iterator has reached end of list. // must be maintained by the derived class int iterationComplete; public: // constructor Iterator(void); // required iterator methods virtual void Next(void) = 0; virtual void Reset(void) = 0; // data retrieval/modification method
7.7. ITERATORS
139
virtual T& Data(void) = 0; // test for end of list virtual int EndOfList(void) const; };
An iterator is a list traversal tool. Its basic methods are Reset (set to the first data element), Next (set position at next item), and EndOfList (identify end of list). The function Data accesses the data value of the current list element.
7.7.3
Implementation of Iterator
The abstract class has a single data value, iterationComplete, that must maintained by Reset and Next in each derived class. Only the constructor and method EndOfList are implemented in the abstract class. #include "iterator.h" // constructor. sets iterationComplete to 0 (False) template Iterator::Iterator(void): iterationComplete(0) {}
The EndOfList method simply returns the value of iterationComplete. The data value is set to 1 (true) by the derived method Reset if the list is empty. The derived class method Next must set iterationComplete to true when Next would advance past the end of the list. // return the value of iterationComplete. template int Iterator::EndOfList(void) const { return iterationComplete; }
7.7.4
Deriving List Iterator
SeqList has been used extensively in this book and served as the basis for design of the abstract List class. Because of its importance, we begin by deriving the SeqListIterator. The iterator maintains a pointer listPtr that points at the SeqList object that is currently being scanned. Since the SeqListIterator is a friend of the derived SeqList, it is allowed access to the private member of SeqList. Specification of Deriving List Iterator #include "iterator.h" #include "list.h"
140
Dynamic Data Structures
#include "link.h" template class SeqListIterator; template class SeqList: public List { protected: // linked list object, available to a derived class LinkedList llist; public: // constructor SeqList(void); // list access methods virtual int Find (T& item); T GetData(int pos); // list modification methods virtual void Insert(const T& item); virtual void Delete(const T& item); T DeleteFront(void); virtual void ClearList(void); // SeqListIterator needs access to llist friend class SeqListIterator; }; // SeqListIterator derived from the abstract class Iterator template class SeqListIterator: public Iterator { private: // maintain a local pointer to SeqList we are traversing SeqList *listPtr; // must maintain previous and current positions as we // traverse the list Node *prevPtr, *currPtr; public: // constructor SeqListIterator(SeqList& lst); // Traversal methods we must define virtual void Next(void);
7.7. ITERATORS
141
virtual void Reset(void); // Data retrieval/modification method we must define virtual T& Data(void); // reset iterator to traverse a new list void SetList(SeqList& lst); };
The iterator implements the virtual functions Next, Reset, and Data, which were declared pure virtual in the base Iterator class. The method SeqList is specific to the SeqListIterator class and allows the client to make a runtime assignment of the iterator to another SeqList object. Implementation of Deriving List Iterator When the iterator is created by the constructor, it is bound to a specific SeqList and all of its operations apply to that list. The iterator maintains a pointer to the SeqList object. After attaching the iterator to the list, we initialize iteratorComplete and set the current position to the front of the list. // constructor. initialize base class and SeqList pointer template SeqListIterator::SeqListIterator(SeqList& lst): Iterator(), listPtr(&lst) { // account for the fact that the list could be empty iterationComplete = listPtr->llist.ListEmpty(); // position the iterator at the front of the list Reset(); }
The movement from item to item is provided by the method Next. The scanning process continues until the current position reaches the end of the list. This condition is flagged by the integer value iterationComplete, which must be maintained by Next. // advance to the next list element template void SeqListIterator::Next(void) { // if currPtr is NULL, we are at end of list if (currPtr == NULL) return; // move prevPtr/currPtr forward one node
142
Dynamic Data Structures
prevPtr = currPtr; currPtr = currPtr->NextNode(); // if we have arrived at end of linked list, signal that // iteration is complete if (currPtr == NULL) iterationComplete = 1; }
Reset restores the initial state of the iterator by initializing iterationComplete and setting the pointers prevPtr and currPtr to their position at the front of the list. The SeqListIterator class is also a friend of the LinkedList class and thus has access to the data member front. // move to the beginning of the list template void SeqListIterator::Reset(void) { // reasssign the state of the iteration iterationComplete = listPtr->llist.ListEmpty(); // if the list is empty, return if (listPtr->llist.front == NULL) return; // move list traversal mechanism to the first node prevPtr = NULL; currPtr = listPtr->llist.front; }
The iterator can gain access to the data value in the current list element with the method Data(). The function returns the data value of the item by using currPtr to access the data value of a LinkedList node. If the list is empty or the iterator is at the end of list, a call to Data terminates the program. // return the data value in the current list element template T& SeqListIterator::Data(void) { // error if list is empty or the traversal has completed if (listPtr->llist.ListEmpty() || currPtr == NULL) { return; } return currPtr->data; }
The SetList method is the runtime equivalent of the constructor. A new SeqList object lst is passed as a parameter and the iterator now traverses lst. Reassign listPtr and call Reset.
7.7. ITERATORS
143
// iterator now traverses lst. reassign listPtr and call Reset template void SeqListIterator::SetList(SeqList& lst) { listPtr = &lst; // position traversal at 1st data value in new list Reset(); }
7.7.5
Array Iterator
When looking to bind iterators to list classes, we may overlook the Array class because of ready access to the index operator. In fact, an Array iterator is a useful abstraction. By initializing the iterator to begin and end at a particular element, the use of indices is eliminated from the application. Furthermore, multiple iterators can simultaneously traverse the same array. Specification of Array Iterator #include "iterator.h" #include "tarray.h" template class ArrayIterator: public Iterator { private: // current location, starting and ending points int currentIndex; int startIndex; int finishIndex; // address of the Array object that we must traverse Array *arr; public: // constructor ArrayIterator(Array& A, int start=0, int finish=-1); // standard iterator operations required by base class virtual void Next(void); virtual void Reset(void); virtual T& Data(void); };
The constructor binds an Array object to the iterator and initializes the starting and finishing indices of the array. The starting value defaults to 0, which sets
144
Dynamic Data Structures
an iterator at the first array element. The finishing index has a default value of -1 indicating that the client accepts the index of the last item in the array as the upper bound. At any point in the iteration, currentIndex is the index of the current array element. The index is given an initial value of startIndex. The ArrayIterator has the minimum set of public member functions that override the pure virtual functions in the base class. Implementation of Array Iterator The constructor sets up the initial state of the iterator. It binds the iterator to the array and initializes the three indices. If startIndex and finishIndex use default values (0 and -1), the iterator ranges over thr entire array. #include "arriter.h" // constructor. initialize the base class and data members template ArrayIterator::ArrayIterator(Array& A, int start, int finish): arr(&A) { // last available array index int ilast = A.ListSize() - 1; // initialize index values. if finish == -1, // traverse whole array currentIndex = startIndex = start; finishIndex = finish != -1 ? finish : ilast; // indices must be in range of array if (!((startIndex>=0 && startIndex=0 && finishIndexdata); // visit the node Preorder(t->Left(), visit); // descend left Preorder(t->Right(), visit); // descend right } }
Inorder Traversal An inorder scan begins its action at a node by first descending to its left subtree so that it can scan the nodes in that subtree. After recursively descending through the nodes in that subtrees, the Inorder takes the second action at the node and uses the data value. The traversal completes its action at the node by performing a recursive scan of the right subtree. The order of operations in the Inorder traversal follows:
8.1. BINARY TREE STRUCTURES
151
1. Traverse the left subtree. 2. Visit the node. 3. Traverse the right subtree. We refer to this traversal as LNR (left, node, right). // Inorder recursive scan of the nodes in a tree. template void Inorder (TreeNode *t, void visit(T& item)) { // The recursive scan terminates on a empty subtree if (t != NULL) { Inorder(t->Left(), visit); // descend left visit(t->data); // visit the node Inorder(t->Right(), visit); // descend right } }
Postorder Traversal The postorder scan delays a visit to a node until after a recursive descent of the left subtree and recursive descent of the right subtree. The order of operations produces an LRN scan (left, right, node). 1. Traverse the left subtree. 2. Traverse the right subtree. 3. Visit the node. // Postorder recursive scan of the nodes in a tree. template void Postorder (TreeNode *t, void visit(T& item)) { // The recursive scan terminates on a empty subtree if (t != NULL) { Postorder(t->Left(), visit); // descend left Postorder(t->Right(), visit); // descend right visit(t->data); // visit the node } }
Clearly, the prefixes pre, in, and post indicates when the “visit” occurs at a node. In each case, we descend down the left subtree before descending the right subtree. There are actually three more algorithms that select the right subtree before the left subtree.
152
Nonlinear Structures
8.1.4
Using Tree Scan Algorithms
The recursive tree traversal algorithms are the basis for many tree applications. They provide an orderly access to the nodes and their data values. In this section, we illustrate the use of traversal algorithms to count the number of leaf nodes, to compute the depth of a tree. In each case, we must use a scanning strategy to visit each node. Application: Visiting Tree Nodes Many applications merely want to scan the nodes of a binary tree without concern for the order of the traversal. In these cases, the client is free to select from any of the scan algorithms. In this application, the function CountLeaf traverses the tree to count the number of leaf nodes. A reference parameter count is incremented each time we identify a leaf node. // the function uses a postorder scan. a visit // tests whether the node is a leaf node template void CountLeaf (TreeNode *t, int& count) { // use posorder descent if (t != NULL) { CountLeaf(t->Left(), count); // descend left CountLeaf(t->Right(), count); // descend right // check if t is a leaf node (no descendants) // if so increment the variable count if (t->Left() == NULL && t->Right() == NULL) count++; } }
The Depth function uses a postorder scan to compute the depth of a binary tree. At each node it computes the depth of the left and right subtrees. The resulting depth of the node is 1 more than the maximum depth of its subtrees. // the function uses a postorder scan. it computes the // depth of the left and right subtrees of a node and // returns the depth of the tree as // 1+ max(depthLeft,depthRight). the depth // of an empty tree is -1 template int Depth(TreeNode *t) { int depthLeft, depthRight, depthval; if (t == NULL) depthval = -1;
8.1. BINARY TREE STRUCTURES
153
else { depthLeft= Depth(t->Left()); depthRight= Depth(t->Right()); depthval = 1 + (depthLeft> depthRight?depthLeft:depthRight); } return depthval; }
Application: Copying and Deleting Trees Utility functions to copy and delete an entire tree introduce new concepts and prepare us for our development of a tree class that requires a descructor and copy constructor. A function CopyTree takes an initial tree and creates a dulicate version. The DeleteTree routine removes each node in the tree, including the root, and deallocates memory for the nodes. The function CopyTree uses a postorder scan to visit the nodes of the tree. The traversal order assures that we move to the greatest deoth in the tree and then implement a visit operation, which creates a node for the new tree. The CopyTree function builds a new tree from the bottom up. It first creates the children and then links them to their parent as the parent is being created. CopyTree is designed as a function that returns a pointer to the newly created node. This return value is then used by the parent when it creates its own node and attaches its children. The function returns the root to the calling program. // create duplicate of tree t; return the new root template TreeNode *CopyTree(TreeNode *t) { // variable newnode points at each new node that is // created by a call to GetTreeNode and later attached to // the new tree. newlptr and newrptr point to the child of // newnode and are passed as parameters to GetTreeNode TreeNode *newlptr, *newrptr, *newnode; // stop the recursive scan when we arrive at empty tree if (t == NULL) return NULL; // CopyTree builds a new tree by scanning the nodes of t. // At each node in t, CopyTree checks for a left child. if // present it makes a copy of left child or returns NULL. // The algorithm similarly checks for a right child. // CopyTree then builds a copy of node using GetTreeNode // and appends copy of left and right children to node. if (t->Left() != NULL) newlptr = CopyTree(t->Left());
154
Nonlinear Structures
else newlptr = NULL; if (t->Right() != NULL) newrptr = CopyTree(t->Right()); else newrptr = NULL; // build new tree from the bottom up by building the two // children and then building the parent. newnode = GetTreeNode(t->data, newlptr, newrptr); // return a pointer to the newly created node return newnode; }
When an application uses a dynamic structure such as a tree, the programer is responsible for deallocating the memory occupied by the tree. For a general binary tree, we design the DeleteTree function, which uses a postorder san of the nodes. The ordering ensures that we first visit the children of a node before deleting the node (parent). The visit calls FreeTreeNode to delete the node. // use the postorder scan algorithm to traverse the nodes in // the tree and delete each node as the visit operation. template void DeleteTree(TreeNode *t) { if (t != NULL) { DeleteTree(t->Left()); DeleteTree(t->Right()); FreeTreeNode(t); } }
A more general tree clearing routine deletes the nodes and resets the root. The function ClearTree calls DeleteTree to deallocate the nodes and then assigns the root pointer to be NULL. // call the function DeleteTree to deallocate the nodes. then // set the root pointer back to NULL template void ClearTree(TreeNode * &t) { DeleteTree(t); t = NULL; // root now NULL }
8.1. BINARY TREE STRUCTURES
155
Breadth-First Scan or Level Scan The breadth-first, or level scan, can no longer recursively descend into subtrees but rather must visit all nodes (siblings) on the same level and then descend to the next level. Rather than a recursive descent, we develop an iterative algorithm that uses a queue to hold the items. For each node, we insert anynon-NULL left and right child in the queue. This assures us that the set of siblings will be visited in order at the next level in the tree. // Traverse the list by level by level and visit each node template void LevelScan(TreeNode *t, void visit(T& item)) { // Store siblings of each node in a queue so that they are // visited in order at the next level of the tree Queue Q; TreeNode *p; // Initialize the queue by inserting the root in the queue Q.QInsert(t); // Continue the iterative process until the queue is empty while(!Q.QEmpty()) { // delete front node from queue and execute visit function p = Q.QDelete(); visit(p->data); // If a left child exists, insert it in the queue if (p->Left() != NULL) Q.QInsert(p->Left()); // If a right child exists, insert next to its sibling if (p->Right() != NULL) Q.QInsert(p->Right()); } }
Similarly, an inorder iterative scan algorithm can be developed by deploying a stack. // Return address of last node on the left branch from t, // stacking nodes on the way. used for iterative inorder scan. template TreeNode *GoFarLeft(TreeNode *t,Stack& S) { // If t is NULL, return NULL if (t == NULL) return NULL; // Go as far left in the tree t as possible, stacking each
156
Nonlinear Structures
// node address on S until a node is found with a NULL // left pointer. return a pointer to that node while (t->Left() != NULL) { S.Push(t); t = t->Left(); } return t; } // Inorder iterative scan template void Inorder_I(TreeNode *t, void visit(T& c)) { // Stack to hold node address on a left branch Stack S; // Get address of last node on the left branch from t. // this is the first node in the scan t = GoFarLeft(t,S); // Continue until t is NULL while (t != NULL) { // We have gone left. visit the node visit(t->data); // If have a right subtree, move right and go far left, // stacking up nodes on the left subtree if (t->Right() != NULL) t = GoFarLeft(t->Right(),S); // There is no right subtree but there are nodes we have // stacked that we must process. pop and continue else if (!S.StackEmpty()) // Move up the tree t = S.Pop(); // No more right branches or stacked nodes to process else t = NULL; // We are done } }
8.2 8.2.1
Elimination of Recursion Recursive Programs
An iterative program is constructed from the primitive instructions with tests, loops, and non-recursive procedures. A recursive program is a program which contains one or more recursive procedures. A procedure p is said to be recursive
8.2. ELIMINATION OF RECURSION
157
if its execution can invoke one or more calls of p. These calls are said to be recursive calls. On the other hand, a procedure call occurring when the procedure is not already being executed, will be termed a principal call. We distinguish between • Simple recursions, where all the recursive calls appear in the body of the same procedure. • Simultaneous recursions, where recursive calls are invoked by the execution of other procedures calls in the body of the recursive procedure. For example, the body of a procedure p contains a call to a procedure q, which in turn contains a call to procedure p. A recursive programs is an extension of iterative one, although it does not allow us to compute any more functions, since a recursive program can always be translated into an equivalent iterative one. It follows that, whatever the problem to be solved, we can choose to construct either a recursive or an iterative program. If the problem lends to itself naturally to a recursive decomposition, the recursive program can simply reproduce the chosen decomposition, and is consequently clearer and easier to prove its correctness than an equivalent iterative program. However, recursive programming nevertheless has its disadvantages: • Certain programming languages (FORTRAN, COBOL, and assembly languages, for instance) do not permit recursion. • It is often costly, in terms of the execution time and storage space required. Specially in the ream-time and embedded applications in DeltaOS. These disadvantages can be minimized by transforming a given recursive program into an iterative program. The method of program transformation will be illustrated in this section. Such a method is more efficient than directly constructing an iterative program, especially when it turns out to be complicated and therefore difficult to prove correct. Indeed, if the recursive program has been proven correct, and if the transformations preserve correctness, then the iterative program thus obtained is guaranteed to be correct. It is a good practice to convert all the recursive programs into equivalent iterative one in writing any real-time and embedded applications. Since then, a deterministic program will be obtained.
8.2.2
Execution Tree of a Recursive Procedure
It is sometimes convenient to represent the sequence of recursive calls and instructions executed via a call to a recursive procedure by a tree, constructed as follows: • The call in question is the label for the root of the tree.
158
Nonlinear Structures
• The leaves of the tree are labelled with sequences of instructions, and the other vertices by procedure calls. • If the call in question does not invoke a recursive call, then the root leads only to a leaf, which is labelled with the associated sequence of instructions On the other hand, if at least one recursive call is invoked, the root will lead to the following (from left to right): • A leaf labelled with the sequence of instructions executed before the first recursive call. • A vertice labelled with the first recursive call, linked to an appropriate subtree of calls. • A leaf labelled with the sequence of instructions executed after the first recursive call, and before the following call. • A vertice labelled with the following recursive call, and connecting with the tree of corresponding calls. • ... • A leaf labelled with the sequence of instructions executed when all the recursive calls invoked by the call in question have ended.
8.2.3
Computation of a Recursive Procedure for a Principal Call
Let us consider 1. A recursive procedure where the tests do not modify the values of the variables and parameters. 2. A principal call of this procedure. 3. The execution tree for this principal call. The computation sequence of the procedure for the principal call can be determined in the following way: 1. Traverse all the paths of the tree, beginning at the root. Replace, in each successor of a vertex corresponding to a call. (a) The variable parameters by the corresponding actual parameters. (b) The value parameters, and the variables declared in the body of the procedure, by variables not yet used. Initializations of variables associated with value parameters should be added. 2. Execute the leaves of this transformed execution tree from left to right.
8.2. ELIMINATION OF RECURSION
8.2.4
159
Call Trees of a Recursive Procedure
We are often not interested in the vertices of the execution tree labelled with sequences of instructions (which do not contain a recursive call). The name call tree is given to a tree obtained from the execution tree, retaining only the vertices labelled with procedure calls.
8.2.5
Depth of a Recursive Call – Embedded Recursive Calls
The depth of a recursive call is the length of the path which connects the root to the vertex associated with the call in the call tree. A principal call is therefore always at the depth 0. Two calls are embedded if one is invoked by the execution of the other, or equivalently, if there is a path linking them in the call tree. Elimination of Recursion in General Cases The complete elimination of recursion in general takes place in two stages: 1. Elimination of formal parameters and local variables. 2. Transformation of a recursive procedure with no parameters and local variables into an equivalent iterative procedure.
8.2.6
Elimination of Formal Parameters and Local Variables
In this section, we will discuss the methods of eliminating formal parameters and local variables. Elimination of Variable Parameters We shall consider only the case where: • The actual parameters of recursive calls are the same as the corresponding formal parameters. • The actual parameters of the principal call can be accessed from anywhere in the body of the procedure. In these two case, for each occurrence of a formal variable parameter, we can then substitute the corresponding actual parameter of the principal call.
160
Nonlinear Structures
Elimination of Value Parameters We will eliminate value parameters by saving their values with the help of the term stack, and then replacing them with global variables. Let P be a recursive procedure of the form: void P(T y) { ...... P(u); ...... } We obtain a procedure equivalent to P by replacing each recursive call P(e) by the sequence: PUSH(y); y = e; P(y); POP(y); We can then write P in the form of a procedure without parameters, manipulating global variables, according to the following definition: T yg; void P(void) { ...... yg = u; P; ...... } where yg is global variables which replace the value parameters y. The new body of procedure P is obtained by substituting for each occurrence of a value parameters y the corresponding global variable yg, and by replacing each recursive call P(e) by the recursive call P. The principal call of P, P(u), is replaced by the sequence yg = u; P. Elimination of Local Variables The transformation of local variables into global variables is achieved in the same way as that of value parameters. A value parameter may be in fact be considered as an initialized local variable. Some kinds of optimization can be considered at this stage. If the value a local variable is never used after any recursive calls, it will be unnecessary to save its value.
8.2.7
Elimination of Recursion
Having used the results described in the last section, we can now solve the problem of eliminating recursion in a procedure which has no parameters or local variables. This can be done using some straightforward rules. Syntactically, recursive functions can be classified into two categories: tail recursion and non-tail recursion.
8.2. ELIMINATION OF RECURSION
161
Tail Recursion Removing If, in the body of a procedure, a recursive call is placed in such a way that its execution is never followed by the execution of another instruction of the procedure, this call is known as a tail recursive call. The execution of such a call terminates the execution of the body of the procedure. A procedure in which all the recursive calls are tail recursive calls is called tail recursive procedure. A tail recursive function has the form of T P(M m) { if cond { A; return P(n); } else { return B; } } Where the list of statements A and statement return B do not contain direct or indirect recursive calls of P. Primitive Recursive Functions and General Recursive Functions Mathematically speaking, a tail recursive function is also called a primitive recursive function. It is a well-known result that a primitive recursive function is equivalent to an iteration program [6] without introducing of auxiliary protocol stack (for storing returning address) and parameter value stack. So mathematically, we can classify the computable functions into two categories: the primitive recursive functions and the general recursive functions. Recursive Form of IsInIdentList The following is a function IsInIdentList defined on a class of IdentList for checking whether an identifier id occurs in an identifier list or not. We first present a tail recursive version of the program, and then give its non-recursive equivalence. int IdentList::IsInIdentList(const Identifier &id) { Identifier id1; IdentList il; il.head = head; il.tail = tail; if (!il.head) return 0; else { il.HeadIdentList(id1); if (id1 == id) return 1; else { il.RestIdentList(il); return il.IsInIdentList(id);
162
Nonlinear Structures
} } } Non-Recursive Form of IsInIdentList A recursion removed version of IsInIdentList is given as follows: int IdentList::IsInIdentList(const Identifier &id) { Identifier id1; IdentList il; il.head = head; il.tail = tail; while (il.head) { il.HeadIdentList(id1); if (id1 == id) return 1; il.RestIdentList(il); } return 0; } Most advanced compilers can do tail recursion removing intelligently. So we may not need to eliminate any tail-recursion by ourselves. Elimination of tail recursion is simple and shortens execution time quite considerably. We have registered gains of 20 to 25 percent for the procedures in PowerEpsilon. This type of elimination is therefore definitely interesting in its own right. Linear Recursive Function A linear recursive function has the following form of T P(M m) { if cond { A; l = P(n); B; return C; } else { D; return E; } } Where the list of statements A, B and D, and statements return C and return E do not contain direct or indirect recursive calls of P. There are a number of methods for transformation from linear recursive functions to tail recursive functions. Pioneer work on these methods was done in 1966 by Cooper [3], who introduced the techniques of operand commutation and function inversion. The technique of re-bracketing was mentioned by Darlington and Burstall in 1973 [5].
8.2. ELIMINATION OF RECURSION
163
Function Inversion The techniques of re-bracketing or operand communication require rather stringent conditions. The technique based on function inversion, on the other hand, is more general. Starting from the argument on termination, the technique of function inversion reconstructs the parameter values of all the incarnations and performs the “pending” operations upon them until the initial parameter value is recovered. The only prerequisite is that there is a inversion function for reversing parameter values. Recursive Form of RestoreTermList For instance, we have the following linear recursive function RestoreTermList. void TermList::RestoreTermList(void) { TermList tl; if (head) { tl.head = head->next; tl.tail = tail; tl.RestoreTermList(); stk.pop(head->elem); } } This is definitely not a tail recursive function since there is a statement stk.pop(head->elem) after the recursive call tl.RestoreTermList(). Non-Recursive Form of RestoreTermList We can introduce an inverse function ReverseTermList to reverse the order of term list tl being processed. A recursion removed version of the function RestoreTermList is given as follows: void TermList::RestoreTermList(void) { Term t; TermList tl, tll; tl.head = head; tl.tail = tail; tl.ReverseTermList(tll); tl = tll; while (tll.head) { tll.HeadTermList(t); stk.pop(t);
164
Nonlinear Structures
tll.RestTermList(tll); }; tl.ReverseTermList(tll); } Function Inversion by Introducing Stacks A general inversion function can be found for any recursive functions by introducing stacks for saving and restoring information, before and after each recursive calls. Non-Linear Recursive Function By no means all non-linear recursive functions can be transformed into repetitive form without introducing auxiliary protocol stacks. This can be seen from the following program scheme: T P(M m) { if cond { A; s = P(p); t = P(q); B; return C; } else { D; return E; } } Where the list of statements A, B and D, and statements return C and return E do not contain direct or indirect recursive calls of P. Paterson, Hewitt [19] and also Strong [24] showed in 1970 that this program scheme cannot (without further restrictions) be transformed into a repetitive form without introducing the protocol stack. Functional Embedding Embedding is a general principle known in mathematics for a long time: if the original formulation of a problem does not lead to a solution straightforwardly, one tries to solve a more general problem that includes the original one as a special case. Arithmetization of the Flow of Control For special nested recursions it is possible to analyze the flow of control (that is the order of application of the operations involved) and to find a repetitive form which yields the same flow. Certain “arithmetization” functions which map all the relevant information about the flow and the parameter values one-to-one
8.2. ELIMINATION OF RECURSION
165
onto a closed interval of natural numbers (“arithmetization”) play a crucial role. Conversely the necessary information can be recovered from the values of this interval. Recursion Removing for One Principal Recursive Call The simplest case for non-tail recursion is that each branch of computation structure contains only one principal recursive call at most. For handling this case, at the beginning of the procedure or function, code is inserted which initialize a term stack which will be used to hold the values of parameters and local variables. Now, each recursive call is replaced by a set of instructions which do the following: 1. Each recursion call will be replaced by a jump to the beginning of the body of the procedure or function. 2. Add the label to the statement immediately following the unconditional jump. Finally, we need to proceeded the last end statement of the procedure or function by the code to do the following: 1. Insert code for deciding whether the term stack is empty or not, and place a label finish before the code. 2. If the term stack is empty, we simply end the procedure or function. 3. If the stack is not empty, then pop up the value from the top of the term stack and jump to the beginning of the procedure or function. 4. A jump to the label finish will be generated when end of procedure or function is reached. void P(void) { if E1 { A1; P; B1; } else if E2 { A2; P; B2; } ...... else if En { An; P; Bn; } else { D } } Once the transformation to iterative form has been accomplished, one can often simplify the program even further thereby producing even more gains in efficiency.
166
Nonlinear Structures
void P(void) { start: while E { if E1 { A1; PUSH(1); } else if E2 { A2; PUSH(2); } ...... else if En { An; PUSH(n); } } D; finish: if (!IS_EMPTY_STACK) { POP(i); if (i == 1) { B1; } else if (i == 2) { B2; } ...... else if (i == n) { Bn; } } } Where a label is stored for distinguishing the different computation branches inside of program. Recursion Removing for Multiple Principal Recursive Calls We now present an investigation of recursion removing for multiple principal recursive calls. A set of principal rules for this kind of transformation is given as follows: 1. At the beginning of the procedure or function, code is inserted which initialize a term stack and a label stack. The former will be used to hold the values of parameters and local variables. The later will be used to save return addresses for each recursive call. 2. The label start is attached to the first executable statement. Now, each recursive call is replaced by a set of instructions which do the following: 1. Create the i-th new label, i, and store i in the label stack. The value i of the label will be used as the return address. 2. Each recursion call will be replaced by a jump to the beginning of the body of the procedure or function. 3. Add the label to the statement immediately following the unconditional jump.
8.2. ELIMINATION OF RECURSION
167
These steps are sufficient to remove all recursive calls from the procedure or function. Finally, we need to proceed the last end statement of the procedure or function by the code to do the following: 1. Insert code for deciding whether the label stack is empty or not, and place a label finish before the code. 2. If the label stack is empty, we simply end the procedure or function. 3. If the stack is not empty, then use the return label from the top of the label stack and execute a branch to this label. 4. A jump to the label finish will be generated when end of procedure or function is reached. By following these rules carefully one can take any recursive program and produce a program which works in exactly the same way, yet which uses only iteration to control the flow of the program. On many compilers this resultant program will be much more efficient than its recursive version. void P(void) { if E1 { A1; P; B1; P; C1; } else if E2 { A2; P; B2; P; C2; } ...... else if En { An; P; Bn; P; Cn; } else { D } } Once the transformation to iterative form has been accomplished, one can often simplify the program even further thereby producing even more gains in efficiency. void P(void) { start: while E { if E1 { A1; PUSH(1, 1); } else if E2 { A2; PUSH(2, 1); } ...... else if En { An; PUSH(n, 1); } } D; finish: if (!IS_EMPTY_STACK) { POP(i, j); if (i == 1) {
168
Nonlinear Structures
if (j == 1) { B1; PUSH(1, 2); goto start; } else { C1; goto finish; } } else if (i == 2) { if (j == 1) { B2; PUSH(2, 2); goto start; } else { C2; goto finish; } ...... else if (i == n) { if (j == 1) { Bn; PUSH(n, 2); goto start; } else { Cn; goto finish; } } } } In this general recursion removing scheme, a two dimensional stack for storing return-addresses (labels) is required. The former is used to distinguish the different computation branches in the program and the latter is used to distinguish the different principal calls in each branch. Fortunately, In the implementation of PowerEpsilon, we may use the term stack deployed for saving parameters and local variables to eliminate this redundancy. For example, the term pp1 in the procedure Subst can be used to distinguish different branches (label i) in an IF-THEN-ELSE statement, and the subcomponent rank in pp1 can be used for storing the label j) in each conditional branch, since the rank will never used in any compound term. Recursive Form of Postorder The recursive version of function Postorder is defined as follows: // postorder recursive scan of the nodes in a tree. template void Postorder(TreeNode *t, void visit(T& item)) { // the recursive scan terminates on a empty subtree if (t != NULL) { Postorder(t->Left(), visit); // descend left Postorder(t->Right(), visit); // descend right visit(t->data); // visit the node } } Non-Recursive Form of Postorder By applying recursion removing rules we have described above, we obtain a non-recursive function of Postorder as follows:
8.2. ELIMINATION OF RECURSION
169
// postorder scan of the nodes in a tree. template void Postorder2 (TreeNode *t, void visit(T& item)) { TreeNode *current; Stack S; Stack LabS; int i; current = t; // the scan terminates on a empty subtree beginloop: while (t != NULL) { S.Push(t); t = t->Left(); LabS.Push(1); } exitloop: if (!S.StackEmpty()) { i = LabS.Pop(); if (i == 1) { t = S.Pop(); S.Push(t); t = t->Right(); LabS.Push(2); goto beginloop; } else { S.Pop(t); visit(t->data); // visit the node goto exitloop; } } }
By reordering the structure of the code following the label exitbegin, we obtain the following program: // postorder scan of the nodes in a tree. template void Postorder3(TreeNode *t, void visit(T& item)) { TreeNode *current; Stack S; Stack LabS;
170
Nonlinear Structures
int i; current = t; // the scan terminates on a empty subtree beginloop: while (t != NULL) { S.Push(t); t = t->Left(); LabS.Push(1); } while (!S.StackEmpty() && (LabS.Top() == 2)) { i = LabS.Pop(); t = S.Pop(); visit(t->data); // visit the node } if (!S.StackEmpty()) { i = LabS.Pop(); t = S.Pop(); S.Push(t); t = t->Right(); LabS.Push(2); goto beginloop; } } As we can see, the resulting program are no longer easy to read and understand, since its complicated structure.
Procedures with Mutual Recursive Calls The transformation described in the last section can easily be generalized to the case of mutual recursively defined procedures. A set of procedures with mutual recursive calls can be combined into one by the following steps. • All of the beginning of the body of the procedures are combined together. The principal procedure call is placed at the beginning. • Each recursive call, will, as before, be replaced by a jump to the corresponding parts of the beginning of the body of the procedures accordingly. • The test t1 , t2 , ..., are set up used to distinguish between the cases: the end of the principal call, when execution is finished, or, the end of a recursive call, when a jump is made to the instructions executed immediately after the recursive call. Other transformation steps are almost same as before.
Example Consider two recursively defined procedures f and g of the form:
8.2. ELIMINATION OF RECURSION
171
void f(void) { if condf { Af; f; Bf; g; Cf; } else { Ef; } } void g(void) { if condg { Ag; g; Bg; f; Cg; } else { Eg; } } Assume that f is the principal procedure call. A new procedure fg is defined with all recursive calls being removed. void fg(void) { INIT_STACK; Af_start: if condf { Af; PUSH(11); goto Af_start; } Ef; goto Final; Ag_start: if condg { Ag; PUSH(21); goto Ag_start; } Eg; goto Final; Final: if (!IS_EMPTY_STACK) { POP(l); if (l == 11) { Bf; PUSH(12); goto Ag_start; } else if (l == 21) { Bg; PUSH(22); goto Af_start; } else if (l == 12) { Cf; goto Final; } else if (l == 22) { Cg; goto Final; } } } Further transformation is carried out by replacing goto statements with whileloop constructs. void fg(void) { INIT_STACK; Af_start: while condf { Af; PUSH(11); } Ef; goto Final; Ag_start: while condg { Ag; PUSH(21); } Eg; goto Final;
172
Nonlinear Structures
Final: if (!IS_EMPTY_STACK) { POP(l); if (l == 11) { Bf; PUSH(12); goto Ag_start; } else if (l == 21) { Bg; PUSH(22); goto Af_start; } else if (l == 12) { Cf; goto Final; } else if (l == 22) { Cg; goto Final; } } } In this iterative procedure, the first two while loops produce sequences of executions of Af and Ag corresponding to the initial parts of the executions of the recursive procedure. In the same way, the final two loops produces sequences of executions of Cf and Cg corresponding to the final parts. It can then be written as: void fg(void) { INIT_STACK; Af_start: while condf { Af; PUSH(11); } Ef; goto Final; Ag_start: while condg { Ag; PUSH(21); } Eg; goto Final; final: while ((!IS_EMPTY_STACK) && ((TOP == 12) || (TOP == 22))) { if (TOP == 12) { POP; Cf; } else { POP; Cg; } } if (!IS_EMPTY_STACK) { POP(l); if (l == 11) { Bf; PUSH(12); goto Ag_start; } else if (l == 21) { Bg; PUSH(22); goto Af_start; } } }
8.3
Binary Search Trees
A general binary tree can hold a large collection of data and yet provide very fast access as we attempt to add, remove, or find an item. Building collection classes is one of the most important applications of a tree. We are familiar with the problems involved in building a general collection class from the SeqList class and its implementations
8.3. BINARY SEARCH TREES
173
with an array or with a linked list. The SeqList class features the Find method, which is implemented with a sequential search. For linear structure, this algorithm is O(n), which is inefficient for a large collection. In general, tree structures provide significantly improvement searching performance, since the path to any data value is no greater than the depth of the tree. Searching performance is maximized with a complete binary tree O(log2 n). For instance, with a list of 10,000 elements, the expected number of comparisons to find an element using the sequential search is 5,000. That same search on a complete tree would require no more than 14 comparisons. A binary tree offers great potential as an implementation structure for a list. To store elements in a tree for efficient access, we must design a search structure that identifies a path to an element. The structure, called a binary search tree, orders the elements by means of the relational operator “left); NULL;
// copy the right branch of tree t and assign its root to newrptr if (t->right != NULL) newrptr = CopyTree(t->right); else newrptr = NULL; // allocate storage for the current root node and assign its data value // and pointers to its subtrees. return its pointer newNode = GetTreeNode(t->data, newlptr, newrptr); return newNode; } // delete the tree stored by the current object template void BinSTree::DeleteTree(TreeNode *t) { // if current root node is not NULL, delete its left subtree, // its right subtree and then the node itself if (t != NULL) { DeleteTree(t->left);
8.3. BINARY SEARCH TREES
177
DeleteTree(t->right); FreeTreeNode(t); } } // search for data item in the tree. if found, return its node // address and a pointer to its parent; otherwise, return NULL template TreeNode *BinSTree::FindNode(const T& item, TreeNode* & parent) const { // cycle t through the tree starting with root TreeNode *t = root; // the parent of the root is NULL parent = NULL; // terminate on on empty subtree while (t != NULL) { // stop on a match if (item == t->data) break; else { // update the parent pointer and move right or left parent = t; if (item < t->data) t = t->left; else t = t->right; } } // return pointer to node; NULL if not found return t; }
Constructor, Destructor, and Assignment The class contains a constructor that initializes the data members. A copy constructor and overloaded assignment operator use the private CopyTree method to create a new binary search tree for the current object. The algorithm for CopyTree and DeleteTree are developed as for the TreeNode class. The overloaded assignment operator copies the right-hand side object to the current object. After checking that the object is not being assigned to itself, the function clears the current tree and uses CopyTree to create a duplicate of the right-hand side (rhs). The pointer current is assigned the root pointer, the list size is copied, and a reference to the current object is returned.
178
Nonlinear Structures
// constructor. initialize root, current to NULL, size to 0 template BinSTree::BinSTree(void): root(NULL),current(NULL),size(0) {} // copy constructor template BinSTree::BinSTree(const BinSTree& tree) { // copy tree to the current object. assign current and size root = CopyTree(tree.root); current = root; size = tree.size; } // destructor template BinSTree::~BinSTree(void) { // just call ClearList ClearList(); } // assignment operator template BinSTree& BinSTree::operator= (const BinSTree& rhs) { // can’t copy a tree to itself if (this == &rhs) return *this; // clear current tree. copy new tree into current object ClearList(); root = CopyTree(rhs.root); // assign current to root and set the tree size current = root; size = rhs.size; // return reference to current object return *this; }
List Operations The Find and Insert methods start at the root and traverse a unique path through the tree. Using the definition of a binary search tree, the algorithm
8.3. BINARY SEARCH TREES
179
traverses the right subtree when the key or new item is greater than or equal to the value of the current node. Otherwise, the algorithm traverses the left subtree. The Find Operation The Find operation uses the private member function FindNode that takes a key and traverses a path down the tree. The operation returns a pointer to the matching node and a pointer to its parent. If the match occurs at the root, the parent pointer is NULL. With Find, we are only interested in assigning the current location to the matching node and assigning the data from the node to the reference parameter item. The Find method return True (1) or False (0) to indicate if the search was successful. Find requires that relational operators “==” and “data; return 1; } else // item not found in the tree. return False return 0; }
The Insert Operation The Insert method takes a new data item and searches the tree to add the item in the correct location. The function iteratively scans the path of left and right subtrees until it locates the insertion point. For each step in the path, the algorithm maintains a record of the current node (called t) and the parent of the current node (called parent). The process terminates when we identify an empty subtree (t == NULL), which indicates that we have found the location to add the new item. At this location, the new node is inserted as a child of the parent.
180
Nonlinear Structures
// insert item into the search tree template void BinSTree::Insert(const T& item) { // t is current node in traversal, parent the previous node TreeNode *t = root, *parent = NULL, *newNode; // terminate on on empty subtree while (t != NULL) { // update the parent pointer. then go left or right parent = t; if (item < t->data) t = t->left; else t = t->right; } // create the new leaf node newNode = GetTreeNode(item,NULL,NULL); // if parent is NULL, insert as root node if (parent == NULL) root = newNode; // if item < parent->data, insert as left child else if (item < parent->data) parent->left = newNode; else // if item >= parent->data, insert as right child parent->right = newNode; // assign current as address of new node and increment size current = newNode; size++; }
The Delete Operation The Delete operation removes a node with a given key from the tree. The deletion is performed by first using the utility method FindNode, which locates the node in the tree along with a pointer to its parent. If the item is not found in the list, the Delete operation quickly returns. Deleting a node from the tree requires a series of tests to determine how the children of the node are going to be reattached to the tree. The subtrees have to be reconnected in such a way that the binary search tree structure is preserved. The call to the FindNode function returns a pointer DNodePtr that identifies the node D that it to be deleted. A second pointer, PNodePtr that identifies the parent P of the deleted node. The Delete method sets out to find a replacement
8.3. BINARY SEARCH TREES
181
node R that will connect to the parent and thus take the place of the deleted node. The variable RNodePtr identifies the replacement node R. The algorithm for finding a replacement must consider four cases that depend on the number of children attached to the node. Note that when the parent is NULL, the root is being deleted. This situation is covered by our cases, with additional factor that the root must be updated. Since BinSTree is a friend of TreeNode class, we have access to private members of left and right. • Situation A: Node D has no children, It is a leaf node. Update the parent node to have an empty subtree. The update can be accomplished by setting RNodePtr to NULL. When we attach the NULL replacement node, the parent points to NULL. • Situation B: Node D has left child but no right child. Attach the left subtree of D to the parent. The update is accomplished by setting RNodePtr to the left child of D and then attaching node R to the parent. • Situation C: Node D has right child but no left child. Attach the right subtree of D to the parent. The update is accomplished by setting RNodePtr to the right child of D and then attaching node R to the parent. • Situation D: Deleting a node with two children. A node with two children has elements in its subtree that are less than and greater than or equal to its key value. The algorithm must select a replacement node that maintains the correct ordering among the items. We need a strategy to select a replacement node from the remaining pool of nodes. The resulting tree must satisfy the binary search tree condition. We use a max-min principle. Select as the replacement node R, the rightmost node in the left subtree. This is the largest node whose data value is less than that of the node to be deleted. Unlink this node R from the tree, connect its left subtree to its parent, and then connect R at the deleted node. A simple algorithm is used to find the rightmost node in the left subtree. 1. Since the replacement node R is less than the deleted node D, descend to the left subtree of D. 2. Since R is the largest of the nodes in the left subtree, locate its value by descending down the path of right subtrees. During the descent, keep track of the predecessor node, which is called PofRNodePtr (parent of the replacement node). The descent down the path of right subtrees distinguishes two cases.
182
Nonlinear Structures
– If the right subtree is empty, the current location is the replacement node R, and PofRNodePtr is the deleted node D. To update, we attach the right subtree of D as the right subtree of R and attach the parent of the deleted node P to R. – If the right subtree is not empty, the scan ends with a leaf node or a node with only a left subtree. In either case, unlink the node R from the tree and relink the children of R to the parent node PofRNodePtr. In each case, the right child of node PofRNodePtr is reset with the statement PofRNodePtr -> right = RNodePtr -> left. (a) R is a leaf node. Unlink it from the tree. Since RNodePtr -> left is NULL, statement sets the right child of PofRNodePtr to NULL. (b) R has a left subtree. Statement attaches this subtree as the right child of PofNodePtr. The algorithm finishes by substituting node R for the deleted node. First, attach the children of D as the children of R. R replace D as the root of the subtree formed by D. Complete the link to the parent node P. // if item is in the tree, delete it template void BinSTree::Delete(const T& item) { // DNodePtr = pointer to node D that is deleted // PNodePtr = pointer to parent P of node D // RNodePtr = pointer to node R that replaces D TreeNode *DNodePtr, *PNodePtr, *RNodePtr; // search for a node containing data value item. obtain its // node address and that of its parent if ((DNodePtr = FindNode (item, PNodePtr)) == NULL) return; // If D has a NULL pointer, the // replacement node is the one on the other branch if (DNodePtr->right == NULL) RNodePtr = DNodePtr->left; else if (DNodePtr->left == NULL) RNodePtr = DNodePtr->right; // Both pointers of DNodePtr are non-NULL. else { // Find and unlink replacement node for D. // Starting on the left branch of node D, // find node whose data value is the largest of all
8.3. BINARY SEARCH TREES
// nodes whose values are less than the value in D. // Unlink the node from the tree. // PofRNodePtr = pointer to parent of replacement node TreeNode *PofRNodePtr = DNodePtr; // first possible replacement is left child of D RNodePtr = DNodePtr->left; // descend down right subtree of the left child of D, // keeping a record of current node and its parent. // when we stop, we have found the replacement while (RNodePtr->right != NULL) { PofRNodePtr = RNodePtr; RNodePtr = RNodePtr->right; } if (PofRNodePtr == DNodePtr) // left child of deleted node is the replacement. // assign right subtree of D to R RNodePtr->right = DNodePtr->right; else { // we moved at least one node down a right branch // delete replacement node from tree by assigning // its left branch to its parent PofRNodePtr->right = RNodePtr->left; // put replacement node in place of DNodePtr. RNodePtr->left = DNodePtr->left; RNodePtr->right = DNodePtr->right; } } // complete the link to the parent node. // deleting the root node. assign new root if (PNodePtr == NULL) root = RNodePtr; // attach R to the correct branch of P else if (DNodePtr->data < PNodePtr->data) PNodePtr->left = RNodePtr; else PNodePtr->right = RNodePtr; // delete the node from memory and decrement list size FreeTreeNode(DNodePtr); size--;
183
184
Nonlinear Structures
}
The Clear Operation The ClearList method deletes all the nodes in the tree, sets root and current to NULL, and sets size to 0. // delete all the nodes in the tree. current now NULL and size is 0 template void BinSTree::ClearList(void) { DeleteTree(root); root = NULL; current = NULL; size = 0; }
Tree Update Methods After a Find, the user mat wish to update data fields in the current node. For this purpose, we provide an Update method that takes a data value as a parameter. If a current node is defined, Update compares the current node’s value with the data value and, if they are equal, performs an update to the node. If the current node is undefined or the data items do not match, the new data value is inserted into the tree. // if current node is defined and its data value matches item, // assign node value to item ; otherwise, insert item in tree template void BinSTree::Update(const T& item) { if (current != NULL && current->data == item) current->data = item; else Insert(item); }
Other Utilities Other utilities include a boolean function ListEmpty for checking whether a binary search tree is empty or not, a function ListSize for measuring the size of a given binary search tree. // indicate whether the tree is empty template int BinSTree::ListEmpty(void) const { return root == NULL; }
8.4. ARRAY-BASED BINARY TREES
185
// return the number of data items in the tree template int BinSTree::ListSize(void) const { return size; }
The method GetRoot gives us access to the tree root. // return the address of the root node. template TreeNode *BinSTree::GetRoot(void) const { return root; }
8.4
Array-Based Binary Trees
In previous sections, we use tree nodes to build binary tree. Each entry has a data field and left and right pointer fields that identify the left and right subtrees of the node. An empty tree is represented by a NULL pointer. Insertions and deletions are done by dynamically allocating nodes and assigning pointer fields. This representation handles tree ranging from degenerate to complete trees. In this section, we introduce a sequential (array) representation of a tree that uses an array entry to store the data and indices to identify the node. We derive a very powerful relationship between an array and a complete binary tree, a relationship that finds applications with heaps and priority queues. Recall that a complete binary tree of depth n contains all possible nodes through level n-1 and nodes at level n are placed left to right with no gaps. An array A is a sequential list whose items can represent nodes on a complete binary tree with root A[0]; first-level children A[1] and A[2]; second-level children A[3], A[4], A[5], and A[6]; and so forth. The root has index 0 and the remaining nodes are assigned indices in level order. Although arrays find a natural tree representation, there is no straightforward representation of a general binary tree as an array. The problem lies with possible missing nodes that must correspond to unused array elements. A degenerate tree with only right subtrees would be even less efficient. The power of array-based trees becomes manifest when direct access to node data is required. There are simple index calculations that identify the children and the parent nodes. For each node A[i] in an N-element array, the indices of the child nodes are computed by the following formula: In general, the formula for computing the parent of node A[i] is
186
Nonlinear Structures
Item A[i]
Left child index is
Item A[i]
Right child index is
Item A[i]
8.4.1
2*i+1 undefined if 2*i+1 ≥ N 2*i+2 undefined if 2*i+2 ≥ N
Parent child index is
(i-1)/2 undefined if i = 0
Application: The Tournament Sort
Description Binary trees find important applications as decision trees in which each node represent a branch with two possible outcomes. One such application uses the tree to maintain a record of players competing in a single-elimination tournament. Each non-leaf node corresponds to the winner of a contest between two players. The leaf nodes give a list of the original players and their pairing in the tournament. A tournament tree can be used to sort a list of N items. In the process we develop an efficient algorithm that makes full use of an array-based tree. Set up an array-based tree to hold N items as leaf nodes in the bottom row of the tree. The elements are stored at level k where 2k ≥ N. Assuming that we are sorting the list in ascending order, we compare each pair of elements and store the smaller (winner) in the parent node. The process continues until we have the smallest element (winner) in the root node. Tournament Sort Algorithm To implement the tournament sort, we define a DataNode type and create an array-based tree of its objects. The data type has members that hold a data item, the location of the item in the bottom row of the tree, and a flag that indicates if the item is still active in the tournament. The operator “= n int bottomRowSize; // number of nodes in the complete tree // whose last row has bottomRowSize entries int treesize; // starting index of the bottom row int loadindex; int i, j; // call PowerOfTwo to determine the needed size of // the bottom row of the tree. bottomRowSize = PowerOfTwo(n);
8.4. ARRAY-BASED BINARY TREES
// compute the size of the tree and dynamically // allocate its nodes treesize = 2 * bottomRowSize - 1; loadindex = bottomRowSize-1; tree = new DataNode[treesize]; // copy the array a to the tree of DataNode objects j = 0; for (i = loadindex; i < treesize; i++) { item.index = i; if (j < n) { item.active = 1; // set flag to active item.data = a[j++]; } else item.active = 0; tree[i] = item; // assign object to tree } // make initial set of comparisons to find smallest item i = loadindex; while (i > 0) { j = i; while (j < 2*i) { // process pairs of competitors // have a match. compare value tree[j] with its // competitor tree[j+1]. assign the winner to // the parent position if (!tree[j+1].active || tree[j] < tree[j+1]) tree[(j-1)/2] = tree[j]; else tree[(j-1)/2] = tree[j + 1]; j += 2; // go to next pair of competitors } // move up to the next level for competition among // the winners of the previous matches i = (i-1)/2; } // handle other n-1 elements. copy winner from the root // to the array. make the winner inactive. update tree // by allowing the winner’s competitor to re-enter the // tournamemt for (i = 0; i < n-1; i++) { a[i] = tree[0].data; tree[tree[0].index].active = 0; UpdateTree(tree,tree[0].index);
189
190
Nonlinear Structures
} // this copies the largest value to the array a[n-1] = tree[0].data; }
8.4.2
Tree Iterators
We have observed the power of iterators for the traversal of linear structures that include arrays and sequential lists. The scanning of nodes in a tree is more difficult since a tree is a nonlinear structure and there is no one traversal order. The TreeNode utilities provide preorder, inorder and postorder algorithms, which allow us to scan a tree recursively and execute a function at each node. The problem with each of these traversal methods is that there is no escape from the recursive process until it completes. We cannot stop the scan, examine the contents of a node, perform various operations on the data, and then continue the scan at another node of the tree. By using an iterator, a client has a tool to scan the nodes of a tree as if they form a linear list without the burdensome details of the underlying scanning algorithm. The Inorder Iterator We have developed the abstract Iterator class to establish a set of basic list traversal methods. The Iterator class gives a common format for traversal methods independent of the implementation details in a derived class. In this section, we use this base class to derive an inorder binary tree iterator. The inorder scan, when applied to a binary search tree, visits the data increasing order and is a useful tool. The construction of preorder, level order, and postorder iterators can be defined in similar way. Specification of Inorder Iterator InorderIterator follows the general iterator pattern. The method EndOfList is implemented in the base class Iterator. The constructor initializes the base class and locates the first inorder node using GoFarLeft. #include "treenode.h" #include "iterator.h" #include "stack.h" // Inorder iterator for binary tree; uses Iterator base class template class InorderIterator: public Iterator { private: // we maintain a stack of TreeNode addresses Stack< TreeNode * > S;
8.4. ARRAY-BASED BINARY TREES
191
// tree root and current node TreeNode *root, *current; // traverse a left path. used by Next TreeNode *GoFarLeft(TreeNode *t); public: // constructor InorderIterator(TreeNode *tree); // implementation of basic traversal operations virtual void Next(void); virtual void Reset(void); virtual T& Data(void); // assign new tree list to iterator void SetTree(TreeNode *tree); };
Implementation of Inorder Iterator An iterator inorder traversal emulates the recursive scan by using a stack to hold node addresses. Start at the root node and traverse the chain of left subtrees placing on a stack the pointers to each of the nodes in the chain. The process stops at a node with NULL left pointer. This becomes the first node that is visited in the inorder scan. The method GoFarLeft starts at node address t and stacks the addresses of all nodes in the path until a NULL pointer is found. Calling GoFarLeft with t == root locates the first node to be visited. // return address of last node on the left branch from t, // stacking node addresses as we go template TreeNode *InorderIterator::GoFarLeft(TreeNode *t) { // if t is NULL, return NULL if (t == NULL) return NULL; // go as far left in the tree t as possible, stacking each // node address on S until a node is found with a NULL // left pointer. return a pointer to that node while (t->Left() != NULL) { S.Push(t); t = t->Left(); } return t; }
192
Nonlinear Structures
After initializing the base class, the constructor sets the data member root to the root of the binary search tree. The first node in the inorder traversal is obtained by calling GoFarLeft with root as its parameter. The return value is assigned to the TreeNode pointer current. // initialize iterationComplete. base class sets it to 0, but // the tree may be empty. first node in traversal is farthest // node left. template InorderIterator::InorderIterator(TreeNode *tree): Iterator(), root(tree) { iterationComplete = (root == NULL); current = GoFarLeft(root); }
Before the first call to Next, current already points to the first node in an inorder scan. Next implements the inorder traversal from node to node. template void InorderIterator::Next(void) { // error if we have already visited all the nodes if (iterationComplete) { return; } // we have visited node current // if have a right subtree, move right and go far left, // stacking up nodes on the left subtree if (current->Right() != NULL) current = GoFarLeft(current->Right()); // no right subtree, but there are other nodes we have // stacked we must process. pop stack into current else if (!S.StackEmpty()) // move up the tree current = S.Pop(); // no right branch of current node and no stacked nodes. // the traversal is complete else iterationComplete = 1; }
The Reset method is essentially the same as the constructor, except that it clears the stack. // reset the traversal to the first tree node template
8.4. ARRAY-BASED BINARY TREES
193
void InorderIterator::Reset(void) { // clear out the stack of node addresses S.ClearStack(); // reassign iterationComplete and current the address of // the first node in the inorder scan iterationComplete = (root == NULL); current = GoFarLeft(root); // go back to 1st node inorder }
Once a node is located, the method Data is used for retrieving the information stored in the node. // return current data value template T& InorderIterator::Data(void) { // error if tree empty or we have completed traversal if (root == NULL || iterationComplete) { return NULL; } return current->data; }
The method SetTree can be used for performing the whole initialization process. template void InorderIterator::SetTree(TreeNode *tree) { // clear out the stack of node addresses S.ClearStack(); // assign new tree root. initialize iterationComplete. // assign current the address of first node in the scan root = tree; iterationComplete = (root == NULL); current = GoFarLeft(root); // go to 1st node inorder }
Application: Tree Sort #include "bstree.h" #include "treeiter.h" // use a binary search tree to sort an array template void TreeSort(T arr[], int n) {
194
Nonlinear Structures
// binary search tree in which array data is placed BinSTree sortTree; int i; // insert each array element into the search tree for (i=0; i < n; i++) sortTree.Insert(arr[i]); // declare an inorder iterator for sortTree InorderIterator treeSortIter(sortTree.GetRoot()); // traverse tree inorder. assign each element back to arr i = 0; while(!treeSortIter.EndOfList()) { arr[i++] = treeSortIter.Data(); treeSortIter.Next(); } }
8.5
Heap
Array-based trees find an important application with heaps, which are complete binary trees having a level order relationship among the nodes. For a maximum heap, the value of a parent is greater than or equal to the value of each of its children. For a minimum heap, the value of the parent is less than or equal to the value of each of its children. This book develops minimum heaps.
8.5.1
The Heap as a List
A heap is a list with an ordering that stores a set of data in a complete binary tree. An ordering, called heap order, states that each node in the heap has a value that is less than or equal to the values of its children. With this ordering, the root has the smallest data value. As an abstract list structure, a heap allows us to add and delete items. The insertion process does not assume that the new item occupies a specific location but only requires that heap order is maintained. A deletion, however, removes the smallest item (root) from the list. A heap is used in those applications where the client wants direct access to the minimum element. As a list, the heap does not provide a Find operator and direct access to list element is read-only. All heap handling algorithms are responsible to update the tree and maintain heap order. A heap is a very efficient list management structure that takes full advantage of its complete binary tree structure. For each insertion and deletion, the heap restores its order by scanning only short paths from the root down to the end of the tree. A heap finds important applications in the implementation of a priority queue and in sorting a list of elements. Rather than using slower sorting
8.5. HEAP
195
algorithm, we can insert the elements into a heap and order them by repeatedly deleting the root node. This gives rise to the extremely fast heap sort. We discuss the internal organization of a heap in our development of the heap class. The algorithms to add and remove items are presented in the implementation of the Insert and Delete methods.
8.5.2
Specification of Heaps
Like any linear and nonlinear list, a heap has operations that insert and delete data items and that return status information, such as the size of the list. These methods and the array that holds the array-based binary tree are encapsulated into the class Heap. template class Heap { private: // hlist points at the array which can be allocated by // the constructor (inArray == 0) or passed as a // parameter (inArray == 1) T *hlist; int inArray; // Max number allowed and current size of heap int maxheapsize; int heapsize; // Identifies end of list // Error message utility function void error(char errmsg[]); // Utility functions for Delete/Insert to restore heap void FilterDown(int i); void FilterUp(int i); public: // Constructors/destructor Heap(int maxsize); // Create empty heap Heap(T arr[], int n); // "heapify" arr Heap(const Heap& H); // Copy constructor ~Heap(void); // Destructor // Overloaded operators: "=", "[]", "T*" Heap& operator=(const Heap& rhs); const T& operator[](int i); // List methods int ListSize(void) const; int ListEmpty(void) const;
196
Nonlinear Structures
int ListFull(void) const; void Insert(const T& item); T Delete(void); void ClearList(void); };
• The first constructor takes a size parameter and uses it to dynamically allocate memory for the array. The resulting heap is initially empty and new elements are added with the Insert method. A destructor, copy constructor, and assignment operator support the use of dynamic memory. A second constructor takes an array as a parameter and binds the heap with the array. The constructor orders the array as a heap. In this way, a client can impose a heap structure on an existing array and take advantage of heap properties. • The overloading index operator “[]” allows to access a heap object as an array. Since the operator returns a constant reference, access is read-only. • The methods ListEmpty, ListSize, and ListFull return information on the current size of the heap. • The Delete method always removes the first (smallest) item in the heap. Insert places an element in the list and maintains heap order.
8.5.3
Implementation of Heaps
We provide a through discussion of the heap insertion and deletion operations and the utility methods FilterUp and FilterDown. The utility methods are responsible for updating the heap when it is created or modified. Error Handling This operation has a null body and leave to the users for detailed implementation. // print error message and terminate the program template void Heap::error(char errmsg[]) {}
Constructors, Destructor and Assignment Operator The first constructor takes a size parameter and uses it to dynamically allocate memory for the array. The resulting heap is initially empty. template Heap::Heap(int size) {
8.5. HEAP
197
if (size ListEmpty(); } template int PQueue::PQLength(void) const { return ptrHeap->ListSize(); } template void PQueue::ClearPQ(void) { ptrHeap->ClearList(); }
8.7
AVL Trees
Binary search trees are designed for rapid access to data. Ideally, the tree is reasonably balanced and has height approximately O(log2 n). With some data, however, a binary search tree can be degenerate. In that case, the height is
206
Nonlinear Structures
O(n) and access to the data is significantly slowed. In this section, we develop a modified tree class that gives us the power of binary search trees without worse case conditions. We develop AVL trees in which every node is height-balanced. By this we mean that for each node in an AVL tree, the difference in height of its two subtrees is at most 1. In AVL tree class, new versions of the insert and delete methods ensure that the nodes remain height-balanced. In this section, we begin by defining the AVLTreeNode class and then use these objects to design an AVLTree class. Our focus is on the AVL tree methods Insert and Delete. The algorithms for these methods require careful design to ensure that each node in the new tree remains height-balanced.
8.7.1
Specification of AVLTreeNode Class
AVL trees have a representation that is similar to binary search trees. The operations are identical except for Insert and Delete, which must constantly monitor the relative heights of the left and right subtrees of a node. To maintain this information, we extend our definition of a TreeNode object to include a balanceFactor field. The value of the field is the differece between the height of the right and left subtrees. balanceFactor = height(right subtree) - height(left subtree) If the balanceFactor is negative, the node is “heavy on the left” since the height of the left subtree is greater than the height of the right subtree. With a positive balanceFactor, the node is “heavy on the right”. A height-balanced node has a balanceFactor of 0. In an AVL tree, balanceFactor must full in the range -1 to 1. By using inheritance tools, we can derive the AVLTreeNode class from the TreeNode base class. An AVLTreeNode object inherits the fields for a TreeNode and appends the additional balanceFactor field. The two data members left and right in TreeNode are protected so that AVLTreeNode or other derived classes have direct access to them. template class AVLTree; // inherits the TreeNode class template class AVLTreeNode: public TreeNode { private: // additional data member needed by AVLTreeNode int balanceFactor; // used by AVLTree class methods to allow assignment
8.7. AVL TREES
207
// to a TreeNode pointer without casting AVLTreeNode* & Left(void); AVLTreeNode* & Right(void); public: // constructor AVLTreeNode(const T& item, AVLTreeNode *lptr = NULL, AVLTreeNode *rptr = NULL, int balfac = 0); // methods that return left/right TreeNode pointers as // AVLTreeNode pointers; handles casting for the client AVLTreeNode* Left(void) const; AVLTreeNode* Right(void) const; // method to access new data field int GetBalanceFactor(void); // AVLTree methods needs access to left and right friend class AVLTree; };
• The data member balanceFactor is private since only the AVL insert and delete operations should update the value. • In the constructor, the parameters include data for the underlying TreeNode structure along with the default parameter balfac = 0. • The client may access the pointer fields using Left and Right. A new definition of these methods is necessary since the return value is a pointer to the larger AVLTreeNode structure. • Since we need to derive the AVLTree class from the BinSTree class and, in doing so, reuse the base class destructor and ClearList. These base methods destroy nodes by executing the delete operator. In each case, the pointer refers to an AVLTreeNode object, not a TreeNode object. If the destructor in the node base class TreeNode is virtual, dynamic binding is used when delete is called and it deletes an AVLTreeNode object.
8.7.2
Implementation AVLTreeNode Class
The constructor for AVLTreeNode class calls the base class constructor and initializes balanceFactor. // constructor; initialize balanceFactor and the base class. // default pointer values NULL initialize node as a leaf node template
208
Nonlinear Structures
AVLTreeNode::AVLTreeNode(const T& item, AVLTreeNode *lptr, AVLTreeNode *rptr, int balfac): TreeNode(item, lptr, rptr), balanceFactor(balfac) {}
The methods of Left and Right in the AVLTreeNode class simplify client access to the fields. An attemt to access the left child with the base class method Left returns a pointer to a TreeNode. A type conversion would be needed to return a pointer to the larger node structure. Rather than forcing the repeated type conversion of pointers, we define methods Left and Right for the AVLTreeNode class and return AVLTreeNode pointer values. // return a reference to left after casting it to an // AVLTreeNode pointer. use to change left template AVLTreeNode* & AVLTreeNode::Left(void) { return (AVLTreeNode *)left; } // return a reference to right after casting it to an // AVLTreeNode pointer. use to change right template AVLTreeNode* & AVLTreeNode::Right(void) { return (AVLTreeNode *)right; } // return left after casting it to an AVLTreeNode pointer template AVLTreeNode* AVLTreeNode::Left(void) const { return (AVLTreeNode *)left; } // return right after casting it to an AVLTreeNode pointer template AVLTreeNode* AVLTreeNode::Right(void) const { return (AVLTreeNode *)right; }
Finally we need a function GetBalanceFactor to obtain the private member balanceFactor. template int AVLTreeNode::GetBalanceFactor(void)
8.7. AVL TREES
209
{ return balanceFactor; }
8.7.3
Specification of AVLTree Class
An AVL tree provides a list structure that is similar to a binary search tree with the added condition that the tree remains height balanced after each insertion and deletion. Since an AVL tree is an extended binary search tree, we use inheritance to derive an AVLTree class from the BinSTree class. The Insert and Delete methods must be overidden to meet the AVL conditions. In addition, we define the copy constructor and the overloaded assignment operator. In the derived class since we are building trees with the larger node structure. // constants to indicate the balance factor of a node const int leftheavy = -1; const int balanced = 0; const int rightheavy = 1; // derived search tree class template class AVLTree: public BinSTree { private: // memory allocation AVLTreeNode *GetAVLTreeNode(const T& item, AVLTreeNode *lptr, AVLTreeNode *rptr); // used by copy constructor and assignment operator AVLTreeNode *CopyTree(AVLTreeNode *t); // used by Insert and Delete method to re-establish // the AVL conditions after a node is added or deleted // from a subtree void SingleRotateLeft(AVLTreeNode* &p); void SingleRotateRight(AVLTreeNode* &p); void DoubleRotateLeft(AVLTreeNode* &p); void DoubleRotateRight(AVLTreeNode* &p); void UpdateLeftTree(AVLTreeNode* &tree, int &reviseBalanceFactor); void UpdateRightTree(AVLTreeNode* &tree, int &reviseBalanceFactor); // class specific versions of the general Insert and
210
Nonlinear Structures
// Delete methods void AVLInsert(AVLTreeNode* &tree, AVLTreeNode* newNode, int &reviseBalanceFactor); void AVLDelete(AVLTreeNode* &tree, AVLTreeNode* newNode, int &reviseBalanceFactor); public: // constructors, destructor AVLTree(void); AVLTree(const AVLTree& tree); // assignment operator AVLTree& operator= (const AVLTree& tree); // standard list handling methods virtual void Insert(const T& item); virtual void Delete(const T& item); };
• The constants leftheavy, balanced, and rightheavy are used by the insertion and deletion algorithm to describe the balance factor of a node. • The method GetAVLTreeNode handles node allocation for the class. By default, the balanceFactor of the new node is set to 0. • This class defines a new CopyTree function for use with the copy constructor and overloaded assignment operator. Although the algorithm is identical to CopyTree in BinSTree, the function correctly creates the larger AVLTreeNode as it builds the new tree. • The functions AVLInsert and AVLDelete implement the Insert and Delete methods, respectively. The private methods such as SingleRotateLeft are used in the implementation of AVLInsert and AVLDelete. We declare the public Insert and Delete methods as virtual functions that override those in the BinSTree base class. Except for these tree modification methods, we inherit all of the other search tree operations from BinSTree.
8.7.4
Implementation of AVLTree Class
Constructor, Destructor and Assignment Operator The constructor, destructor and assignment operators of AVLTree are defined as follows:
8.7. AVL TREES
211
template AVLTree::AVLTree(void): BinSTree() {} template AVLTree::AVLTree(const AVLTree& tree) { root = (TreeNode *)CopyTree((AVLTreeNode *)tree.root); current = root; size = tree.size; } template AVLTree& AVLTree::operator= (const AVLTree& tree) { ClearList(); root = (TreeNode *)CopyTree((AVLTreeNode *)tree.root); current = root; size = tree.size; return *this; } // Delete is empty. suppress warning message that item not used // Find the node with value item and delete it. template void AVLTree::Delete(const T& item) {}
Memory Allocation for AVLTree The AVLTree class is derived from BinSTree and inherits most of its operations. We develop separate memory allocation and copy methods when we need to create the larger AVLTreeNode objects, as illustrated by GetAVLTreeNode. // allocate an AVLTreeNode; terminate the program on a memory // allocation error template AVLTreeNode *AVLTree::GetAVLTreeNode(const T& item, AVLTreeNode *lptr, AVLTreeNode *rptr) { AVLTreeNode *p; p = new AVLTreeNode (item, lptr, rptr); if (p == NULL) { return NULL; } return p;
212
Nonlinear Structures
} template AVLTreeNode *AVLTree::CopyTree(AVLTreeNode *t) { AVLTreeNode *newlptr, *newrptr, *newNode; if (t == NULL) return NULL; if (t->Left() != NULL) newlptr = CopyTree(t->Left()); else newlptr = NULL; if (t->Right() != 0) newrptr = CopyTree(t->Right()); else newrptr = NULL; newNode = GetAVLTreeNode(t->data, newlptr, newrptr); return newNode; }
Base class methods are sufficient to remove the larger AVLTreeNode. The DeleteTree method from the BinSTree class makes use of the virtual destructor in the TreeNode class. AVLTree Insert Methods The power of an AVL tree is its ability to create and maintain height-balanced trees. This power is made possible by the AVL insertion and deletion algorithms. In our AVLTree class, we describe an Insert method that overrides the same operation in the BinSTree base class. The actual implementation of Insert uses the recursive method AVLInsert to store the new element. We first give the C++ code for insert and then focus on the recursive method AVLInsert. // insert a new node using the basic List operation and format template void AVLTree::Insert(const T& item) { // declare AVL tree node pointer. using base class method // GetRoot, cast to larger node and assign root pointer AVLTreeNode *treeRoot = (AVLTreeNode *)GetRoot(), *newNode; // flag used by AVLInsert to rebalance nodes int reviseBalanceFactor = 0;
8.7. AVL TREES
213
// get a new AVL tree node with empty poiner fields newNode = GetAVLTreeNode(item,NULL,NULL); // call recursive routine to actually insert the element AVLInsert(treeRoot, newNode, reviseBalanceFactor); // assign new values to data members root, size // current in the base class root = treeRoot; current = newNode; size++; }
The heart of the insertion algorithm is the recursive method AVLInsert. Like its counterpart in BinSTree, it traverses the left subtree if item < node value and the right subtree if item ≥ node value. This private function has a parameter called tree, which maintains a record of the current node in the scan, the new node to insert the tree, and a flag called revisedBalanceFactor. As we scan the left or right subtree of node, the flag notifies us if any balanceFactor in the subtree have been changed. If so, we must check that AVL height balance is preserved. If the insertion of the new node disrupts the equilibrium of the tree and distorts a balance factor, we must reestablish the AVL equilibrium. AVL Insert Algorithm The insert process resembles that uses by the binary search tree. We recursively scan down a path of left and right children until an empty subtree is identified and then tentatively insert the new node at that location. During the process, we visit each node in the search path from the root to the new entry. Since the process is recursive, we have access to the nodes in reverse order and can update the balance factor in parent node after learning of the effect of adding the new item in one of its subtrees. At each node in the search path, we determine if an update is necessary. We are confronted with three possible situations. In two cases, the node maintains AVL balance and no restructuring of subtrees is necessary. Only the balanceFactor of the node must be updated. The third case un-balances the tree and requires us to perform a single or double rotation of nodes to re-balance the tree. Case 1: A node on the search path is initially balanced (balanceFactor = 0). After adding a new item in a subtree, the node becomes heavy on the left or the right, depending on which of its subtrees stores the new item. We update balanceFactor to -1 if the item is stored in the left subtree and 1 if the item is stored in the right subtree. Case 2: A node on the path is weighted to the left or right subtree and the new item is stored in the other (lighter) subtree. The node then becomes balanced.
214
Nonlinear Structures
Case 3: A node on the path is weighted to the left or right subtree and the new item is positioned in the same (heavier) subtree. The resulting node violates the AVL condition since balanceFactor is out of AVL range -1 ... 1. The algorithm directs us to rotate nodes to restore height balance. template void AVLTree:: AVLInsert(AVLTreeNode* & tree, AVLTreeNode* newNode, int& reviseBalanceFactor) { // flag indicates change node’s balanceFactor will occur int rebalanceCurrNode; // scan reaches an empty tree; time to insert the new node if (tree == NULL) { // update the parent to point at newNode tree = newNode; // assign balanceFactor = 0 to new node tree->balanceFactor = balanced; // broadcast message; balanceFactor value is modified reviseBalanceFactor = 1; } // recursively move left if new data < current data else if (newNode->data < tree->data) { AVLInsert(tree->Left(), newNode, rebalanceCurrNode); // check if balanceFactor must be updated. if (rebalanceCurrNode) { // case 3: went left from node that is already heavy // on the left. violates AVL condition; rotatate if (tree->balanceFactor == leftheavy) UpdateLeftTree(tree, reviseBalanceFactor); // case 1: moving left from balanced node. resulting // node will be heavy on left else if (tree->balanceFactor == balanced) { tree->balanceFactor = leftheavy; reviseBalanceFactor = 1; } // case 2: scanning left from node heavy on the // right. node will be balanced else { tree->balanceFactor = balanced; reviseBalanceFactor = 0; } }
8.7. AVL TREES
215
else // no balancing occurs; do not ask previous nodes reviseBalanceFactor = 0; } // otherwise recursively move right else { AVLInsert(tree->Right(), newNode, rebalanceCurrNode); // check if balanceFactor must be updated. if (rebalanceCurrNode) { // case 2: node becomes balanced if (tree->balanceFactor == leftheavy) { // scanning right subtree. node heavy on left. // the node will become balanced tree->balanceFactor = balanced; reviseBalanceFactor = 0; } // case 1: node is initially balanced else if (tree->balanceFactor == balanced) { // node is balanced; will become heavy on right tree->balanceFactor = rightheavy; reviseBalanceFactor = 1; } else // case 3: need to update node // scanning right from a node already heavy on // the right. this violates the AVL condition // and rotations are needed. UpdateRightTree(tree, reviseBalanceFactor); } else reviseBalanceFactor = 0; } }
AVLInsert identifies situations for Case 3 that cause a node to violate the AVL condition. The insert uses the methods UpdateLeftTree and UpdateRightTree to carry out the rebalancing. These private functions select the appropriate single or double rotation to balance a node and then set the flag reviseBalanceFactor to 0 (False) to notisfy the parent that the subtree is balanced. We give the code before describing specific details for the rotation. Rotations Rotations are necessary when the parent node P becomes unbalanced. A single right rotation occurs when both the parent node (P) and the left child (LC) become heavy on the left after inserting the node at position X. We rotate
216
Nonlinear Structures
the nodes so that LC replaces the parent, which becomes a right child. In the process, we take the nodes in the right subtree of LC (ST) and attach them as a left subtree of P. This maintains the ordering since nodes in ST are greater or equal to LC but less than P. The rotation balances both the parent and left child. // rotate clockwise about node p; make lc the new pivot template void AVLTree::SingleRotateRight (AVLTreeNode* & p) { // the left subtree of p is heavy AVLTreeNode *lc; // assign the left subtree to lc lc = p->Left(); // update the balance factor for parent and left child p->balanceFactor = balanced; lc->balanceFactor = balanced; // any right subtree of lc must continue as right subtree // of lc. do this by making it a left subtree of p p->Left() = lc->Right(); // rotate p (larger node) into right subtree of lc // make lc the pivot node lc->Right() = p; p = lc; }
Double Right Rotation A double rotation occurs when the parent node (P) becomes heavy on the left and the left child (LC) become heavy on the right. NP is the root of the heavy right subtree of LC). We rotate the nodes so that NP replaces the parent node. // double rotation right about node p template void AVLTree::DoubleRotateRight (AVLTreeNode* &p) { // two subtrees that are rotated AVLTreeNode *lc, *np; // in the tree, node(lc) < nodes(np) < node(p) lc = p->Left(); // lc is left child of parent np = lc->Right(); // np is right child of lc // update balance factors for p, lc, and np
8.7. AVL TREES
if (np->balanceFactor == rightheavy) { p->balanceFactor = balanced; lc->balanceFactor = leftheavy; } else if (np->balanceFactor == balanced) { p->balanceFactor = balanced; lc->balanceFactor = balanced; } else { p->balanceFactor = rightheavy; lc->balanceFactor = balanced; } np->balanceFactor = balanced; // before np replaces the parent p, take care of subtrees // detach old children and attach new children lc->Right() = np->Left(); np->Left() = lc; p->Left() = np->Right(); np->Right() = p; p = np; }
Similarly we may define the SingleRotateLeft and DoubleRotateLeft. template void AVLTree::SingleRotateLeft (AVLTreeNode* &p) { AVLTreeNode *rc; rc = p->Right(); p->balanceFactor = balanced; rc->balanceFactor = balanced; p->Right() = rc->Left(); rc->Left() = p; p = rc; } template void AVLTree::DoubleRotateLeft (AVLTreeNode* &p) { AVLTreeNode *rc, *np; rc = p->Right(); np = rc->Left(); if (np->balanceFactor == leftheavy) { p->balanceFactor = balanced; rc->balanceFactor = rightheavy;
217
218
Nonlinear Structures
} else if (np->balanceFactor == balanced) { p->balanceFactor = balanced; rc->balanceFactor = balanced; } else { p->balanceFactor = leftheavy; rc->balanceFactor = balanced; } np->balanceFactor = balanced; rc->Left() = np->Right(); np->Right() = rc; p->Right() = np->Left(); np->Left() = p; p = np; }
The private methods UpdateLeftTree and UpdateRightTree are then coded as follows: template void AVLTree::UpdateLeftTree (AVLTreeNode* &p, int &reviseBalanceFactor) { AVLTreeNode *lc; lc = p->Left(); // left subtree is also heavy if (lc->balanceFactor == leftheavy) { SingleRotateRight(p); // need a single rotation reviseBalanceFactor = 0; } // is right subtree heavy? else if (lc->balanceFactor == rightheavy) { // make a double rotation DoubleRotateRight(p); // root is now balanced reviseBalanceFactor = 0; } } template void AVLTree::UpdateRightTree (AVLTreeNode* &p, int &reviseBalanceFactor) { AVLTreeNode *rc; rc = p->Right(); if (rc->balanceFactor == rightheavy) {
8.8. GRAPHS
219
SingleRotateLeft(p); reviseBalanceFactor = 0; } else if (rc->balanceFactor == leftheavy) { DoubleRotateLeft(p); reviseBalanceFactor = 0; } }
8.8
Graphs
There are many real life problems that can be abstracted as problems concerning sets of discrete objects and binary relations on them. For example, consider a series of public-opinion polls conducted to determine the popularity of the presidential candidates. In each poll, voter’s opinions were sought on two of the candidates, and a favorite between the two was determined. The results of the polls will be interpreted as follows: Candidate a is considered to be running ahead of candidate b if: 1. Candidate a was ahead of candidate b in a poll conducted between them; or 2. Candidate a was ahead of candidate c in a poll, and candidate c was ahead of candidate b in another poll; or 3. Candidate a was ahead of candidate c, and candidate c was ahead of candidate d, and candidate d was ahead of candidate b in three separate polls, and so on. Given two candidates, we might want to know whether one of them is running ahead of the other. Let S = {a, b, c, . . .} be the set of candidates and R be a binary relation on S such that (a, b) is in R if a poll between a and b was conducted and a was chosen the favorite candidate. A binary relation can be represented in graphical form where candidate a is more popular than candidate e if there are ordered pair , , in R. One probably would agree that the graphical representation of the binary relation R is quite useful in comparing the popularity of two candidates, since there must be “a sequence of arrows” leading from the point corresponding to the more popular candidate to that corresponding to the less popular one. As another example, consider a number of cities connected by highways. Given a map of the highways, we might want to determine whether there is a highway route between two cities on the map. Also consider all the board positions in a chess game. We might want to know whether a given board position can be reached from some other board position through a sequence of legal moves. It is clear that both of these examples are again concerned with
220
Nonlinear Structures
discrete objects and binary relations on them. In the example highway map, let S = {a, b, c, . . .} be the set of cities and R be a binary relation on S such that (a, b) is in R if there is a highway from city a to city b. In the chess game example, let S = {a, b, c, . . .} be the set of all board positions and R be a binary relation on S such that (a, b) is in R if board position a can be transformed into board position b in one legal move. Furthermore, in both of these cases as in the example on the popularity of the presidential candidates, we want to know whether, for given a and b in S, there exist c, d, e, . . ., h in S such that {(a, c), (c, d), (d, e), . . ., (h, b)} ⊆ R. Indeed, in many problems dealing with discrete objects and binary relations, a graphical representation of the objects and the binary relations on them is a very convenient form of representation. This leads us naturally to a study of the theory of graphs.
8.8.1
Definition of Graph
The first recorded evidence of the use of graphs dates back to 1736 when Euler used them to solve the now classical Koenigsberg bridge problem. In the town of Koenigsberg (in Eastern Prussia) the river Pregal flows around the island Kneiphof and then divides into two. Basic Definitions and Terminology Definition 8.1 A graph, G, consists of two sets V and E. V is a finite non-empty set of vertices. E is a set of pairs of vertices; these pairs are called edges. V(G) and E(G) will represent the sets of vertices and edges of graph G. We will also write G = (V, E) to represent a graph. In an undirected graph the pair of vertices representing any edge is unordered. Thus, the pairs (v1, v2) and (v2, v1) represent the same edge. In a directed graph each edge is represented by a directed pair . v1 is the tail and v2 the head of the edge. Therefore and represent two different edges. If the vertices represent cities and the edges represent a highway system, movement can occur in both directions between the cities. If the edges represent a communication system, there may be a flow of information from one node to another but not in the reverse direction. In this case, G becomes a directed .
8.8.2
Specification of Graphs
Representing Graphs There are a variety of representations for vertices V and edges E in a digraph.
8.8. GRAPHS
221
• A simple technique stores the members of V as a sequential list of V0 , V1 , . . ., Vm−1 . The edges of a graph are identified by a m × m square matrix, called an adjacency matrix, in which row i and column j correspond to vertices Vi and Vj , respectively. Each entry (i, j) in the matrix gives the weight of the edge Eij = (Vi , Vj ) or 0 if the edge does not exist. For an unweighted digraph, an entry in the adjacency matrix has boolean values 0 and 1 to indicate whether the pair corresponds to an edge in the graph. • Another representation of a graph associates with each vertex a linked list of its adjacent vertices. The list identifies the neighbors of the vertex. This dynamic model stores information on precisely the edges that actually belong to the graph. For a weighted digraph, each node in the linked list contains a weighted field. In this section, we develop a Graph class that assume the adjacency matrix representation of edges. We use a static model of a graph that assumes an upper bound on the number of vertices. The use matrices simplifies the class implementation and allows us to focus on a variety of graph algorithms. #include #include #include #include
"stack.h" "pqueue.h" "queue.h" "seqlist2.h"
const int MaxGraphSize = 25; template class VertexIterator; template class Graph { private: // Key data including list of vertices, adjacency matrix // and current size (number of vertices) of the graph SeqList vertexList; int edge [MaxGraphSize][MaxGraphSize]; int graphsize; // Methods to find vertex and identify position in list int FindVertex(SeqList &L, const T& vertex); int GetVertexPos(const T& vertex); public: // Constructor Graph(void); // Graph test methods
222
Nonlinear Structures
int GraphEmpty(void) const; int GraphFull(void) const; // Data access methods int NumberOfVertices(void) const; int GetWeight(const T& vertex1, const T& vertex2); SeqList& GetNeighbors(const T& vertex); // Graph modification methods void InsertVertex(const T& vertex); void InsertEdge(const T& vertex1, const T& vertex2, int weight); void DeleteVertex(const T& vertex); void DeleteEdge(const T& vertex1, const T& vertex2); // Utility methods int MinimumPath(const T& sVertex, const T& eVertex); SeqList& DepthFirstSearch(const T& beginVertex); SeqList& BreadthFirstSearch(const T& beginVertex); // Iterator used to scan the vertices friend class VertexIterator; };
• The data members for the class include the vertices that are stored in a sequential list, the edges that are represented by a two-dimensional integer matrix, and a data value graphsize, which counts the number of vertices. The value graphsize is returned by the method NumberOfVertices. • The utility method FindVertex checks whether a vertex already exists in list L. This is used by the search methods. The method GetVertexPos translates a vertex to a position in vertexList. The position corresponds to a row or column index in the adjacency matrix. • VertexIterator is derived from the SeqListIterator class and enables the user to scan the vertices. The iterator simplifies applications.
8.8.3
Implementation of Graph
Constructor of Graph The constructor for Graph class is responsible for initializing the MaxGraphSize × MaxGraphSize adjacency matrix and setting the graphsize to 0. The constructor sets each entry in the matrix to 0 indicate that there are no edges. // constructor initialize entries in the adjacency matrix // to 0 and sets the graphsize to 0 template Graph::Graph(void)
8.8. GRAPHS
223
{ for (int i = 0; i < MaxGraphSize; i++) for (int j = 0; j < MaxGraphSize; j++) edge[i][j] = 0; graphsize = 0; }
Counting Graph Components The private data member graphsize maintains the size of the vertex list. Its value can be accessed with the method NumberOfVertices. template int Graph::NumberOfVertices(void) const { return graphsize; }
The class operator GraphEmpty indicates whether the graphsize is 0. template int Graph::GraphEmpty(void) const { return graphsize == 0; }
Accessing Graph Components The componenents of a graph are contained in the vertex list and the adjacency matrix. For the vertices, a vertex iterator allows us to scan the items in the vertex list. The iterator is a friend in the Graph class so that it has access to vertexList. The graph iterator is inherited from the SeqListIterator class. template class VertexIterator: public SeqListIterator { public: VertexIterator(Graph& G); }
The constrctor simply initializes the base class to traverse the private data member vertexList. The following code gives the implementation of this constructor: template VertexIterator::VertexIterator(Graph& G): SeqListIterator(G.vertexList)
The iterator scans items in the vertexList and is used to implement the function GetVertexPos that scans the vertex list and returns the position of the vertex in the list. The following iterator code searches for the vertex:
224
Nonlinear Structures
template int Graph::GetVertexPos(const T& vertex) { SeqListIterator liter(vertexList); int pos = 0; while(!liter.EndOfList() && liter.Data() != vertex) { pos++; liter.Next(); } if (liter.EndOfList()) { pos = -1; } return pos; }
For each edge, the method GetWeight returns the weight of the edge connecting vertex1 and vertex2. The method uses GetVertexPos to get the position of the two vertices in the list and hence the row-and-column entry in the adjacency matrix. If either vertex is not in the list, the method returns -1. template int Graph::GetWeight(const T& vertex1, const T& vertex2) { int pos1=GetVertexPos(vertex1), pos2=GetVertexPos(vertex2); if (pos1 == -1 || pos2 == -1) { return -1; } return edge[pos1][pos2]; }
For a vertex parameter, the method GetNeighbors creates a list of all adjacent vertices. The method scans the adjacency matrix and identifies a list of nodes VE such that (V, VE ) is an edge. The list is returned as a parameter and can be scanned using a SeqList iterator. If the vertex has no neighbors, the method returns an empty list. // return the list of all adjacent vertices template SeqList& Graph::GetNeighbors(const T& vertex) { SeqList *L; SeqListIterator viter(vertexList); // allocate an SeqList L = new SeqList;
8.8. GRAPHS
225
// look up pos in list to identify row in adjacency matrix int pos = GetVertexPos(vertex); // if vertex not in list of vertices, terminate if (pos == -1) { return *L; // return empty list } // scan row of adjacency matrix and include all vertices // having a non-zero weighted edge from vertex for (int i = 0; i < graphsize; i++) { if (edge[pos][i] > 0) L->Insert(viter.Data()); viter.Next(); } return *L; }
Updating Vertices and Edges To insert an edge, we use GetVertexPos to check that both vertex1 and vertex2 are in the vertes list. If either item is not found, an error message occurs and the method returns. template void Graph::InsertVertex(const T& vertex) { if (graphsize+1 > MaxGraphSize) { return; } vertexList.Insert(vertex); graphsize++; }
Once the position pos1 and pos2 of the vertices are established, InsertEdge places the weight of the edge in (pos1, pos2) of the adjacency matrix. template void Graph::InsertEdge(const T& vertex1, const T& vertex2, int weight) { int pos1=GetVertexPos(vertex1), pos2=GetVertexPos(vertex2); if (pos1 == -1 || pos2 == -1) { return; }
226
Nonlinear Structures
edge[pos1][pos2] = weight; }
The Graph class provides the method DeleteVertex to remove a vertex from the graph. If the vertex is not in the list, the method returns without giving any result. If it is present, however, we must delete all edges that create a link with the deleted vertex. There are three regions in the adjacency matrix that must be adjusted. • Region 1: Shift the column index to left. • Region 2: Shift the row index up and column index to the left. • Region 3: Shift the row index up. // delete a vertex from vertex list and update the adjacency // matrix to remove all edges linked to the vertex. template void Graph::DeleteVertex(const T& vertex) { // get the position in the vertex list int pos = GetVertexPos(vertex); int row, col; // if vertex is not present, terminate the program if (pos == -1) { return; } // delete the vertex and decrement graphsize vertexList.Delete(vertex); graphsize--; // the adjacency matrix is partitioned into three regions for (row = 0; row < pos; row++) // region I for (col = pos + 1; col < graphsize; col++) edge[row][col-1] = edge[row][col]; for (row = pos+1; row < graphsize; row++) for (col = pos + 1; col < graphsize; col++) edge[row-1][col-1] = edge[row][col];
// region II
for (row = pos+1; row < graphsize; row++) for (col = 0; col < pos; col++) edge[row-1][col] = edge[row][col];
// region III
}
To delete an edge, we simply remove the link between the two vertices. After testing the vertices are in the vertexList, the method DeleteEdge assigns a 0 weight to the edge and leaves the other edges unchanged.
8.8. GRAPHS
227
template void Graph::DeleteEdge(const T& vertex1, const T& vertex2) { int pos1=GetVertexPos(vertex1), pos2=GetVertexPos(vertex2); if (pos1 == -1 || pos2 == -1) { return; } edge[pos1][pos2] = 0; }
We use the method FindVertex to test if a given vertex is in a vertex list L or not. template int Graph::FindVertex(SeqList &L, const T& vertex) { SeqListIterator iter(L); int ret = 0; while (!iter.EndOfList()) { if (iter.Data() == vertex) { ret = 1; break; } iter.Next(); } return ret; }
Graph Traversals When scanning a nonlinear structure, we must develop a strategy to access the nodes and to mark them once they are visited. Binary trees define a series of searching methods that have a counterpart with graphs. The predorder binary tree scan uses the strategy of visiting a node and then driving down the subtrees. For graphs, the depth-first search is a generalization of the preorder traversal. The starting vertex is passed as a parameter and becomes the first vertex that is visited. As we move down a path until we reach a “dead end”, we store adjacent vertices on a stack so that we can return and search down other paths if unvisited vertices remain. The visited vertices is the set of all vertices that are reachable from the starting vertex. Tree feature a level order scan that starts at the root and visits the nodes by level down to the depth of the tree. With a graph, the breadth-first search uses a similar strategy by starting at an initial vertex and visiting each of its adjacent vertices. The scan continues to the next level of adjacent vertices, and
228
Nonlinear Structures
so forth, until we reach the end of a path. The algorithm uses a queue to store the adjacent vertices as we scan from level to level in the graph. Depth-First Search The search algorithm uses a list L to maintain a record of visited vertices and a stack S to store adjacent vertices. After placing the initial vertex on the stack, we begin an iterative process that pops a vertex from the stack and then visits the vertex. We terminate when the stack is empty and return the list of visited vertices. For each step, use the following strategy. Pop a vertex V from the stack and check with list L to see if V has been visited. If not, we are visiting a new vertex and use the opportunity to get a list of its adjacent vertices. We insert V in list L so that it is not revisited. We conclude by pushing the adjacent vertices of V that are not already in L on the stack. The algorithm is implemented as follows: // from a starting vertex, return depth first scanning list template SeqList & Graph::DepthFirstSearch(const T& beginVertex) { // stack to temporarily hold waiting vertices Stack S; // L is list of nodes on the scan; adjL holds the // neighbors of the current vertex. L is created // in dynamic memory so a reference to it can be // returned SeqList *L, adjL; // iteradjL used to traverse neighbor lists SeqListIterator iteradjL(adjL); T vertex; // initialize return list; push starting vertex on stack L = new SeqList; S.Push(beginVertex); // scan continues until the stack is empty while (!S.StackEmpty()) { // pop next vertex vertex = S.Pop(); // check if it is already in L if (!FindVertex(*L, vertex)) { // if not, put it in L and get all adjacent vertices (*L).Insert(vertex); adjL = GetNeighbors(vertex);
8.8. GRAPHS
229
// set iterator to current adjL iteradjL.SetList(adjL); // scan list of neighbors; put on stack if not in L for (iteradjL.Reset(); !iteradjL.EndOfList(); iteradjL.Next()) if (!FindVertex(*L, iteradjL.Data())) S.Push(iteradjL.Data()); } } // return depth first scan list. return *L; // return list }
Breadth-First Search The breadth-first search uses a queue as does the level order scan of a binary tree. The algorithm uses the techniques developed for the depth-first search. Vertices are placed in a queue instead of a stack. An iterative process removes vertices from the queue until it is empty. Delete a vertex V from the queue and check to see if V is in the list of visited nodes. If V is not in L, we have a new vertex, which must be added to L. At the same time, we get all neighbors of V and insert them in the queue provided they are not in the list of visited nodes. The algorithm is implemented as follows: template SeqList& Graph::BreadthFirstSearch(const T& beginVertex) { Queue Q; SeqList *L, adjL; SeqListIterator iteradjL(adjL); T vertex; L = new SeqList; Q.QInsert(beginVertex);
// initialize the queue
while (!Q.QEmpty()) { // remove a vertex from the queue vertex = Q.QDelete(); // if vertex is not in L, add it if (!FindVertex(*L, vertex)) { (*L).Insert(vertex); // get list of neighbors of vertex
230
Nonlinear Structures
adjL = GetNeighbors(vertex); iteradjL.SetList(adjL); // insert all neighbors of vertex into the queue // if they are not already there for (iteradjL.Reset(); !iteradjL.EndOfList(); iteradjL.Next()) { if (!FindVertex(*L,iteradjL.Data())) Q.QInsert(iteradjL.Data()); } } } return *L; }
Minimum Path The depth-first and breadth-first search identify vertices that are on a path from the initial node. The algorithms do not attempt to optimize the movement from vertex to vertex and identify a minimum path, a problem that we now address. Many applications want to select a path that requires a minimum “cost” as measured by the accumulated weights on a path between vertices. To find a minimum path between two nodes, we introduces a new class, called PathInfo, whose objects specify paths connecting two vertices and a data value that measures the cumulative cost of travelling on a path between the vertices. The PathInfo objects are stored in a priority queue that provides us with access to the object in the queue having minimum cost. template struct PathInfo { T startV, endV; int cost; }; template int operator table size, terminate search; scan is over if (cb > hashTable->numBuckets) return; // otherwise search from current list to end of table // for a non-empty bucket and update private data members for (int i=cb; i < hashTable->numBuckets; i++) if (!hashTable->buckets[i].ListEmpty()) { // before return, set currentBucket index to i and // set currBucketPtr to the new non-empty list. currBucketPtr = &hashTable->buckets[i]; currBucketPtr->Reset(); currentBucket = i; return; } }
The constructor initializes the Iterator base class and sets the private pointer
8.9. HASHING AND HASH TABLE CLASS
243
variable hashTable to the address of the hash table. Call SearchNextNode with parameter 0 to locate a nonempty list. // constructor; initialize both the base class // and hashTable. SearchNextNode identifies // the first non-empty bucket in the table template HashTableIterator::HashTableIterator(HashTable& ht): Iterator(), hashTable(&ht) { SearchNextNode(0); }
Next advances one element forward in the current list. If we reach the end of the list, the function SearchNextNode updates the iterator so it traverses the next nonempty bucket. // move to the next data item in the table template void HashTableIterator::Next(void) { // using current list, move to next node or end of list currBucketPtr->Next(); // at end of list, call SearchNextNode to identify next // non-empty bucket in the table if (currBucketPtr->EndOfList()) SearchNextNode(++currentBucket); // set the iterationComplete flag to check if currentBucket // index is at end of list, iterationComplete = currentBucket == -1; } template void HashTableIterator::Reset(void) { iterationComplete = hashTable->ListEmpty(); SearchNextNode(0); } template T& HashTableIterator::Data(void) { return (*currBucketPtr).Data(); } template
244
void HashTableIterator::SetList(HashTable& lst) { hashTable = &lst; Reset(); }
Nonlinear Structures
Part III
Concurrent Programming in DeltaOS
245
Chapter 9
A Cyclical Executive for Small Systems A simple cyclical executive is adequate for a variety small non-critical real-time applications. This chapter presents the concept and the code for the executive.
9.1
Concept
This technique is known by various names including executive, cyclical executive, round robin executive, round robin kernel, super loop, and probably others. The concept is shown below. void main(void) { for (;;) { task0(); task1(); task2(); /* and so on... */ } } Each “task” (C function) is called in turn. For this technique to work, the following rules must be followed: • Tasks may not employ busy waiting. • Tasks must do their work quickly and return to the main loop so that other tasks can run. • If necessary, tasks must save their place by using a state variable. 246
9.2. TIMERS
247
The executive provides no services to the tasks. For example, there are no priorities, no timers, no inter-task communication mechanisms, no queuing, no task switching, and no suspension mechanisms provided.
9.2
Timers
The timer ISR for the executive is shown below. void TimerIsr(void) { Tics++; } The timer ISR has one function and that is to bump a global counter (unsigned integer). Tasks use this counter to implement timers. For this book we will assume that the hardware timer has been programmed to generate an interrupt every 10 milliseconds (a typical value for many real-time systems). void readAnalogDataTask(void) { static int state = 0, timer; switch (state) { case 0: timer = Tics + 10; state = 1; break; case 1: if (Tics == timer) { readAndProcessAnalogData(); timer = Tics + 10; } break; } } The task reads an analog channel about every 100 milliseconds (every 10 tics). It implements its own timer by adding to “Tics” the number of timer tics to wait, and then checks “Tics” against the timeout value each time the task is called. Note also that the “if” test checks for ==, not ¿= because if the value of “timer” added to “Tics” forces a wrap around, then “¿=” will succeed on the first try which is incorrect. This also means that the round trip time through the loop should not exceed 10 milliseconds (the hardware timer interval) otherwise, by the time the timeout test is performed, the value of “Tics” may have gone past the timeout value. Note also the use of the state variable. This is necessary because the first time the function is called, a timer must be started.
248
CHAPTER 9. A CYCLICAL EXECUTIVE FOR SMALL SYSTEMS
9.3
Inter-Task Communication
Inter-task communication is done through global data. A signal is simply a global flag. A message with data is a global data structure and a flag that signals the receiving task that data is available. It is up to the receiving task to check for any data that it may receive and to reset the flag after data is read. void displayDataTask(void) { if (DisplayData.flag == 1) { displayOnScreen(&DisplayData); DisplayData.flag = 0; } } Note that the receiver of the data must set the flag back to 0, otherwise, the flag will stay set.
9.4
Priorities
There are no priorities. Tasks are called in their turn in round robin fashion. High priority items are handled inside ISR’s. For example, if a line must be toggled at a very accurate 10 millisecond rate, then the appropriate code is added to the timer ISR.
9.5
Queueing
There are no message queues. If your application needs queuing, consider using a kernel rather than implementing queues here. The idea with the cyclical executive is to keep things simple.
9.6
Task Switching
The concept of task switching is not part of the cyclical executive. If a task is at a point where it must save its place and return to the executive loop, then it must save its place in a static or global state variable. The next time the task is called, the task can resume where it left off by switching on the state variable.
9.7
Instance Data
Consider a system that has 5 analog channels, each with a different sample rate, gain, IO address, etc. Each channel must be sampled every ”n” milliseconds, where ”n” is different for each channel. One solution is to have a separate task for each channel and call each from the executive loop.
9.8. ADVANTAGES
249
channel0Task(); channel1Task(); channel2Task(); channel3Task(); channel4Task(); Another approach is to code one generic task that would acquire its sample rate, IO address, etc. from a C structure passed to it as an argument. genericChannelTask(&channel0); genericChannelTask(&channel1); genericChannelTask(&channel2); genericChannelTask(&channel3); genericChannelTask(&channel4); The benefit gained is that only one function needs to be written.
9.8
Advantages
The cyclical executive has the following advantages. • Easy to use and understand. • With the exception of interrupts, the system is completely deterministic. Tasks are always called in the same order. The worst case loop latency can be computed. • Minimal code and data requirements. • Because tasks cannot be preempted, time dependent errors cannot occur. • Only one stack is required. Preemptive systems require a stack for each task.
9.9
Disadvantages
The cyclical executive has the following disadvantages. • More responsibility is placed on the programmer. • Coding is more difficult than with other models. • No priorities. • No queues. • A form of busy waiting is employed, since tasks are called regardless of whether they have work to do or not.
250
9.10
CHAPTER 9. A CYCLICAL EXECUTIVE FOR SMALL SYSTEMS
Comments
For small systems, especially single chip micro-controllers, the cyclical executive is a good choice. Larger applications that require priorities and queuing may require a standard real-time kernel.
Chapter 10
How to Break an Application Down into Tasks This chapter provides some guidelines that may be useful when designing software for real-time multi-tasking systems. Specifically, it addresses how an application is broken down into multiple concurrent tasks.
10.1
When Multi-tasking is Required
Multi-tasking is generally required in the following situations. • When operations must occur concurrently. • When a common resource (like a printer) needs to be managed. • When it is desirable to defer interrupt processing. These topics are addressed in detail below.
10.2
Implementing Concurrent Operations as Separate Tasks
Consider a simple data acquisition system with the following requirements: 1. Read data every 100 ms. 2. Display averaged data once per second. 251
252
Breaks into Tasks
3. Wait for and respond to keyboard input. This application could be implemented as 1 task, or 3 tasks. Both the single and 3 task solutions are shown below.
10.2.1
Single Task Solution
void doAllTask(void) { int i; for (i = 0; ; i++) { pause(100L); if (kbhit()) processKey(); readAndAverageData(); if (i == 9) { displayAverageData(); i = 0; } } }
10.2.2
Three Task Solution
void dataAcquisitionTask(void) { while (TRUE) { pause(100L); readAndAverageData(); } } void keyboardTask(void) { while (TRUE) { pause(100L); if (kbhit()) processKey(); } } void displayTask(void) { while (TRUE) { pause(1000L); displayAverageData(); } }
10.3. DESIGN PRINCIPLES
253
The question is: which approach is better? The answer for this case is the three task solution, and the reasons are as follows. Three tasks will not significantly add to system overhead. The code is probably easier to understand. Future modifications are more easily implemented. Each task can be debugged separately. More tasks are not always the desired solution however. In general task count should be kept to a minimum.
10.3
Design Principles
10.3.1
Keep the Task Count Down
Task proliferation should be avoided. Consider a system with 32 RS-232 channels, each channel being connected to a simple CRT terminal. Let us consider the handling of incoming data only. Each RS-232 channel has an incoming queue associated with it and incoming characters are placed into their respective queues by the receive ISR. Let us further assume that the system is to simply perform the typical tasks that a CRT terminal requires: echo characters to the screen, backspace, delete, etc. One may be tempted to create a generic task and instantiate it 32 times once for each channel. Each instance would wake up every 100 ms and check the incoming data queue associated with its channel. This would have the effect of creating 32 task control blocks, 32 stacks, and 32 on-going timers. A better approach might be to create a single task that wakes up every 100 ms and services all 32 queues. This approach would cut down on overhead, run faster, and make debugging easier. In our previous example, we chose the 3 task approach instead of the single task. Note, however, the differences here. • 32 tasks and 32 timers can begin to add significant overhead to the system. • The tasks are identical, making the single task approach easy to implement. • The single task approach simplifies development and debugging.
10.3.2
When Multiple Task Instances Make Sense
Multiple task instances can make sense when requirements become more complex. Consider the example given above with the requirement that each RS-232 line’s incoming queue must be polled at a different rate. Now the single task approach becomes quite difficult in that the single task must start 32 different timers, and when each timer expires, match it to the appropriate buffer. Furthermore, one of the benefits of using the single task was the avoidance of 32 pending timers, but with this new requirement we need 32 pending timers even
254
Breaks into Tasks
with one task. A simpler approach is to create a generic task and instantiate it 32 times. void rs232Task(void) { typeRS232Data * d; d = (typeRS232Data *) ActiveTcb->ptr; while (TRUE) { pause(d->pauseValueInMs); processLine(d); } } When each task instance is created, the “ptr” field of its TCB is pointed to a data structure that contains data specific for that task instance (instance data). In this way, each task instance pauses for a different interval.
10.3.3
Implementing Resource Management as a Task
If tasks simply write to a printer when they so desire, interleaved printout can result. A solution to this problem is to create a separate server task whose job is to process print requests (messages) from other tasks. If all tasks follow the rule, then printing will be accomplished in an orderly fashion with interleaving eliminated. In general, a server task should be created whenever tasks contend for a resource. Although a semaphore could be used, this approach can cause problems.
10.3.4
Defering Interrupt Processing to a Task
It is generally desirable to get in and out of an ISR as soon as possible. For this and other reasons it is sometimes desirable to defer ISR processing to a task. This can be done by creating a task that will process a message from the ISR as shown below. void taskToHandleInterrupt(void) { typeMsg * msg; status = delta_message_queue_receive ( qrid, isr_msg_buf, &size, DELTA_WAIT, timeout
10.3. DESIGN PRINCIPLES
255
); msg = isr_msg_buf[0]; processIsrMsg(msg); freeMsg(msg); } The task above remains dormant until the ISR is entered and the message sent. One could argue that the ISR processing could be performed within the ISR itself, but sometimes this is not easily done. Consider an ISR that is subject to ”bursts”. That is, 5 interrupts in a row may occur very quickly, followed by a period of inactivity. During a burst, the ISR may not have time to process the interrupt before another interrupt occurs. In this case, the ISR would be reentered (assuming interrupts were enabled while inside the ISR) and as a result handling would become more complex. Deferring interrupt processing to a task solves this problem. Since messages are queued by the multi-tasking kernel, burst processing is not an issue. Messages will be queued to the handler task and each will be handled in its turn. Of course this assumes that the kernel message system is fast enough to send the message before another interrupt occurs.
Chapter 11
Concurrent Programming in DeltaOS It is a fun to play DeltaOS with a small program which can demonstrate many features of DeltaOS.
11.1
Classical Process Coordinate Problems
In this section we present a number of different synchronization problems that are important mainly because they are examples for a large class of concurrency control problems. These problems are used in testing nearly every newly proposed synchronization scheme. Semaphores are used for synchronization in our solutions.
11.1.1
The Bounded-Buffer Problem
The bounded-buffer problem is commonly used to illustrate the power of synchronization primitives. We present here a general structure of this scheme, without committing ourselves to any particular implementation. We assume that the pool consists of n buffers, each capable of holding one item. The mutex semaphore provides mutual exclusion for accesses to the buffer pool. The empty and full semaphores count the number of empty and full buffers, respectively. const int MaxBufferSize = 50; template class BBuffer { private: T bufferlist[MaxBufferSize];
256
11.1. CLASSICAL PROCESS COORDINATE PROBLEMS
int count, in, out; PSemaphore xmutex, xfull, xempty; public: // Constructor (default) BBuffer(void); // Data retrieval and storage operations void Receive(T& i); void Send(T i); }; template BBuffer::BBuffer(void) { count = 0; in = 0; out = -1; xfull.Create("FULL", SM_FIFO, 0); xempty.Create("EMPT", SM_FIFO, MaxBufferSize); xmutex.Create("MUTX", SM_FIFO, 1); } // Retrieve item if initialized template void BBuffer::Receive(T& i) { unsigned long time_out = 0; // If the channel is not full, just wait xfull.P(SM_WAIT, time_out); // Enter the critical region xmutex.P(SM_WAIT, time_out); i = bufferlist[out]; out = (out + 1) % MaxBufferSize; count--; xmutex.V(); // Set xempty as true xempty.V(); } // Put item in storage template
257
258
Concurrent Programming
void BBuffer::Send(T i) { unsigned long time_out = 0; // If the channel is not empty, just wait xempty.P(SM_WAIT, time_out); // Enter the critical region xmutex.P(SM_WAIT, time_out); bufferlist[in] = i; in = (in + 1) % MaxBufferSize; count++; xmutex.V(); // Set xfull as true xfull.V(); }
Note the symmetry between the Read and Write. We can interpret this code as the producer producing full buffers for the consumer, or as the consumer producing empty buffers for the producer.
11.1.2
The Reader/Writer Problem
This problem was first posed by Courtois et al. [4]. A data object (such as a file or record) is to be shared among several concurrent processes. Some of these processes may want only to read the content of the shared object, while others may want to update (that is, read and write) the shared object. We distinguish between these two types of processes by referring to those processes which are only interested in reading as readers and to the rest as writers. Obviously, if two readers access the shared data objects simultaneously, no adverse effects will result. However, if a writer and some other process (either a reader or a writer) access the shared object simultaneously, chaos may ensue. In order to ensure that these difficulties do not arise, we require that the writers have exclusive access to the shared object. This synchronization problem is referred to as the readers/writers problem. The readers/writers problem has several variations, all involving priorities. The simplest one, referred to as the first readers/writers problem, requires that no reader will be kept waiting unless a writer has already obtained permission to use the shared object. In other words, no reader should wait for other readers to finish simply because a writer is waiting. The second readers/writers problem requires that once a writer is ready, it performs its write as soon as possible. In other words, if a writer is waiting to access the objects, no new readers may start reading.
11.1. CLASSICAL PROCESS COORDINATE PROBLEMS
259
In this solution to the first readers/writers problem, the reader processes share the following specification. template class ReaderWriter { private: PSemaphore xmutex, wrt; int readcount; public: // Constructor (default) ReaderWriter(void); // Data retrieval and storage operations void Write(T i); void Read(T& i); };
The semaphores, mutex and wrt are initialized to 1, while readcount is initialized to 0. The semaphore wrt is common to both the reader and writer processes. The mutex semaphore is used to ensure mutual exclusion when the variable readcount is updated. The readcount keeps track of how many processes are currently reading the object. The semaphore wrt functions as mutual exclusion semaphore for the writers. It also is used by the first/last reader that enters/exits the critical section. It is not used by readers who enter or exit while other readers are in their critical section. template ReaderWriter::ReaderWriter(void) : top(-1) { xmutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); wrt.Create("WRIT", 1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); readcount = 0; }
The general structure of a reader process is: template void ReaderWriter::Read(T& i) { unsigned long time_out = 0;
260
Concurrent Programming
xmutex.Obtain(DELTA_WAIT,time_out); readcount = readcount + 1; if (readcount = 1) wrt.Obtain(DELTA_WAIT,time_out); xmutex.Release(); // Reading is performed here xmutex.Obtain(DELTA_WAIT,time_out); readcount = readcount - 1; if (readcount == 0) wrt.Release(); xmutex.Release(); }
while the general structure of a writer process is: template void ReaderWriter::Write(T i) { unsigned long time_out = 0; wrt.Obtain(DELTA_WAIT,time_out); // Writing is peformed here wrt.Release(); }
Note that if a writer is in the critical section and n readers are waiting, then one reader is queued on wrt, while n − 1 readers are queued on mutex. Also observe that when a writer executes wrt.V(), we may resume the execution of either the waiting readers or a single waiting writer. The selection is up to the scheduler. There are many solutions to this problem, and they are more or less easy to design or to understand depending on the properties of the language in which they are presented. Solutions have been proposed based on monitor[13], critical region[11], or control modules [22]. Note that the solution proposed here will not avoid starvation for writers, whether the number of readers is finite or not. This means that a different algorithm must be developed depending on whether we want to give readers or writers priority, or to have no priority at all (the fair solution).
11.2
Dining Philosopher Problem
11.2.1
Problem Description
The Mortal Dining Philosophers This problem is an adaptation of the one posed by E. W. Dijkstra[9]. Five philosophers spend their lives eating spaghetti and thinking. They eat at a
11.2. DINING PHILOSOPHER PROBLEM
261
circular table in a dining room. The table has five chairs around it and chair number i has been assigned to philosopher number i (0 ≤ i ≤ 4). Five forks have also been laid out on the table so that there is precisely one fork between every adjacent two chairs. Consequently one fork to the left of each chair and one to its right. Fork number i is to the right of chair number i. In order to be able to eat, a philosopher must enter the dining room and sit in the chair assign to him. A philosopher must have two forks to eat (the forks placed to the left and right of every chair). If the philosopher cannot get two forks immediately, then he must wait until he can get them. The forks are picked up one at a time with the left fork being picked up first. When a philosopher is finished eating (after a finite amount of time), he puts the forks down and leave the room. The dining philosophers problem has been studied extensively in the computer science literature. It is used as a benchmark to check the appropriateness of concurrent programming facilities and of proof techniques for concurrent programs. It is interesting because, despite its apparent simplicity, it illustrates many of the problems, such as shared resources and deadlock, encountered in concurrent programming. The forks are the resources shared by the philosophers who represent the concurrent processes. Purpose This is a good exercise for better understanding the resource allocation problem in a concurrent and distributed environment.
11.2.2
Solutions
Solution with Possibility of Deadlock The five philosophers are implemented as tasks and the five forks are implemented as semaphores. On creation, each philosopher is given an identification. Each philosopher is motal and passes on to the next world soon after having eaten 100,000 times (about three times a day for 90 years). Philosophers were implemented as five tasks and forks were implemented as semaphores. A variation of the above problem is to allow a philosopher to sit in any chair. This variation will result in a smaller average waiting time for eating for the philosophers. This scheme can be implemented by declaring a new task that is called by every philosopher to request a chair (preferably one with free forks). On leaving the dining room, a philosopher informs this task that the chair is vacant. In the solution given, no individual philosopher will be blocked indefinitely from eating, i.e., starve, because the philosophers pick up the forks in the first-in first-out order (the discipline associated with task queue with same priority).
262
Concurrent Programming
However, there is a possibility of deadlock in the solution given above, e.g., each philosopher picks up one fork and waits to get another fork so that he can start eat. Assuming that all the philosophers are obstinate and that none of them will give up his fork until he gets another fork and has eaten, everything will be in a state of suspension and all the philosophers will starve. Root Task The task Init is created and started by DeltaCORE kernel as part of its startup initialization. After startup, the DeltaCORE kernel simply passes control to the task Init. /****************************************************************/ */ /* /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /****************************************************************/ delta_task Init(delta_task_argument argument) { int i; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); time_buffer.year = 2001; /* year, A.D. */ time_buffer.month = 6; /* month, 1 -> 12 */ time_buffer.day = 7; /* day, 1 -> 31 */ time_buffer.hour = 9; /* hour, 0 -> 23 */ time_buffer.minute = 38; /* minute, 0 -> 59 */ time_buffer.second = 0; /* second, 0 -> 59 */ time_buffer.ticks = 20; /* second, 0 -> 59 */ delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks);
11.2. DINING PHILOSOPHER PROBLEM
263
/*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’P’, ’H’, ’L’, ’ ’); status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(tid[i], phil, i); } /*-------------------------------------------------------------------*/ /* Create five message queues as forks. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Sem_name[i] = delta_build_name(’F’, ’O’, ’K’, ’ ’); status = delta_semaphore_create(Sem_name[i], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &forkid[i]); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } } /*-------------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-------------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); }
Philosopher Tasks /***********************************************************************/ /* */ /* phil */ /* */ /* Note : This is a parameterized procedure for generating five tasks */ /* */ /***********************************************************************/
264
Concurrent Programming
delta_task phil(delta_task_argument targ) { unsigned long timeout = 0; int lfork, rfork; unsigned long msgbuf[4]; int i; i = targ; lfork = i; rfork = (i + 1) % 5; msgbuf[0] = (unsigned long) "Fork_Ok"; while (1) { /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); delta_semaphore_obtain(forkid[lfork], DELTA_NO_WAIT, timeout); /* Take rest for 15 ticks before to pick up another fork */ delta_task_wake_after(15); delta_semaphore_obtain(forkid[rfork], DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); delta_semaphore_release(forkid[lfork]); delta_semaphore_release(forkid[rfork]); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50); }; }
11.2. DINING PHILOSOPHER PROBLEM
265
A Deadlock Free Solution Deadlock can be avoided in several way; for example, a philosopher may pick up the two forks needed by him only when both forks available. Alternatively, one could add another task called host that makes sure there are at most four philosophers in the dining room at any given time. Each philosopher must request permission to enter the room from the host and must inform him on leaving. More simply, we may allow at most 4 philosophers to enter the room letting at least one philosopher always being able to pick up two forks and to finish the eating. There is no possibility of a deadlock with this change, since at least one philosopher in the room will be able to eat. Since they all eat for a finite time, he will leave and some other philosopher will be able to eat. Root Task /***********************************************************************/ */ /* /* root : Code for the ROOT task */ */ /* /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { unsigned long tid[5]; int i; int j; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); time_buffer.year time_buffer.month time_buffer.day time_buffer.hour time_buffer.minute time_buffer.second time_buffer.ticks
= = = = = = =
2001; 6; 7; 9; 38; 0; 20;
delta_clock_set(&time_buffer);
/* /* /* /* /* /* /*
year, A.D. */ month, 1 -> 12 */ day, 1 -> 31 */ hour, 0 -> 23 */ minute, 0 -> 59 */ second, 0 -> 59 */ second, 0 -> 59 */
266
Concurrent Programming
status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’P’, ’H’, ’L’, ’ ’); status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(tid[i], phil, i); } /*-------------------------------------------------------------------*/ /* Create five message queues as forks. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Sem_name[i] = delta_build_name(’S’, ’E’, ’M’, ’ ’); status = delta_semaphore_create(Sem_name[i], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &forkid[i]); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } } Sem_name[1] = delta_build_name(’W’, ’A’, ’I’, ’T’); status = delta_semaphore_create(Sem_name[1], 4, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO,
11.2. DINING PHILOSOPHER PROBLEM
267
5, &roomwait); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } /*-----------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-----------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); }
Philosopher Tasks /***********************************************************************/ /* */ /* phil */ /* */ /* Note : This is a parameterized procedure for generating five tasks */ */ /* /***********************************************************************/ delta_task phil(delta_task_argument targ) { unsigned long timeout = 0; int lfork, rfork; unsigned long msgbuf[4]; int i; i = targ; lfork = i; rfork = (i + 1) % 5; msgbuf[0] = (unsigned long) "Fork_Ok"; while (1) { delta_semaphore_obtain(roomwait, DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); delta_semaphore_obtain(forkid[lfork], DELTA_WAIT, timeout); /* Take rest for 15 ticks before to pick up another fork */ delta_task_wake_after(15); delta_semaphore_obtain(forkid[rfork], DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */
268
Concurrent Programming
/*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); delta_semaphore_release(forkid[lfork]); delta_semaphore_release(forkid[rfork]); delta_semaphore_release(roomwait); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50); }; }
Solution with Message Queues In previous solution, we have used semaphores for representing forks. Alternatively we may also use message queues for the same purpose. The size of the queue is limited to 1. Sending message to a queue means to put down the fork and receiving message from a queue means to pick up the fork. Root Task /***********************************************************************/ */ /* /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { void *dummy; void *data_ptr; unsigned long ioretpb[4], ioretval; unsigned long rc; unsigned long msg_buf[4]; int i; delta_status_code status; delta_time_of_day time;
11.2. DINING PHILOSOPHER PROBLEM
269
delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); /*-------------------------------------------------------------------*/ /* Set date to March 1, 1996, time to 8:30 AM, and start the system */ /* clock running. */ /*-------------------------------------------------------------------*/ time_buffer.year = 2001; /* year, A.D. */ time_buffer.month = 6; /* month, 1 -> 12 */ time_buffer.day = 7; /* day, 1 -> 31 */ time_buffer.hour = 9; /* hour, 0 -> 23 */ time_buffer.minute = 38; /* minute, 0 -> 59 */ time_buffer.second = 0; /* second, 0 -> 59 */ time_buffer.ticks = 20; /* second, 0 -> 59 */ delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’P’, ’H’, ’L’, ’ ’); status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(tid[i], phil, i); } /*-------------------------------------------------------------------*/ /* Create five message queues as forks. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Queue_name[i] = delta_build_name(’F’, ’O’, ’K ’, ’ ’);
270
Concurrent Programming
status = delta_message_queue_create(Queue_name[i], 1, 20, DELTA_PRIORITY, &forkid[i]); } msg_buf[0] = (unsigned long) "Fork Ok"; for (i = 0; i < 5; ++i) { status = delta_message_queue_send(forkid[i], msg_buf, 20); }; Sem_name[1] = delta_build_name(’S’, ’E’, ’M’, ’1’); status = delta_semaphore_create(Sem_name[1], 4, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 5, &roomwait); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } /*-------------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-------------------------------------------------------------------*/ delta_task_suspend(DELTA_SELF); }
Philosopher Tasks /***********************************************************************/ */ /* /* phil : */ /* */ /* NOTE : Runs as task of five philosophers */ /* */ /***********************************************************************/ delta_task phil(delta_task_augument targ) { unsigned long rc, ioretval, iopb[4]; unsigned long timeout = 0; int lfork, rfork; unsigned long msgbuf[4]; int i; unsigned32 size; delta_status_code status;
11.2. DINING PHILOSOPHER PROBLEM
271
i = targ; lfork = i; rfork = (i + 1) % 5; msgbuf[0] = (unsigned long) "Fork_Ok"; while (1) { delta_semaphore_obtain(roomwait, DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); status = delta_message_queue_receive(forkid[lfork], msgbuf, &size, DELTA_DEFAULT_OPTIONS, timeout); /* Take rest for 15 ticks before to pick up another fork */ delta_task_wake_after(15); status = delta_message_queue_receive(forkid[rfork], msgbuf, &size, DELTA_DEFAULT_OPTIONS, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); status = delta_message_queue_send(forkid[lfork], msgbuf, 20); status = delta_message_queue_send(forkid[rfork], msgbuf, 20); delta_semaphore_release(roomwait); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50); }; }
272
Concurrent Programming
Solution with Events We can also provide a solution using events. In this case, the five forks will be represented as five tasks and the event 0x01 means to pick up a fork and the event 0x02 means to put down a fork. Root Task /***********************************************************************/ */ /* /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { void *dummy; void *data_ptr; unsigned long ioretpb[4], ioretval; unsigned long rc; unsigned long msg_buf[4]; int i; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); /*-------------------------------------------------------------------*/ /* Set date to March 1, 1996, time to 8:30 AM, and start the system */ /* clock running. */ /*-------------------------------------------------------------------*/ time_buffer.year = 2001; /* year, A.D. */ time_buffer.month = 6; /* month, 1 -> 12 */ time_buffer.day = 7; /* day, 1 -> 31 */ time_buffer.hour = 9; /* hour, 0 -> 23 */ time_buffer.minute = 38; /* minute, 0 -> 59 */ time_buffer.second = 0; /* second, 0 -> 59 */ time_buffer.ticks = 20; /* second, 0 -> 59 */ delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time);
11.2. DINING PHILOSOPHER PROBLEM
273
printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’P’, ’H’, ’L’, ’ ’); status = delta_task_create(Task_name[1], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(tid[i], phil, i); }
for (i = 0; i < 5; ++i) { Fork_name[i] = delta_build_name(’F’, ’O’, ’R’, ’K’); status = delta_task_create(Fork_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &fid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(fid[i], fork, 0); } Sem_name[1] = delta_build_name(’W’, ’A’, ’I’, ’T’); status = delta_semaphore_create(Sem_name[1], 4, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 5, &roomwait); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } /*-------------------------------------------------------------------*/
274
Concurrent Programming
/* Delete self. If deletion fails, call k_fatal. */ /*-------------------------------------------------------------------*/ delta_task_suspend(DELTA_SELF); }
Philosopher Tasks /***********************************************************************/ /* */ /* phil : */ /* */ /* NOTE : Runs as task of five philosophers */ /* */ /***********************************************************************/ delta_task phil(delta_task_augument i) { unsigned long timeout = 0; while (1) { delta_semaphore_obtain(roomwait, DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); /* Pickup fork in lefthand */ status = delta_event_send(fid[i], 0x01); /* Pickup fork in righthand */ status = delta_event_send(fid[(I + 1) %5],0x01); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); /* Putdown fork in lefthand */ status = delta_event_send(fid[i], 0x02); delta_task_wake_after(5); /* Putdown fork in righthand */ status = delta_event_send(fid[(I+1)%5], 0x02); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); delta_semaphore_release(roomwait); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50);
11.3. SIEVE ALGORITHM FOR PRIME NUMBERS
275
}; }
Fork Tasks /***********************************************************************/ /* */ /* fork : */ */ /* /* NOTE : Runs as task of five forks */ */ /* /***********************************************************************/ delta_task fork() { unsigned long timeout = 0; unsigned long revent; while (1) { /* Pickup fork */ ev_receive(0x01, EV_WAIT, timeout, &revent); delta_event_receive(0x01, DELTA_EVENT_ALL, timeout, &revent); /* Putdown fork */ delta_event_receive(0x02, DELTA_EVENT_ALL, timeout, &revent); } }
11.3
Sieve Algorithm for Prime Numbers
11.3.1
Problem Description
Sieve Algorithm for Prime Numbers To print in ascending order all primes. Use a sequence of processes, prime, in which each process inputs a prime from its predecessor and prints it. The process then inputs an ascending stream of numbers from its predecessor and passes them on to its successor, suppressing any that are multiples of the original prime. This beautiful solution was contributed by David Gries. Purpose This is good exercise for understanding memory allocation approach used by DeltaCORE. You can create as many tasks as the memory being available. For instance, on a standard PC with 4M RAM, we have been able to create about 3,000 tasks. With each task standing for a prime number, about first 3,000 prime numbers were printed out. The tasks can be created as many as
276
Concurrent Programming
possible until the system crashes. A big challenge is to create more 3,000 tasks with the same configuration.
11.3.2
Solution
Root Task The task Init is created and started by DeltaCORE kernel as part of its startup initialization. After startup, the DeltaCORE kernel simply passes control to the task Init. /***********************************************************************/ /* */ /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { unsigned long tid[1000]; unsigned long qid[1000]; unsigned long msg_buf[4]; int i; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); time_buffer.year = 2001; time_buffer.month = 6; time_buffer.day = 7; time_buffer.hour = 9; time_buffer.minute = 38; time_buffer.second = 0; time_buffer.ticks = 20;
/* /* /* /* /* /* /*
year, A.D. */ month, 1 -> 12 */ day, 1 -> 31 */ hour, 0 -> 23 */ minute, 0 -> 59 */ second, 0 -> 59 */ second, 0 -> 59 */
delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month,
11.3. SIEVE ALGORITHM FOR PRIME NUMBERS
277
time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ Task_name[1] = delta_build_name(’P’, ’R’, ’I’, ’M’); status = delta_task_create(Task_name[1], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[0]); Queue_name[1] = delta_build_name(’Q’, ’U’, ’E ’,’U’); status = delta_message_queue_create(Queue_name[1], 1, 20, DELTA_PRIORITY, &qid[0]); status = delta_task_start(tid[0], prime0, 0); /*-----------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-----------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); }
Starting Task The starting task prime0 starts from prime number n, print it, then passes it onto its successor, suppressing any that are multiples of n. /***********************************************************************/ /* */ /* prime0 : Start from n = 2, print it, passes it onto its successor, */ /* suppressing any that are multiples of the original n. */ */ /* /***********************************************************************/ delta_task prime0() { unsigned long msgbuf[4]; delta_status_code status; long m, n; int i;
278
Concurrent Programming
unsigned long qrid; qrid = targ[0] ; /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ n = 2; printf("\nTask 0 Output Prime Number: %i", n); g_I ++; m = n; Task_name[g_i] = delta_build_name(’P’, ’R’, ’I’, ’M’); status = delta_task_create(Task_name[g_i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[g_i]); status = delta_task_start(tid[g_i], prime, 0); Queue_name[g_i] = delta_build_name(’Q’, ’U’, ’E ’,’U’); status = delta_message_queue_create(Queue_name[g_i], 1, 20, DELTA_PRIORITY, &qid[g_i]); while (1) { m = m + 1; if (m % n != 0) { msgbuf[0] = m; status = delta_message_queue_send(qrid, msgbuf, 20); } delta_task_wake_after(1); } }
Subsequent Tasks This kind of task, when created, will form dynamically a task chain. Each of them will stand for a prime number n inputed from previous task and act as a filter to filter out the multiples of n and pass the rest of the number stream to the next task. /***********************************************************************/ */ /* /* prime : Input a sequence of numbers from predecessor, print the */ /* first one, say, n, passes rest of them on to its */
11.3. SIEVE ALGORITHM FOR PRIME NUMBERS
279
/* successor, suppressing any that are multiples of the */ /* original n. */ /* */ /***********************************************************************/ void prime(unsigned long i, unsigned long qrid) { unsigned long tid; unsigned long qid; unsigned long targ[4]; unsigned long rc; unsigned long timeout = 0; unsigned long msgbuf[4]; long m, n; status = delta_message_queue_receive(qrid, msgbuf, &size, DELTA_WAIT, timeout); n = msgbuf[0]; g_I++; Print("Prime Number : %i\n", n); Task_name[g_i] = delta_build_name(’P’, ’R’, ’I’, ’M’); status = delta_task_create(Task_name[g_i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[g_i]); Queue_name[g_i] = delta_build_name(’Q’, ’U’, ’E ’,’U’); status = delta_message_queue_create(Queue_name[g_i], 1, 20, DELTA_PRIORITY, &qid[g_i]); targ[0] = i+1; targ[1] = qid; status = delta_task_start(tid[g_i], prime, 0); while (1) { status = delta_message_queue_receive(qrid,
280
Concurrent Programming
msgbuf, &size, DELTA_WAIT, timeout); m = msgbuf[0]; if (m % n != 0) { msgbuf[0] = m; status = delta_message_queue_send(qwid, msgbuf, 20); } delta_task_wake_after(1); } }
11.4
Barber Shop Problem
11.4.1
Problem Description
Barber Shop Problem A barbershop consists of a waiting room with n chairs, and the barber room containing the barber chair. If there are no customers to be served the barber goes to sleep. If a customer enters the barbershop and all chairs are occupied, then the customer leaves the shop. If the barber is busy, then the customer sits in one of the available free chairs. If the barber is asleep, the customer wakes the barber up. This problem was also proposed by E. W. Dijkstra [8]. Purpose This is a good exercise for understanding scheduling approach deployed in DeltaCORE.
11.4.2
Solution
There are three kinds of task defined in this solution. • Init task is used for create the whole system. • barber task is used to represent the barber. • custom task is used to represent the customers. Root Task The task Init is created and started by DeltaCORE kernel as part of its startup initialization. After startup, the DeltaCORE kernel simply passes control to the task Init.
11.4. BARBER SHOP PROBLEM
281
/***********************************************************************/ */ /* /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { void *dummy; void *data_ptr; unsigned long rc; unsigned long timeout = 0; int i; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); /*-------------------------------------------------------------------*/ /* Set date to March 1, 1996, time to 8:30 AM, and start the system */ /* clock running. */ /*-------------------------------------------------------------------*/ time_buffer.year = 2001; time_buffer.month = 6; time_buffer.day = 7; time_buffer.hour = 9; time_buffer.minute = 38; time_buffer.second = 0; time_buffer.ticks = 20;
/* /* /* /* /* /* /*
year, A.D. */ month, 1 -> 12 */ day, 1 -> 31 */ hour, 0 -> 23 */ minute, 0 -> 59 */ second, 0 -> 59 */ second, 0 -> 59 */
delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application barber task and start it. */
282
Concurrent Programming
/*-------------------------------------------------------------------*/ Sem_name[0] = delta_build_name(’S’, ’E’, ’M’, ’0’); status = delta_semaphore_create(Sem_name[0], MAXBARB, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 5, &barbers); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } for (i = 0; i < MAXBARB; i++) { delta_semaphore_obtain(barbers, DELTA_NO_WAIT, timeout); } Sem_name[1] = delta_build_name(’S’, ’E’, ’M’, ’1’); status = delta_semaphore_create( Sem_name[1], MAXCUST, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 5, &customers); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } for (i = 0; i < MAXCUST; i++) { delta_semaphore_obtain(customers, DELTA_NO_WAIT, timeout); } Sem_name[2] = delta_build_name(’M’, ’U’, ’T’, ’X’); status = delta_semaphore_create(Sem_name[3], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &mutex); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } for (i = 0; i < MAXBARB; i++) { Task_name[i] = delta_build_name(’B’, ’A’, ’R’, ’ ’);
11.4. BARBER SHOP PROBLEM
283
status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &btid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(btid[i], barber, i); }
for (i = 0; i < MAXCUST; i++) { Cust_name[i] = delta_build_name(’C’, ’U’, ’S’, ’ ’); status = delta_task_create(Cust_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &ctid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(ctid[i], customer, i); } delta_task_suspend(DELTA_SELF); }
Barber Task /***********************************************************************/ /* */ /* barber : */ /* */ /* NOTE : Runs as task of the barber */ /* */ /***********************************************************************/ delta_task barber(delta_task_argument btarg) { unsigned long timeout = 0; unsigned long i; i = btarg; while (1) { /* go to sleep if # of customers is 0 */ delta_semaphore_obtain(customers, DELTA_NO_WAIT, timeout);
284
Concurrent Programming
/* acquire access to ‘waiting’ */ delta_semaphore_obtain(mutex, DELTA_NO_WAIT, timeout); /* decrement count of waiting customers */ waiting = waiting - 1; /* one barbers is now ready to cut hair */ delta_semaphore_release(barbers); /* release waiting */ delta_semaphore_release(mutex); /* cut hair */ printf("barber %i cut hair!\n", i); delta_task_wake_after(1); } while(1); }
Customer Tasks /***********************************************************************/ /* */ /* custom : */ /* */ /* NOTE : Runs as task of the customers */ */ /* /***********************************************************************/ delta_task customer(delta_task_argument ctarg) { unsigned long timeout; unsigned long i; i = ctarg; timeout = 0; while (1) { /* enter the critical region */ delta_semaphore_obtain(mutex, DELTA_NO_WAIT, timeout); if (waiting < CHAIRS) { /* increment count of waiting customers */ waiting = waiting + 1;
11.5. CIGARETT SMOKER PROBLEM
285
/* wake up barber if necessary */ delta_semaphore_release(customers); /* release accss to ‘waiting’ */ delta_semaphore_release(mutex); /* go to sleep if # of free barbers is 0 */ delta_semaphore_obtain(barbers, DELTA_NO_WAIT, timeout); /* be seated and be serviced */ printf("customer %i be seated and be servic!\n", i); } else { /* shop is full; do not wait */ delta_semaphore_release(mutex); } delta_task_wake_after(1); } }
11.5
Cigarett Smoker Problem
11.5.1
Problem Description
Cigarette Smoker Problem Consider a system with three smoker processes and one agent process. Each smoker continuously makes a cigarette and smokes it. But to make a cigarette, three ingredients are needed: tobacco, paper, and matches. One of the processes has paper, another tobacco and the third has matches. The agent has an infinite supply of all three. The agent places two of the ingredients on the table. The smoker who has the remaining ingredient can then make and smoke a cigarette, signaling the agent upon completion. The agent then puts out another two of the three ingredients and the cycle repeats. The problem can be found in [20] and [18].
Purpose This is a good exercise for better understanding the resource allocation problem in a concurrent and distributed environment using synchronization mechanisms such as semaphores and events provided by DeltaCORE.
286
11.5.2
Concurrent Programming
Solution without Deadlock
Root Task The task Init is created and started by DeltaCORE kernel as part of its startup initialization. After startup, the DeltaCORE kernel simply passes control to the task Init. /***********************************************************************/ /* */ /* root : Code for the ROOT task */ /* */ /* NOTE : This executes as the ’ROOT’ task. It suspends itself */ /* when finished. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument) { delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); time_buffer.year time_buffer.month time_buffer.day time_buffer.hour time_buffer.minute time_buffer.second time_buffer.ticks
= = = = = = =
2001; 6; 7; 9; 38; 0; 20;
/* /* /* /* /* /* /*
year, A.D. */ month, 1 -> 12 */ day, 1 -> 31 */ hour, 0 -> 23 */ minute, 0 -> 59 */ second, 0 -> 59 */ second, 0 -> 59 */
delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ Task_name[1] = delta_build_name(’S’, ’M’, ’K’, ’T’); status = delta_task_create(Task_name[1],
11.5. CIGARETT SMOKER PROBLEM
10, DELTA_MINIMUM_STACK_SIZE * DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &smktid); Task_name[2] = delta_build_name(’S’, ’M’, ’K’, ’P’); status = delta_task_create(Task_name[2], 10, DELTA_MINIMUM_STACK_SIZE * DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &smkpid); Task_name[3] = delta_build_name(’S’, ’M’, ’K’, ’M’); status = delta_task_create(Task_name[3], 10, DELTA_MINIMUM_STACK_SIZE * DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &smkmid); Task_name[4] = delta_build_name(’A’, ’G’, ’N’, ’T’); status = delta_task_create(Task_name[4], 10, DELTA_MINIMUM_STACK_SIZE * DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &agntid); status status status status
= = = =
delta_task_start(smktid, delta_task_start(smkpid, delta_task_start(smkmid, delta_task_start(agntid,
287
2,
2,
2,
2,
smokert, 0); smokerp, 0); smokerm, 0); agent, 0);
/*-------------------------------------------------------------------*/ /* Create semaphores as tobacco, paper and match. */ /*-------------------------------------------------------------------*/ Sem_name[1] = delta_build_name(’T’, ’O’, ’B’, ’C’); status = delta_semaphore_create(Sem_name[1], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &tsmid); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } Sem_name[2] = delta_build_name(’P’, ’A’, ’P’, ’R’); status = delta_semaphore_create(Sem_name[2],
288
Concurrent Programming
1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &psmid); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } Sem_name[3] = delta_build_name(’M’, ’A’, ’T’, ’C’); status = delta_semaphore_create(Sem_name[3], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &msmid); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n"); } /*-------------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-------------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); }
Function Definitions /***********************************************************************/ */ /* /* smokert : */ /* */ /* NOTE : Runs as task ’SMKT’. */ /* */ /***********************************************************************/ void smokert(void) { unsigned long timeout = 0; delta_status_code status; while (1) { printf("Smoker with Tobacco Try to Smoke!\n"); /* Pick Up Paper */ delta_semaphore_obtain(psmid, DELTA_WAIT, timeout); /* Pick Up Match */ delta_semaphore_obtain(msmid, DELTA_WAIT, timeout); printf("Smoker with Tobacco Smoke!\n");
11.5. CIGARETT SMOKER PROBLEM
289
status = delta_event_send(agntid, 0x01); printf("Smoker with Tobacco Stop Smoke!\n"); /* sleep for 5 ticks before attempting to try again */ delta_task_wake_after(5); }; } /***********************************************************************/ /* */ /* smokerp : */ */ /* /* NOTE : Runs as task ’SMKP’. */ /* */ /***********************************************************************/ void smokerp(void) { unsigned long timeout = 0; delta_status_code status; while (1) { printf("Smoker with Paper Try to Smoke!\n"); /* Pick Up Tobacco */ delta_semaphore_obtain(tsmid, DELTA_WAIT, timeout); /* Pick Up Match */ delta_semaphore_obtain(msmid, DELTA_WAIT, timeout); printf("Smoker with Paper Smoke!\n"); status = delta_event_send(agntid, 0x02); printf("Smoker with Paper Stop Smoke!\n"); /* sleep for 5 ticks before attempting to try again */ delta_task_wake_after(5); }; } /***********************************************************************/ /* */ /* smokerm : */ /* */ /* NOTE : Runs as task ’SMKM’. */ /* */
290
Concurrent Programming
/***********************************************************************/ void smokerm(void) { unsigned long timeout = 0; delta_status_code status; while (1) { printf("Smoker with Match Try to Smoke!\n"); /* Pick Up Paper */ delta_semaphore_obtain(psmid, DELTA_WAIT, timeout); /* Pick Up Tobacco */ delta_semaphore_obtain(tsmid, DELTA_WAIT, timeout); printf("Smoker with Match Smoke!\n"); status = delta_event_send(agntid, 0x04); printf("Smoker with Match Stop Smoke!\n"); /* sleep for 5 ticks before attempting to try again */ delta_task_wake_after(5); }; } /***********************************************************************/ /* */ /* agent : */ /* */ /* NOTE : Runs as task ’AGNT’. */ /* */ /***********************************************************************/ void agent(void) { unsigned long timeout = 0; delta_status_code status; delta_id revent; while (1) { status = delta_event_receive(0x07, DELTA_EVENT_ANY, timeout, &revent); if (revent == 0x01) { /* Smoker with Tobacco Done */ delta_semaphore_release(psmid);
11.6. PARALLEL QUICK SORTING
291
delta_semaphore_release(msmid); } else if (revent == 0x02) { /* Smoker with Paper Done */ delta_semaphore_release(tsmid); delta_semaphore_release(msmid); } else if (revent == 0x03) { /* Smoker with Tobacco and Smoker with Paper Done */ delta_semaphore_release(psmid); delta_semaphore_release(msmid); } else if (revent == 0x04) { /* Smoker with Match Done */ delta_semaphore_release(psmid); delta_semaphore_release(tsmid); } else if (revent == 0x05) { /* Smoker with Tobacco and Smoker with Match Done */ delta_semaphore_release(psmid); delta_semaphore_release(tsmid); } else if (revent == 0x06) { /* Smoker with Paper and Smoker with Match Done */ delta_semaphore_release(tsmid); delta_semaphore_release(msmid); } else { /* Smokers with Tobacco and Smoker with Paper and Smoker with Match Done */ delta_semaphore_release(psmid); delta_semaphore_release(msmid); } } }
11.6
Parallel Quick Sorting
11.6.1
Problem Description
In Chapter 5, we have represented a quick sort algorithm QuickSort, the two recursive calls to QuickSort can be executed in parallel because they operate on disjoint slices of the element list. It would be nice to be able to specify that the two recursive calls to QuickSort are to be executed in parallel. In this section, we provide two solutions for parallel quick sorting.
292
11.6.2
Concurrent Programming
First Solution
The QuickSort function is rewritten by replacing each recursive calls by creating a new task and starting it and replacing each return statement by a suspend command t suspend(0). void QuickSort(delta_task_augument targ); // QuickSort accepts an array and two range parameters delta_task QuickSort(delta_task_augument targ) { // local variables holding the mid index of the range, // its value intlist[mid] and the scanning indices int pivot; int scanUp, scanDown; int mid; int low, high; delta_id qsortid1, qsortid2; delta_status_code status; delta_task_augument targ1[3], targ2[3]; low = targ[0]; high = targ[1]; // if the range is not at least two elements, return if (high - low pivot while (scanUp 59 */
delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*-------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*-------------------------------------------------------------------*/ delta_task_ident("INIT", &rootid);
11.7. RUNNER AND TIMER PROBLEM
299
targ[0] = glow; targ[1] = ghigh; targ[2] = rootid; targ[3] = 1; Task_name[0] = delta_build_name(’Q’, ’S’, ’R’, ’T’); status = delta_task_create(Task_name[0], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &qsortid); if (status == DELTA_SUCCESSFUL) status = delta_task_start(qsortid, QuickSort, &targ); delta_event_receive(0x01, DELTA_EVENT_ALL, timeout, &revent); printf("The results are given as follows:\n"); for (i=0; i 12 */ day, 1 -> 31 */ hour, 0 -> 23 */ minute, 0 -> 59 */ second, 0 -> 59 */ second, 0 -> 59 */
delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*---------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*---------------------------------------------------------------------*/ #define SUPVSTACKSIZE 2048 Task_name[0] = delta_build_name(’R’, ’U’, ’N’, ’R’); PRunner.Create((char*)Task_name[0],
11.7. RUNNER AND TIMER PROBLEM
301
task_pri, task_stack, task_mode, task_attr); if (status == DELTA_SUCCESSFUL) PRunner.Start(0); Task_name[1] = delta_build_name(’T’, ’I’, ’M’, ’E’); PTimer.Create((char*)Task_name[1], task_pri, task_stack, task_mode, task_attr); if (status == DELTA_SUCCESSFUL) PTimer.Start(0); /*-----------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-----------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); }
The task Runner will count the running distance and issues an ASR signal one he has been running for 100 meters. Then he will suspend himself. /*********************************************************/ /* */ /* Runner : */ /* */ /*********************************************************/ void PR::body(delta_task_argument n) { deltaEvent PEvent; deltaSignal PSignal; delta_status_code status; metercount = 0; PEvent.Send(PTimer,0x01); while (1) { metercount = metercount + 1; printf("Runner was running %i meters\n", metercount); if (metercount >= 100) { PSignal.Send(PTimer,0x0001); delta_task_suspend(DELTA_SELF); metercount = 0; } } }
302
Concurrent Programming
The task Timer will continuously count the time. It will be interrupted by Runner once Runner has been running for 100 meters. Then the control is automatically transferred to the its ASR Watcher. /***********************************************************************/ /* */ */ /* Timer : /* */ /***********************************************************************/ void PT::body(delta_task_argument n) { delta_interval timeout = 0; delta_event_set revent; deltaEvent PEvent; deltaSignal PSignal; delta_status_code status; delta_event_set event_in = 0x01; timecount = 0; status = PSignal.Catch(Watcher,DELTA_ASR); PEvent.Receive((unsigned int)event_in, (delta_event_set&)revent); while (1) { delta_task_wake_after(1); timecount = timecount + 1; } }
The routine Watcher is an ASR attached to Timer task. It will perform the following two actions: • Print out the time period used by Runner for running 100 meters. • Suspend the Timer task. delta_asr Watcher(delta_signal_set signal) { printf("Runner was running in %i ticks\n", timecount); delta_task_suspend(DELTA_SELF); timecount }
= 0;
Chapter 12
Synchronization and Communication In previous chapter, we have presented a number of examples for solving synchronization problems using the mechanisms such as semaphores and message queues provided by DeltaOS. We will now present some more complicated structures for synchronization and communication.
12.1
Concurrency Conditions
When can two statements in a program be executed concurrently and still produce the same results? Before we answer this question, let us first defines some notation: • R(Si ) = { a1 , a2 , . . ., am }, the read set for Si , is the set of all variables whose values are referenced in statement Si during its execution. • W(Si ) = { b1 , b2 , . . ., bn }, the write set for Si , is the set of all variables whose values are changed (written) by the execution of statement Si . The following three conditions must be satisfied for two successive statements S1 and S2 to be executed concurrently and still produce the same result. These conditions were first stated by Berntein and are commonly known as Bernstein’s conditions. 1. R(S1 ) ∩ W(S2 ) = {}. 2. W(S1 ) ∩ R(S2 ) = {}. 3. R(W1 ) ∩ W(S2 ) = {}. 303
304
Synchronization and Communication
12.2
Critical Regions
Semaphores can be effectively used in solving the critical section problem, as well as arbitrary synchronization schemes. Let us review this solution to the critical section problem. All processes share a semaphore variable mutex, which is initialized to 1. Each process must execute mutex.P before entering the critical section, and mutex.V() afterwards. If this sequence is not observed, two processes may be in their critical sections simultaneously, resulting in timedependent errors. Let us examine the various difficulties that may result. Note that these difficulties will arise even if a single process is not well behaved. This situation may be the result of an honest programming error or an uncooperative programmer. • Suppose that a process interchanges the operations on the semaphore mutex. That is, it executes mutex.V(); ... Critical section ... mutex.P();
In this situation, several processes may be executing in their critical section simultaneously, violating the mutual exclusion requirement. This time-dependent error may be discovered only if several processes are simultaneously active in their critical sections. Note that this situation may not always be reproduced. • Suppose that a process exchange mutex.V() with mutex.P(). That is, it executes mutex.P(); ... Critical section ... mutex.P();
In this case, a deadlock will occur. • Suppose that a process omits the mutex.P() or the mutex.V() or both. In this case, either mutual exclusion is violated or a deadlock will occur. The example above illustrates that time-dependent errors can be easily generated when semaphores are used to solve the critical section problem. To
12.2. CRITICAL REGIONS
305
overcome this difficiency, Brinch Hansen and Tony Hoare introduced a new language construct, the critical region[11]. A variable v of type T, which is to be shared among many processes, can be declared: var v : shared T; The variable v can be accessed only inside a region statement of the following form: region v do S; This construct means that while statement S is being executed, no other process can access the variable v. Thus, if the two statements, region v do S1; region v do S2; are executed concurrently in distinct sequential processes, the result will be equivalent to the sequential execution “S1 followed by S2”, or “S2 followed by S2”. The critical-region construct guards against some simple errors associated with the semaphore solution to the critical section problem which may be made by a programmer. Note that it does not necessarily eliminate time-dependent errors, but rather reduces the number of them. If errors occur in the logic of the program, reproducing a particular sequence of events may not be simple.
12.2.1
Specification of Critical Regions
Let us illustrate now how the critical region construct chould be implemented in terms of DeltaOS. For each declaration var v : shared T; we define a semaphore mutex initialized to 1. #include typedef void (*SHARE_FUNCT)(); class CriticalRegion { private: deltaSemaphore mutex; public: // Constructors CriticalRegion(void); void DoRegion(SHARE_FUNCT ShareAddr); };
306
12.2.2
Synchronization and Communication
Implementation of Critical Regions
For each statement region v do S; we will have the following code: mutex.P(); S; mutex.V(); Clearly, mutual exclusion is preserved as required by the semantics of the critical region statement. #include "crtregion.h" // Default initialize condition CriticalRegion::CriticalRegion(void) { mutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); } // Enter the critical region void CriticalRegion::DoRegion(SHARE_FUNCT ShareAddr) { unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); ShareAddr(); mutex.Release(); }
12.3
Conditional Critical Regions
The critical region construct can be effectively used to solve the critical section problem. It cannot, however, be used to solve some general synchronization problems. For this reason the conditional critical region was introduced by Hoare. The major difference between the critical region and the conditional critical region constructs is in the region statement, which now has the form: region v when B do S; where B is a boolean expression. As before, regions referring to the same shared variable excluded each other in time. Now, however, when a process enters the critical section region, the boolean expression B is evaluated. If the expression is
12.3. CONDITIONAL CRITICAL REGIONS
307
true, statement S is executed. If it is false, the process relinquishes the mutual exclusion and is delayed until B becomes true and no other process is in the region associated with v.
12.3.1
Specification of Conditional Critical Regions
Let us illustrate how the conditional critical region could be implemented. With each shared variable x, the following variables are associated: PSemaphore xmutex, xwait; int xcount, xtemp; Mutual exclusive access to the critical section is provided by xmutex. If a process cannot enter the critical section because the boolean condition B is false, it waits on the xwait semaphore. We keep track of the number of processes waiting on xwait with xcount. When a process leaves the critical section, it may have changed the value of some boolean condition B that prevented another process from entering the critical section. Accordingly, we must trace through the queue of processes waiting on xwait allowing each process to test its boolean condition. We use xtemp to determine when we have allowed each process to test its condition. Accordingly, xmutex is initialized to 1, while xwait, xcount, and xtemp are initialized to 0. A statement region x when B do S; can be implemented by a procedure DoCondRegion. #include typedef void (*SHARE_FUNCT)(); typedef int (*BOOL_FUNCT)(); class CondCriticalRegion { private: deltaSemaphore xmutex, xwait; int xcount, xtemp; public: // constructors CondCriticalRegion(void); void DoCondRegion(BOOL_FUNCT BoolAddr, SHARE_FUNCT ShareAddr); };
12.3.2
Implementation of Conditional Critical Regions
The implementation assumes a FIFO ordering in the queueing of processes for a semaphore.
308
Synchronization and Communication
// Default initialize condition CondCriticalRegion::CondCriticalRegion(void) { xmutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); xwait.Create("XWAT", 0, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); xcount = 0; xtemp = 0; } // Enter the conditional critical region void CondCriticalRegion::DoCondRegion(BOOL_FUNCT BoolAddr, SHARE_FUNCT ShareAddr) { unsigned long time_out = 0; xmutex.Obtain(DELTA_WAIT, time_out); if (!BoolAddr()) { xcount = xcount + 1; xmutex.Release(); xmutex.Obtain(DELTA_WAIT, time_out); while (!BoolAddr()) { xtemp = xtemp + 1; if (xtemp < xcount) xmutex.Release(); else xmutex.Release(); xmutex.Obtain(DELTA_WAIT, time_out); } xcount = xcount - 1; }; ShareAddr(); if (xcount > 0) { xtemp = 1; xwait.Release(); } else xwait.Release(); }
Note that this implementation requires the reevaluation of the expression B for any waiting processes every time a process leaves the critical section. If several processes are delayed, waiting for their respective boolean expressions to become
12.3. CONDITIONAL CRITICAL REGIONS
309
true, this reevaluation overhead may result in inefficient code. There are various optimization methodss that can be used to reduce this overhead. Hoare’s construct allows processes to be delayed only at the beginning of a critical region. There are, however, circumstances where synchronization conditions must be placed anywhere within the critical region. This observation led Brinch Hansen to the following region construct: region v do begin S1; await(B); S2; end; When a process enters the region executes statement S1 (S1 may be null). It then evaluates B. If B is true, S2 is executed. If B is false, the process relinquishes mutual exclusion and is delayed until B becomes true and no other process is in the region associated with v. We illustrate this new construct by coding a variant of the second readers/writers problem. The second readers/writers problem requires that once a writer is ready, it may write as soon as possible. Thus a reader can enter its critical section only if there are no writers either waiting or in its critical section. The reader-writer class is declared as follows: #include class ReaderWriter2 { private: int busy, nreaders, nwriters; deltaSemaphore xmutex, xwait; int xcount, xtemp; int OpenReadBool(void); void OpenReadFunc1(void); void OpenReadFunc2(void); int CloseReadBool(void); void CloseReadFunc1(void); void CloseReadFunc2(void); int OpenWriteBool(void); void OpenWriteFunc1(void); void OpenWriteFunc2(void); int CloseWriteBool(void); void CloseWriteFunc1(void); void CloseWriteFunc2(void); public:
310
Synchronization and Communication
// constructors ReaderWriter2(void); void void void void
OpenRead(void); CloseRead(void); OpenWrite(void); CloseWrite(void);
};
ReaderWriter2 is implemented as follows: // Default initialize condition ReaderWriter2::ReaderWriter2(void) { xmutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); xwait.Create("XWAT", 0, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); xcount = 0; xtemp = 0; busy = 0; nreaders = 0; nwriters = 0; } int ReaderWriter2::OpenReadBool(void) { return (nwriters == 0); } void ReaderWriter2::OpenReadFunc1(void) {} void ReaderWriter2::OpenReadFunc2(void) { nreaders = nreaders + 1; } // Enter the conditional critical region void ReaderWriter2::OpenRead(void) { unsigned long time_out = 0;
12.3. CONDITIONAL CRITICAL REGIONS
xmutex.Obtain(DELTA_WAIT,time_out); OpenReadFunc1() ; if (!OpenReadBool()) { xcount = xcount + 1; xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); while (!OpenReadBool()) { xtemp = xtemp + 1; if (xtemp < xcount) xwait.Release(); else xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); } xcount = xcount - 1; }; OpenReadFunc2(); if (xcount > 0) { xtemp = 1; xwait.Release(); } else xmutex.Release(); } int ReaderWriter2::CloseReadBool(void) { return 1; } void ReaderWriter2::CloseReadFunc1(void) { nreaders = nreaders - 1; } void ReaderWriter2::CloseReadFunc2(void) {} // Enter the conditional critical region void ReaderWriter2::CloseRead(void) { unsigned long time_out = 0; xmutex.Obtain(DELTA_WAIT,time_out);
311
312
Synchronization and Communication
CloseReadFunc1() ; if (!CloseReadBool()) { xcount = xcount + 1; xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); while (!CloseReadBool()) { xtemp = xtemp + 1; if (xtemp < xcount) xwait.Release(); else xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); } xcount = xcount - 1; }; CloseReadFunc2(); if (xcount > 0) { xtemp = 1; xwait.Release(); } else xmutex.Release(); } int ReaderWriter2::OpenWriteBool(void) { return (!busy && (nreaders == 0)); } void ReaderWriter2::OpenWriteFunc1(void) { nwriters = nwriters + 1; } void ReaderWriter2::OpenWriteFunc2(void) { busy = 1; } // Enter the conditional critical region void ReaderWriter2::OpenWrite(void) { unsigned long time_out = 0;
12.3. CONDITIONAL CRITICAL REGIONS
xmutex.Obtain(DELTA_WAIT,time_out); OpenWriteFunc1() ; if (!OpenWriteBool()) { xcount = xcount + 1; xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); while (!OpenWriteBool()) { xtemp = xtemp + 1; if (xtemp < xcount) xwait.Release(); else xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); } xcount = xcount - 1; }; OpenWriteFunc2(); if (xcount > 0) { xtemp = 1; xwait.Release(); } else xmutex.Release(); } int ReaderWriter2::CloseWriteBool(void) { return 1; } void ReaderWriter2::CloseWriteFunc1(void) { nwriters = nwriters - 1; busy = 0; } void ReaderWriter2::CloseWriteFunc2(void) {} // Enter the conditional critical region void ReaderWriter2::CloseWrite(void) { unsigned long time_out = 0; xmutex.Obtain(DELTA_WAIT,time_out);
313
314
Synchronization and Communication
CloseWriteFunc1() ; if (!CloseWriteBool()) { xcount = xcount + 1; xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); while (!CloseWriteBool()) { xtemp = xtemp + 1; if (xtemp < xcount) xwait.Release(); else xmutex.Release(); xwait.Obtain(DELTA_WAIT,time_out); } xcount = xcount - 1; }; CloseWriteFunc2(); if (xcount > 0) { xtemp = 1; xwait.Release(); } else xmutex.Release(); }
A reader process must invoke a read operation on the file instance and operations upon an instance rw of the ReaderWriter2 class only in the following sequence: rw.OpenRead(); ... read file ... rw.CloseRead(); A write process must invoke operations in the sequence: rw.OpenWrite(); ... write file ... rw.CloseWrite(); However, the class concept alone cannot guarantee that such sequences will be observed. In particular, • A process might operate on the file without first gaining access permission to it (by a direct call to read or to write the file);
12.4. MONITORS
315
• A process might never release the file once it has been granted access to it; • A process might attempt to release a file that it never requested; • A process might request the same file twice (without first releasing it); and so on. This kind of difficulties can be overcome by using the new synchronization mechanism called monitor[13].
12.4
Monitors
12.4.1
Definitions and Implementations
A monitor is a mechanism that allows the safe and effective sharing of abstract data types among several tasks. The main semantic difference between monitors and classes is that the monitor assures mutual exclusion; that is, only one task at a time can be active within the monitor. This property is guaranteed by the monitor itself. Consequently, the programmer need not explicitly code this synchronization constraint. Now suppose that when the x.SSIGNAL() operation is invoked by a process P, there is a suspended process Q associated with condition x. Clearly, if the suspended process Q is allowed to resume its execution, the signaling process P must wait. Otherwise, both P and Q will be active simultaneously within the monitor. Note, however, that both processes can conceptually continue with their execution. Two possibilities exist: 1. P waits until Q either leaves the monitor, or waits for another condition; 2. Q waits until P either leaves the monitors, or waits for another condition. There reasonable arguments in favor of adopting either (1) or (2). Since P was already executing in the monitor, choice (2) seems more reasonable. However, if we allow process P to continue, the “logical” condition for which Q was waiting may no longer hold by the time Q is resumed. We will now consider a possible implementation of the monitor mechanism. For each monitor, a semaphore mutex (initialize to 1) is provided. mutex.P must be executed before entering the monitor, while mutex.V must be executed after leaving the monitor. Since a signalling process must wait until the resumed process either leaves or waits, an additional semaphore, next, is introduced, initialized to 0, on which the signaling processes may suspended themselves. An integer variable nextcount will also be provided to count the number of processes suspended on next. Thus each external procedure F will be replaced by:
316
Synchronization and Communication
mutex.P(); ... body of F; ... if (nextcount > 0) next.V(); else mutex.V(); Mutual exclution within a monitor is ensured. We can now describe how condition variables are implemented. For each condition x we introduce a semaphore xsem and an integer variable xcount, both initialized to 0. The operation x.SWAIT() can now be implemented as: xcount = xcount if (nextcount > next.V(); else mutex.V(); xsem.P(SM_WAIT, xcount = xcount
+ 1; 0)
time_out); - 1;
The operation x.SSIGNAL() can be implemented as: if (xcount > 0) { nextcount = nextcount + 1; xsem.V(); next.P(SM_WAIT, time_out); nextcount = nextcount - 1; } This implementation is applicable to the definition of monitors of both (1) and (2). In some cases, however, the generality of the implementation is unnecessary and a significant improvement in efficiency is possible. Note that if a monitor M1 calls another monitor M2, the mutual exclusion in M1 is not released while execution processes in M2. This fact has two consequences: • Any process calling M1 will be blocked outside of M1 on mutex during this time period. • If the process enters a condition queue in M2, a deadlock may occur. This problem is called the nested monitor calls problem.
12.4.2
Monitor SynStack
We now present a monitor-based implementation of stacks using the method we have presented above.
12.4. MONITORS
317
Specification of SynStack SynStack is an abstract data type for which only one task at a time can be active within it. If the stack is full, the tasks who want to insert an item into the stack must wait until other tasks who delete an item from the stack. If the stack is empty, the tasks who want to delete an item from the stack must wait until other tasks who insert an item into the stack.
#include const int MaxStackSize = 100; template class SynStack { private: // Private data members, stack array, and top T stacklist[MaxStackSize]; int top; // For each condition x, we need a semaphore xsem and an integer // xcount both initialize to 0 deltaSemaphore xsem; int xcount; deltaSemaphore mutex,next; int nextcount; void SWAIT(void); void SSIGNAL(void); public: // Constructor; initialize the top SynStack(); ~SynStack(); // Stack modification operation void Push(const T& item); T Pop(void); void ClearStack(void); // Stack access T Peer(void); // Stack test methods int StackEmpty(void); int StackFull(void); };
318
12.4.3
Synchronization and Communication
Implementation of SynStack
Stask Initialization The SynStack constructor initializes the index top to have -1, which is equivalent to a stack empty as well as the semaphores necessarily for synchronization controls. #include "synstack.h" // Initialize stack top template SynStack::SynStack(void) : top(-1) { xsem.Create("XSEM", 0, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); mutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); next.Create("NEXT", 0, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); xcount = 0; nextcount = 0; } // Initialize stack top template SynStack::~SynStack(void) { xsem.Delete(); mutex.Delete(); next.Delete(); } template void SynStack::SWAIT(void) { unsigned long time_out = 0; xcount = xcount + 1; if (nextcount > 0)
12.4. MONITORS
next.Release(); else mutex.Release(); xsem.Obtain(DELTA_WAIT, time_out); xcount = xcount - 1; } template void SynStack::SSIGNAL(void) { unsigned long time_out = 0; if (xcount > 0) { nextcount = nextcount + 1; xsem.Release(); next.Obtain(DELTA_WAIT, time_out); nextcount = nextcount - 1; } } template void SynStack::Push(const T& item) { unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); // If the stack is full, wait on "xsem" if (top == MaxStackSize-1) { SWAIT(); } // Increment top and copy item to stacklist top++; stacklist[top] = item; SSIGNAL(); if (nextcount > 0) next.Release(); else mutex.Release(); } // Pop the stack and return the top element template T SynStack::Pop(void)
319
320
Synchronization and Communication
{ T temp; unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); // If the stack is empty, wait on "xsem" if (top == -1) { SWAIT(); } // Record the top element temp = stacklist[top]; // Decrement top and return former top element top--; SSIGNAL(); if (nextcount > 0) next.Release(); else mutex.Release(); return temp; } // Return the value at the top of the stack template T SynStack::Peer(void) { T temp; unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); // If the stack is empty, wait on "xsem" if (top == -1) { SWAIT(); } temp = stacklist[top]; SSIGNAL(); if (nextcount > 0) next.Release(); else mutex.Release(); return temp;
12.4. MONITORS
} // Test for empty stack template int SynStack::StackEmpty(void) { int temp; unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); // Return the logical value (top == -1) temp = (top == -1); if (nextcount > 0) next.Release(); else mutex.Release(); return temp; } // Test for full stack template int SynStack::StackFull(void) { int temp; unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT, time_out); // Return the logical value (top == MaxStackSize-1) temp = (top == MaxStackSize - 1); if (nextcount > 0) next.Release(); else mutex.Release(); return temp; } // Clear all items from the stack template void SynStack::ClearStack(void) { unsigned long time_out = 0;
321
322
Synchronization and Communication
mutex.Obtain(DELTA_WAIT, time_out); top = -1; if (nextcount > 0) next.Release(); else mutex.Release(); }
12.4.4
Application of SynStack
We now present a solution of dining philosopher problem using SynStack with TMaxStackSize = 1. Instead of using semaphore as a means of synchronization control, we now deploy the monitor-based stack to represent the shared forks. SynStack forkid[5];
So now to push an item to the stack means to pick up a fork and to pop an item from the stack means to put down a fork. delta_task phil(delta_task_argument targ) { unsigned long timeout = 0; int lfork, rfork; int i; i = targ; lfork = i; rfork = (i + 1) % 5; while (1) { delta_semaphore_obtain(roomwait, DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); forkid[lfork].Push(0); /* Take rest for 15 ticks before to pick up another fork */ delta_task_wake_after(15); forkid[rfork].Push(0); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); forkid[lfork].Pop();
12.4. MONITORS
323
forkid[rfork].Pop(); delta_semaphore_release(roomwait); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50); }; }
12.4.5
Condition as a Class
Specification of Condition #include class Condition { private: // For each condition x, we need a semaphore xsem and an integer // xcount both initialize to 0 deltaSemaphore xsem; int xcount; deltaSemaphore mutex, next; public: int nextcount; // Constructor Condition(); Condition(deltaSemaphore& m, deltaSemaphore& n); ~Condition(); void SWAIT(void); void SSIGNAL(void); };
Implementation of Condition #include "condition.h" // Default initialize condition Condition::Condition(void) { // Semaphore initianlized to 0 xsem.Create("XSEM", 0,
324
Synchronization and Communication
DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); // Initialize "xcount" to 0 xcount = 0; } Condition::~Condition(void) { xsem.Delete(); } // Initialize condition Condition::Condition(deltaSemaphore& m, deltaSemaphore& n) { mutex = m; next = n; } // Calling task will wait on the condition until other tasks resume it // by issuing SSIGNAL. void Condition::SWAIT(void) { unsigned long time_out = 0; xcount = xcount + 1; if (nextcount > 0) next.Release(); else mutex.Release(); xsem.Obtain(DELTA_WAIT, time_out); xcount = xcount - 1; } // Resume exactly one task who is waiting on the condition. void Condition::SSIGNAL(void) { unsigned long time_out = 0; if (xcount > 0) { nextcount = nextcount + 1; xsem.Release(); next.Obtain(DELTA_WAIT, time_out); nextcount = nextcount - 1; } }
12.4. MONITORS
12.4.6
325
Implementation of SynStack Again
We will now be able to redesign the class of SynStack. The improvement are as follows: 1. The semaphore xsem and the integer xcount will be hidden inside of Condition. 2. The beginning and ending codes will be grouped in the procedure Prologue and Epilogue. These will create a very clear version of SynStack. #include "condition.h" const int MaxStackSize = 1; template class SynStack { private: // Private data members, stack array, and top T stacklist[MaxStackSize]; int top; PSemaphore mutex, next; Condition cond; void Prologue(); void Epilogue(); public: // Constructor; initialize the top SynStack(); // Stack modification operation void Push(const T& item); T Pop(void); void ClearStack(void); // Stack access T Peer(void); // Stack test methods int StackEmpty(void); int StackFull(void);
326
Synchronization and Communication
};
The implementation are then rewritten as follows: // Initialize stack top template SynStack::SynStack(void) : top(-1) { mutex.Create("MUTX", SM_FIFO, 1); next.Create("NEXT", SM_FIFO, 0); cond.nextcount = 0; cond.Condition(mutex, next); } template void SynStack::Prologue() { unsigned long time_out = 0; mutex.P(SM_WAIT, time_out); } template void SynStack::Epilogue() { if (cond.nextcount > 0) next.V(); else mutex.V(); } template void SynStack::Push(const T& item) { Prologue(); // If the stack is full, wait on "xsem" if (top == MaxStackSize-1) { cond.SWAIT(); } // Increment top and copy item to stacklist top++; stacklist[top] = item; cond.SSIGNAL();
12.4. MONITORS
Epilogue(); } // Pop the stack and return the top element template T SynStack::Pop(void) { T temp; Prologue(); // If the stack is empty, wait on "xsem" if (top == -1) { cond.SWAIT(); } // Record the top element temp = stacklist[top]; // Decrement top and return former top element top--; cond.SSIGNAL(); Epilogue(); return temp; } // Return the value at the top of the stack template T SynStack::Peer(void) { T temp; Prologue(); // If the stack is empty, wait on "xsem" if (top == -1) { cond.SWAIT(); } temp = stacklist[top]; cond.SSIGNAL();
327
328
Synchronization and Communication
Epilogue(); return temp; } // Test for empty stack template int SynStack::StackEmpty(void) { int temp; Prologue(); // Return the logical value (top == -1) temp = (top == -1); Epilogue(); return temp; } // Test for full stack template int SynStack::StackFull(void) { int temp; Prologue(); // Return the logical value (top == MaxStackSize-1) temp = (top == MaxStackSize - 1); Epilogue(); return temp; } // Clear all items from the stack template void SynStack::ClearStack(void) { Prologue(); top = -1; Epilogue(); }
12.5. INTERPROCESS COMMUNICATION
12.5
329
Interprocess Communication
Many of the problems which we have described, and more, are presented as synchronization problems. But, in a larger sense, they are simple examples of the larger problem of allowing communication between processes which wish to cooperate. In this section, we are concerned with the general problem of interprocess communication. Principally, there exist two complementary communication schemes: shared memory and message systems. Shared-memory systems require communicating processes to share some variables. The processes are expected to exchange information through the use of these shared variables. For example, the bounded-buffer scheme could be used for this purpose. In a shared-memory scheme the responsibility for providing communication rests with the application programmers; the operating system only needs to provide the shared memory. The message-system scheme allows the processes to exchange messages. In this scheme, the responsibility for providing communication rests with the operating system itself. Obviously, these two schemes are not mutually exclusive, and could be used simultaneously within a single operating system. In this section, we focus primarily on message systems, since the shared-memory scheme is basically application oriented. The function of a message system is to allow processes to communicate with each other without the need to resort to shared variables. An interprocess communication facility basically provides two operations: send(message) and receive(message). If processes P and Q want to communicate they must send and receive messages from each other. In order to do so, a communication link must exist between them. This link can be implemented in a variety of ways. We are not concerned here with the physical implementation of a link (such as shared memory or a hardware bus), but more in the issues of its logical implementation, such as its logical properties. Some basic implementation questions are: • How are links established? • Can a link be associated with more than two processes? • How many links can there be between every pair of processes? • What is the capacity of a link? That is, does the link have some buffer space? If so, how much? • What is the size of messages? Can the link accommodate variable-size or fixed-size messages? • Is a link unidirectional or bidirectional? That is, if a link exists between P and Q, can messages flow in one direction (such as only from P to Q) or in both directions?
330
Synchronization and Communication
The definition of unidirectional must be more carefully stated, since a link may be associated with more than two processes. Thus we say that a link is unidirectional only if each process connected to the link can either send or receive, but not both, and each link has at least one receiver process connected to it. In addition, there are several methods for logically implementing a link and the send/receive operations: • Direct or indirect communication. • Send to a process or to a mailbox. • Symmetric or asynmetric communication. • Automatic or explicit buffering. • Send by copy or send by reference. • Fixed-size or variable-size message. In the following, we elaborate on these types of message systems.
12.5.1
Naming
In this section we consider the first three issues concerning the logical implementation of a link. Primarily, communication between two processes can be either direct or indirect. Direct Communication In the direct communication discipline, each process that wants to send or receive a message must explicitly name the recipient or sender of the communication. In this scheme, the send and receive primitives are defined as follows: • send(P, message). Send a message to process P. • receive(Q, message). Receive a message from process Q. A communication link in this scheme has the following properties: • A link is established automatically between every pair of processes that want to communicate. The processes need only know each other’s identity to communicate. • A link is associated with exactly two processes. • Between each pair of communicating processes, there exists exactly one link. • The link is bidirectional. In DeltaOS, events are provided as the way of directed communication between processes.
12.5. INTERPROCESS COMMUNICATION
331
Indirect Communication With indirect communication, the messages are sent to and received from mailboxes (also referred to as ports). A mailbox can be abstractly viewed as an object into which messages may be placed by processes and from which messages may be removed. Each mailbox has a unique identification that distinguishes it. In this scheme, a process may communicate with some other process by a number of different mailboxes. Two processes may communicate only if they have a shared mailbox. The send and receive primitives are defined as follows: • send(A, message). Send a message to mailbox A. • receive(A, message). Receive a message from mailbox A. In this scheme, a communication link has the following properties: • A link is established between a pair of processes only if they have a shared mailbox. • A link may be associated with more than two processes. • Between each pair of communicating processes there may be a number of different links, each corresponding to one mailbox. • The link may be either unidirectional or bidirectional. Now suppose that processes P1 , P2 , and P3 all shared mailbox A. Process P1 sends a message to A, while P2 and P3 each execute a receive from A. Who will receive the message sent by P1 ? This question can be resolved in a variety ways: • Allow a link to be associated with at most two processes. • Allow at most one process at a time to execute a receive operation. • Allow the system to select who will receive the message arbitrarily (that is, either P2 or P3 , but not both, will receive the message). The system may identify the receiver to the sender. Ownership of Mailboxes A mailbox may be owned either by a process or by the system. If the mailbox is owned by a process (that is, the mailbox is attached to or defined as part of the process), then we distinguish between the owner (who can only receive messages through this mailbox) and the user of the mailbox (who can only send messages to the mailbox). Since each mailbox has a unique owner, there can be no confusion about who should receive a message sent to this mailbox. When a process that own a mailbox terminates, the mailbox disappears. Any process that subsequently sends a message to this mailbox must be notisfied that the mailbox no longer exists (a form of exception handling).
332
Synchronization and Communication
There are a number of ways to designate the owner and users of a particular mailbox. One possibility is to allow a process to declare variables of type Mailbox. The process that declares a mailbox is its owner. Any other process that knows the name of this mailbox can use it. It is also possible to declare a shared mailbox and then externally declare the identity of the owner. On the other hand, a mailbox that is owned by the operating system, has an existence of its own. It is independent and not “attached” to any particular process. The operating system provides a mechanism that allows a process to: • Create a new mailbox. • Send and receive messages through the mailbox. • Destroy a mailbox.
12.5.2
Buffering
In this section we consider two more issues concerning the logical implementation of a link: capacity and message properties. Capacity A link has some capacity that determines the number of messages that can temporarily reside in it. This property can be viewed as a queue of messages attached to the link. Basically, there are three ways such a queue can be implemented: • Zero capacity. The queue has maximum length 0; thus the link cannot have any message waiting in it. In this case, the sender must wait until the recipient receives the message. The two processes must be synchronized for a message transfer to take place. This synchronization is called a rendezvous. • Bounded capacity. The queue has finite length n; thus at most n messages can reside in it. If the queue is not full when a new message is sent, it is placed in the queue (either by copying the message or by keeping a pointer to the message), and the sender can continue execution without waiting. The link has a finite capacity, however. If the link is full, the sender must be delayed until space is available in the queue. • Unbounded capacity. The queue has potentially infinite length; thus any number of messages can wait in it. The sender is never delayed. The zero capacity case is sometimes referred to as a message system with no buffering; the other cases provide automatic buffering. We note that, in the non-zero capacity cases, a process does not know whether a message has arrived at its destination after the send operation is
12.6. SYNCHRONIZED COMMUNICATION CHANNELS
333
completed. If this information is crutial for the computation, the sender must explicitly communicate with the receiver to find out if it received the message. Messages Messages sent by a process can be of three varieties: • Fixed-sized. • Variable-sized. • Typed messages. If only fixed-size messages can be sent, the physical implementation is straightforward. This restriction, however, makes the task programming more difficult. On the other hand, variable-sized messages require a more complex physical implementation, but the programming become simpler.
12.6
Synchronized Communication Channels
Communicating Sequential Processes (CSP) is a language framework for concurrent programming[14]. The following concepts are central to the language: • A CSP program consists of a fixed number of sequential processes that are mutually disjoint in address spaces. • Communication and synchronization are accomplished through the input and output constructs. • The sequential control structures are based on Dijkstra’s guarded commands. Communication in CSP occurs when one process names a second as the destination for output and the second process names the first as the source for input. The output values are copied from the first process to the second (the message is sent and received). Messages are typed. The types of the output message and the input message must match for communication to take place. Transfer of information occurs only when both source and destination processes have invoked the output and input commands, respectively. Therefore, either the source or the destination process may be suspended until the other process is ready with the corresponding output or input. This I/O facility serves both as a communication mechanism and a synchronization tool. We can simulate CSP in the framework of DeltaOS by introducing a special class called communication channel Channel. Communication between these two processes occur when both processes invoke the same communication channel for input/output operations.
334
Synchronization and Communication
12.6.1
Specification of Channel
Four semaphore are used for synchronization control, where xmutex provides mutual exclusion for access to the internal data slot. The xempty and xfull semaphores count the number of empty and full data slot (0 or 1), respectively. An additional semaphore accept is used for synchronize the inputting and outputting processes. #include template class Channel { private: T item; // object holds a data value deltaSemaphore xmutex, xfull, xempty, acceptin, acceptout; public: // constructor (default) Channel(void); ~Channel(void); // data retrieval and storage operations void Read(T& i); void Write(T i); };
12.6.2
Implementation of Channel
#include "chan.h" // declare that the data value is not initialized template Channel::Channel(void) { xfull.Create("FULL", 0, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); xempty.Create("EMPT", 1, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10);
12.6. SYNCHRONIZED COMMUNICATION CHANNELS
335
xmutex.Create("MUTX", 1, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); acceptin.Create("ACPI", 0, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); acceptout.Create("ACPO", 0, DELTA_COUNTING_SEMAPHORE|DELTA_FIFO, 10); } // declare that the data value is not initialized template Channel::~Channel(void) { xfull.Delete(); xempty.Delete(); xmutex.Delete(); acceptin.Delete(); acceptout.Delete(); } // retrieve item if initialized template void Channel::Read(T& i) { unsigned long time_out = 0; // If there is a writing task waiting on entering the channel, // release it. acceptin.Release(); // If the channel is not full (no writing task to send the message), // just wait. xfull.Obtain(DELTA_WAIT, time_out); // Enter the critical region. xmutex.Obtain(DELTA_WAIT, time_out); // Read the message i = item; // Leave the critical region. xmutex.Release();
336
Synchronization and Communication
// Set xempty as true. xmutex.Release(); // If there is a writing task waiting on leaving the channel, // release it. acceptout.Release(); } // put item in storage template void Channel::Write(T i) { unsigned long time_out = 0; // Waiting for a reading task to accept starting the writing action. acceptin.Obtain(DELTA_WAIT, time_out); // If the channel is not empty, just wait xempty.Obtain(DELTA_WAIT, time_out); // Enter the critical region xmutex.Obtain(DELTA_WAIT, time_out); item = i; // Leave the critical region xmutex.Release(); // Set xfull as true xfull.Release(); // Waiting for a reading task to accept ending the writing action. acceptout.Obtain(DELTA_WAIT, time_out); }
12.6.3
Application of Channel
Again, we will use the communication channels to solve the dining philosopher problem. The forks will be represented using tasks. Here the Channel are used for synchronization purpose only. /***********************************************************************/ */ /* /* root: Sets up the evaluation program execution. */ /* */ /* NOTE: Executes as task ’ROOT’. */ /* */ /***********************************************************************/ delta_task Init(delta_task_argument argument)
12.6. SYNCHRONIZED COMMUNICATION CHANNELS
337
{ int i; delta_status_code status; delta_time_of_day time; delta_time_of_day time_buffer ; set_outputfunc((outputfunc_ptr)disp_char); screen_init(); time_buffer.year = 2001; /* year, A.D. */ time_buffer.month = 6; /* month, 1 -> 12 */ time_buffer.day = 7; /* day, 1 -> 31 */ time_buffer.hour = 9; /* hour, 0 -> 23 */ time_buffer.minute = 38; /* minute, 0 -> 59 */ time_buffer.second = 0; /* second, 0 -> 59 */ time_buffer.ticks = 20; /* second, 0 -> 59 */ delta_clock_set(&time_buffer); status = delta_clock_get(DELTA_CLOCK_GET_TOD, &time); printf("\n****Task wake test start on %d/%d/%d, %d:%d:%d:%d.****\n\r", time.year, time.month, time.day, time.hour, time.minute, time.second, time.ticks); /*---------------------------------------------------------------------*/ /* Create application tasks and start them. */ /*---------------------------------------------------------------------*/ for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’P’, ’H’, ’L’, ’ ’); status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES, &tid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(tid[i], phil, i); } for (i = 0; i < 5; ++i) { Task_name[i] = delta_build_name(’F’, ’O’, ’K’, ’ ’); status = delta_task_create(Task_name[i], 10, DELTA_MINIMUM_STACK_SIZE * 2, DELTA_DEFAULT_MODES, DELTA_DEFAULT_ATTRIBUTES,
338
Synchronization and Communication
&fid[i]); if (status == DELTA_SUCCESSFUL) status = delta_task_start(fid[i], fork, i); } Sem_name[1] = delta_build_name(’W’, ’A’, ’I’, ’T’); status = delta_semaphore_create(Sem_name[1], 1, DELTA_BINARY_SEMAPHORE|DELTA_FIFO, 5, &roomwait); if (status != DELTA_SUCCESSFUL) { printf("Semaphore failly created\n "); } /*-------------------------------------------------------------------*/ /* Delete self. If deletion fails, call k_fatal. */ /*-------------------------------------------------------------------*/ delta_task_delete(DELTA_SELF); } /***********************************************************************/ /* */ /* phil : */ /* */ /* NOTE : Runs as task of five philosophers */ /* */ /***********************************************************************/ delta_task phil(delta_task_argument targ) { unsigned long timeout = 0; int PickupFork = 0; int PutdownFork = 1; int i; i = targ; while (1) { delta_semaphore_obtain(roomwait, DELTA_WAIT, timeout); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Enter Room!\n", i); forkchan[i].Write(PickupFork); delta_task_wake_after(5); forkchan[(i + 1) % 5].Write(PickupFork); /*-----------------------------------------------------------------*/
12.7. ASYNCHRONIZED COMMUNICATION BUFFER
339
/* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Eating\n", i); forkchan[i].Write(PutdownFork); delta_task_wake_after(5); forkchan[(i + 1) % 5].Write(PutdownFork); /*-----------------------------------------------------------------*/ /* Send message to the standard output device */ /*-----------------------------------------------------------------*/ printf("Philosopher %i Leave Room\n", i); delta_semaphore_release(roomwait); /* sleep for 50 ticks before attempting to reenter room again */ delta_task_wake_after(50); }; }
The behaviour of the forks are to be picked up and put down by a philosopher setting on either side of it. /*********************************************************/ /* */ /* fork : */ /* */ /* NOTE : Runs as task of five forks */ /* */ /*********************************************************/ delta_task fork(delta_task_argument targ) { int PickupFork; int PutdownFork; int i; i = targ; while (1) { forkchan[i].Read(PickupFork); forkchan[i].Read(PutdownFork); } }
12.7
Asynchronized Communication Buffer
Channel is a structure for synchronized communication where both reading and writing tasks must be ready on setting up a communication link. However, in some cases, the reading task or the writing task do not need to wait for
340
Synchronization and Communication
other party to be ready. We therefore define an asynchronous communication structure SynStore.
12.7.1
Specification of SynStore
#include template class SynStore { private: T item; int haveValue;
// object holds a data value // flag set when item is initialized
// For each condition x, we need a semaphore xsem and an integer // xcount both initialize to 0 deltaSemaphore xsem; int xcount; deltaSemaphore mutex, next; int nextcount; void SWAIT(void); void SSIGNAL(void); public: // constructor (default) SynStore(void); ~SynStore(void); // data retrieval and storage operations T GetElement(void); void PutElement(T x); };
12.7.2
Implementation of SynStore
template template SynStore::SynStore(void): haveValue(0) { xsem.Create("XSEM", 0, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); mutex.Create("MUTX",
12.7. ASYNCHRONIZED COMMUNICATION BUFFER
1, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); next.Create("NEXT", 0, DELTA_COUNTING_SEMAPHORE|DELTA_PRIORITY, 10); xcount = 0; nextcount = 0; } template SynStore::~SynStore(void) { xsem.Delete(); mutex.Delete(); next.Delete(); } template void SynStore::SWAIT(void) { unsigned long time_out = 0; xcount = xcount + 1; if (nextcount > 0) next.Release(); else mutex.Release(); xsem.Obtain(DELTA_WAIT,time_out); xcount = xcount - 1; } template void SynStore::SSIGNAL(void) { unsigned long time_out = 0; if (xcount > 0) { nextcount = nextcount + 1; xsem.Release(); next.Obtain(DELTA_WAIT,time_out); nextcount = nextcount - 1; } }
341
342
Synchronization and Communication
// retrieve item if initialized template T SynStore::GetElement(void) { unsigned long time_out = 0; T temp; mutex.Obtain(DELTA_WAIT,time_out); // wait if client tries to access uninitialized data if (haveValue == 0) { SWAIT(); } haveValue--; // haveValue is FALSE temp = item; // returns item from storage SSIGNAL(); if (nextcount > 0) next.Release(); else mutex.Release(); return temp; } // put item in storage template void SynStore::PutElement(T x) { unsigned long time_out = 0; mutex.Obtain(DELTA_WAIT,time_out); // If the stack is full, wait on "xsem" if (haveValue == 1) { SWAIT(); } haveValue++; // haveValue is TRUE item = x; // store x SSIGNAL(); if (nextcount > 0) next.Release(); else
12.7. ASYNCHRONIZED COMMUNICATION BUFFER
mutex.Release(); }
343
Appendix A
Glossaries Active Task Refers to the task that is currently running. Multi-tasking kernels need to know the identity of the currently running task so that it can know where to save the task’s context when the kernel performs a task switch. For example, a global variable called ActiveTCB might point to the TCB of the active task. The kernel would save the active task’s context on the active task’s stack (i.e. the current stack)
Application Another name for program.
Assert, Asserted Designates the active state of an electrical signal. For example, raising a normally 0 volt CPU input control pin to 5 volts asserts the line. Similarly, pulling down a normally 5 volt CPU input control pin to 0 volts asserts the line. In either of the above examples, the line is said to be asserted. When an input put is asserted, it performs its function. For example, asserting the interrupt pin on the 8086 generates an interrupt.
Block, Blocked Task A task is blocked when it attempts to acquire a kernel entity that is not available. The task will suspend until the desired entity is available. A task is blocked when: it attempts to acquire a semaphore that is unavailable, or when it waits 344
345
for a message that is not in its queue, or while it waits for a pause to expire, or when, in a time-slicing environment, its time-slice has expired, to give a few examples.
Boot The procedure of starting a computer by loading the OS kernel is known as booting the system. On most computer systems, there is a small piece of code, stored in ROM, known as the bootstrap program or bootstrap loader. This code is able to locate the kernel, load it into main memory, and start its execution.
Break, Breakpoint A program breaks when a breakpoint is reached. A breakpoint is an instruction address at which a debugger has been commanded to stop program execution, and execute the debugger so that the user can enter debugger commands. At some point after breaking the user typically executes a debugger command like “continue” which continues program execution from the point at which the program was stopped.
Bus If one or more devices use a common set of wires, the connection is called a bus. In slightly more formal terms, a bus is a set of wires and a rigidly defined protocol that specifies a set of messages that can be sent on the wires.
Busy Waiting The process by which code repetitively checks for a condition. The following code exhibits busy waiting. while (TRUE) { if (switchDown) { lampOn(); break; } } The code stays in the loop until the condition is met. Contrast with event driven code.
346
APPENDIX A. GLOSSARIES
Condition Variable The condition variable is a mechanism for synchronization. This mechanism is provided by the condition construct. A programmer who needs to write his own tailor-made synchronization scheme can define one or more variables of type condition: condition x, y; The only operations that can be invoked on a condition variables are WAIT and SIGNAL. Thus a condition variable can be viewed as an abstract data type that provides those two operations. The operation x.WAIT; means that the process invoking this operation is suspended until another process invokes x.SIGNAL; The x.SIGNAL operations resumes exactly one suspended process. If no process is suspended, then the SIGNAL operation has no effect; that is, the state x is as if the operation was never executed.
Context, Context Switching A CPU’s context typically refers to all its registers (including IP and SP) and status register(s). When a task switch occurs, its context is saved, and the context of the next task to run is restored. The saving of the current task’s context and the restoring of the context of the next task to run is called a context switch. (Special Note: In a more complete sense context also includes the task’s local variables and its subroutine nesting information (for example, the task may be 7 subroutine calls deep when the context switch is made, and when the task is eventually resumed the task’s code must be able to return from all 7 subroutines). The local variables and subroutine nesting level are preserved by giving each task its own private stack when the task is created, since subroutine return information and local variables are stored on the stack.)
Controller A controller is a collection of electronics that can operate a port, a bus, or a device. A serial-port controller is an example of a simple device controller. It is a single chip in the computer that controls the signals on the wires of a serial port.
347
Cooperative Scheduling A group of tasks run cooperatively when each must voluntarily relinquish control so that other tasks can run. In other words, a task will run forever, thus starving other tasks, unless it voluntarily gives up control. Control can be relinquished explicitly by a yield, pause, or suspend; or implicitly by waiting for an event. Contrast with priority based preemptive scheduling and time-slicing.
Crash A system is crashed when the CPU has halted to due a catastrophic error, however, the word can be used in the same sense has hung.
Cyclical Executive, Cyclical Kernel, Cyclical Scheduling Cyclical (also referred to as round robin) scheduling is done without a kernel per se by simply calling one “task” (typically a C function) after another in an infinite loop as shown below. void main(void) { for (;;) { task0(); task1(); task2(); /* and so on... */ } } In this scenario, each task must do its work quickly, save its state if necessary, and return back to the main loop. The advantages with this approach are small size, no kernel required, requires only 1 stack, and easy to understand and control. The disadvantages are no priorities, tasks must retain their states, more responsibility is placed on the programmer, and the possibility of excessive loop latency when there are many tasks.
Deadlock The term can have various meanings, but typically, a deadlock is a condition in which a task will remain forever suspended waiting for a resource that it can never acquire. Consider a system comprised of a keyboard and a display, with a separate semaphore for each. In order for a task to interact with the user it
348
APPENDIX A. GLOSSARIES
must acquire the “console” (keyboard and display) and therefore must acquire both the keyboard and the display semaphores. If more than one task decides to interact with the user at the same time a condition can arise where taskA acquires the keyboard semaphore and taskB acquires the display semaphore. Now taskA will wait forever for the display semaphore and taskB will wait forever for the keyboard semaphore. The solution in this example is to treat the keyboard and display as a single resource (console). Deadlocks can occur for a variety of reasons, but the end result is typically the same: the task will wait forever for a resource that it can never acquire.
Debugger A debugger is a software program used to break program execution at various locations in an application program after which the user is presented with a debugger command prompt that will allow him to enter debugger commands that will allow for setting breakpoints, displaying or changing memory, single stepping, and so forth.
Devices, Device Drivers Computers operate a great many kinds of devices. General types include storage devices (disks, tapes), transmission devices (network cards, modems), and human-interface devices (screen, keyboard, mouse). The device drivers present a uniform device-access interface to the I/O subsystem, much as system calls provide a standard interface between the application and the operating system.
Direct Memory Access (DMA) Many computers avoid burdening the main CPU with programmed I/O (PIO) by offloading some of this work to a special-purpose processor called a directmemory-access (DMA) controller. To initiate a DMA transfer, the host writes a DMA command block into memory. This block contains a pointer to the source of a transfer, a pointer to the destination of transfer, and a count of the number of bytes to be transferred. The CPU writes the address of this command block to the DMA controller, then goes on with other work. The DMA controller then proceeds to operate the memory bus directly, placing addresses on the bus to perform transfers without the help of the main CPU.
349
Differential Timer Management A method of managing timers such that only 1 timer count is decremented regardless of the number of pending timers. Consider 3 pending timers with durations of 50 ms, 20 ms, and 5 ms respectively. The timers are sorted and stored in a list with the lowest timer first: 5, 20, 50. Then the timer durations are replaced with the difference of the timer duration and the preceding duration, i.e., 5, 20, 50, is replaced with 5, (20-5), (50-20), which equals 5, 15, 30. This allows for only the first duration (5 ms in this example) to be decremented. Thus, when the first duration of 5 decrements down to 0, there are only 15 ms required to meet the 20 ms duration, and when the 50 ms duration is to be started, already 5 + 15 = 20 ms have expired, so only a count of 30 ms is required. This technique is attractive because the timer counts are typically decremented inside the timer isr and time inside any ISR should be kept to a minimum.
Download Refers to the transfer of executable code from a host to a target, typically using an RS-232 serial line. The target must have resident software (e.g., EPROM) that can read the incoming data, translate it if necessary (the file format may be ASCII hex for example) and load and run the code. If the target board has no resident software, then an ICE is required. The ICE connects to the host via RS-232 typically and accepts the download and loads the code into memory.
Driver The software that communicates between a hardware peripheral and the rest of the system. Often called a “device driver.” In DeltaOS , a device driver often refers only to those drivers which have a UNIX-like interface.
Dynamic Priorities Priorities that can be changed at run-time. Contrast with fixed priorities.
Event, Event Driven Code Code that remains dormant until an event occurs. The following is an example. while (TRUE) { msg = waitMsg(MOTOR_ON); freeMsg(msg);
350
APPENDIX A. GLOSSARIES
turnMotorOn(); } Until the message MOTOR ON is received, the function waitMsg will suspend the current thread of execution until the message is received. The receipt of the message is the event that causes the ask to be scheduled and eventually run. Contrast with busy waiting.
Exception A software interrupt generated by the CPU when an error or fault is detected.
Fairness A concept which dictates that if a task runs too long, its priority is lowered so that other tasks can get their fair share of the CPU. This concept is repugnant to real-time principles and is used chiefly in multi-user systems with the intent of disallowing one user to monopolize the CPU for long compiles and the like.
Fault A fault is an exception that is generated when the current instruction needs help before it can be executed. For example, supposed in a virtual memory system, a memory access is made to a page that is not in memory. In this case, a fault is generated which vectors to an ISR that will read in the page. The fault exception is different in the way it returns from interrupt in that control is returned to the instruction that caused the fault, not the following instruction as with a normal interrupt. This allows the instruction to access the memory again, and this time succeed.
Fixed Block Memory Allocation A memory allocation/deallocation scheme in which a number of memory pools of fixed sized blocks is created. For example, 3 pools are created as follows: pool 1 contains 100 64 byte blocks, pool 2 contains 50 256 byte blocks, and pool 3 contains 10 1K byte blocks. When a user wants memory he takes and returns blocks from/to a pool. For example, if the user needed a 256 byte block of memory he would take a block from pool 2. Similarly, if he needed a 100 byte block he would still have to take a block from pool 2 even though bytes are wasted. The advantage is no fragmentation and high speed.
351
Fixed Priorities Priorities that are set once, typically at compile time, and unalterable at runtime. Contrast with dynamic priorities.
Fragmentation In a kernel or language which allows for memory allocation and deallocation, the available free memory can eventually become fragmented, i.e., non-contiguous, if memory is allocated from one large pool. For example after allocating and deallocating many different size blocks of memory, there may be 12K available, but it is not contiguous, and therefore, if software needs 12K it cannot use it, even though it is available. Contrast with fixed block memory allocation.
Hang, Hung A system is hung when it does not respond to external input (e.g. keyboard) but has not completely crashed. A hung system is typically spinning in an endless loop.
Hard Real-Time Hard real-time refers to the strict definition of real-time. See real-time.
Hardware Interrupt Latency The time it takes for the ISR to be entered, once the processor interrupt pin is asserted.
Hook, Chain Intercept an interrupt by saving the current interrupt vector and writing a new interrupt vector. When the new interrupt service routine is finished it calls the old interrupt service routine before returning.
Host See target.
352
APPENDIX A. GLOSSARIES
In-Circuit Emulator, ICE An electronic tool that allows for debugging beyond the capabilities of a standard software debugger. An ICE is essentially a hardware box with a 2 to 3 foot cable attached to it. At the end of the cable is a multi-pin connector connected to a CPU processor chip which is identical to the processor on the target board. The target processor is removed from your target board, and the connector plugged in. The ICE allows for trapping the following types of activities: read/write to an address range, read/write a specific value from an address range, setting a breakpoint in EPROM, mapping emulator memory to target board memory, and other similar features. It also turns the host (a PC for example) into a debug station with supplied software. This allows for debugging of the target board even though the target does not have a keyboard, screen, or disk. For more information, see articles 1, 2, and 3.
Idle Task A kernel owned task that is created and scheduled during system initialization. It runs at the lowest possible priority. The idle task has the lowest possible priority and runs only when no other tasks are scheduled. When any other task is scheduled, the idle task is preempted. The idle task is typically a “do nothing” tight loop. Its only purpose is to run when no other tasks run. It is a convenience mechanism in that no special kernel code is required to handle the case of all tasks being idle; the kernel sees the idle task like any other task. The key is that the idle task has a lower priority than any other task, and even though it is scheduled to run and occupies a place in the ready queue, it will not run until all other tasks are idle. And, conversely, when any other task is scheduled, the idle task is preempted.
Instantiate, Instantiation To start multiple instances of a task.
Intercept Identical to hook except that the old ISR is not called before returning.
Interrupt, Interrupt Vector Most CPU’s have a pin designated as the “interrupt” pin. When this pin is asserted the CPU
353
1. halts, 2. saves the current instruction pointer and the CPU flags on the stack, and 3. jumps to the location of an interrupt service routine (ISR). That how the address of the ISR is determined is CPU dependent. Some CPU’s have multiple interrupt pins, and each pin refers to a reserved memory location where the ISR address can be found. For example, consider an imaginary 16 bit CPU with 8 interrupt pins I0 through I7, which refer to interrupt numbers 0 through 7, and an interrupt vector table starting at address 0. The addresses of the ISR’s for interrupt pins I0 through I7 would be located at addresses 0, 2, 4, 6, 8, 10, and 12. The dedicated interrupt pin approach works fine until more than 8 interrupts are desired. The 8086 handles this problem by relegating the handling of the interrupt signals with a special interrupt controller chip like the 8259A. The 8086 has only 1 interrupt pin. When the interrupt pin is asserted, the 8086 determines which device generated the interrupt by reading an interrupt number from the interrupt controller. The interrupt number is converted to an interrupt vector table index by multiplying the interrupt number by 4 because each entry in the 8086 interrupt vector table requires 4 bytes (2 bytes for the CS register and 2 bytes for the IP register). Although the 8259A interrupt controller handles only 8 incoming interrupt lines, the 8259A’s can be cascaded to handle more interrupts.
Interrupt Controller The following explanation uses the 8259A as an example interrupt controller. Sometimes many interrupts are required. Most CPU’s with dedicated interrupt pins have no more than 8 interrupt pins, so that no more than 8 incoming interrupts can be handled. A way around this problem is to handle interrupts with an interrupt controller chip. Typically interrupt controller chips can be cascaded so that even though a single interrupt controller chip handles only 8 incoming interrupts, a cascade of 4 can handle 32 incoming interrupts. The typical arrangement is for the hardware device to signal an interrupt by asserting one of its output pins which is directly connected to one of the interrupt controller’s interrupt pins. The interrupt controller then signals the CPU by asserting the one and only CPU interrupt pin. The CPU then begins a special interrupt bus cycle with the interrupt controller and obtains the interrupt number which is converted into an index into an interrupt vector table, which enables the CPU to jump to the proper ISR for the interrupt. If other interrupts are asserted while the CPU is handling an interrupt, the interrupt controller waits until it receives a signal from the CPU called and EOI (end of interrupt) before interrupting the processor again. The EOI signal is generally asserted with a software IO output instruction. Note however, that if more than one interrupt is generated on a
354
APPENDIX A. GLOSSARIES
single interrupt controller pin while the CPU is processing an interrupt, then all but the first waiting interrupt will be lost. For more information see interrupts.
Interrupt Latency Time The execution time of interrupt handler. The interrupt may be enabled or disabled inside of ISRs.
Interrupt Latency The time interval between the last line executed in the interrupted thread and the first line executed in the interrupt service routine (ISR) attached to this interrupt.
Interrupt Dispatch Latency The time interval between the last line executed in the interrupt service routine and the first line executed in the thread scheduled to run.
Interrupt Service Routine, ISR A routine that is invoked when an interrupt occurs.
Interrupts and CPU Flags Register, IRET Instruction It is necessary to save the flags on entering an ISR and restore them on exit, otherwise, the ISR may change the flags, which can cause problems. The 8086 automatically saves flags on the stack prior to calling the ISR, and restores the flags before returning using the IRET instruction. The reason for saving the flags can be understood by considering the following code fragment. ; Assume that AX = 0. CMP AX,0 JE LABEL7 The CMP instruction executes, and because AX = 0, the Z bit in the flags register is set by the CPU. Let us say that after the compare instruction, an interrupt occurs and the ISR is jumped to which executes the following instruction.
355
; Assume BX = 1. CMP BX,0 The above instruction will clear the Z bit because BX is not 0. The ISR will eventually return to the instruction above JE LABEL7. If a standard RET instruction is used instead of IRET, the flags will not be restored, and the instruction JE LABEL7 will not perform the jump as it should because the Z bit was cleared in the ISR. For this reason flags must be saved before entering an ISR and restored prior to returning from an ISR.
Interrupt Vector Table, Interrupt Service Routine (ISR), Software Interrupt Some CPU’s handle interrupts with an interrupt vector table. On the 8086, the interrupt vector table is a 1K long table starting at address 0, which will support 256 interrupts numbered 0 through 255. The numbers 0 through 255 are referred to as “vector numbers” or “interrupt numbers”. The table contains addresses of interrupt service routines, which are simply subroutines that are called when the interrupt occurs. The only difference between an interrupt service routine and a subroutine is that on entry to the ISR the CPU flags are stored on the stack in addition to the return address; for an explanation of why the flags must be saved, see the definition. This means that an interrupt service routine must return with a special return instruction (IRET on the 8086) that pops the CPU flags off the stack and restores them before returning. On the 8086, an address is specified by a segment and an offset, and therefore each interrupt vector table entry uses 4 bytes; two bytes for the CS register and 2 bytes for the IP register (IP is the lower word in the vector table and CS is the higher word). An interrupt service routine can be invoked by software using the 8086 INT instruction, which in addition to the return address pushes the CPU flags onto the stack. An interrupt service routine can also be invoked by software using a standard CALL instruction as long as the CPU flags are first pushed onto the stack. For details of how an interrupt is generated and serviced, see interrupts.
Kernel, Multi-Tasking Kernel, Real-Time Kernel Software that provides interfaces between application software and hardware and in general controls and programs all system hardware. A multi-tasking kernel provides for multi-tasking in the form of threads, processes, or both. A real-time kernel provides is a multi-tasking kernel that has predictable worst case tasks switch times and priorities so that high priority events can be serviced successfully.
356
APPENDIX A. GLOSSARIES
The kernel is to software what the CPU is to hardware. The name of the DeltaOS kernel is DeltaCORE?.
LAN (Local Area Network) A network used within a confined physical area, in contrast to a WAN (Wide Area Network). Ethernet is a LAN.
Latency The time between when something happens and when its response is generated. This is often critical in real-time applications.
Message A data structure that is passed to a task by placing a pointer to the structure into a queue. The queue is a doubly linked list that is known to the task. Kernel functions are supplied to remove messages from the queue.
Message Aliasing A technique by which the sender of the message changes the “sender” field in the message structure before sending it so that the receiver of the message will reply to a task other than the actual sender.
Messages vs. Mail Messages are queued, mail is not. New mail data may optionally over-write old, which is advantageous for certain applications. Consider taskA that updates a display and receives the latest data on a periodic basis from taskB. Assume further that the system is busy to the point that taskB sends taskA two data updates before taskA can resume execution. At this point the old data is useless, since taskA only displays the latest value. In this situation, queuing (using messages) is undesirable, and mail with over-write enabled is preferred.
Microkernel This method structures the operating system by removing all nonessential components from the kernel, and implementing them as system and user-level programs. The result is a smaller kernel. There is little consensus regarding which
357
services should remain in the kernel and which should be implemented in user space. In general, however, microkernels typically provide minimal process (threads) and memory management, in addition to a communication facility. The main function of microkernel is to provide a communication facility between the client program and the various services that are also running in user space. Communication is provided by message passing. For example, if the client program wishes to access a file, it must interact with the file server. The client program and service never interact directly. Rather, they communicate indirectly by exchanging messages with the microkernel. The benefits of the microkernel approach include the ease of extending the operating system. All new services are added to user space and consequently do not require modification of the kernel. When the kernel does have to be modified, the changes tend to be fewer, because the microkernel is a smaller kernel. The resulting operating system is easier to port from one hardware design to another. The microkernel also provides more security and reliability, since most services are running as user – rather than kernel – processes (threads). If a service fails, the rest of the operating system remains untouched.
Monitor Refers to a kernel, or a very minimal kernel used for debugging, typically written by the programmer for a particular project, or supplied on ROM from the board manufacturer.
Multi-Processing A term that describes an environment in which 2 or more CPU’s are used to distribute the load of a single application.
Multi-Tasking A term that describes an environment in which a single computer is used to share time with 2 or more tasks. Contrast with multi-user. For more information and implementation details see articles 1, and 2.
Multi-User A term that describes a single computer that is connected to multiple users, each with his own keyboard and display. Also referred to as a time-sharing system. Contrast with multi-tasking.
358
APPENDIX A. GLOSSARIES
Mutual Exclusion At least one resource must be held in a non-sharable mode; that is, only one process (thread) at a time can use the resource. If another process (thread) requests that resource, the requesting process (thread) must be delayed until the resource has been released.
Mutexes Mutual exclusion locks (mutexes) are the mutual exclusion binary semaphores to prevent multiple threads from simultaneously executing critical sections of code which access shared data (that is, mutexes are used to serialize the execution of threads). All mutexes must be global. A successful call for a mutex lock via mutex lock() will cause another thread that is also trying to lock the same mutex to block until the owner thread unlocks it via mutex unlock(). Threads within the same process or within other processes can share mutexes. Mutexes can synchronize threads within the same process or in other processes. Mutexes can be used to synchronize threads between processes if the mutexes are allocated in writable memory and shared among the cooperating processes. Initialize Mutexes are either intra-process or inter-process, depending upon the argument passed implicitly or explicitly to the initialization of that mutex. A statically allocated mutex does not need to be explicitly initialized; by default, a statically allocated mutex is initialized with all zeros and its scope is set to be within the calling process. For inter-process synchronization, a mutex needs to be allocated in memory shared between these processes. Since the memory for such a mutex must be allocated dynamically, the mutex needs to be explicitly initialized using mutex init() with the appropriate attribute that indicates inter-process use.
Non-Reentrant, Reentrant Code Non-reentrant code cannot be interrupted and then reentered. Reentrancy can occur, for example, when a function is running and an interrupt occurs which transfers control to an interrupt service routine, and the interrupt service routine calls the function that was interrupted. If the function is not reentrant, the preceding scenario will cause problems, typically a crash. The problem typically occurs when the function makes use of global data. To illustrate, consider a function that performs disk accesses and uses a global variable called “NextSectorToRead”, which is set according to a parameter passed to the function. Consider the following scenario. The function is called with a parameter of 5, and “NextSectorToRead” is set to 5. The function is then interrupted and control is transferred to an ISR which calls the function with a parameter of 7, which results in “NextSectorToRead” being set to 7. The sector is read, control
359
is returned to the ISR, the ISR performs a return from interrupt, and the function resumes with an erroneous value of 7 instead of 5 in “NextSectorToRead”. This problem can be avoided by only using stack variables.
Page A unit of memory, typically around 4K, which is used as the smallest granule of memory in virtual memory systems.
Platform A hardware or software architecture. Also refers to an operating system, in which case the hardware may or may not be implied. For example, when a program is said to “run on the DeltaOS platform” it means that the program has been compiled into a processor (i.e., Intel x86) language and that the processor communicates with the DeltaOS operating system. The terms platform and environment are often used interchangeably.
Polling The controller sets the busy bit when it is busy working, and clears the busy bit when it is ready to accept the next command. The host repeatedly reads the busy bit until that bit become clear is called polling (busy waiting).
Port The device communicates with the machine via a connection point termed a port (for example, a serial port).
POSIX An IEEE standards committee (Portable Operating System Interface - the final X is meaningless). The original POSIX standard (1003.1) defined a user interface to the UNIX operating system. POSIX has expanded its standards efforts into many other areas, including real-time (1003.1b and 1003.1c). While the UNIX standard has been very successful, it still remains uncertain to what extent the other work standards will be successful.
360
APPENDIX A. GLOSSARIES
Preemption - Causes of Preemption occurs when a dormant higher priority task becomes ready to run. A dormant higher priority task can become “ready to run” in a variety of ways. If time-slicing is enabled, (which is rarely the case in properly designed real-time systems), preemption occurs when the task’s time-slice has expired. Preemption can also occur when a message is sent to a task of a higher priority than the active task. Consider the following. The active task, taskB, invokes waitMsg to wait for a particular message named START. Since the message START is not in taskB’s message queue (i.e., no task has sent taskB a message named START), taskB will be suspended until such time as the message START is received. Once taskB is suspended, the highest priority waiting task, (the task at the front of the ready queue), is removed from the ready queue, (call it taskA which is of a lower priority than taskB) and it becomes the new active task. Now if taskA sends taskB a message named START, the kernel will suspend taskA, and resume taskB since taskB has a higher priority and it now has a reason to run (it received the message that it was waiting for). Task preemption can also occur via interrupt. If a message is sent from within an ISR to a task with a higher priority than the active task, (which was interrupted to run the ISR), then on exiting the ISR, control should not return to the interrupted task, but rather to the higher priority task to which the message was sent.
Preemptive, Preemption Preemption occurs when the current thread of execution is stopped and execution is resumed at a different location. Preemption can occur under software (kernel) or hardware (interrupt) control. See priority based preemptive scheduling.
Priority A number attached to a task that determines when the task will run. See also scheduling.
Priority based Preemptive Scheduling A multi-tasking scheduling policy by which a higher priority task will preempt a lower priority task when the higher priority task has work to do. Contrast with cooperative scheduling and time-sliced scheduling.
361
Priority Inversion An inappropriately named condition created as follows. Consider three tasks: taskA, taskB, and taskC, taskA having a high priority, taskB having a middle priority, and taskC having a low priority. taskA and taskB are suspended waiting for a timer to expire, and therefore taskC runs. taskC then acquires semaphoreA, and is subsequently preempted by taskA before it releases semaphoreA. taskA then attempts to acquire semaphoreA, and blocks since the semaphore is in use by taskC. taskB now runs, because it is of a higher priority than taskC and its timer has expired. Now, taskB can run until it decides to give up control, creating a condition where a high priority task (taskA) cannot run because a lower priority task has a resource that it needs. taskA will remain blocked until the lower priority task runs and releases the semaphore. This situation can be avoided by proper coding, or by using a server task instead of a semaphore.
Process A process is a single executable module that runs concurrently with other executable modules. For example, in a multi-tasking environment that supports processes, like OS/2, a word processor, an internet browser, and a data base, are separate processes and can run concurrently. Processes are separate executable, loadable modules as opposed to threads which are not loadable. Multiple threads of execution may occur within a process. For example, from within a data base application, a user may start both a spell check and a time consuming sort. In order to continue to accept further input from the user, the active thread could start two other concurrent threads of execution, one for the spell check and one for the sort. Contrast this with multiple .EXE files (processes) like a word processor, a data base, and internet browser, multi-tasking under OS/2 for example.
Program Space The range of memory that a program can address.
Protection Violation Through a scheme involving special hardware and kernel software, tasks can be restricted to access only certain portions of memory. If a program attempts to jump to a code location outside its restricted area or if it attempts access data memory outside its restricted area, a protection violation occurs. The protection violation generates an interrupt to an interrupt service routine. The interrupt service routine will typically terminate the task that caused the violation and
362
APPENDIX A. GLOSSARIES
schedule the next ready task. When the interrupt service routine returns, it returns to the next ready to run task. Special hardware is needed because each code or data access must be checked. This is essentially done through special hardware tables that kernel software can write to. For example, when a task is created, the kernel assigns a hardware table to the task (each task is assigned a different hardware table). The kernel then writes into this table the lower limit and the upper limit of memory that the task is allowed to access. When a task starts, a special hardware register is set to point to the hardware table for the task so that the special memory management hardware (which may be on the CPU chip, or external to it) will know which table to use when making access checks. Each time the task accesses memory, the memory management hardware checks the requested memory access against the allowable limits, and generates a protection violation if necessary.
Real-Time In a strict sense, real-time refers to applications that have a time critical nature. Consider a data acquisition and control program for an automobile engine. Assume that the data must be collected and processed once each revolution of the engine shaft. This means that data must be read and processed before the shaft rotates another revolution, otherwise the sampling rate will be compromised and inaccurate calculations may result. Contrast this with a program that prints payroll checks. The speed at which computations are made has no bearing on the accuracy of the results. Payroll checks will be generated with perfect results regardless of how long it takes to compute net pay and deductions. See also Hard Real-Time and Soft Real-Time.
Real-Time Kernel A real-time kernel is a set of software calls that provide for the creation of independent tasks, timer management, inter-task communication, memory management, and resource management.
Real-Time Operating System A real-time kernel plus command line interpreter, file system, and other typical OS utilities.
363
Relinquish, Voluntarily Relinquish Control The process by which the active task voluntarily gives up control of the CPU by notifying the kernel of its wish to do so. A task can voluntarily relinquish control explicitly with a yield, pause, or suspend; or implicitly with waitMsg, waitMail, etc. and the actual system calls will vary from kernel to kernel. When the active task relinquishes control, the task referenced by the node at the front of the ready queue becomes the new active task.
Ready Queue The ready queue is a doubly linked list of pointers to TCB’s of tasks that are waiting to run. When the currently active task relinquishes control, either voluntarily, or involuntarily, (for example by an interrupt service routine which schedules a higher priority task), the kernel chooses the task at the front of the ready queue as the next task to run.
Real-Time Responsive Describes a task switching policy that is consistent with real-time requirements. Specifically, when ever it is determined that a higher priority task needs to run, it runs immediately. Contrast this with a policy in which the higher priority task is scheduled when it is determined that it must run, but the active lower priority task continues to run until a system scheduler task interrupts it and then switches to the higher priority task.
Resource A resource is a general term used to describe a physical device or software data structure that can only be accessed by one task at a time. Examples of physical devices include printer, screen, disk, keyboard, tape, etc. If, for example, access to the printer is not managed, various tasks can print to the printer and interleave their printout. This problem is typically handled by a server task whose job is to accept messages from various tasks to print files. In this way access to the printer is serialized and files are printed in an orderly fashion. Consider a software data structure that contains data and the date and time at which the data was written. If tasks are allowed to read and write to this structure at random, then one task may read the structure during the time that another task is updating the structure, with the result that the data may not be time stamped correctly. This type of problem may also be solved by creating a server task to manage access to the structure.
364
APPENDIX A. GLOSSARIES
Resume To run again a suspended task at the point where it was suspended. Formally, the process by which the active task is suspended and a context switch is performed to the previously suspended task.
Round Robin, Round Robin Scheduling A multi-tasking scheduling policy in which all tasks are run in their turn, one after the other. When all tasks have run, the cycle is repeated. Note that various scheduling policies are run in round robin fashion, e.g., cooperative scheduling, time-sliced scheduling, and cyclical scheduling. Cyclical and round robin scheduling are used inter-changeably.
Dynamic Scheduling A scheduling mechanism that allows for creating and scheduling new tasks and changing task priorities during execution. See also scheduling and static scheduling.
Static Scheduling A scheduling mechanism in which all tasks and task priorities are described and bound at compile-time; they cannot be changed during execution. See also scheduling and dynamic scheduling.
Schedule, Scheduling A task can be in any of the following states: 1. dormant (inactive, usually waiting for an event - a message for example), 2. waiting to run (a task that was non-existent or dormant was scheduled to run), 3. running (on a single CPU system, only one task at a time can be in the running state. A dormant task is scheduled by notifying the kernel/OS that the task is now ready to run. There are various scheduling policies: see cooperative scheduling, time-sliced scheduling, cyclical scheduling, and priority based preemptive scheduling.
365
Semaphore This term has several meanings. In its simplest form, it is a global flag - a byte of global memory in a multi-tasking system that has the value of 1 or 0. In multi-tasking systems, however, a simple global flag used in the normal manner can cause time dependent errors unless the flag is handled by a special test and set instruction like the 80x86 XCHG instruction. A semaphore is typically used to manage a resource in a multi-tasking environment - a printer for example. See XCHG for a code example. In another form, the semaphore may have a count associated with it, instead of the simple flag values of 1 or 0. When the semaphore is acquired, its count is made available to the caller, and can be used for various purposes. For example, consider a system that has three printers numbered 1, 2, and 3. A semaphore could be created with a count of 3. Each time the semaphore is acquired its count is decremented. When the count is 0, the semaphore is not available. When the semaphore is released, its count is incremented. In this way a pool of 3 printers can be shared among multiple tasks. The caller uses the semaphore count value returned to him to determine which numbered resource (e.g., printer 1, 2 or 3) he is allowed to use. Numerous variations on this theme can be created. Compare also with server task. The advantage of a server task is that it does not block the calling task if the resource is not available, thus avoiding priority inversion. In general semaphores should be avoided; furthermore they are unnecessary in a message based OS. The server task approach to resource management is preferred over semaphores as it is cleaner and less error prone.
Serial Line Analyzer, Datascope A piece of test equipment that is connected in series with a serial line for the purpose of displaying and monitoring two way serial communications. The term usually refers only to RS-232 serial lines.
Server Task A task dedicated to the management of a specific resource. A server task accepts requests in the form of messages from other tasks. For example, a server task could be created to manage access to a single printer in a multi-tasking environment. Tasks that require printing send the server a message that contains the name of the file to print. Note that the server task approach does not block the calling task like a semaphore will when the resource is unavailable; instead the message is queued to the server task, and the requesting task continues executing. This is an important consideration when is a concern.
366
APPENDIX A. GLOSSARIES
Single Step A debugger command that allows an application program to execute one line of the program, which can either be a single assembly language instruction, or a single high level language instruction. There are typically two distinct single step commands - one that will single step “into” subroutine calls, and one that will step “across” them (i.e., enter the routine, but do not show its instructions executed on the screen). The latter command is useful, otherwise many unwanted levels of subroutines would be entered while single stepping.
Soft Real-Time Refers to applications that are not of a time critical nature. Contrast with hard real-time. See also real-time.
Stress Testing The process by which a software system is put under heavy load and demanding conditions in an attempt to make it fail.
Subroutine Nesting, Subroutine Nesting Level Refers to the scenario in which subroutineA calls subroutineB which calls subroutineC ... which calls subroutineN. This is relevant for real-time systems because if a task is 7 subroutines deep, and the task is preempted, the return addresses must be preserved so that when the task is later resumed, it can return back up the chain. The most practical way of preserving subroutine nesting is by assigning each task its own stack. This way when a task is resumed, the SP is first set to the task’s stack.
Suspend, Suspension The immediate cessation of a thread, preceded by the saving of its context.
Synchronization Sometimes one task must wait for another ask to finish before it can proceed. Consider a data acquisition application with 2 tasks: taskA that acquires data and taskB that displays data. taskB cannot display new data until taskA fills in a global data structure with all the new data values. taskA and taskB are typically synchronized as follows. taskB waits for a message from taskA, and
367
since no message is available, taskB suspends. When taskA runs and updates the data structure, it sends a message to taskB which schedules taskB to run, at which time it displays the new data, then again waits for a message, and the scenario is repeated.
Target, Target Board, Host Refers to the situation where two computers are involved in the software development process - one computer to develop software (edit, compile, link, etc.), referred to as the host, and one computer to run the software, referred to as the target. The target is the actual product on which the software is to run. In most common situations, the host and target are the same. For example, a word processor is developed on the PC and runs as a product on the PC. However, for various real-time and embedded systems, this is not the case. Consider a small single board computer that controls a robot arm. The software cannot be developed on this board because it has no keyboard, display, or disk, and therefore, a host computer, like a PC, is used to develop the software. At some point, when it is believed that the software is ready for testing, the software is compiled and linked to form an executable file, and the file is downloaded to the target, typically over an RS-232 serial connection. Debugging on the target is typically done with an ICE.
Task A thread of execution that can be suspended, and later resumed.
Task Collisions, Time Dependent Error Data corruption due to preemption. Consider a preemptive multi-tasking environment in which tasks use a global flag to gain access to a printer. The following code will not have the desired effect. ryAgain: CMP flag,1 ; Printer available? JE tryAgain ; No, then try again. MOV flag,1 ; Printer was available. Set it as busy. ...use printer... MOV flag,0 ; Set printer as available. The problem is that after the CMP instruction, the current task can be preempted to run another task which also checks the flag, sees that it is available, sets the flag, does some printing, and is then preempted to run the original task, which resumes at the JE instruction. Since the 80x86 flags register (not to be confused
368
APPENDIX A. GLOSSARIES
with the flag variable), which contains the status of the last CMP instruction, is restored with the original task’s context, the JE will fail, since the flag was 0. The original task will now acquire the flag and interleave its printing with the other task’s printing. This problem can be overcome by using a read-modifywrite instruction like XCHG or by managing the printer with a server task. Time dependent errors can also occur when preemption is used with non-reentrant code. Consider a graphics library that is non-reentrant. If taskA calls a nonreentrant graphics function and is then preempted to run taskB which calls the same function, errors will result. Sometimes the non-reentrancy problem is not so obvious. Consider a non-reentrant floating software library that is called invisibly by the compiler. For example the statement y = 5.72 * x; would make a call to the floating point software multiplication function. If the current task were preempted while executing this instruction, and the next task to run also performed a multiply, then errors can again result. Note that floating point time-dependent errors could still occur even if a floating point chip was used instead of a floating point software library. An error could occur if the floating point CPU’s context context is not saved by the kernel.
Task Control Block, TCB A task control block is a data structure that contains information about the task. For example, task name, start address, a pointer to the task’s instance data structure, stack pointer, stack top, stack bottom, task number, message queue, etc.
Task Instance, Generic Task, Instance Data A single generic task can be created to handle multiple instances of a device. Consider 5 analog channels that must be read and processed every “n” milliseconds, the number “n” being different for each channel. A generic task can be created that is channel independent through the use of instance data. Instance data is a C structure that in this case contains information specific to each channel, i.e., the IO port address of the channel, the sample interval in milliseconds, etc. The task code is written so that it references all channel specific data through the instance data structure. When the task is created a pointer to the task’s task control block is returned. One of the fields of the task control block is a user available void pointer named “ptr”. After starting the task, the “ptr” field is set to an instance data structure that contains the sample rate, IO port address, etc. for that task instance. for (i = 0; i < 5; i++) { tcb = makeTask(genericAnalogTask, i); tcb->ptr = &InstanceDataArray[i];
369
startTask(tcb); } The code above shows how 5 instances of the same task are created, and each given a unique identity via tcb->ptr which points to the instance data. void genericAnalogTask(void) { typeAnalogData * d; d = (typeAnalogData *) ActiveTcb->ptr; while (TRUE) { pause(d->sampleIntervalInMs); readAndProcessChannel(d->channelIOPortAddress); } } The code above shows what the generic task might look like. For further information see the example.
Thread A thread is a task that runs concurrently with other tasks within a single executable file (e.g., within a single MS-DOS EXE file). Unlike processes, threads have access to common data through global variables.
Thread of Execution A sequence of CPU instructions that can be or have been executed.
Thread (Context) Switch Switching the CPU to another thread requires saving the state of the old thread and loading the saved state for the new thread. This is known as a context switch. The context of a thread is represented in the TCB of a thread; it includes the value of the CPU registers, the thread state, and memory management information.
Thread Switch Latency The time interval between the last instruction of the currently running thread before giving up the processor (voluntary or involuntary) and the first instruction in the next ready thread.
370
APPENDIX A. GLOSSARIES
Time-Slice, Time-Slicing, Time-Sliced Scheduling, Time-Sharing Each task is allotted a certain number of time units, typically milliseconds, during which it has exclusive control of the processor. After the time-slice has expired the task is preempted and the next task at the same priority, for which time-slicing is enabled, runs. This continues in a round-robin fashion. This means that a group of time-sliced high priority tasks will starve other lower priority tasks. For example, in a 10 task system, there are 3 high priority tasks and 7 normal priority tasks. The 3 high priority tasks have time-slicing enabled. As each high priority task’s time-slice expires, the next high priority task is run for its time slice, and so on. The high priority tasks are run forever, (each in its turn until its time-slice expires), and the low priority tasks will never run. If however, all high priority task waits for an event, (for example each pauses for 100 ms), then lower priority tasks can run. The behavior of tasks in a timeslicing environment is kernel dependent; the behavior outlined above is only one of many possibilities, but is typically the way time-sliced tasks should behave in a real-time system. Some kernels may implement the concept of fairness to handle this situation, however, fairness is repugnant to real-time principles as it compromises the concept of priority.
Trace An ICE command that will save the most recent “n” instructions executed. The trace can also be conditional, e.g., trace only those instructions that access memory between 0 and 1023.
Trace Buffer A buffer in ICE memory that stores the last “n” instructions executed. It is useful while debugging as it shows a history of what has occurred.
Virtual Memory A technique in which a large memory space is simulated with a small amount of RAM, a disk, and special paging hardware. For example, a virtual 1 megabyte address space can be simulated with 64K of RAM, a 2 megabyte disk, and paging hardware. The paging hardware translates all memory accesses through a page table. If the page that contains the memory access is not loaded into memory a fault is generated which results in the least used page being written to disk and the desired page being read from disk and loaded into memory over the least used page.
371
Watchdog Timers A technique to execute a subroutine sometime in the future. Typically used to provide timeouts. For example, a task may wait from input from a device but set up a watchdog timer to cancel the wait after some period of time.
XCHG (80x86 XCHG Instruction), Protected Flag, Read-Modify-Write The 80x86 XCHG instruction can be used in a single processor, preemptive multitasking environment to create a flag that is shared between different tasks. A simple global byte cannot be used as a flag in this type of environment since tasks may collide when attempting to acquire the flag. The XCHG instruction is typically used as follows: tryAgain: MOV AL,1 XCHG AL,flag CMP AL,0 JNE tryAgain flag is a byte in RAM. If the flag is 1, then the XCHG instruction will place a 1 into AL. If AL is 1 after the XCHG then the loop must continue until finally AL is 0. When AL is 0 the following is known: (1) The flag was 0 and therefore available, and (2) the flag is now 1 due to the exchange, meaning that the flag has been acquired. The effect is that with a single instruction, a task can check, and possibly acquire the flag, thus eliminating the possibility of a collision with another task. When using multiple processors with the flag in shared RAM, the above solution will not work, since although the instruction is indivisible for the processor that executes it, it is not indivisible with respect to the shared RAM data bus, and therefore, the shared RAM bus arbitration logic must provide a method by which one processor can lock out the others while the XCHG executes. This is typically done with the 80x86 LOCK instruction as shown below. tryAgain: MOV AL,1 LOCK XCHG AL,flag CMP AL,0 JNE tryAgain Note that the LOCK instruction provides information only. It is up the shared RAM bus arbitration logic to incorporate the LOCK* pin into its logic. Newer
372
APPENDIX A. GLOSSARIES
processors like the 80386 incorporate the LOCK implicitly into the XCHG instruction so the LOCK instruction is unnecessary.
Yield, Yielding The process by which a task voluntarily relinquishes control (via a kernel call) so that other tasks can run; the task will run again in its turn (according to its priority).
Bibliography [1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading, MA, 1983. [2] W. J. Collins. Data Structures, An Object-Oriented Approach. AddisonWesley, Reading, 1992. [3] D. C. Cooper. The equivalence of certain computations. Computer Journal, 9, 1966. [4] P. J. Courtois, F. Heymans, and D. L. Parnas. Concurrent control with readers and writers. Communications of ACM, 14(10), October 1971. [5] J. Darlington and R. M. Burstall. A system which automatically improves programs. In Proceedings of 3rd International Joint Conference on Artificial Intelligence, Stanford, California, 1976. The Third Italian Conference. [6] Martin D. Davis and Elaine J. Weyuker. Compuability, Complexity, and Languages. Academic Press, 1983. [7] E. W. Dijkstra. Co-oprating sequential processes. In F. Genuys, editor, Programming Languages. Academic Press, New York, 1965. [8] E. W. Dijkstra. Coordinating sequential processes. Technical Report Technical Report EWD-123, Technological University, Eindhoven, 1965. [9] E. W. Dijkstra. Solution of a problem in concurrent programming control. Communications of ACM, 8(9), September 1965. [10] A. N. Haberman. Introduction to Operating System Design. Science Research Associates, Palo Alto, 1976. [11] P. Brinch Hansen. A comparison of two synchronizing concepts. Acta Informatica, 1(3), 1971. [12] P. Brinch Hansen. Operating System Principle. Prentice Hall, New Jersey, 1973. 373
374
BIBLIOGRAPHY
[13] C. A. R. Hoare. Monitor: An operating system structuring concept. Communications of ACM, 17(10), October 1974. [14] C. A. R. Hoare. Communicating sequential processes. Communications of ACM, 21(8), August 1978. [15] D. J. Knuth. The Art of Computer Programming, Vol. 1, Fundamental Algorithms. Addison-Wesley, 1968. [16] D. J. Knuth. The Art of Computer Programming, Vol. 3, Sorting and Searching. Addison-Wesley, 1973. [17] A. M. Lister. Fundamentals of Operating Systems. Macmillan Press, 1979. [18] D. L. Parnas. On a solution to the cigarett smokers’ problem without conditional statements. Communications of ACM, 18(3), 1975. [19] M. S. Paterson and C. E. Hewitt. Comparative schematology. In Record of the Project MAC Conference on Concurrent Systems and Parallel Computation. ACM, Woods, Hole, Mass., 1970. [20] S. Patil. Limitations and capabilities of dijstra’s semaphore primitives for coordination among processes. Technical report, Massachusetts Institute of Technology, February 1971. [21] L. Peterson and A. Silberschatz. Operating System Concepts. AddisonWesley, 1983. [22] P. Robert and J. P. Verjus. Towards autonomous descriptions of synchronization modules. In Proceedings of IFIP Congress 1977, Toronto, 1977. IFIP Congress. [23] A. C. Shaw. The Logical Design of Operating Systems. Prentice-Hall, New Jersey, 1974. [24] H. R. Strong. Translating recursion equations into flow charts. Journals of Computer System Science, 5, 1971. [25] N. Wirth. Algorithms + Data Structures = Programs. Prentice Hall, Englewood Cliffs, New Jersey, 1976. [26] N. Wirth. Towards a discipline of real-time programming. Communications of ACM, 20(8), August 1977. [27] N. Wirth. Algorithms and Data Structures. Prentice Hall, Englewood Cliffs, New Jersey, 1986.
Index C++, 76, 77 DeltaOS, 4, 5, 8, 9, 11 programming, 4 programming techniques, 4 reference manuals, 4 world, 4 String class, 92 constructor, 94
tree binary search, 173 tree node, 146 trees AVL, 206 concurrent algorithms, 4 computation, 5 programming, 4 programming facilities, 4 programs, 4 software, 11
Algorithm Sieve, 264 array dynamic, 88, 107 Reside method, 87 assignment operator, 87 constructor, 87, 88 copy constructor, 87, 88 destructor, 87, 88 index operator, 87 static, 86 arrays dynamic, 86 static, 86
derived classes iterator, 138 digraph, 220, 232 unweighted, 221 weighted, 221 distributed, 7 computation, 5 embedded computer systems, 7 development systems, 10 software development, 5 system, 9 system designers, 7 system developers, 9 systems, 7, 9 systems designers, 10 tool set, 10 Euler, 220
Bounded-buffer problem, 246 Bounded-buffer scheme, 317 class graph, 221 heap, 195 iterator, 137, 138, 140 queue, 61 sequential list, 51 stack, 56
FIFO, 60, 64 generic, 76 375
376
generic class, 78 generic code, 77 generic parameters, 76 generic type, 76–78 graph, 220, 221, 223, 226, 228, 230, 232, 234 algorithms, 221 depth-first search, 227, 232 directed, 220 extended, 234 iterator, 223 strongly connetced, 232 undirected, 220 graphical form, 219 representation, 219, 220 graphs, 220, 227 Gries David, 264 heap, 194, 196, 200, 201 condition, 199, 200 order, 194, 196, 199 sort, 195, 203 algorithm, 203 heaps, 185, 194 iterator, 137–144 array, 143 operations, 138 iterators, 138, 143 LIFO, 56 linked list, 117, 118, 120, 123, 124, 127, 131, 137 circular, 131, 134 doubly, 134 linked lists doubly, 134 linked nodes doubly, 134 Mailbox, 318–321 Shared, 320
INDEX
Mailboxes, 319 methods trees AVL, 206 priority queue, 64 queue, 59–64 queues, 60, 61 real-time embedded application software, 8 industry, 11 operating system, 4, 8 programming, 4 software, 11 system, 8 systems, 4 search, 68 algorithm, 69 binary, 68 sequential, 68 sort, 68 bubble, 71 exchange, 70 insertion, 72 quick, 73 selection, 70 stack, 56–60 operations, 57, 59 structure, 56 stacks, 56 subtree, 146, 150 subtrees, 149 template, 76, 77 template class, 77 template function, 77, 78 template functions, 80 template parameter list, 77, 78 template type, 78 templates, 76, 77 theory
INDEX
of graphs, 220 tree, 146, 147, 149, 150, 152–154, 178 array-based, 185, 186 AVL, 206, 209 binary, 146, 149, 152, 154 iterator, 190 binary search, 173, 175, 178, 180, 184, 190, 192 structure, 146 tournament, 186 traversal algorithm, 150 traversal algorithms, 152 traversal method, 149 tree node, 146 binary, 148 trees, 149 array-based, 194 AVL, 206, 212 binary, 194 complete binary, 194
377