A Real-Time Execution Performance Agent Interface for Confidence ...

A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling by Sam Siewert B.S., University of Notre Dame, 1989 M.S., University of Colorado, 1993

A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirement for the degree of Doctor of Philosophy Department of Computer Science 2000

Copyright  2000 Sam Siewert, All Rights Reserved

This thesis entitled: A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling written by Sam Siewert has been approved for the Department of Computer Science

____________________________ Professor Gary J. Nutt, Advisor

____________________________ Professor Ren Su

Date__________________

The final copy of this thesis has been examined by the signatories, and we find that both content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline.


ii

Siewert, Sam (Ph.D. Computer Science) A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling Thesis directed by Professor Gary Nutt

Abstract The use of microprocessors and software to build real-time applications is expanding from traditional domains such as digital control, data acquisition, robotics, and digital switching, to include emerging domains like multimedia, virtual reality, optical navigation, and audio processing. These emerging real-time application domains require much more bandwidth and processing capability than the traditional real-time systems applications. Furthermore, at the same time, the potential performance and complexity of microprocessor and I/O architectures is also rapidly evolving to meet these new application demands (e.g. a super-scalar, pipelined architecture with multilevel cache with burst transmission I/O bus). Finally, the complexity of typical realtime system algorithms is increasing extant to include functions such as image processing, rulebased fault protection, and intelligent sensor processing. The foundation of real-time systems theory is the recognition that bandwidth and processing resources will always be constrained (a more demanding application always exists that can make use of increased resources as they become available). Given this reality, the question is how does an engineer formally ensure, given resource constraints, that the system will not only function correctly, but also meet timing deadlines. Since the introduction of Liu and Layland’s rate-monotonic analysis and the development of the formal theory of hard real-time systems, significant progress has been made on extending this theory and developing an engineering process for it. The problem is that the current hard real-time theory and process assumes full reliability and constrains systems more than necessary by requiring either deterministic use of resources or worst-case models of such usage. Real-time systems engineering requires translation of requirements into a system meeting cost, performance, and reliability objectives. If deadline performance was the only consideration in the engineering process, and there were no cost or reliability requirements, then current hard real-time theory is generally sufficient. In reality though, it is clear that cost and reliability must be considered, especially since emerging application domains may be more sensitive to cost and reliability than traditional hard real-time domains. Typically a direct trade can be made between cost and reliability for a given performance level. There are three main problems that exist with application of current hard real-time theory to systems requiring a balance of cost, reliability and performance. First, there is no formal approach for the design of systems for less than full reliability. Second, the assumptions and constraints of applying hard real-time theory severely limit performance. Finally, safe mixing of hard and soft real-time execution is not supported. Without a better framework for mixed hard and soft real-time requirements implementation, the engineer must either adapt hard real-time theory on a case by case basis, or risk implementing a best effort system which provides no formal assurance of performance. Soft real-time quality-ofservice frameworks are also an option. However, not only are these approaches not fully mature, more fundamentally, they do not address the concept of mixed hard and soft real-time processing, nor is it clear that any of these approaches provide concretely measurable reliability. In this thesis we present an alternative framework for the implementation of real-time systems which


iii

accommodates mixed hard and soft real-time processing with measurable reliability by providing a confidence-based scheduling and execution fault handling framework. This framework, called the RT EPA (real-time execution performance agent), provides a more natural and less constraining approach to translating both timing and functional requirements into a working system. The RT EPA framework is based on an extension to deadline monotonic theory. The RT EPA has been evaluated with simulated loading, an optical navigation test-bed, and the RT EPA monitoring module will be flown on an upcoming NASA space telescope in late 2001. The significance of this work is that it directly addresses the shortcomings in the current process for handling reliability and provides measurable reliability and performance feedback during the implementation, systems integration, and maintenance phases of the real-time systems engineering process.


iv

Acknowledgements I would especially like to thank the following people who believed in the merits of this research, helped remove road blocks, provided expert insight, and gave me moral support and encouragement along the way. Prof. Gary Nutt, Computer Science Department, Dissertation Advisor – Prof. Nutt was the perfect advisor as far as letting me drive the research, but also challenging me to get to the heart of the research and gently reminding me to keep on track and helping me to get past roadblocks preventing progress. Much of the theory in this dissertation was derived almost three years prior to completing the proof-of-concept experiments and Prof. Nutt was extremely patient and supportive while I struggled with technological aspects of the experiments such as high-bandwidth transfer for high frame rates. I think that the complexity and realistic nature of the experiments like RACE has made this a better thesis and I appreciate the support and patience Prof. Nutt provided. Dr. George Rieke, University of Arizona Steward Observatory, MIPS Principal Investigator on Space-based Infrared Telescope Facility (SIRTF) -- Dr. Rieke was very trusting and supportive of my efforts to use methods from this research in order to improve data processing performance on the Multi-band Infrared Photometer for SIRTF (MIPS) instrument. At the point in the project when there was SIRTF program concern that the MIPS instrument software would not work due to apparently random timeouts in exposure processing, Dr. Rieke supported my efforts completely without unnecessary questioning. I believe this is due to his insight and uncommon ability to appreciate technology in many different disciplines. Ball Aerospace – Ball supported my completion of this research by providing me time off to finish writing and allowing me to use information from the SIRTF/MIPS real-time scheduling work in this thesis completed under contract 960785 to the NASA Jet Propulsion Laboratory. Elaine Hansen, Director of the Colorado Space Grant Consortium – Elaine Hansen provided me with research opportunities on NASA Jet Propulsion Laboratory (JPL) projects, funding for basic research and presentations at conferences on the subject of real-time automation, and excellent review and insight of many of the early ideas which lead to completion of this research. Along the way I also learned the practical aspects of engineering real-time systems by working for director Hansen to build ground and flight software for a Shuttle payload operations system which flew on STS-85 in 1997. Prof. Ren Su, Chairman of the Electrical Engineering Department – Prof. Su provided me the opportunity to teach fundamentals of real-time systems while completing this research which was motivational and helped me to focus on the current state-of-practice in real-time embedded systems as well as directions for my research and infrastructure to complete experiments.


v

Dr. Richard Doyle, Manager of Autonomy Technology Programs and Information and Computing Technologies Research at the NASA Jet Propulsion Laboratory – Dr. Doyle provided support for basic research on real-time automation to me through the Space Grant College. Working with members of his group including Dr. Steve Chien and Dr. Dennis Decoste at JPL, I was able to formulate some of the basic concepts which inspired the RT EPA framework for realtime data processing and control. Dr. Doyle also encouraged me to present early research work at the International Space Artificial Intelligence and Robotics Applications Symposium where I was able to share concepts with many researchers from NASA and the robotics industry which helped me to understand these application domains much better.


vi

CONTENTS ABSTRACT............................................................................................................................ III ACKNOWLEDGEMENTS......................................................................................................V 1

INTRODUCTION............................................................................................................. 1 1.1 RESEARCH SCOPE......................................................................................................... 3 1.2 COMPARISON TO E XISTING APPROACHES ...................................................................... 6 1.3 PROBLEM STATEMENT.................................................................................................. 8 1.4 PROPOSED SOLUTION ................................................................................................. 10 1.5 EVALUATION ............................................................................................................. 12 1.6 SUMMARY OF RESEARCH RESULTS ............................................................................. 12 1.6.1 Theoretical Results ............................................................................................ 13 1.6.2 Framework Prototype Implementation................................................................ 13 1.6.3 Proof-of-Concept Test Results ............................................................................ 13 1.7 SIGNIFICANCE ............................................................................................................ 13

2

PROBLEM STATEMENT ............................................................................................. 15 2.1 SYSTEM TIMING ISSUES .............................................................................................. 16 2.1.1 Release Variance Due to Contention and Interference ........................................ 18 2.1.2 Dispatch and Preemption Variance Due to System Overhead.............................. 18 2.1.3 Algorithm Execution Variance Due to Non-uniform Loading In a Single Release 19 2.1.4 Architectural Execution Variance Due to Micro-parallelism and Memory Hierarchy ......................................................................................................................... 19 2.1.5 Input/Output Variance Due to Shared Resource Contention and Transfer Modes 21 2.1.6 System End-to-End Latency and Jitter ................................................................ 21 2.2 ENVIRONMENTAL E VENT RATE VARIANCE DUE TO NATURE OF EVENTS AND MODELING DIFFICULTY........................................................................................................................... 22 2.3 CHARACTERISTICS OF EMERGING REAL-TIME APPLICATIONS ........................................ 23 2.3.1 Loading characteristics of purely continuous media ........................................... 23 2.3.2 Loading characteristics of purely event-driven processing .................................. 24 2.3.3 Loading characteristics of purely digital control applications............................. 25 2.3.4 Loading characteristics of mixed processing ...................................................... 26 2.3.5 Loading characteristics of mixed event-driven and digital control....................... 27 2.3.6 Loading characteristics of mixed real-time applications in general..................... 28

3

RELATED RESEARCH................................................................................................. 30 3.1 HARD REAL-TIME RESEARCH RELATED TO RT EPA ................................................... 30 3.2 SOFT REAL-TIME RESEARCH RELATED TO RT EPA AND CONFIDENCE-BASED SCHEDULING ......................................................................................................................... 31 3.3 EXECUTION FRAMEWORKS SIMILAR TO RT EPA......................................................... 31

4

SCHEDULING EPOCHS ............................................................................................... 32 4.1 MULTIPLE ON-LINE SCHEDULING EPOCH CONCEPT DEFINITION ................................... 32 4.1.1 Admission and Scheduling Within an Epoch ....................................................... 33 4.1.2 Active epoch policy ............................................................................................ 33 4.2 EQUIVALENCE OF EDF AND MULTI-EPOCH SCHEDULING IN THE LIMIT ........................ 34 4.3 APPLICATION OF MULTI-EPOCH SCHEDULING ............................................................. 34 4.3.1 SIRTF/MIPS Multi-Epoch Scheduling Example .................................................. 35 4.3.2 Multi-epoch Scheduling Compared to Multi-level Scheduling ............................. 37

5

REAL-TIME EXECUTION PERFORMANCE AGENT FRAMEWORK................... 38 5.1

DESIGN OVERVIEW .................................................................................................... 38


vii

5.1.1 Pipeline Time Consistency and Data Consistency ............................................... 39 5.1.2 Pipeline Admission and Control ......................................................................... 39 5.2 RT EPA TRADITIONAL HARD REAL-TIME FEATURES .................................................. 40 5.3 RT EPA SOFT REAL-TIME FEATURES ......................................................................... 40 5.4 RT EPA BEST EFFORT FEATURES ............................................................................... 41 5.5 RT EPA DATA PROCESSING PIPELINE FEATURES ........................................................ 41 5.6 RT EPA IMPLEMENTATION ........................................................................................ 41 5.6.1 RT EPA Service and Configuration API.............................................................. 42 5.6.1.1 RT EPA System Initialization and Shutdown................................................................42 5.6.1.2 RT EPA Service (Thread) Admission and Dismissal .....................................................42 5.6.1.3 RT EPA Task Control..................................................................................................43 5.6.1.4 RT EPA Release and Pipeline Control..........................................................................44 5.6.1.5 RT EPA Performance Monitoring ................................................................................44 5.6.1.6 RT EPA Execution Model Utilities...............................................................................45 5.6.1.7 RT EPA Information Utilities.......................................................................................45 5.6.1.8 RT EPA Control Block ................................................................................................45 5.6.1.8.1 RT EPA CB Negotiated Service ............................................................................47 5.6.1.8.2 RT EPA CB Release and Deadline Specification....................................................47 5.6.1.8.3 RT EPA CB On-Line Statistics and Event Tags......................................................48 5.6.1.8.4 RT EPA CB On Demand or Periodic Server Computed Performance Statistics .......48 5.6.1.9 RT EPA Service Negotiation and Configuration Example .............................................49 5.6.1.10 RT EPA Admission Request and Service Specification.............................................50 5.6.1.10.1 Service Type.......................................................................................................51 5.6.1.10.2 Interference Assumption .....................................................................................51 5.6.1.10.3 Execution Model.................................................................................................52 5.6.1.10.4 Termination Deadline Miss Policy.......................................................................52 5.6.1.10.5 Release Period and Deadline Specification...........................................................52 5.6.1.11 Expected Performance Feedback..............................................................................53 5.6.1.11.1 Global Performance Parameters Update API Functions ........................................53 5.6.1.11.2 Deadline Peformance API Functions ...................................................................53 5.6.1.11.3 Execution Peformance API Functions..................................................................53 5.6.1.11.4 Release Performance API Functions ....................................................................54 5.6.1.12 RT EPA Task Activation and Execution Specification ..............................................54 5.6.1.12.1 Service Execution Entry Point and Soft Deadline Miss Callback...........................54 5.6.1.12.2 Service Release Complete Isochronal Callback ....................................................54 5.6.1.12.3 Release Type and Event Specification..................................................................54 5.6.1.12.4 Service On-Line Model Size................................................................................55 5.6.1.13 Service Performance Monitoring Specification .........................................................55

5.6.2

RT EPA Kernel-Level Monitoring and Control ................................................... 55

5.6.2.1 Event Release Wrapper Code .......................................................................................55 5.6.2.1.1 ISR Release Wrapper Code ...................................................................................56 5.6.2.1.2 RT EPA Event Release Wrapper Code...................................................................56 5.6.2.2 Dispatch and Preempt Event Code................................................................................57 5.6.2.3 Release Frequency .......................................................................................................58 5.6.2.4 Execution Time ...........................................................................................................58 5.6.2.5 Response Time ............................................................................................................58 5.6.2.6 Deadline Miss Management .........................................................................................58 5.6.2.6.1 Terminate Execution that would Exceed Hard Deadline .........................................59 5.6.2.6.2 Hard Deadline Miss Restart Policy.........................................................................59 5.6.2.6.3 Termination Deadline Miss Dismissal Policy .........................................................60

5.6.3

Performance Monitoring and Re-negotiation...................................................... 60

5.6.3.1 5.6.3.2

6

THE CONFIDENCE-BASED SCHEDULING FORMULATION ................................ 62 6.1 6.2 6.3

7

Soft Deadline Confidence ............................................................................................61 Hard Deadline Confidence ...........................................................................................61

RT EPA CBDM CONCEPT ......................................................................................... 62 CBDM DEADLINE CONFIDENCE FROM EXECUTION CONFIDENCE INTERVALS ............... 63 CBDM ADMISSION TEST EXAMPLE ............................................................................ 64

EVALUATION METHOD ............................................................................................. 67


viii

7.1 RT EPA PSEUDO LOADING EVALUATION ................................................................... 67 7.2 SIRTF/MIPS VIDEO PROCESSING RT EPA MONITORING EVALUATION ....................... 69 7.3 SIRTF/MIPS VIDEO PROCESSING RT EPA EPOCH EVALUATION ................................. 69 7.4 DIGITAL VIDEO PIPELINE TEST-BED ............................................................................ 73 7.4.1 NTSC Digital Video Decoder DMA Micro-coding .............................................. 73 7.4.2 RT EPA Digital Video Processing Pipeline......................................................... 74 7.5 RACE OPTICAL NAVIGATION AND CONTROL EXPERIMENT ......................................... 75 7.5.1 RACE Mechanical System Overview................................................................... 76 7.5.2 RACE Electronics System Description................................................................ 77 7.5.3 RACE RT EPA Command, Control, and Telemetry Services................................ 78 7.5.3.1 7.5.3.2 7.5.3.3 7.5.3.4 7.5.3.5 7.5.3.6 7.5.3.7

Frame-based Processing and Control Sequencing..........................................................79 Frame Display Compression/Formatting Algorithm ......................................................79 Optical Navigation Algorithm ......................................................................................79 RACE Control Algorithm ............................................................................................80 State Telemetry Link Algorithm...................................................................................80 Grayscale Frame Link Algorithm .................................................................................81 NTSC Camera Tilt and Pan Control Algorithm.............................................................81

7.5.4 RACE RT EPA Software System ......................................................................... 82 7.6 ROBOTIC TEST-BED .................................................................................................... 83 7.7 ROBOTICS TEST-BED INCONCLUSIVE RESULTS ............................................................ 84 8

EXPERIMENTAL RESULTS........................................................................................ 85 8.1 RT EPA EXPERIMENTATION GOALS ........................................................................... 85 8.2 RT EPA PSEUDO LOADING TESTS .............................................................................. 85 8.2.1 Pseudo Load Marginal Task Set Negotiation and Re-negotiation Testing (Goal 2) 86 8.3 SIRTF/MIPS VIDEO PROCESSING RT EPA MONITORING EVALUATION ....................... 88 8.3.1 SIRTF/MIPS RT EPA DM Priority Assignment................................................... 89 8.3.2 MIPS Exposure-Start Reference Timing Model................................................... 91 8.3.3 SIRTF/MIPS Exposure Steady-State Reference Timing Model............................. 95 8.3.4 SIRTF/MIPS SUR Mode Steady-State ME Results............................................... 97 8.3.5 SIRTF/MIPS Raw Mode Steady-State Results ..................................................... 99 8.3.6 SIRTF/MIPS Video Processing RT EPA Epoch Evaluation ................................. 99 8.4 DIGITAL VIDEO PIPELINE TEST-BED RESULTS ............................................................100 8.5 RACE RESULTS ........................................................................................................102 8.5.1 RACE Marginal Task Set Experiment (Goal 1) ..................................................102 8.5.2 RACE Nominal Configuration Results ...............................................................103 8.5.2.1 8.5.2.2 8.5.2.3 8.5.2.4 8.5.2.5 8.5.2.6 8.5.2.7

8.5.3 8.5.4 8.5.5 8.5.5.1 8.5.5.2

9

Bt878 Video Frame Buffer Service.............................................................................104 Frame Display Formatting and Compression Service ..................................................105 Optical Navigation Ranging and Centroid Location Service ........................................106 RACE Vehicle Ramp Distance Control ......................................................................107 RACE Vehicle Telemetry Processing .........................................................................108 RACE Video Frame Link Processing..........................................................................109 RACE Camera Control ..............................................................................................110

RACE RT EPA Initial Service Negotiation and Re-negotiation (Goal 2) .............110 RACE Release Phasing Control Demonstration (Goal 3a and 3b)......................111 RACE Protection of System from Unbounded Overruns (Goal 5) .......................112 Example of Unanticipated Contention for I/O and CPU Resources ..............................112 RT EPA Protection from Period/Execution Jitter Due to Misconfiguration (Goal 5).....113

SIGNIFICANCE ............................................................................................................115

10

PLANS FOR FUTURE RESEARCH ........................................................................116

11

CONCLUSION...........................................................................................................117

REFERENCES ......................................................................................................................118


ix

APPENDIX A ........................................................................................................................121 RT EPA SOURCE CODE API SPECIFICATION .........................................................................121 APPENDIX B.........................................................................................................................129 LOADING ANALYSIS FOR IMAGE CENTROID CALCULATION WITH VARIANCE DUE TO CACHE MISSES.................................................................................................................................129 11.1 ARHITECTURE PERFORMANCE ASSUMPTIONS .............................................................129 11.2 33 MHZ RAD6000 ANALYSIS ...................................................................................129 11.3 CENTROID COMPUTATION TIME MODEL.....................................................................130 11.3.1 Alogirthm Description.......................................................................................130 11.3.2 Load-Store RISC Pseudo-code Instructions to Implement for X-bar and Y-bar ...130 11.4 OVERALL EXPECTED CACHE HIT RATE ......................................................................131 11.5 CENTROID CPI ESTIMATIONS ....................................................................................131 11.6 ALGORITHM COMPLEXITY .........................................................................................131 11.7 TIME TO COMPUTE ARRAY CENTROID........................................................................132 11.8 EXAMPLE FOR 1024X1024 ARRAY .............................................................................132 11.9 GENERAL RESULT .....................................................................................................132 APPENDIX C ........................................................................................................................133 UNMODELED INTERFERENCE CAUSES SEVERAL TERMINATION DEADLINE MISSES ..................133 APPENDIX D ........................................................................................................................144 RACE INITIAL SCHEDULING AND CONFIGURATION ADMISSION RESULTS ...............................144 APPENDIX E.........................................................................................................................148 VIDEO PIPELINE TEST RESULTS (WITHOUT ISOCHRONOUS OUTPUT).......................................148 APPENDIX F.........................................................................................................................157 VIDEO PIPELINE TEST RESULTS (WITH ISOCHRONOUS OUTPUT) .............................................157


x

TABLES TABLE 1: MIXED BEST EFFORT, SOFT, AND HARD REAL-TIME APPLICATION EXAMPLE ............... 10 TABLE 2: ENVIRONMENTAL EVENT-RATE TYPES WITH APPLICATION EXAMPLES OF EACH ........... 28 TABLE 3: SINGLE EPOCH DESIGN OF THE SIRTF/MIPS VIDEO PROCESSING WITH 16 KWORD FIFOS ............................................................................................................................... 36 TABLE 4: SINGLE EPOCH DESIGN OF THE SIRTF/MIPS VIDEO PROCESSING WITH 4 KWORD FIFOS ......................................................................................................................................... 36 TABLE 5: MULTIPLE EPOCH DESIGN OF THE SIRTF/MIPS STEADY-STATE VIDEO PROCESSING.... 37 TABLE 6: RT EPA DEADLINE MANAGEMENT SUMMARY ............................................................ 59 TABLE 7: PSEUDO SOURCE/SINK EXPERIMENT T ASK SET DESCRIPTION ....................................... 68 TABLE 8A: EPOCH 1 OF THE SIRTF/MIPS VIDEO PROCESSING.................................................... 70 TABLE 8B: EPOCH 2 OF THE SIRTF/MIPS VIDEO PROCESSING .................................................... 71 TABLE 8C: EPOCH 3 OF THE SIRTF/MIPS VIDEO PROCESSING .................................................... 72 TABLE 9: DIGITAL VIDEO PIPELINE SERVICES ............................................................................ 75 TABLE 10: RACE TASK SET DESCRIPTION ................................................................................. 79 TABLE 11: 5 DOF ROBOTIC EXPERIMENT T ASK SET DESCRIPTION.............................................. 84 TABLE 12: PSEUDO LOADING MARGINAL TASK SET DESCRIPTION (TIMER RELEASED) ................ 86 TABLE 13: PSEUDO LOADING ACTUAL MARGINAL T ASK SET PERFORMANCE (TIMER RELEASED) 87 TABLE 14: PSEUDO LOADING MARGINAL TASK SET DESCRIPTION (TIMER RELEASED) ................ 87 TABLE 15: PSEUDO LOADING ACTUAL MARGINAL T ASK SET PERFORMANCE (TIMER RELEASED) 88 TABLE 16: RT EPA EXECUTION JITTER IN SIRTF/MIPS SI FRAME PROCESSING RELEASES ......... 89 TABLE 17: RT EPA EXECUTION JITTER IN SIRTF/MIPS SI OPTIMIZED FRAME PROCESSING RELEASES .......................................................................................................................... 89 TABLE 18: SIRTF/MIPS DM PRIORITY ASSIGNMENTS ............................................................... 90 TABLE 19: MIPS SUR C0F2N2 EXPOSURE START VMETRO TIME TAGS ..................................... 94 TABLE 20: MIPS RAW C0F1N2 EXPOSURE START VMETRO TIME TAGS ..................................... 95 TABLE 21: SUR C0F2N2 STEADY-STATE EXPOSURE TIME T AGS ................................................ 98 TABLE 22: RAW C0F1N2 STEADY-STATE E XPOSURE TIME TAGS ................................................ 99 TABLE 23: DIGITAL VIDEO PIPELINE MARGINAL TASK SET DESCRIPTION...................................100 TABLE 24: ACTUAL DIGITAL VIDEO T ASK SET PERFORMANCE...................................................101 TABLE 25: RACE SOURCE/SINK PIPELINE TASK SET DESCRIPTION ............................................103 TABLE 26: RACE STANDARD PIPELINE PHASING AND RELEASE FREQUENCIES ...........................103 TABLE 27: RACE SOFT AND TERMINATION DEADLINE ASSIGNMENT .........................................104 TABLE 28: INITIAL RACE SOURCE/SINK PIPELINE T ASK SERVICE DESCRIPTION.........................111 TABLE 29: RACE SOURCE/SINK ACTUAL PERFORMANCE ..........................................................111


xi

FIGURES FIGURE 1: RT EPA CONFIDENCE-BASED SCHEDULING UPPER-BOUND .......................................... 4 FIGURE 2: THE RT EPA UTILITY ASSUMPTION............................................................................. 6 FIGURE 3: END-TO-END JITTER FROM E VENT RELEASE TO RESPONSE .......................................... 22 FIGURE 4: CONTINUOUS MEDIA DIGITAL VIDEO PIPELINE........................................................... 24 FIGURE 5: PURELY EVENT-DRIVEN REAL-TIME PROCESSING ...................................................... 25 FIGURE 6: REAL-TIME DIGITAL CONTROL PROCESSING .............................................................. 26 FIGURE 7: MIXED CONTINUOUS MEDIA AND E VENT-DRIVEN REAL-TIME PROCESSING ................ 27 FIGURE 8: MIXED DIGITAL CONTROL AND E VENT-DRIVEN REAL-TIME PROCESSING ................... 28 FIGURE 9: FEATURE SPACE OF SYSTEM I/O AND CPU REQUIREMENTS BY APPLICATION TYPE..... 29 FIGURE 10: MULTIPLE EPOCHS OF SCHEDULING ACTIVE SIMULTANEOUSLY ................................ 32 FIGURE 11: IN-KERNEL PIPE WITH FILTER STAGE AND DEVICE INTERFACE MODULES .................. 38 FIGURE 12: EXECUTION E VENTS AND DESIRED RESPONSE SHOWING UTILITY.............................. 62 FIGURE 13: EPA PSEUDO LOADING PIPELINE ............................................................................. 67 FIGURE 14: SIRTF/MIPS DUAL STREAM PIPELINE ..................................................................... 69 FIGURE 15: BASIC DIGITAL VIDEO RT EPA PIPELINE ................................................................. 75 FIGURE 16 A AND B : RACE SYSTEM SIDE-VIEW(A) AND FRONTAL-VIEW (B).......................... 76 FIGURE 17: RACE VEHICLE AND GROUND CONTROL SYSTEM ELECTRONICS .............................. 77 FIGURE 18: RACE ELECTRONICS ............................................................................................... 78 FIGURE 19 A AND B: TARGET WIDTH DISTRIBUTION FOR ALL SCAN-LINES – CLOSE (A) AND FAR (B) .................................................................................................................................... 80 FIGURE 20: RACE EPA PIPELINE .............................................................................................. 82 FIGURE 21 A AND B: 5 DOF DEAD-RECKONING ROBOT (A, LEFT), POSITION FEEDBACK ROBOT (B, RIGHT)............................................................................................................................... 84 FIGURE 22: MIPS MODE HTG READY E XPOSURE-START HW/SW SYNCHRONIZATION WINDOW 91 FIGURE 23: MIPS EXPOSURE START WORST CASE DELAY (CASE A) .......................................... 92 FIGURE 24: MIPS EXPOSURE START BEST CASE DELAY (CASE B) .............................................. 93 FIGURE 25: SUR C0F2NN FIRST DCE DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 96 FIGURE 26: SUR C0F2NN DCE 2 TO N DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 96 FIGURE 27: FIRST DCE RAW C0F1NN DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 97 FIGURE 28: DCE 2 TO N RAW C0F1NN DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 97 FIGURE 29 A AND B: RACE FRAME COMPRESSION (A) AND RESPONSE JITTER (B) ....................100 FIGURE 30 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE JITTER (B)................101 FIGURE 31 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE JITTER (B) WITH ISOCHRONAL OUTPUT CONTROL........................................................................................102 FIGURE 32: BT878 VIDEO RELEASE JITTER ...............................................................................104 FIGURE 33 A AND B: BT878 VIDEO EXECUTION (A) AND RESPONSE JITTER (B) .........................105 FIGURE 34: RACE FRAME DISPLAY SERVICE RELEASE PERIOD JITTER ......................................105 FIGURE 35 A AND B: FRAME DISPLAY SERVICE EXECUTION (A) AND RESPONSE (B) LATENCY AND JITTER ..............................................................................................................................106 FIGURE 36: OPTICAL NAVIGATION E VENT RELEASE PERIOD JITTER ...........................................106 FIGURE 37 A: OPTICAL NAVIGATION EXECUTION (A) AND RESPONSE JITTER (B) .......................107 FIGURE 38: RAMP CONTROL RELEASE PERIOD JITTER ................................................................107 FIGURE 39 A AND B: RAMP CONTROL EXECUTION (A) AND RESPONSE (B) JITTER .....................108 FIGURE 40: RACE TELEMETRY RELEASE PERIOD JITTER ...........................................................108 FIGURE 41 A AND B: RACE TELEMETRY E XECUTION (A) AND RESPONSE (B) JITTER ................109 FIGURE 42: RACE FRAME LINK RELEASE PERIOD JITTER ..........................................................109 FIGURE 43 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE (B) JITTER................109 FIGURE 44: RACE CAMERA CONTROL PERIOD RELEASE JITTER ................................................110 FIGURE 45 A AND B: RACE CAMERA CONTROL E XECUTION (A) AND RESPONSE (B) JITTER ......110 FIGURE 46 A AND B: BEFORE AND AFTER PHASING CONTROL ...................................................112


xii

FIGURE 47: FRAME LINK TERMINATION DEADLINE MISS CONTROL............................................113 FIGURE 48: MISCONFIGURATION E XECUTION VARIANCE EXAMPLE ............................................114


xiii

1

Introduction

The range of real-time applications is expanding from traditional domains such as digital control, data acquisition, and digital switching, to include emerging domains like multimedia, virtual reality, optical navigation, and speech recognition, all of which really expand the input/output range and frequency of such systems and therefore impose higher bandwidth and processing requirements. At the same time, the complexity of microprocessors, typical algorithms, and input/output architectures is also increasing in an attempt to provide performance that can handle bandwidth and processing demands. The foundation of real-time systems theory is the recognition that bandwidth and processing resources will always be constrained (a more demanding application always exists that can make use of increased resources) and that given this reality, the question is how does an engineer formally ensure that given the resource constraints, the system will not only function correctly, but function correctly and meet timing deadlines. Since the introduction of the formal theory of real-time systems, perhaps best marked by the work of Liu and Layland on rate-monotonic analysis [LiuLay73], significant progress has been made on the engineering process for hard real-time systems using rate-monotonic theory. Improvements to the basic theory have been made, but perhaps more importantly, improvements to the real-time systems engineering process have been made. In order to engineer a real-time microprocessor-based system, ultimately it is necessary that the engineer be able to translate requirements into a system which meets cost, performance, and reliability objectives for the system. For a system which has timing requirements, where deadlines must be met, the definition of a real-time system, this means that in addition to correct function, the system must also provide correct function by a deadline, with required reliability and within cost constraints. Engineering a system requires a process and methods to measure the quality and success of each step in the process. Traditional steps in the engineering process include analysis, design, implementation, unit testing, integration, systems testing and maintenance. Depending upon the application, the materials, performance requirements, cost constraints and required reliability, this process must have figures of merit that provide meaningful feedback to the process. The current state of practice in real-time systems is to perform rate monotonic analysis (RMA) using estimated worst-case execution and release periods, design the functional code, and then implement it in a priority preemptive multitasking operating system or interrupt driven executive environment [Bu91]. The problem is that RMA assumes 100% reliability and furthermore it requires significant resource margin (approximately 20-30% of the central processing unit resource). Finally it is not clear how such a system will react in an overload scenario. The research presented in this thesis intends to provide a way of implementing not only timing performance requirements, but to also meet cost and reliability requirements by providing an implementation framework for mixed hard and soft real-time services. In real-time systems most services are provided by event-released tasks which are tasks providing a service dispatched for execution from an interrupt . Safely scheduling event-released


1

tasks, to complete execution by relative real-time deadlines is most challenging when processor loading approaches full utility (i.e. 100% central processing unit utilization). Liu and Layland established interesting bounds on this problem with the RMA least upper bound (provable safe utility < 0.7) and the ideal theoretical upper bound of full utility provided by the Earliest Deadline First (EDF) dynamic priority algorithm [LiuLay73]. The derivation of the RMA least upper bound treats all tasks in the set as having equally hard deadlines and is pessimistic since it requires a worst-case release and a worst-case execution assumption. Task sets with process utility loads below the RMA least upper bound can easily be scheduled, and as Liu and Layland point out, the EDF ideal is not achievable given the impracticality of the dynamic priority assignment required. What is perhaps most interesting about Liu and Layland’s work is that it provides the bounds for a marginally safe region of real-time scheduling using the fixed priority preemptive scheduling method – i.e. those task sets with loading between the RMA least upper bound and full utility (provable safe utility < 0.7 < marginally safe utility < full utility) – a marginal task set. Due to pessimistic execution time and period assumptions, tasks can be scheduled successfully within this region of marginally safe utility. The problem is that the safety of meeting deadlines for such a system cannot be mathematically guaranteed, nor can it even be well estimated from a reliability perspective with RMA. Ideally, from an engineering viewpoint, it would be beneficial to be able to schedule a marginal task set such that the reliability in meeting deadlines could be specified for each task and so that the range of reliability includes guaranteed service, soft real-time service (number of missed deadlines over a period is fully predictable), and best effort tasks. A formulation and implementation of such a programming framework called the RT EPA (Real-Time Execution Performance Agent) is introduced in this thesis. The RT EPA can be applied to typical real-time applications such as continuous media, digital control, and event processing. A confidence-interval mathematical formulation for task admission is used to provide a range of scheduling reliability within the RT EPA framework which enforces policy and provides an intuitive way to build processing pipelines and to schedule execution of task sets based on desired deadline reliability performance. An evaluation of the RT EPA framework is presented based on simulated applications, an optical navigation test-bed called RACE (Railguided, Air-powered Controls Experiment), and a video processing system for the SIRTF/MIPS (Space-based Infrared Telescope Facility / Multi-band Imaging Photometer for SIRTF) instrument. The results show that given minimal programmer specification, the RT EPA applications performed according to expectation for guaranteed performance requirements, within specified tolerances for soft real-time requirements, and also supported best effort performance for a marginal task set. The simulated applications and the SIRTF/MIPS instrument application prove the viability of the theory, implementation concept, and ability to solve difficult scheduling problems that include a mix of hard, soft and best effort tasks in a single real-time application. This thesis introduces a novel real-time scheduling method and real-time programming framework along with examples of how it can be used. The examples provide evidence that this approach is a valuable new way to implement real-time applications which include high loading and mixed hard and soft real-time tasks. The thesis defines the scope of real-time execution addressed; defines the problems with existing hard and soft real-time frameworks; provides a


2

proposed solution to those problems; presents results of testing an implementation of the proposed solution; and finally describes the significance of the results.

1.1

Research Scope

The purpose of the RT EPA research is to validate extensions to Deadline Montonic (DM) theory for mixed hard and soft services in multiphase in-kernel pipelines. The threads of execution in the RT EPA are trusted modules that are loaded into kernel space and dynamically bound to kernel symbols. This is based upon the concept that such modules can be tested in protected memory spaces for correctness and once thoroughly tested for correctness, loaded into kernel space for better performance (elimination of system call traps) and for better real-time scheduling control (kernel-level threads). So, the research includes building an RT EPA framework according to the design presented in Section 5 and evaluating this framework with synthetic loads, an optical navigation test-bed, and on an actual NASA space-based telescope. Evaluating the RT EPA framework with applications that include marginal services (thread sets) is of fundamental importance. Furthermore, the applications must include both event driven thread releases and time-based releases. The ideal application test of the RT EPA should include mixed hard and soft real-time services released both by external events and by an internal clock to provide a significant validation of the framework. Scheduling event-released tasks (tasks released by interrupts generated from external events) to complete execution by relative real-time deadlines is not difficult when the processor on which these tasks are executed is under-loaded. An under-loaded system is considered to have processor loading less than the RMA least upper bound (approximately 70%). Constrained processor resources make the goal of safely scheduling tasks for real-time execution much more difficult as loading demands approach full utility. As previously mentioned, Liu and Layland set out a basic theory for safe and practical hard real-time processor scheduling utility when they formulated RMA. It is important to note that they also compared this RMA least upper bound to the ideal theoretical full utility upper bound provided by the Earliest Deadline First (EDF) dynamic priority algorithm (the algorithm most typically used in real-time pipelines – see Section 1.2). They derived the RMA least upper bound, which asymptotes to 70 percent utility with increasing numbers of tasks in a set. Furthermore, they established that it was not safe to assume real-time deadlines could be met with the EDF theoretical upper bound due to the impracticality of dynamic priority assignment (i.e. other real-time pipeline implementations have questionable real-time safety – discussed in Section 1.2). The derivation of the RMA least upper bound treats all tasks in the set as having equally hard deadlines. The RMA least upper bound is pessimistic for the following reasons: 1) overestimation of interference (the necessary and sufficient RMA least upper bound derived since is fully accurate in terms of interference), 2) worst-case release assumption (highest release frequency assumed for quasi-periodic releases), and 3) worst-case execution assumption (longest execution time possible assumed) [LiuLay73]. Their RMA least upper bound is perhaps slightly optimistic since it does not include overhead of context switching, but this is typically insignificant (e.g. 100 microseconds for release execution times in the millisecond range, so less than 10 percent and can be included in release execution time). Task sets with process


3

utility loads below the RMA least upper bound can easily be scheduled by a priority preemptive system with highest priority assigned to tasks with the shortest release period. Perhaps what is most interesting about Liu and Layland’s work is that it provides bounds for an interesting region of real-time scheduling – those task sets with loading between the RMA least upper bound and full utility. Tasks can be scheduled successfully to some level of utility between the RMA least upper bound and ideal EDF full utility given the margin afforded by the RMA pessimistic assumptions – the problem is that the safety of meeting deadlines for such a system cannot be mathematically guaranteed, nor can it even be well estimated from a reliability perspective. Task sets which fall between the RMA least upper bound and full utilization are referred to as marginal task sets in this thesis and are the motivation for the research completed. Figure 1 graphically shows the marginal thread scheduling region that is the primary region of theoretical interest and therefore also the goal for experimental validation. Validation of the RT EPA in the under-loaded region shows that the framework is functionally correct, but does not test the features for providing an execution environment for mixed reliable hard and soft real-time services. Two major new theories for real-time scheduling of the central processing unit (CPU) are presented here and validated in an RT EPA implementation: 1) Confidence-Based Deadline Monotonic (CBDM) scheduling, and 2) Multi-Epoch (ME) theory for scheduling. These two new theories are implemented in the RT EPA framework to provide the service admission test, scheduling policy, and overall to provide the capability to reliably schedule mixed hard and soft services in marginally loaded systems. System safety in the RT EPA is provided by on-line monitoring of releases and their deadlines such that release over-runs are controlled and the system may associate an action with such an over-run to safe the system (i.e. disconnect actuators or other devices which may be damaged by further execution and/or enter a fail-safe mode such as switching automatically to a back-up protection control system). Figure 1: RT EPA Confidence-Based Scheduling Upper-Bound RT EPA CBDM Upper Bound

No Loading

RMA LeastUpper Bound

Marginal Thread Sets

Under-loaded Thread Sets 0.0


EDF Theoretical Bound

0.7

1.0

4

The RT EPA is a framework for implementing mixed hard and soft real-time services (each implemented with a thread of computation) with on-line monitoring, control, and admission with quantifiable performance. The scope of the RT EPA research is summarized as follows: 1) Establishing an admission test and sufficient upper bound for thread sets with specified deadline confidence requirements. This upper bound is thread set specific and based on required confidences, but will always lie between the RMA sufficient least upper bound and the theoretical EDF bound of full utility. 2) The termination deadline must be less than the thread release period so that the CBDM admission test can bound interference (i.e. no overruns beyond the release period are allowed). In other words, multiple active releases of the same service are not allowed. 3) Thread priorities are not specified to the RT EPA, but the RT EPA CBDM policy requires fixed priority preemptive scheduling within a single scheduling epoch. 4) The RT EPA assumes that response utility is nonnegative between release and the termination deadline and has a maximum value on that interval. This utility assumption inherently includes both hard and soft real-time services, which are best described by the utility provided by such a service completing before a deadline relative to release. 5) Multiple scheduling epochs are investigated and shown to be viable and beneficial. An online multi-epoch RT EPA capability provides limited-scope dynamic prioritization – i.e. priorities are fixed within a single epoch, but change between epochs. Threads sets are then admitted to one of n epochs rather than to a system (i.e. single epoch). The research presented shows how multi-epoch priorities and admission tests remove pessimism inherent in the RMA critical instant assumption; it also shows that with an increasing number of epochs, the EDF theoretical ideal can be approached, but never reached due to overhead associated with online multi-epoch management. 6) Deadline miss and overrun policy includes traditional hard real-time full system safing (e.g. switching to backup control and/or disabling actuators) as well as soft real-time policies to: 1) terminate overrunning release, and restart for next release (allow a dropout), 2) allow overrun of a soft deadline to proceed noted and with application callback (soft overrun), or 3) dismiss a thread from the on-line set when it overruns and perform application reconfiguration callback. Why should the RT EPA support both hard and soft services? This goal was derived from the fact that many systems include the concept of both hard and soft services with respect to completing a service release by a deadline and the relative utility in doing so. The RT EPA utility assumption is based on the goal for the RT EPA to support a mixed set of hard and soft real-time services together as shown in Figure 2.


5

Figure 2: The RT EPA Utility Assumption Utility Soft Real-Time Isochronous Utility Curve

Hard Real-time Bounding Curve

1.0 Soft Real-Time Deminishing Utility Curve

Time

Release

Soft Deadline (Overrun Allowed)

Deadline (Hard or Futility)

Stated simply, the hard real-time utility curve is assumed to produce full system utility for a response any time after release, but before deadline – at the deadline, hard real-time utility goes negative, meaning that continued execution causes “negative utility” or harm. RT EPA soft realtime utility is considered to be any piecewise continuous function inside this hard real-time utility bound. All deadlines are considered relative to the time the event is released. The RT EPA supports two deadline concepts: 1) a soft deadline which indicates an application notion of a diminishing utility boundary, and 2) a termination deadline. For hard real-time services the termination deadline has the traditional definition of full system failure and for soft real-time services it simply marks the point at which continued execution will no longer lead to any system or service utility. The RT EPA does not consider a service which has a utility curve that never diminishes to zero to be a real-time service, but does accommodate such a service by providing a best effort scheduling class. Note that at the point where utility becomes zero it is futile to continue in the case of a soft service and it may actually be even more harmful to continue beyond this point for a hard service – either way, continuing beyond the zero utility point means that the system is being loaded with work that has no value.

1.2

Comparison to Existing Approaches

Traditionally, if an application requires service time assurances, there are three approaches: best-effort systems, hard real-time systems, and QoS (Quality of Service) soft real-time systems. Best-effort systems rely on adequate resources always being available for a task set, and can make no guarantees when resources are even temporarily overloaded. Hard real-time systems require


6

that the application provide resource bounds to ensure margin (e.g. release worst-case execution time and period), so that scheduling feasibility can be mathematically assured for the worst case; task sets are only admitted when completion for all tasks can be guaranteed by hard deadlines in all circumstances. Typically, QoS systems provide resource reservation or specification of levels of service such that each task within the system set will have a performance bound; however, it is not clear how an abstract level or reservation translates into deadline reliability. Work has been completed to provide improved translations between service levels and reliability [BraNu98], but the RT EPA is the only framework that provides a direct probability theoretical relationship between execution model confidence and deadline reliability. The goal of directly associating model confidence and deadline reliability is a traditional engineering approach. For example, a solid mechanics engineer is able to test material used as members in a system and obtain a confidence in stress/strain performance which is then ultimately translated into overall structural system reliability. The mapping between QoS levels and actual performance is weak since it is not linked in any way to actual variances in the system, but rather, requires the service negotiator to estimate needs up front – this is much like estimating capital needed to complete a construction project. In such QoS systems resources are reserved and protected from misuse [MerSav94], but the allocation of resources requires high predictability in service demands. Reserving more resources is always beneficial, but clearly not possible, so what is not clear about QoS is how to translate real-time system performance requirements into resource requests. Most QoS systems have addressed this problem by providing for on-line re-negotiation of service levels [BraNu98], [JonRos97], [NuBra99]. Iterative request methods are a possibility, and have been investigated, but having a good estimate up front based on a mathematical model of loading including execution time variance and release period variance would give a better idea of worst case and average case resource needs and provide good bounds for such negotiation. Unless the underlying QoS levels intuitively map mathematical execution models into levels of service, this negotiation will not be straight forward – this is the case for existing QoS methods. The RT EPA confidence-based models provided initially and continually refined on-line provide a simple method for refining resource allocations and renegotiating service levels which is concrete by comparison (grounded in probability theory). The RT EPA sets up processing pipelines using trusted modules. This is based upon the current best practice available to balance safety with the need for efficiency and resource control. A number of efforts have been made in previous research to construct real-time in-kernel pipeline processing frameworks [Gov91], [Co94], [Fal94], [MosPet96]. In all cases these frameworks either employed an EDF service admission policy or in the case of Fall’s work, the scheduling was best effort. EDF is a fully dynamic priority assignment policy which makes it difficult to prove that it will be able to meet desired deadlines for all services in advance. By comparison, the RT EPA employs an extension to Deadline Monotonic (DM) scheduling which has been proven to be a safe and optimal policy for hard real-time systems [Au93]. Govindan’s in-kernel pipe work was the earliest and functioned by providing a memory mapping to devices directly to the application level, essentially defeating memory protection completely for pipelined applications in this environment. Since Govindan’s work, the concept of


7

trusted kernel modules has become the best way to deal with providing memory protection for applications yet at the same time providing for greater efficiency and control of resources needed by particular threads/services in a real-time system. A trusted module is dynamically linked into the kernel address space with a privileged command to load the object code. This concept of loadable modules is currently supported in Wind River Systems VxWorks 6.x, Sun Microsystems Solaris 2.x, and in the Linux 2.4.x operating system. All three of these popular operating systems also include memory protection domains for applications in addition to the trusted modules such that the RT EPA can provide a system call interface for non-real-time control of services, additional best effort applications and off-line initialization. Work subsequent to Govindan has made use of trusted kernel modules rather than opening up a hole into the kernel address space for all applications. The efficacy of this approach has been investigated in detail in the SPIN operating system [Be95] and this is widely accepted as a safe compromise for services that need improved efficiency and resource control, yet maintaining overall system safety for applications. Based on this related research history on in-kernel pipeline frameworks, the RT EPA employs the trusted module mechanism and policy.

1.3

Problem Statement

Ideally, a programmer should be able to take a set of real-time performance requirements and map them into service requests directly. For example, consider the characteristics and real-time requirements for command and control of a remote digital video camera as summarized in Table 1. This example is a typical real-time digital system and includes three important real-time application types: 1) continuous media (e.g. video or audio), 2) digital control, and 3) event-driven automation (e.g. fault protection) [Si96]. Furthermore, this example application includes all three types of real-time processing in a mixed service application – hard real-time, soft real-time, and best-effort -- which is also typical of many emerging real-time applications. A survey of real-time application types is provided in Section 2. It should be noted that variances in execution and release frequency are also typical in this example. For example, execution variance can stem from complex algorithms (e.g. data-driven compression algorithms) and complex yet overall very efficient microprocessor architectures (e.g. pipelined super-scalar with L1/L2 cache). The use of real-time systems for applications with complex algorithms and microprocessors still requires traditional hard real-time control functions in addition to softer real-time functions when the application provides a mixture of services (as is true in this example). The largest execution variance in the Table 1 example comes from the video acquisition and compression typical of memory reference streams that require a cache larger than the L1/L2 cache. This scenario is typical of video compression processing since data driven compression algorithms and buffer copies normally produces high cache miss rates and therefore pipeline stalls (this is discussed in more detail in Section 2). What is most notable is that less than guaranteed reliability is also acceptable on two of the largest loads. This is because from a user viewpoint, it is often


8

acceptable to drop a video frame occasionally at the 20 frames/sec rate and yet still have acceptable quality. The deadline reliability for frame processing in this example is that, on average, there should be less than one frame dropout every 5 seconds and the probability of two frames dropping out in succession should be less than 1/10000. While the frame processing has the most significant execution variance, the service interval on fault safing varies the most in this example. Let us assume that the limits violation monitor in this example contains a consecutive out of limits count that must be exceeded before a safing request is made (this may vary from 2 to 20 in this example) -- only one safing request is made for multiple violations. Furthermore, the reliability on telemetry is such that there will be no more than one telemetry dropout every 5 seconds on average and the probability of two dropouts in a row is less than 1/400. These are typical engineering specifications for any type of system, but this is typically not the view taken in hard real-time scheduling, nor with QoS scheduling which does not allow for specification of reliability in meeting deadlines with a probability. This example is intended to illustrate the problems with translating real-time system requirements into traditional hard real-time system implementations or into QoS service level specifications. The example is not complete, but it highlights the difficulty with which system requirements must be realized in terms of RMA or QoS frameworks. The difficulty of realizing systems like this example using traditional RMA is well recognized by publications which provide guides to applying real-time theory [BriRoy99], [Laplante93]. Despite excellent references s for making the mapping between requirements and the implementation, the task is arduous and soft requirements are still not addressed. The QoS frameworks have not evolved enough to provide engineer’s guides like RMA and really only provide examples of how a particular QoS framework was used to implement an example application [BraNu98], [MerSav94], [NiLam96]. Loïc Briand [BriRoy99] states, “This book is the result of years of accumulated frustration” in reference to his book written with Daniel Roy to provide a prescriptive guide on how to apply RMA to design a system to meet real-time requirements. The intent of the RT EPA framework is to provide a translation from real-time system requirements to code which is as direct as possible. Specification of service interval and deadline are required for QoS, RMA, and the RT EPA, but the RT EPA is the only framework which provides for direct specification of service execution time confidence and desired reliability in each service meeting its deadline. The goal of the RT EPA is therefore to bypass prescriptions for mapping requirements to theory and to implementation by providing a direct mapping between requirements and implementation through theory encapsulated in the framework directly rather than the application.


9

Table 1: Mixed Best Effort, Soft, and Hard Real-Time Application Example Service

Service Interval (msec)

Execution Variance (msec)

Deadline (msec)

Worst-case Utility

Best-case Utility

Deadline Reliability

Camera platform stability and position control

50

5 +/- 1

50

0.12

0.08

100 % (hard)

Video source acquisition and frame compression

50

20 +/- 10

50

0.6

0.2

99 % (soft)

Camera fault detection

250

12.5 +/- 2.5

100

0.06

0.04

100 % (hard)

Camera state telemetry acquisition and transport

250

20 +/- 5

250

0.10

0.06

95% (soft)

Camera fault safing

500 +/ 4500

10 +/- 5

1000

0.03

0.001

100 % (hard)

Camera command processing

500

10 +/- 2

500

0.024

0.016

100 % (hard)

Memory scrubbing

12000

500 +/- 100

12000

0.05

0.033

best-effort

0.984

0.43

TOTAL

The example in Table 1 cannot be scheduled according to standard RMA and DM admission tests, and is an example of a marginal task set. So, the implementation choices with current realtime systems technology are to schedule this system according to RMA priority policy despite not clearly meeting admission criteria, schedule the system using best effort, or schedule the system using QoS levels based on estimates about how much resources should be reserved for each service. None of these approaches allows an engineer to have confidence that the system can meet the deadlines required with the reliabilities required. Examples of requirements such as these and marginal task sets like this are the motivation behind the RT EPA.

1.4

Proposed Solution

This thesis describes an interface and scheduler that provide an on-line intelligent service negotiation and execution control mechanism called the Real-Time Execution Performance Agent (RT EPA). The RT EPA ensures execution of algorithms based on required reliability and confidence in meeting deadlines rather than on priorities or an abstract derivative thereof (such as QoS levels). The basis for the RT EPA is confidence-based scheduling, an extension to DM scheduling employing stochastic execution models. Furthermore, the EPA provides predictable and safe execution of hard real-time safety critical tasks in addition to predictable execution of soft


10

real-time tasks. Hard real-time tasks are still scheduled safely since all tasks are protected from each other in terms of interference, and each task has quantifiable assurances of processor resource availability – i.e. it is possible to predict the order in which tasks will fail in an overload situation. Hard real-time tasks are protected from soft real-time tasks which may occasionally overrun deadlines through strict enforcement of termination deadlines for all tasks (this strictly limits maximum interference). By analogy, the EPA provides a balancing capability much like the everyday ability people have to walk without tripping (hard real-time) while chewing gum (soft real-time) and contemplating how to build a better career (best effort). Perhaps a less safe example is the emerging habit of people talking on a cell phone and driving – the critical task of driving must be fire-walled from interfering service demands based on the phone for safe use of a cell phone while driving (e.g. no dialing except at stoplights and use of a hands-free microphone and speaker). The RT EPA performs this fire-walling by executing tasks in specific execution reliability and confidence spaces and monitoring actual execution times to determine when resources must be adjusted due to execution variances exceeding originally negotiated requests for service. Since continuous media and digital control applications (2 of the 3 main application types) both include pipeline processing of inputs to outputs (a pipeline is a logical stream with input, filtering stages, and output), the RT EPA has also been designed to make the construction of such real-time pipelines along with more event-driven task releases a process where timing and dataflow can be specified together. The specification of real-time data-flow processing has been the subject of research on in-kernel pipelines, which have also been implemented in conjunction with QoS scheduling [Gov91], [Co94], [Fal94], [MosPet96]. A more recent example of continuing work on in-kernel pipelines is Microsoft DirectX which along with the rich history of pipeline research clearly establishes the importance of pipelines to efficient and reliable real-time streaming applications which have device sources and sinks [McCart00]. While pipelined realtime services are typical of continuous media and digital control (two of the most prevalent realtime applications – see Section 2.5), the RT EPA mechanism is intended to provide time-critical pipelined and non-pipelined applications with quantifiable assurance of system response using a simple extension to the DM scheduling algorithm [Au93]. In addition, the RT EPA provides an admission interface and execution control which allows applications to monitor and control realtime performance of tasks and processing pipelines on-line. So, the RT EPA significantly extends existing work on in-kernel pipelines as well as QoS soft real-time scheduling in general through the confidence-based scheduling admission test and on-line monitoring and control provided by the RT EPA. Most importantly, given the RT EPA interface, a developer who needs to manage a mixture of hard and soft real-time now has a framework for developing applications with quantifiable reliability, execution-model failure fire-walls, negotiable service, and on-line monitoring. This insight into execution performance and intuitive quantifiable service negotiation in terms of probability of meeting/missing deadlines not only makes the job of implementing such systems easier, but it can help a programmer implement applications reliably that no other existing method can.


11

1.5

Evaluation

The RT EPA mechanism has been implemented as an extension to the Wind River Systems VxWorks micro-kernel [WRS97]. It has been tested with applications including: 1) pseudo task loads, 2) monitoring capabilities on the SIRTF/MIPS instrument incorporating continuous media and digital control pipelines, 3) a 5 degree-of-freedom (DOF) robotic arm, 4) a video acquisition and compression pipeline and 5) an optical navigation test-bed incorporating real-time video processing and digital control. The RT EPA has demonstrated the ability to increase reliability and predictability of systems that require both hard real-time execution control in addition to flexible soft real-time processing. The experiments demonstrate the viability of the confidencebased scheduling formulation and the RT EPA implementation for negotiated, monitored, on-line quantifiable service assurance using the confidence-based formulation. Three of the test-beds (pseudo loading, 5 DOF robot, and video compression pipeline) are applications designed to evaluate the RT EPA and serve no other purpose. Since it could be argued that such applications are “toy examples” and are not indicative of realistic applications, portions of the RT EPA were evaluated with the SIRTF/MIPS (Space-based Infrared Telescope Facility / Multi-band Imaging Photometer for SIRTF) instrument. The code for this application includes over 45 thousand lines of C code in addition to the VxWorks Real-Time Operating System (RTOS). The SIRTF/MIPS instrument has three real-time digital detectors with hard realtime continuous media processing pipeline deadlines, a complicated video compression method, closed loop digital thermal control, a PowerPC pipelined microprocessor with L1 cache, many soft real-time requirements such as command handling and telemetry processing, as well as best effort task such as scrubbing memory. Furthermore, SIRTF/MIPS is an example of a mixed application with continuous media (three concurrent video sources), digital control (thermal control), event driven quasi-periodic tasks (command and telemetry by request), and totally non-real-time maintenance type processing (memory scrubbing). While the MIPS instrument is a unique application, many similar examples exist for more commercial applications – virtual reality, multimedia, and in-situ interactive real-time applications such as vehicle navigation systems, digital flight management systems for aircraft, and satellite-based internet and digital media services. These types of applications, similar to SIRTF/MIPS, all include: complex data and event driven algorithms, mixed service requirements, and complex high performance microprocessor architectures. Finally, during development, the SIRTF/MIPS software was timing out on missed deadlines with RMA priorities; RT EPA was incorporated into the system, enabling it to meet all deadlines within existing specifications. Without the use of the RT EPA monitoring capabilities the SIRTF/MIPS instrument was not able to operate correctly.

1.6

Summary of Research Results

The thesis presents three basic types of major results: 1) theoretical formulation, 2) prototype software, and 3) software/hardware test-beds to demonstrate goals for the framework and validate the theory.


12

1.6.1

Theoretical Results

The theoretical results include three major new real-time theories: 1) An engineering view of real-time scheduling that inherently includes mixed hard and soft services with quantifiable reliability and confidence in the system and specification of desired service performance (compared to abstract levels of service). 2) Confidence-based thread admission and monitoring reducing RMA pessimism in execution time bounds and how they translate into ability to respond by deadlines. 3) Evidence that thread sets can be admitted to multiple on-line epochs with priority changes between epochs, but fixed within an epoch. The confidence-based scheduling admission test is derived by modifying the DM admission test to include execution confidence intervals and deadline confidence. After multi-epoch theory is presented by example, the multi-epoch theory is shown to approach the EDF ideal upper bound of full utility. 1.6.2 1) 2) 3) 4)

Framework Prototype Implementation

The RT EPA prototype framework provides: On-line confidence-based deadline monotonic (CBDM) admission of threads. On-line monitoring of thread/service performance in terms of missed deadlines, reliability, and a confidence model. A periodic server for performance monitoring and service re-negotiation. Pipelining of data processing between source and sink interfaces to enable distribution of processing load over time and high level sequencing and control of processing stages.

1.6.3

Proof-of-Concept Test Results

Three applications were tested with the RT EPA prototype: 1) Pseudo software loads (demonstrate basic features and characteristics of CBDM and negotiation/re-negotiation). 2) Use of the on-line kernel monitoring in the RT EPA to identify service epochs and apply this theory to solve a real-world scheduling problem on a space-based telescope. 3) Use of the full RT EPA to demonstrate all research goals with an optical navigation test-bed including multiple RT EPA pipelines for digital video processing and control of an airpowered vehicle. 1.7

Significance

The significance of confidence-based scheduling and the RT EPA is that this approach provides a reliable and quantifiable performance framework for mixed hard and soft real-time applications. This thesis provides a detailed explanation of the RT EPA and confidence-based scheduling concepts, a comparison to other soft real-time scheduling methods, the theoretical background of confidence-based scheduling, the mathematical formulation for confidence-based scheduling, an example RT EPA implementation and results which validate the usefulness of the framework. The set of applications requiring this type of performance negotiation support from an


13

operating system is increasing with the emergence of virtual reality environments [Nu95], continuous media [Co94], multimedia [Ste95], digital control, and “shared-control” automation [Bru93][SiNu96]. Furthermore, in addition to providing scheduling for mixed hard and soft services, the RT EPA facility allows an application developer to construct a set of real-time kernel modules that manage an input (source) device; apply sequential processing to the input stream (pipeline stages); control individual processing stage behavior through parameters obtained from a user-space application; provide performance feedback to the controlling application; and manage the output (sink) device. This type of pipeline construction in combination with the RT EPA on-line admission testing based on requested deadline reliability and on-line performance monitoring make implementation of typical continuous media, digital control, and event-driven real-time systems much more simple than hard real-time, QoS, or best-effort systems. The RT EPA framework with CBDM and ME formulations for scheduling provides a powerful framework for a broad set of applications requiring timely services with measurable quality in the ability to meet deadlines.


14

2

Problem Statement

Real-time software systems must be functionally correct and must produce correct results by deadlines relative to the occurrence of a stream of events (in the context of this thesis, the occurrence of events is referred to as “the event-based release of a thread of execution”). Put more idiomatically, if real-time software produces a functionally/mathematically correct result that is inconsistent with the event occurrences, then it is incorrect; and of course conversely, if such a system produces a functionally/mathematically incorrect result on time, then it is also wrong. To be correct, the application must produce a correct answer at the right time. Usually producing the correct answer before a deadline is sufficient, but in the case of isochronal applications, the result must be produced at only one particular time, that is the result must not be produced too early or too late, but on time. Unfortunately, many complex real-world systems such as the SIRTF/MIPS video processing application and the RACE application studied in this thesis experience very occasional jitter in response such that the end result is a timing fault which is exhibited somewhat rarely, but frequently enough to be a problem. The response jitter ultimately can be shown to stem from execution or input/output jitter or both. In the case of the RACE results presented in Section 8.4 execution jitter was seen for the frame display task that was as high as 7.7 % of the average execution time on a Pentium processor with L1/L2 cache (Figure 35). Worse yet, the SIRTF/MIPS video processing application experienced execution jitter in video frame slice compression validation processing of 55% as measured by the SIRTF/MIPS RT EPA monitor (Table 16). This 55% jitter rarely happened, but regularly caused timeouts when it did – the result is apparently random timeout faults. Full results for both applications are presented in Section 8, but the point is that the variance can cause very occasional timing faults and requires either very pessimistic assumptions of resource demands (high margin) or on-line monitoring and control for occasional glitches as is provided by the RT EPA. The problem with processing events in real-time can be categorized according to the system boundary (i.e. elements that are controlled parts of the system under construction rather than the environment in which the system must operate). By this definition we have only two domains to consider – the system and the environment. Furthermore, we will be applying an admission test which will ensure that there is sufficient CPU resource to provide the throughput and turnaround required for the system, the most basic timing issues. So, what is of most significant concern is how will variances in the environment and system itself affect performance. Ideally, the environment would be modeled as an event source which initiates service releases in a purely periodic manner with no jitter or latency in this source. Likewise, the system would ideally be modeled as responding to the periodic demands by releasing services (threads of execution) to handle the event and produce outputs with no jitter in the response. In the case of the environment the latency would ideally be a constant determined for the environment by the physics of the detector (e.g. speed of light, sound, electrical transmission and sensing of physical phenomena). In the case of the system the latency of response would be purely determined by a deterministic worst-case dispatch and execution time on the CPU given all demands upon this shared resource. For numerous reasons to be discussed in this section, both the environment and system are not


15

ideal. Therefore, given the system/environment boundary model proposed here, the two categories of timing issues that must be considered are: 1) environmental latency and variance and 2) system latency and variance. The first category is addressed to some extent in Section 4 of this thesis, but a comprehensive treatment is not provided, because modeling of the environment is beyond the scope of this thesis. Some research into variance in event rates due to sensors and detection was completed [SiNu96]. System timing latency and variance is the category of principal interest in this thesis because it is a more tractable problem and because most systems are deployed in controlled environments which require a model specific to that environment. The problem of deploying systems in uncontrolled environments is ultimately not tractable (it requires an accurate world model) and many excellent sources for modeling environmental event rates for more well-defined and somewhat controlled environments already exist [Tin93], [Tin94], [Tom87], [Sprunt88], [Sprunt89], [Kl94]. However, it should be noted that the operational environment for the system must be reasonably well modeled in order to make good use of the RT EPA framework. The RT EPA does address the issue of environmental variance in a limited scope through service epochs presented in Section 4, however this assumes that a good model of environmental event rates and modes exists – as already noted, deriving an event rate and mode model goes beyond the scope of this investigation and the capabilities of the RT EPA. Finally, a number of real-time system application domains are examined with respect to impact of both the environmental and system variance characteristic of each domain. 2.1

System Timing Issues

Ultimately system latency can be shown to be the sum of the latencies for the following: 1) release, 2) dispatch/preempt, 3) input, 4) execution, and 5) output as is evident in Figure 3 and well noted[BriRoy99]. Likewise, there are 4 types of system variances (or jitter) that are of interest with respect to the ability to meet deadlines relative to events including: 1) release jitter, 2) dispatch/preempt jitter, 3) execution jitter, and 4) I/O jitter. These variances have been noted in related research as well as in the work presented here[Ste95], [Fle95], [Bru93], [Tör95]. As already noted in the problem statement boundary definition, given the system scope of this investigation, the real world events themselves are not considered to have significant latency or jitter from the system view alone. This is despite the fact that real-world events may be aperiodic (e.g. events generated based upon the whim of an operator or environmental events that are hard to predict). Likewise, latency due to speed of sound and light could actually be considered, but this goes beyond the boundaries of the system as defined here. Typically an environmental model can be formed so that real-world events are assumed to have a maximum event inter-arrival rate for a particular mode of service, leading to the concept of service epochs discussed in Section 4. In this thesis we consider all deadlines to be relative to the occurrence of events in the real world once they have already been detected by the system (e.g. a sensor/transducer has transformed a physical event into an electrical assertion interrupting the CPU). Ultimately all that really matters is the response latency and jitter which results end-to-end from the time of the event related interrupt assertion to the time of system response output assertion. Simply stated, each event release must have a response before a deadline relative to that event release and the ability to meet the relative deadline will be determined by the sum of the latencies. Likewise, the ability to


16

reliably meet the deadline on a periodic basis will be determined by the end-to-end jitter. Typically latency is a matter of resource capabilities such as bus bandwidth, CPU speed, CPI (Clocks Per Instruction), and network bandwidth. In addition, availability of the resource is also an issue and contributes to latency and jitter due to interference and overhead time to preempt lower priority users of the resource. Jitter however is much more complicated to assess since it is due to several variable factors including contention for resources (interference), system hazards (e.g. cache misses and pipeline stalls), and system inaccuracy (e.g. clock jitter and device interrupt assertion jitter). By far the most significant jitter is due to competition for resources which is the fundamental consideration in hard real-time scheduling [LiuLay73] (the two best examples are network/bus contention and CPU interference). The secondary source of jitter is typically due to system hazards and in general, the contribution due to system inaccuracies should be minor unless the system has hardware deficiencies for real-time application. Examining each source of system timing variance in detail, there are multiple sources for latency and jitter in each of the four phases identified here. 1) Release variance – Due to I/O contention (bus grant) and CPU interference (higher priority threads), the time from when a external event occurs and when the servicing thread becomes ready to run is variable – this is the time between the external event and when the initial release is made (i.e. the service is granted the CPU resource for the first time) . Furthermore, over the time from initial release to response completion there may be interference and contention that will require preemption and re-dispatch. For example, a higher priority thread may very likely become ready to run during a release that is in progress or, in the case of I/O contention, a bus grant may be revoked due to the bus grant arbitration scheme. 2) Dispatch/Preempt variance – the context switch time is dependent upon save/restore requirements and potentially kernel overhead for interrupt servicing, checking kernel events and maintaining services depending upon the event release interrupt priority relative to other interrupting sources (e.g. checking semaphore take lists, updating virtual clocks, and dealing with floating-point tasks). 3) Execution variance – execution variance has two major sub-components including architectural and algorithmic sources of variance [BriRoy99]. a)

Algorithm execution variance – This variance is due to non-uniform computational loading over time resulting from data driven applications. For example, content-based digital video compression like change-only digital pixel stream compression has an execution time and I/O bandwidth needs that is proportional to the scene change rate. If more pixels are changing in the scene due to motion in the scene, then the compression algorithm will have to produce more change packets per frame – best case there is no scene change and nothing needs to be generated.


17

b) Architectural execution variance – This variance is due to the complexity of microprocessor architecture features such as multi-level caches which greatly increase average throughput, but make individual release execution time difficult to predict. Examples of these features include: pipelining, super-scalar ALUs, L1/L2 cache memory hierarchy, and branch prediction [HePa90]. Hennessy and Patterson describe these features and modeling execution times in great detail. 4) Input/Output variance – This is variance in I/O due to the complexity of peripheral bus architectures and multiplexed access to these I/O interfaces by multiple threads of execution. Multiplexed access to buses requires mutual exclusion which can cause priority inversion and bus grant arbitration may also result in response output jitter due to buffering. These variances are distinguished from CPU architecture variances since they are purely I/O related. For example, a PCI bus controller may be programmed to establish the bus grant latency for each device through systems software. Scheduling the bus is beyond the scope of this thesis and therefore all research was completed on a system with a significantly under utilized bus (maximum of 20% of available bandwidth). The RT EPA to handles environmental timing variance through the use of ME addressed in Section 4.

2.1.1

Release Variance Due to Contention and Interference

Release jitter is due to the possibility of variable phasing between interfering threads and the currently executing thread already released; between releases, a given service may encounter different levels of interference by higher priority threads prior to release. In some cases it is possible to reduce interference by synchronizing the services.. Liu and Layland addressed this issue with the critical instant assumption, which means that they made no assumption about phasing of threads. This is the safest assumption, but can be overly pessimistic. The synchronizing features of the RT EPA for pipelining can greatly reduce interference due to bad phasing.

2.1.2

Dispatch and Preemption Variance Due to System Overhead

A thread will potentially need to be involved in one or more context switches, depending on interference that might have added overhead to the response time. Since this overhead is a function of interference, it will experience jitter if the interference has jitter in addition to any variances in the context switch time itself. The isochronal feature of the RT EPA can be used to control end-to-end jitter in pipelines by providing buffer holds between stages or at the end of the pipeline.


18

2.1.3

Algorithm Execution Variance Due to Non-uniform Loading In a Single Release

With regard to release execution time, this thesis addresses the challenge of dealing with systems that have execution jitter and therefore impose non-uniform loading over time by introducing the concept of multi-epoch scheduling. The SIRTF/MIPS application provides an example of how real-time scheduling problems can be solved by analyzing the scheduling feasibility of task releases in two or more separate scheduling windows or epochs rather than assessing utilization over the longest single inter-release period as does RMA [LiuLay73]. Likewise, DM assesses scheduling feasibility of the relatively largest deadline (i.e. iterative calculation of interference by all threads of higher priority compared to that under test). The problem of taking an existing singly threaded application and redistributing it into multiple threads is not addressed in this thesis because that is a source code optimization problem, however, the research completed here shows a distinct advantage to decomposing real-time systems into larger numbers of threads which each have smaller release execution times in order to enable multiepoch solutions. The concept of analyzing loading over multiple co-existent scheduling epochs is clearly demonstrated by example and made possible the successful real-time scheduling of video processing on the SIRTF/MIPS instrument. The results of the multi-epoch solution are described in Section 8.3. 2.1.4

Architectural Execution Variance Due to Micro-parallelism and Memory Hierarchy

Modern microprocessor architecture features which make real-time execution difficult to predict include: 1) Pipelining, branch prediction and super-scalar instruction execution hazards can cause CPI variances. For a super-scalar pipelined system the ideal rate of execution will be less than 1 clock per instruction. When the pipeline stalls due to a hazard, the CPI may fall to a worstcase value where no micro-parallelism is employed and one clock is required for each stage of execution (fetch, decode, execute, write-back). So, due to hazards, the CPI at any time may vary between 0.5 to 4 easily on modern microprocessor architectures such as the PowerPC and Pentium processors. An average CPI may be assumed of course, but the prediction of hazards is extremely difficult except in an average sense for a specific body of code [HePa90]. 2) Memory hierarchy features contribute to execution variance. At the highest level, the L1/L2 set associative write-back and write-through caches will experience misses based upon the specific application memory reference stream. At the next level in the hierarchy use of virtual memory will result in occasional page faults and significant delays during page swapping. It is not typical for embedded systems to use virtual memory for this reason, but the goal of the RT EPA is to handle these types of occasional variances in timing. Unix systems pages can be hardwired into memory for a real-time application.


19

3) I/O optimizations can lead to variances in timing. Direct memory access (DMA) transfers save CPU cycles, but the transfer and interrupt rate are driven by the bus mastering I/O device rather than the application on the CPU. Burst transfers typical with DMA transfers across buses also impose long periods of bus utilization which may hold off single word requests for many bus cycles. Pipelining a CPU with micro-parallelism essentially reduces the CPI to 1.0 (increasing efficiency) as long as the pipeline is not stalled by a hazard preventing the CPU control from safely continuing parallel instruction fetch, decode, execution, and register write-back on each clock cycle [HePa90]. Furthermore, super-scalar CPUs use micro-parallelism to provide for CPIs of 0.5 or less by not only providing parallel execution of CPU cycle phases, but by providing parallel execution of each phase – i.e. multi-instruction fetch, multi-decode, multi-path execution, and multi-path write-back [HePa90]. There is no standard form of micro-parallelism, so the efficiencies gained by a particular pipelined super-scalar architecture depend upon the exact nature of the micro-parallelism and hazards associated with execution of arbitrary instruction sequences. From a real-time execution perspective, the efficiencies gained must in essence be ignored since in the worst case a given instruction sequence could result a high frequency of pipeline hazards and in the worst case could result in a continuously stalled pipeline increasing the CPI from 1.0 or less to a maximum of 4.0 or more depending upon the number of CPU cycle phases. Complex memory hierarchies including L1/L2 caches speed up processing overall (and enable pipelined/super-scalar execution rates by providing cached memory references in a single clock), but at the cost of predictability of execution time for a given instruction sequence [HePa90]. The execution variance introduced by caches is due to the complexity of predicting cache hit/miss rates for a given thread release (i.e. it’s difficult to predict a memory reference trace in advance for a reasonably complex algorithm) [HePa90]. Some cache hit/miss rates can be predicted satisfactorily for trivial algorithms (such as copying a block of bytes from one location in memory to another), but many algorithms are more complex and there are also context switch interactions that are hard to predict. Examples of more complex data-driven algorithms include: data compression, image processing, searching, sorting, root-solving, variable-step-size integration, and matrix inversion are all examples of much more rigorous computations in terms of variability in complexity and memory reference streams. Cache misses require main memory fetches that are much slower than cached hits. Furthermore, a cache miss stalls the processor pipeline until the required data or instruction is fetched. Likewise, if the memory hierarchy includes virtual memory to provide working space greater than main memory, then a page fault will greatly increase execution time while pages are swapped from disk to main memory. (Usually virtual memory is considered unacceptable for real-time applications given the huge difference in access time – potentially from nanoseconds to milliseconds). From the traditional RM/DM real-time perspective, the efficiencies gained by caching must be ignored, and instead it is assumed that all execution times will be worst case. So, every memory access must be considered a cache miss requiring main memory access times and associated processor pipeline stalling – or, a super-scalar architecture with a CPI of 0.5 must be de-rated to a CPI of 4 or more. Likewise, the flexibility afforded by virtual memory is normally disabled in real-time systems either by omission as is the


20

case with VxWorks or by wiring pages used by real-time threads of execution so that they can never be swapped out to disk (an option on Unix systems such as Linux). Pipeline hazard reduction decreases overall CPU pipeline stall frequency and thereby increases overall efficiency. However, once again, these features may not be used in a real-time system since methods such as branch prediction are based purely on probability and therefore cannot provide guaranteed pipeline stall control – i.e. branch prediction improves overall efficiency, but does not improve execution time predictability.

2.1.5

Input/Output Variance Due to Shared Resource Contention and Transfer Modes

Optimizations for efficient microprocessor I/O, such as DMA transfers, increase overall I/O bandwidth and reduce CPU cycle-stealing, but the increases in bandwidth and the decreases in cycle-stealing are hard to guarantee. Likewise, buses are a shared resource and shared bus access introduces unpredictability of I/O resource contention between multiple threads of execution. If the bus is underutilized, then the I/O affects may be negligible; but a multi-threaded application which requires multi-threaded bus access needs to access the shared resource with mutual exclusion. The priority-inversion problems are well-known, and while there has been progress in minimizing the possibility and duration of inversions, there is not a way to avoid the problem completely [ShaRaj90]. The priority inheritance protocol prevents unbounded inversions, but can lead to chaining of temporary priority amplification which is a problem for hard and soft real-time systems. The potential for chaining can be limited by setting a priority amplification ceiling according to either the priority ceiling protocol [ShaRaj90], or the highest locker protocol [Klein93]. In either case this still is not a complete solution since a complex system may not be able to be analyzed to guarantee the ceiling is sufficient.. The full implications of I/O resource contention are beyond the scope of this thesis since the focus here is the CPU utilization and contention.

2.1.6

System End-to-End Latency and Jitter

Ultimately each contribution to latency and jitter leads to an overall system latency and jitter. Namely the latency from event release to output response is the response time which must be less than the relative deadline. Furthermore, the jitter in each phase of the response leads to overall response jitter. Figure 3 depicts response latency with minimal sum of the various jitter components and likewise Figure 3 depicts response latency with maximal summed jitter. Any given response will have an overall latency within this bound.


21

Figure 3: End-to-end Jitter from Event Release to Response Event release event latency

Dispatch release interference

Preempt

Dispatch exec interference

exec time

Complete exec time

output latency

Response (Output) Event release event latency

Dispatch release interference

Preempt

exec time

Dispatch

exec interference

exec time

Response Jitter

Complete output latency Response (Output)

Real-World Event

2.2

jitter

execution

Real-World Actuation

Environmental Event Rate Variance Due to Nature of Events and Modeling Difficulty Event sources can be classified as follows: 1) Aperiodic – Events having no predictable release characteristics (e.g. system faults). 2) Quasi-periodic or bursty – When a bursty source is active, it tends to be periodic, but when it will become active is difficult to predict (e.g. user interaction with a virtual world object through a data glove). 3) Periodic – Events which are completely predictable in terms of inter-release frequency (e.g. video frame digitization rate through multiplexed A/D converter).

Traditional hard real-time scheduling policies including rate monotonic (RM) and deadline monotonic (DM) require periodicity. Therefore, aperiodic sources are modeled as periodic by determining a maximum worst-case release frequency for them and assuming this frequency to determine their period for admission tests. For a rare event source that is high frequency when it does emerge, significant CPU will be wasted to reserve resources normally not needed. The RT EPA addresses this problem by providing an interface which provides on-line admission and reconfiguration to support occasional modes which can be negotiated on-line by the controlling application – for example, a burst of faults can trigger a real-time safing mode. It is quite possible to leave RT EPA tasks memory resident (i.e. fully allocated, but inactive) so that the transitions between such modes simply requires executing activation and deactivation sequences. In the SIRTF/MIPS application this concept of multiple execution modes was extended such that admission and execution control is provided in two distinct mode on-line with dynamic priority assignment between executions in a given mode (an epoch).


22

2.3

Characteristics of emerging real-time applications

In "Dynamically Negotiated Resource Management for Data Intensive Application Suites" [Nu97], it is noted that emergent real-time applications require a range of execution performances from hard real-time to soft real-time, with the degenerate case being best effort. Ideally a programming framework for CPU scheduling should provide a simple interface for specifying the full range of desired execution performance in a real-time system. Currently, there are three types of CPU scheduling frameworks: 1) Hard real-time priority preemptive, 2) Soft real-time quality of service, and 3) Non-real-time. The reason that a better framework for a range of required realtime execution performance would be beneficial is because in addition to traditional hard real-time application domains such as digital control and continuous media (e.g. digital video and audio) there are many emerging systems which have soft real-time requirements (e.g. multi-media entertainment systems) and even more interesting, mixed domains (e.g. virtual reality). These emerging application domains have created a need to fill the gap between best effort scheduling and RMA or DM hard real-time priority preemptive scheduling with an approach that is more reliability oriented than QoS methods. Furthermore, due to more user interaction in mixed applications such as virtual reality, there is a need for more flexibility and more dynamic mixed hard/soft real-time systems which provide an interface for reconfiguration, negotiation for service, and on-line monitoring. In the following Sections (2.6 – 2.8), the traditional domains are reviewed with respect to their real-time characteristics and the case is made for the need for more configuration control on-line by analyzing the characteristics of emerging domains (2.9-2.11).

2.3.1

Loading characteristics of purely continuous media

Continuous media applications must be isochronal end-to-end so that the output data is neither too early nor too late -- either case causes output jitter and will result in poor end-user media quality. Continuous media such as digital video and audio are not a new application domain, but the popularity and importance of these applications has increased due to multi-media applications and the proliferation of digital internetworking. The loading characteristics of these systems is periodic since they are driven by frame rates determined by human computer interaction principals (e.g. motion picture frame rates). What is interesting about continuous media applications is that due to typically high bandwidth requirements for networked applications, most video systems include compression pipelines. So, with video pipelines, there can be a large amount of execution jitter stemming from data-driven compression for network transport and tradeoffs to maximize performance so that the overall application is neither CPU nor I/O bound (as depicted in Figure 4). The execution jitter due to data driven algorithms for compression is also exacerbated by jitter due to high cache miss rates associated with large frame buffer manipulations. A worst case for frame processing and compression algorithms can be extremely pessimistic. For example, take a simple algorithm which determines the pixel brightness centroid in a 1024x1024 image (a typical algorithm used in optical navigation). A detailed analysis of such a scenario is provided in Appendix B., but in short, the execution time can vary from 70 to 129.5 milliseconds per frame given a PowerPC 750 microprocessor and its architectural characteristics [HePa90]. Furthermore, if there are multiple pipelines in a multi-media application, I/O resource contention may also cause


23

additional output jitter due to demands made on the bus for devices such as memory-mapped frame-grabbers and soundcards. To summarize the characteristics of continuos media, the period jitter is low and the execution jitter may be high due to compression and manipulation of large frame arrays. Figure 4: Continuous Media Digital Video Pipeline Source Flow Control Application

Sink Flow Control Application API

local pipeline agent

+/- milliseconds

local pipeline agent Frame Decompression

Frame Compression

HW/SW Frame Grabber

Network Device

Network Device

Video Adapter

2.3.2 Loading characteristics of purely event-driven processing Fault-handling is one of the best examples of purely event-driven processing. Normally, once a fault is detected, the earlier a response is generated, the better. It is very difficult to predict fault rates in advance and there will be aperiodic faults as well as the possibility of bursty faults depending upon the nature of the fault detection and sensor interface processing software and hardware. Typically, hard real-time fault protection systems are designed to handle a particular maximum fault rate and often will safe the system completely if the rate becomes higher than the design maximum (i.e. if there is a fault in the fault protection system itself such as a fault queue overflow). The RT EPA has been designed to provide an interface with on-line admission and reconfiguration to support occasional task load sets which can be negotiated for admission in advance and then brought on-line by the controlling application – for example, a burst of faults can trigger a real-time safing mode. It is quite possible to leave an already admitted RT EPA task set memory resident (i.e. fully allocated, admitted as a separate set from the currently active set of tasks, but inactive) so that the transitions between such modes simply requires executing activation and deactivation sequences for the set (activation and deactivation of services are part of the RT EPA application programmer’s interface). Figure 5 depicts such a scenario. After such a scenario it would be possible, assuming a system recovery was possible, to reactivate the nominal task set.


24

Figure 5: Purely Event-Driven Real-Time Processing

Sensor Processing

Sensor Electronics

Fault Identification

Sensors

Fault Handling

Faulty Effector

Environment

2.3.3

Loading characteristics of purely digital control applications

Digital control applications must be isochronal end-to-end so that the output data is neither too early or too late since either case causes output jitter and will result in decreased stability of the control loop. Digital control applications are periodic, sensitive to jitter and/or dropouts, and typically have low execution variance for simple digital control applications like thermal control or other types of large time constant single variable control problems (depicted in Figure 6). However, applications such as spacecraft attitude determination and control have many more variances, but still require the same sort of end-to-end stability in the input/output rates. Attitude determination and control may require significant sensor filtering (e.g. extended Kalman filter) on the sensor interface and may require high load computations such as orbit determination and momentum management to produce a single actuator output. To summarize the characteristics of digital control, the execution jitter tends to be low, but output jitter must be as minimal as possible to prevent timing instability. Regularity of output can be much more important than consistency of sample inputs – i.e. many digital control applications are robust to occasional stale sample input, but are often sensitive to output jitter which may cause actuators to adjust control points non-uniformly over time and lead to instability.


25

Figure 6: Real-Time Digital Control Processing

Sensor Processing

Sensor Electronics

Digital Control Law

Sensors

Effectors

Actuator Processing

Actuator Electronics

Environment

2.3.4 Loading characteristics of mixed processing Mixed applications which include some combination of continuous media and event processing are becoming more prevalent with more sophisticated user interaction interfaces and applications. For example, virtual reality (VR) systems include frame-based rendering of scenes as well as event-driven user input (depicted in Figure 7). In general, the rendered scene only needs updated in the video buffer when there is a change in observer perspective (e.g. the observer changes viewpoint or moves in the VR world). So, rendering is, on the one hand, a frame-based service, but on the other hand more event driven than video processing since it is inherently a change-only service (video may be designed to be change-only, but is not so by nature like VR rendering). Another characteristic of rendering is that scene complexity drives the execution time dramatically. A VR rendering algorithm can therefore control execution variance by rendering all scenes, no matter how complex in terms of the world model with a fixed number of polygons, but this may not be desirable, so there may be execution variances in frame rendering times due to variations in the number of polygons required. Furthermore, it is possible to design the rendering to trade off frame rate with the number of polygons rendered to balance update rates and scene quality – for example, if one moves quickly through a VR world, then the execution variance can be controlled by reducing the quality of the frame rendering in favor of a higher update rate (half the number of polygons and twice the frame rate). Making these types of trades is not the subject of this research, but providing a framework to reliably schedule releases of VR tasks and control execution is an issue no matter how these trades are made.


26

Figure 7: Mixed Continuous Media and Event-Driven Real-Time Processing VR Control Application

VR Control Application

+/- milliseconds VR World Rendering/Model

VR World Rendering/Model

Data Glove

Network Device

Network Device

Graphics Accelerator

Data Glove Graphics Accelerator

2.3.5 Loading characteristics of mixed event-driven and digital control Mixed applications which include some combination of digital control and event processing are becoming more prevalent with more sophisticated semi-autonomous robotic systems that include shared control of a device by combining user interaction with autonomous agents [Bru93]. For example, a semi-autonomous robotic system is typically commanded at a very high level such as providing a navigational way-point and then allowing the system to autonomously achieve the goal with digital control to handle environmental perturbations and agent software to handle faults. A perfect example is a robotic rover which must deal with obstacles, control speed on varying surfaces and inclines, and navigate to a user defined way-point. All modern satellite systems fit this application class as well since they include hard real-time digital control for attitude determination and control, but also event-driven automation for fault detection and safing. Semi-autonomous systems are becoming more and more prevalent in space systems and robotics due to the high cost of tele-operations and the unfeasibility of tele-operation in some circumstances (e.g. large latency in communications with planetary rovers, underwater vehicles, and robotic manipulators) [Bru93] [Fle95]. On-line re-planning is often performed for rovers whereby a rover may encounter an unmapped obstacle and require an update to its obstacle model and on-line re-planning of its route. So, it would be typical for such an application to include a more deliberative planning agent function and a more reactive agent function which provides immediate avoidance of an obstacle (depicted in Figure 8).


27

Figure 8: Mixed Digital Control and Event-Driven Real-Time Processing

Deliberative Agent

Soft real-time planning

Soft real-time management

+/- minutes

+/- sub-seconds

Interactive Agent

API Hard real-time software digital control

Reactive Agent (Software Digital Control)

+/- milliseconds HW/SW

Hard real-time hardware digital control

Digital Controller Environment

2.3.6

+/- microseconds

Loading characteristics of mixed real-time applications in general

In summary, mixed applications provide services with different requirements on deadline reliability and for various environment event releases with a range of system execution demands and variances in release frequency. This environmental event rate and system loading demands are summarized in Table 2 (required reliability would be a third dimension - not shown). Within the system space alone, it is also clear that there are a range of application demands upon the combination of I/O and CPU resources. This feature space is characterized by Figure 9. Table 2: Environmental Event-Rate Types with Application Examples of Each

High Execution Variance Medium Execution Variance Low Execution Variance

Periodic Digital video

Quasi-Periodic Obstacle avoidance

Digital audio Digital control and packet telemetry

VR scene rendering VR user input (e.g. data glove)

Aperiodic On-line replanning Fault handling Command processing

It is not hard to imagine applications which include combinations of these types of environmental and system characteristics – e.g. a remotely piloted vehicle with video and audio streaming to a VR interface for semi-autonomous operator shared control of the vehicle – this application would have digital video, digital audio, digital control, telemetry, obstacle avoidance, on-line re-planning, fault handling and command processing on the vehicle processor and likewise digital video, digital audio, scene rendering, high bandwidth user input, telemetry and command processing within the VR control environment.


28

Figure 9: Feature Space of System I/O and CPU Requirements By Application Type I/O Bandwidth

(buffer, rate) (frame X x Y, rate) Data Acquisition (DMA words/sec)

Graphics Display Continuous Media Procesing

(N polygons, rate)

Rendering

(N Sensors/Actuators, rate) Digital Control Numerical Simulation/Modeling

(N elem, rate)

CPU Loading

As already noted, digital video, audio, and control are periodic and the range of execution variances stems from algorithm complexity and data manipulation size (larger data units are more likely to experience cache misses and pipeline stalls). For quasi-periodic applications the environment drives the release rate and typically this is bursty depending upon for example motion in the environment – if a rover moves through an environment quickly, then obstacle avoidance is higher rate and depending upon the complexity of the environment and the algorithm (local or more global), the execution variance can be high. Likewise, if an avatar moves through a VR world quickly, then the frame update rate and scene change rate increases. Aperiodic event sources are exceptions to normal processing – faults definitely fit this definition and associated fault handling which may also require on-line re-planning for highly autonomous systems such as a planetary rover. Command processing is completely driven by a mission and/or user input which is clearly aperiodic in nature although a maximum command rate may be imposed at a periodic rate. More examples could be filled in, but Table 2 is sufficient to establish that applications do exist which span this range of release and execution demands upon systems.


29

3

Related Research

Most of the related research focuses on either admission tests and scheduling policy or on processing frameworks given an admission test and policy. The most extensive frameworks which compare to the RT EPA include the Chorus micro-kernel QoS work based on DM scheduling by Coulson et al [Co94], the RT-Mach QoS resource reservation framework research at Carnegie Mellon University [MerSav94] and work by Jeffay et al at the University of North Carolina to build frameworks that deal with release period and loading variance [JefGod99]. In all cases, the application domain focus for these frameworks is continuous media rather than applications. Soft real-time frameworks relate to the RT EPA in that they also accommodate execution and release period variances as well as marginal task sets, but these approaches ultimately require mapping tasks onto abstract levels of service [BraNu98]. It’s not clear how soft real-time QoS methods can support mixed hard and soft real-time applications since they inherently optimize and/or control loading to maximize service quality rather than controlling deadline reliability as the RT EPA does. Finally, the Pfair method provides an approach to scheduling releases in multiple windows much like the multi-epoch scheduling investigated with the RT EPA [Baruah97], however, Pfair reduces these windows into very small slices within a single release period rather than the more granular use of windows investigated with the RT EPA. A clear distinction of the RT EPA research is that while related research focuses on a particular application type such as continuous media or digital control in isolation -- not a mixed services applications and mixed deadline reliability requirements like the RT EPA research does. Furthermore, it is not clear at all that any of the related research attempts to take the deadline reliability view of scheduling a mixed set of services as the RT EPA research presented here does.

3.1 Hard Real-Time Research Related to RT EPA As noted in the introduction, the hard real-time theory relates to the RT EPA since it defines the least upper bound for marginal task sets that the RT EPA has been designed to handle in terms of deadline reliability and execution variance control. The work of Liu and Layland clearly defines the least upper bound for fixed priority scheduling and dynamic priority scheduling and therefore defines limits on the potential capabilities of the RT EPA. Furthermore, the RT EPA directly extends the DM equations developed by Audsley and Burns [Au93] to incorporate execution variance and reliability directly into the DM admission test. The RT EPA and CBDM goal is to minimize the pessimism in two classic hard real-time assumptions: 1) Bounds on execution for admission testing must be worst-case. 2) No assumptions regarding the phasing of service releases may be made. Furthermore, the RT EPA provides the same admission test as DM for guaranteed service requests, but allows such hard real-time services to coexist with soft real-time services by firewalling guaranteed services from the effects of soft overruns and dropouts.


30

3.2 Soft Real-Time Research Related to RT EPA and Confidence-based Scheduling The RT EPA has been designed to perform on-line admission testing, monitoring, interference fire-walling, and on-line negotiation for soft real-time applications and therefore research in soft real-time QoS is related. Potentially related QoS frameworks include: 1) RT-Mach processor capacity reserves [MerSav94], 2) Rialto [JonRos97], 3) the DQM middle-ware [BraNu98], 4) SMART [NiLam96], and 5) MMOSS [Fan95]. What is inherently different is that while all of these soft real-time execution frameworks provide methods to schedule marginal task sets at the cost of missing occasional deadlines, they do so without a direct specification of the reliability desired for each task. Furthermore, it is not clear how an application would negotiate for hard real-time guarantees mixed with soft real-time deadline reliability as is possible with the RT EPA. Like QoS methods, the RT EPA also provides an interface for negotiation of a service level – for the comparable QoS systems these service levels are abstractly quantified in terms of benefit and resource requirements, but for the RT EPA service levels are clearly quantified in terms of reliability [NuBra99]. It is assumed that the application designer will make decisions as to what the benefit is of negotiation for a particular reliability with the RT EPA, however, not only does the RT EPA ensure that it can meet requested reliability before admitting a task, it returns the reliability level possible whether a task is successfully admitted or not so that higher or lower reliabilities can be negotiated. Finally, QoS methods do protect one service level from another similar to the termination deadline fire-walling provided by the RT EPA.

3.3 Execution Frameworks Similar to RT EPA A number of pipeline mechanisms for continuous media have been developed [Gov91], [Co94], [Fal94]. However, most common implementations include application-level processing with device buffers mapped from kernel space into user-space rather than an in-kernel mechanism for executing trusted modules loaded into kernel space. Likewise, these memory-mapped implementations also employ user-level threads with split-level scheduling or bindings of user threads onto kernel threads. The splice mechanism is most relevant since it operates in kernel space using loadable modules or simple streaming as the RT EPA does, and was shown to have up to a 55% performance improvement [Fal94]. However, splice does not provide a configuration and on-line control interface like the RT EPA for scheduling. Many examples of periodic hard real-time digital control streams exist [Kl94], but no general mechanism for reliable real-time control of pipelines is known to exist. Research on process control requirements for digital control indicate that parametric control of a number of processing pipelines within a general operating system environment would be useful for sophisticated industrial applications. Finally, many real-time semi-autonomous and “shared control” projects are in progress [Bru93] [Fle95], including applications where occasional missed deadlines would not be catastrophic [Pa96].


31

4

Scheduling Epochs

On-line monitoring of execution not only provides feedback on deadline performance, but gives insight into the load distribution on the system. The traditional hard real-time approach to scheduling is to analyze loading over the longest period for all tasks in the system since the deadline guarantees are based on estimating worst-case interference along with actual resource demands by any given task in the system. The DM approach somewhat improves upon the RMA least upper bound computation since it lends itself more to on-line admission given iterative interference formulation compared to a simple bound. With DM, an admission test must use iteration to compute utility and interference for each period (or deadline) in the system -- at the cost of algorithm complexity in the admission test compared to the quickly computed RMA least upper bound. With either approach, the basic assumption is that there is one basic steady state release model for the system and if the system has different modes of execution (i.e. vastly different task sets and releases), then each mode will be separately analyzed for scheduling feasibility. A very good example of this is the Space Shuttle PASS (Primary Avionics SubSystem), which has a high, medium, and low rate executive and each one of these threads of execution runs different software based on a major and minor flight software mode for the Shuttle phase of flight (e.g. re-entry and ascent are totally different modes) [Carlow84]. 4.1

Multiple On-line Scheduling Epoch Concept Definition

In contrast to the traditional RMA and DM hard real-time view of releases is the extreme case of the Pfair algorithm which reduces the window of scheduling feasibility not just to a window smaller than the longest overall period, but actually to a window shorter than even the shortest release period. The goal of Pfair is to ensure that all tasks make progress at a steady rate proportional to the weight of the task (utility) [Baruah97]. To do this, the Pfair scheduler must slice a release up into many smaller releases with intermediate pseudo deadlines. Multi-epoch scheduling is at a much higher level of granularity, but lies between the Pfair extreme and the RMA on-line mode extremes by considering the possibility that multiple modes or scheduling epochs can be active at the same time, but only one of the epochs can be released at any given point in time. Figure 10: Multiple Epochs of Scheduling Active Simultaneously s1 s2 s3

e1-r1

e1-r2

e2-r1

s2 & s3

e2-r2

e1-r3

e2-r3

s1 & s3

s2 &Siewert, s3 => admitted toReserved all e1 epoch releases with unique negotiation Copyright  2000 Sam All Rights s1 & s3 => admitted to all 32 epoch releases with unique negotiation

32

While the multi-epoch capability has not yet been implemented in the RT EPA, the framework could be extended to provide specification of epochs and admission of threads to each epoch with virtual management of the resources available to each epoch. As already noted and apparent in Figure 10, the major restriction on the proposed RT EPA support for multi-epoch scheduling is that the epochs do not overlap, i.e. that they are mutually exclusive in time. Much like Pfair, a release which originally executes over a period spanning two or more epochs can be decomposed into two separate releases in each of the two epochs. This capability is supported by the RT EPA with on-line monitoring which makes clear highly loaded and relatively less loaded sub-periods in the longest overall release period. Furthermore, the RT EPA pipeline framework for decomposing single releases into multiple synchronized releases would require that pipelines run completely in a single epoch – pipelining between epochs would be indirectly possible through careful configuration by the application.

4.1.1

Admission and Scheduling Within an Epoch

The RT EPA concept for epochs assumes that normally multiple services would be released in each epoch, but that epochs themselves would be released with known phasing such that there is no need to consider the possibility of two epochs being released in a critical instant – epoch phasing must be fully specified to the RT EPA and the RT EPA admits services to an epoch just like it does to a system. The concept of multiple epochs is most useful when there are clear event release epochs which in turn implies unique service epochs for them. It is possible to divide service releases into logical epochs even when event releases do not naturally fall into epochs, but this simply introduces more RT EPA management overhead. Therefore, admission to an epoch just like admission to a system requires: 1) Service code to run 2) Release frequency within the epoch, resource needs (Cexp), and deadline requirements 3) Desired service quality 4) Event or time release source of the epoch For each service admitted into an epoch, the RT EPA must assign resource usage priority and monitor actual usage and control overruns just like it does for a single system. So, multiple epochs do greatly increase management overhead. If a service is admitted to more than one epoch, then it will have multiple release and execution contexts and therefore if viewed from the system level (over all active epochs), it actually has dynamic priority, but fixed priority within a single epoch. Furthermore, the most simple and perhaps useful application of epochs is to subdivide the scheduling problem and allocate independent execution sequences to one of the epochs with epoch release requiring no dispatch or preemption of another epoch because all epochs are released by other epochs.

4.1.2

Active epoch policy

Just like a service in a system, an epoch must have a policy for becoming the active epoch (equivalent of a service being dispatched). Since epoch overhead is significant and since it is


33

envisioned that most applications will have limited (although significant) need for multiple epochs, the proposed RT EPA active epoch policy is simple: 1) The epoch must be based upon an event that is part of the normal system event release stream. 2) The epoch is active to a deadline equivalent to the longest period service admitted to the epoch. 3) Epochs are dispatched and preempted by priority assigned according to the epoch release period. Due to the overhead of multiple levels of dispatch and preemption and the overhead of adjusting service priorities for the currently active epoch, it is envisioned that most systems would make use of only a few epochs at most. Furthermore, in most cases epochs will never need to preempt each other if the simplest phasing rule is applied which is namely that epoch e1 completion releases epoch e2.

4.2

Equivalence of EDF and Multi-Epoch Scheduling in the Limit

Within each epoch every service has a unique priority based upon its deadline within that particular epoch. The service with the shortest deadline has the highest priority within the epoch by DM and CBDM policy as applied to the epoch concept. In the limit, a system could be decomposed into as many epochs as there are event releases so that only one service is released in each epoch. The priority assigned to a service would be adjusted according to the active epoch and active service within that epoch even though there is only one service. Other pending epochs at the time of the realization of a new epoch would require demotion of their services so that the current epoch can be handled. If the active epoch policy is to make the epoch active which is shortest, and by definition that one service has the earliest deadline (only deadline in fact) in that epoch and it is assigned highest priority within the epoch – thus the earliest deadline service always has highest priority in the system – and the active epoch is always the shortest one, so this is the definition of EDF. Of course this multi-epoch limit suffers the same problem as EDF, the on-line identification of epochs would impose so much overhead when reduced to this fine granularity, such that it is infeasible. Given the multi-epoch framework and policy, clearly if all epochs are decomposed to a single service, then in essence, this is specifying that all service priorities be adjusted with each service release and the epoch/service with the shortest deadline is dispatched, so on every service release the shortest deadline service is effectively dispatched.

4.3

Application of Multi-Epoch Scheduling

During the RT EPA research, the question arose as to whether it would be possible to take this concept of scheduling feasibility analysis for modes and use the RT EPA monitoring capability to identify multiple modes and redistribute releases. For example, a video processing application could be viewed as having a data acquisition and a data processing mode; generally there time to switch from one mode to the other would be very short. Once modes are identified and characterized, the RT EPA can help an application developer redistribute releases in order to take


34

a single mode which cannot be scheduled reliably and turn it into two modes, both on-line at the same time, that can be scheduled reliably. In essence, we have on-line switching between modes at a high frequency (compared to, e.g., Shuttle software which switches only a few times during a mission), and we use the RT EPA monitoring to identify the modes and reorder releases. Ultimately it is envisioned that at the granularity of threads, the RT EPA can facilitate on-line redistribution of releases to improve scheduling.

4.3.1

SIRTF/MIPS Multi-Epoch Scheduling Example

The importance of scheduling epochs is clearly demonstrated by the RT EPA monitoring experiments with the SIRTF/MIPS software. The SIRTF/MIPS multi-epoch decomposition was needed to solve a loading and deadline reliability problem, but it is important to note that epochs as applied to SIRTF/MIPS were only released by other epochs (there was not concept of epoch preemption and dispatch) – all scheduling epochs in SIRTF/MIPS were released by the completion of another epoch. The MIPS epochs included: e0: The instrument ready state where telemetry is gathered and processed and clocking hardware is ready to start clocking out exposure data and the ADC channels are ready to digitize the data. e1: An exposure start event ends e0 ready and starts e1 exposure start sequencing and clocking synchronization. e2: An exposure collection cycle start event ends e1 and starts e2 data collection and processing for the steady state prior to detector electronics reset to avoid saturation. e3: A programmed detector reset event ends e2 and starts e3 data processing completion (no data ready). The e3 epoch is ended by one of two events – either return to the ready state and e0 or return to data collection and e2. The segmentation of the SIRTF/MIPS scheduling into these 4 service epochs made it possible to analyze the scheduling feasibility in each independently (e2 and e3 timing summarized in Table 5). Ultimately only one priority was adjusted between epochs and only one allocation change was made between epochs to provide system level scheduling feasibility, but without this dynamic priority adjustment and this reallocation of loading, the SIRTF/MIPS instrumentation software never would have been able to meet deadlines for the most CPU intensive exposures. The RT EPA kernel level monitoring capabilities were used to identify the loading in each epoch and to determine priorities in each epoch. Initially, exposures could not be scheduled reliably with the software due to execution variances that were causing processing to miss deadlines and time out. The results of this experiment are provided in Section 8. Each epoch identified for SIRTF/MIPS was clearly related to a real-world event release epoch. One interesting fact is that the hardware characteristics were designed to provide better granularity in processing events. First, a 16 Kword FIFO was proposed for the detector data digitization sources which lead to a highest frequency data ready event rate of 262 milliseconds as shown in Table 3.


35

Table 3: Single Epoch Design of the SIRTF/MIPS Video Processing With 16 Kword FIFOs task ID

Description

Release Period (msecs)

Epoch

Compressed Frame Set

2621.44

1

Si FIFO driver (SOURCE)

262.144

2

Ge FIFO driver (SOURCE)

917

3

Science FIFO driver (SINK)

131.072

4

Ge Processing and Compression

917

5

Si Processing and Compression

262.144

6

Science Grouping

1048.576

Note that the Science FIFO service rate has a period of 131 milliseconds, but that is a sink event rate rather than a data ready source event rate. Processing can only start when data is ready, so it the source event rates are what drive the overall pipeline, and the data ready event rate would have been a 262 msec period for the 16 Kword FIFO. In order to better distribute loading and releases over time, it was decided that a 4 Kword FIFO be used instead of a 16 Kword FIFO. This dropped the source event rate to a period of 65 milliseconds. So, this allowed for design of a pipeline with a fundamental 65 millisecond release period driven by the source. Shorter releases provided for better distribution of interface servicing in the system as a whole (summarized in Table 4). Table 4: Single Epoch Design of the SIRTF/MIPS Video Processing With 4 Kword FIFOs task ID

Description

Release Period (msecs)

Epoch

Compressed Frame Set

2621.44

1

Si FIFO driver

65.536

2

Ge FIFO driver

233.02

3

Science Link FIFO driver

32.768

4


131.072

5

Si Processing and Compression

65.536

6

Science Grouping

1048.576

Epoch e2 and e3 are epochs that coexist during steady-state processing of exposure data to compress it for downlink to the ground. The e0 and e1 epochs are only active during the initialization of an exposure and the system always returns to e0 once an exposure is finished. So, the instrument receives an exposure command transitions from e0 to e1 and then to e2 and alternates between e3 and e2 until the exposure is finished with e2 that finally returns to e0. In the case of SIRTF/MIPS, the scheduling epochs are coexistent, but mutually exclusive and the release phasing is well known. This is the ideal situation for multi-epoch scheduling.


36

Table 5: Multiple Epoch Design of the SIRTF/MIPS Steady-State Video Processing task ID

Description

Release Period (msecs) th

Epoch 2

Reset and Initial Sample 1/8 Frame Period – No data available

589.824

1

Si FIFO driver

inactive

2

Ge FIFO driver

233.02

3


32.768

4


131.072

5

Si Compression

589.824

6

Science Grouping

589.824

Epoch 3

Sample Frame Period – Data available

2031.616

1

Si FIFO driver

65.536

2

Ge FIFO driver

233.02

3


inactive

4


131.072

5

Si Compression

589.824

6

Science Grouping

1048.576

4.3.2

Multi-epoch Scheduling Compared to Multi-level Scheduling

Multi-epoch scheduling differs from multi-level scheduling where a multi-threaded task is scheduled which in turn schedules its thread set during its release since multi-epoch requires that the software have only one scheduling policy and mechanism overall. The multiple epochs are simply sub-periods of a larger period related to releases and the leverage in the multi-epoch view of the longest period in the system is that it provides a method for analyzing how to adjust releases and relative phasing of releases when the application does in fact have such control. The RMA critical instant assumption pessimistically assumes all releases might be simultaneous and will have no predictable relative phasing when in reality multiple hardware interfaces can and likely will be synchronized and software events may be directly correlated to hardware events – e.g. processing is applied every tenth frame. The RT EPA facilitates multi-epoch scheduling by providing on-line monitoring and task set activation and deactivation, but it is envisioned that the RT EPA could be further extended to actually admit task sets to one or more epochs and then epochs would be admitted. It is interesting to note that this concept is further facilitated by lots of short execution task releases rather than small numbers of long execution releases – i.e. more smaller releases can be interleaved and redistributed easier. Overall, the idea of multi-epoch scheduling is to provide a method to control loading distribution to prevent missing deadlines due to transient overloads arising from execution variance.


37

5

Real-Time Execution Performance Agent Framework

The RT EPA provides a framework for both hard and soft real-time threads. The definition of hard real-time is well understood and universally accepted, but the definition of soft real-time is not yet universally understood and accepted. The RT EPA implements the traditional hard realtime admission policy, underlying priority preemptive priority assignment policy, and will safe the entire system when a hard real-time termination deadline is missed. For soft real-time, the RT EPA allows for bounded overruns and for service dropouts. Since most real-time applications are driven by events and data/resource availability, the RT EPA provides methods for setting up phased data processing pipelines between a source and sink. Finally, the RT EPA has been implemented as VxWorks kernel extensions and application code, but can be ported to most modern operating systems and microprocessors which provide priority preemptive scheduling of multiple threads, kernel loadable modules for access to kernel task state and context switching, access to a real-time clock with microsecond or better accuracy. asynchronous real-time signaling (e.g. POSIX 1003.1b compliant) [POSIX93], priority inheritance and ceiling protocol mutual exclusion semaphores, and binary signaling semaphores.

5.1

Design Overview

This basic in-kernel pipeline design is similar to the splice mechanism [Fal94], but the RT EPA API, performance monitoring, and execution control are much different. Each RT EPA module, shown in Figure 11, is implemented as a kernel thread configured and controlled through the EPA and scheduled by the CBDM (Confidence-Based Deadline Monotonic) algorithm. Figure 11: In-Kernel Pipe with Filter Stage and Device Interface Modules Application

system call

kernel API

Execution-Performance Agent

Device Interface

Pipe-Stage Filter

Device Interface

HW / SW Interface Source Device

Sink Device

The controlling application executes as a normal user thread. The RT EPA mechanism is efficient due to removal of overhead associated with protection domain crossings between device


38

and processing buffers, and reliable due to kernel thread scheduling (compared to split-level scheduling of user threads). In the case of RT EPA implementation in a single user / protection mode operating system like VxWorks, the significance of the in-kernel execution is inconsequential, but does not affect the basic design. The RT EPA API provides configuration and execution flexibility on-line, with performance-oriented “reliable” execution (in terms of expected number of missed soft deadlines and missed termination deadlines). 5.1.1

Pipeline Time Consistency and Data Consistency

The RT EPA may be used to admit and schedule any set of asynchronous services, but pipelines are typical for continuous media and digital control real-time applications. The advantages of designing a real-time data processing application as a pipeline include: 1) testability of individual stages, 2) on-line configuration control of the pipeline processing stages, 3) stage super-frequencies and sub-frequencies, 4) buffer holds for isochronal output, and 5) stage load distribution and potential for parallel processing. These are all advantages of the pipeline approach compared to for example having a single interrupt driven executive that provides equivalent processing in a single thread. However, the most fundamental advantage of pipelines is that they provide flexibility in terms of time and data consistency. Requirements for time and data consistency may vary and the RT EPA provides a simple way of configuring a pipeline to meet vary requirements. For example a digital control application may have strict time consistency requirements, but fairly relaxed data consistency requirements. Often for digital control it is more important that outputs be made on a very regular time interval to system actuators, but since sensors are usually over-sampled compared to outputs, the consistency between sample data and producing outputs is less important. If an output stage sometimes uses the three most recent samples and other times uses two most recent and an old sample just missing output of the third, that’s not particularly a problem (i.e. it does not affect stability or responsiveness), but if there is output jitter, this may compromise stability [Tin94]. In other cases, it may be more important to have a fully synchronized pipeline such that each stage is release upon data availability from the preceding stage. The RT EPA supports both data and time consistency requirements through the pipeline configuration API. 5.1.2

Pipeline Admission and Control

The RT EPA API is intended to allow an application to specify desired service and adjust performance for both periodic isochronal pipelines and for non-isochronal application execution as well. As demonstrated in the RACE and SIRTF/MIPS examples as well as explored theoretically, many scenarios exist for on-line RT EPA service re-negotiation for continuous media, digital control, etc. [Si96]. For example, a continuous media application might initially negotiate reliable service for a video pipeline with a frame-rate of 30 fps, and later renegotiate on-line for 15 fps so that an audio pipeline may also be executed. An application loading pipeline stages or other realtime service threads must at least specify the following parameters for a service epoch: 1) Service type for particular service or pipeline: i) Execution model:


39

ii) Off-line execution samples for Cexpected; 2) Input event source (source must exist as stage or device interface): The application must also provide and can control these additional parameters on-line during a service epoch: 5) Desired termination and soft deadlines with confidence for reliable service: 6) Delay time for output response (earlier responses are held by EPA): 7) Release period (expected minimum inter-arrival time for aperiodics): The exact details of the parameter types and the API are discussed completely in the following sections and can also be found in the RT EPA application code specification in Appendix A. The approach for scheduling RT EPA thread execution is based on the EPA interface to the fixed priority DM scheduling policy and admission test called the EPA-DM approach here. The EPA-DM approach supports reliable soft deadlines given pipeline stage execution times in terms of an execution time confidence interval instead of deterministic worst-case execution time (WCET). Also noteworthy, the RT EPA facility uses two protection domains; one for user code and one for operating systems code. However, the RT EPA facility allows trusted module code to be executed in the kernel protection domain. We have focused on the functionality of architecture, relying on the existence of other technology such as that used in the “SPIN” operating system [Be95] to provide compile time safety checking. The negotiation control provided by RT EPA is envisioned to support isochronal event-driven applications which can employ and control these pipelines for guaranteed or reliable execution performance.

5.2

RT EPA Traditional Hard Real-Time Features

The simplest traditional definition of hard real-time systems is that the utility of continuing execution beyond a hard real-time deadline is not only futile, but may even have a negative utility – i.e. it may damage the system even more than simply safing it. As such, if an RT EPA thread is designated to have a guaranteed deadline instead of the other two options (reliable and besteffort), then when such a deadline is missed, the RT EPA first calls an application specific system safing callback provided at initialization time and then deactivates all threads under RT EPA control fully halting the system in the safed state. So, the Dterm for a hard real-time thread in an RT EPA controlled system is a traditional hard deadline that not only has service implications, but actually results in termination of all services and safing of the whole system.

5.3

RT EPA Soft Real-Time Features

The RT EPA defines soft real-time in three ways: 1) tasks may have bounded overruns on a given release, 2) tasks may have occasional failures to complete a release, and 3) release period may jitter. This is based upon the definition adopted for the RT EPA which is that a soft real-time


40

task has a utility function [Au93] such that it is worthwhile to continue services and keep the system running despite deadline overruns, misses, and release jitter.

5.4

RT EPA Best Effort Features

The RT EPA provides best effort scheduling for things like EDAC memory scrubbers, slackstealing diagnostics, and any other number of services that have not processing deadlines at all. These types of threads/services still must register with the RT EPA and be admitted and activated so that it is ensured that they do not interfere with the guaranteed and reliable services. In fact, any execution outside of the RT EPA admitted thread set other than system overhead (context switching and interrupt service routines) will completely defeat the RT EPA. This is normally prevented by the RT EPA option to demote all other tasks at initialization. The best effort tasks may be assigned an importance level by the order in which they are admitted (i.e. a best effort task admitted earlier than another best effort task will preempt it when both happen to be ready to run). 5.5

RT EPA Data Processing Pipeline Features

In addition to hard and soft real-time features, the RT EPA provides pipelining and isochronal output features to simplify digital control and continuous media application implementations. For pipelining, when services are activated, they may provide a release complete callback (or simply provide NULL if they do not want this callback). Furthermore, each service may be event released by a binary semaphore, and therefore it is simple to provide a pipeline of services with phased execution by performing semaphore gives in the release completion callback. If the pipeline has requirements to execute a stage every N cycles instead of at the fundamental pipeline frequency, then the application simply needs to provide a callback which gives the semaphore every N releases rather than every release. Note that this is a nice re-negotiation feature. For example, an application might process video frame data for both local and remote display – local display capability might be 30 fps, whereas due to the limitations of the communication medium with the remote host, the capability might have to be downgraded to 10 fps compressed. Finally, when a release completion callback is provided, it is also necessary to specify whether that callback should be made isochronal or not. An isochronal callback requires that the RT EPA will wait to make the callback after release until the specified Tout time. The value Tout is a period of time after release (and therefore if it is less than the current response time, the serviceReleaseCompleteCallback will be called immediately, and otherwise, a timer will be set and the callback will be made after the specified time). This can be particularly useful for digital control processing pipelines which are sensitive to output jitter (i.e. stability may be affected by jitter in actuation of control devices) and as well for continuous media processing pipelines where early frame presentation can lead to overall poor quality of service.

5.6

RT EPA Implementation

The RT EPA provides a service negotiation and control interface, deadline management in terms of execution time model confidence, specification of release event for pipelining inputs to outputs, and it returns admission results which are not just accepted or rejected, but if accepted, what reliability can be provided given the current thread set. Once the RT EPA brings a thread set


41

on line, it provides monitoring of actual execution performance and firewalls threads from occasional overruns by soft real-time threads. These RT EPA capabilities are provided by three basic components: 1) The service and configuration API, 2) The kernel-level monitoring and control, and 3) The RT EPA server.

5.6.1

RT EPA Service and Configuration API

The RT EPA service negotiation and configuration API is the primary interface that the real-time system application uses in order to admit threads and bring the system on line with a specific pipeline configuration and then to obtain performance information once the system is online for the purpose of re-negotiation. In this section a review is given of the major API functions and their use and required arguments are explained.

5.6.1.1 RT EPA System Initialization and Shutdown These two API functions are used to start up and shutdown the entire RT EPA system. The initialization requires the application to provide a callback function to safe the entire system in case a guaranteed service termination deadline is missed. The initialization mask specifies basic features including whether a performance monitoring server (an RT EPA service) should be started, whether standard system tasks in VxWorks should be demoted, and finally whether an active idle task should be admitted best effort. The idle task will cause preemption when there are no processor demands – otherwise tasks completions where there are no other demands will not be visible. If the mask includes a performance monitoring server, then a monitoring period must also be specified. The shutdown includes a mask to specify whether VxWorks system task priorities should be restored and otherwise this function simply deactivates all active RT EPA tasks and completely releases all resources under control of the RT EPA. The initialization function includes specification of a system safing callback, a mask for specification of the type on on-line monitoring desired, and a monitoring period: int rtepaInitialize(FUNCPTR safing_callback, int init_mask, r_time monitor_period); The shutdown function simply specifies whether the system should be fully restored to boot-up system task configuration or not with a shutdown mask: int rtepaShutdown(int shutdown_mask); 5.6.1.2

RT EPA Service (Thread) Admission and Dismissal

RT EPA service/thread admission is discussed in detail in Section 6.3. The admission does not activate the service, but makes activation possible assuming the service can be admitted. Dismissal of a task causes deactivation and deletion of the service as well as release of all


42

resources associated with it. This service must be re-admitted if it is ever to be activated again in the future. The service admission and dismissal functions include: int rtepaTaskAdmit( int *rtid, enum task_control tc_type, enum interference_assumption interference, enum exec_model exec_model, union model_type *modelPtr, enum hard_miss_policy miss_control, r_time Dsoft, r_time Dterm, r_time Texp, double *SoftConf, double *TermConf, char *name); int rtepaTaskDismiss(int rtid); 5.6.1.3

RT EPA Task Control

The RT EPA includes task control functions which are only valid for previously admitted services. These functions include: int rtepaTaskActivate(

int rtid, FUNCPTR entryPt, FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback, enum release_complete complete_control, int stackBytes, enum release_type release_type, union release_method release_method, uint Nonline);

int rtepaTaskSuspend(int rtid); int rtepaTaskResume(int rtid); int rtepaTaskDelete(int rtid); int rtepaIDFromTaskID(WIND_TCB *tcbptr); int rtepaInTaskSet(int tid);


43

5.6.1.4

RT EPA Release and Pipeline Control

Three major functions are provide by the RT EPA API for pipeline control configuration. First, a function to specify source interrupt release: int rtepaPCIx86IRQReleaseEventInitialize(

int rtid, SEM_ID event_semaphore, unsigned char x86irq, FUNCPTR isr_entry_pt);

Second, a function for specifying release of processing stages between the source and sink interfaces: void rtepaPipelineSeq(

int src_rtid, int sink_rtid, int sink_release_freq, int sink_release_offset, SEM_ID sink_release_sem);

Third and finally, a function for specifying whether a service should provide isochronal output: void rtepaSetIsochronousOutput(int rtid, r_time Tout); 5.6.1.5

RT EPA Performance Monitoring

The RT EPA provides a callback specification for renegotiation so that an application can handle performance failures: int rtepaRegisterPerfMon(int rtid, FUNCPTR renegotiation_callback, int monitor_mask); Furthermore, the RT EPA can be configured to automatically update all performance data for a service automatically using rtepaPerfMonUpdateService or alternatively can update performance data on demand through the rtepaPerfMonUpdateAll function. int rtepaPerfMonUpdateAll(void); int rtepaPerfMonUpdateService(int rtid); Finally, the following specific performance criteria can be computed on demand for a given service: r_time rtepaPerfMonDtermFromNegotiatedConf(int rtid); r_time rtepaPerfMonDsoftFromNegotiatedConf(int rtid); double rtepaPerfMonConfInDterm(int rtid); double rtepaPerfMonConfInDsoft(int rtid); double rtepaPerfMonDtermReliability(int rtid); double rtepaPerfMonDsoftReliability(int rtid);r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid); Copyright  2000 Sam Siewert, All Rights Reserved

44

r_time rtepaPerfMonClow(int rtid); r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRhigh(int rtid); r_time rtepaPerfMonRlow(int rtid);RT EPA Execution Model Utilities The RT EPA provides a utility function to save and load execution models from actual on-line service including: int rtepaLoadModelFromArray(r_time *sample_array, r_time *sample_src, int n); int rtepaTaskSaveCactexec(int rtid, char *name); int rtepaTaskLoadCactexec(r_time *model_array, char *name); 5.6.1.7

RT EPA Information Utilities

In order to facilitate performance data collection and analysis, the RT EPA provides print functions which summarize on-line performance including: int rtepaTaskPrintPerformance(int rtid); int rtepaTaskPrintActuals(int rtid); int rtepaTaskPrintCompare(int rtid); 5.6.1.8 RT EPA Control Block The RT EPA maintains a data structure associate with each service/thread that has been admitted and is therefore monitored and controlled by the RT EPA. This data structure can be indexed by the rtid, which is a handle for the service/thread much like a VxWorks taskid or Unix PID. The RT EPA CB (Control Block) is defined as follows (note that not all fields are shown here, just those of interest since some fields are only internally used by the RT EPA):

struct rtepa_control_block { /* Service type */ enum task_control tc_type; enum interference_assumption interference_type; enum exec_model exec_model; union model_type model; /* Release and deadline specification */ enum release_type release_type; union release_method release_method; r_time Dsoft; r_time Dterm; r_time Texp; r_time Tout; enum hard_miss_policy HardMissAction; FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback FUNCPTR entryPt; char name[MAX_NAME];


45

int stackBytes; char Stack[MAX_STACK+1]; int RTEPA_id; int sched_tid; WIND_TCB sched_tcb; WIND_TCB *sched_tcbptr; int assigned_prio; /* On-line state and performance updated on every release and/or dispatch/preemption in kernel */ int RTState; int ReleaseState; int ExecState; r_time Cexp; r_time Clow; r_time Chigh; ULONG prev_release_ticks; UINT32 prev_release_jiffies; ULONG last_release_ticks[MAX_MODEL]; UINT32 last_release_jiffies[MAX_MODEL]; ULONG last_complete_ticks[MAX_MODEL]; UINT32 last_complete_jiffies[MAX_MODEL]; ULONG last_dispatch_ticks; UINT32 last_dispatch_jiffies; ULONG last_preempt_ticks; UINT32 last_preempt_jiffies; ULONG app_release_ticks[MAX_MODEL]; UINT32 app_release_jiffies[MAX_MODEL]; ULONG app_complete_ticks[MAX_MODEL]; UINT32 app_complete_jiffies[MAX_MODEL]; uint Nstart; uint Nact; uint N; uint Nonline; r_time Cactcomp[MAX_MODEL]; r_time Cactexec[MAX_MODEL]; r_time Tact[MAX_MODEL]; uint Npreempts; uint Ndispatches; uint SoftMissCnt; uint HardMissCnt; uint HardMissReleasesTerminatedCnt; r_time HardMissCactcomp[MAX_MODEL]; r_time SoftMissCactcomp[MAX_MODEL]; uint ReleaseCnt; uint CompleteCnt; uint ReleaseError; uint CompleteError; uint ExecError; /* On demand or periodic server state and performance */ r_time Cexpactcomp; r_time Clowactcomp; r_time Chighactcomp; r_time Cexpactexec; r_time Clowactexec; r_time Chighactexec; r_time Texpact; double HardReliability; double SoftReliability; r_time ActConfDsoft; r_time ActConfDhard; };


46

5.6.1.8.1 RT EPA CB Negotiated Service The task control is defined by enum task_control which may be {guaranteed, reliable, besteffort}. The interference assumption used in admission is enum interference and may be {worstcase, highconf, lowconf, expected}. The execution model may either be {normal, distfree}. Finally, the union model_type is defined as: union model_type { struct normal_model normal_model; struct distfree_model distfree_model; struct worst_case_model worst_case_model; }; The worst case model is the worst-case execution time, r_time Cwc, or may be a normal model or distribution free model defined by the structures: struct normal_model { /* for normal distribution supplied model */ r_time Cmu; r_time Csigma; double HighConf; double LowConf; double Zphigh; double Zplow; r_time Ntrials; }; struct distfree_model { r_time Csample[MAX_MODEL]; double HighConf; double LowConf; r_time Ntrials; }; 5.6.1.8.2 RT EPA CB Release and Deadline Specification All services must specify how they will be released with enum release_type release_type which may be either {external_event, single, internal_timer}. Depending upon the release type specified, the union release_method release_method must be populated with either a VxWorks timer or semaphore identifier based on this union defined as: union release_method { SEM_ID release_sem; timer_t release_itimer; }; The semaphore should be a standard VxWorks binary semaphore created with semBCreate or a standard POSIX 1003.1b compliant timer [POSIX93] created with timer_create.


47

The soft deadline r_time Dsoft; specifies the allowable overrun deadline which if overrun will result in the RT EPA calling the FUNCPTR serviceDsoftMissCallback if it is not NULL. The r_time Texp specifies the service release period and if FUNCPTR serviceReleaseCompleteCallback is not NULL, it will be called at the end of every release. The parameter r_time Dterm specifies the termination deadline, which if exceeded the RT EPA will take action according to the enum hard_miss_policy HardMissAction which may be either {restart, dismissal}. The release code entry point, name, stack size, and stack memory must be specified by FUNCPTR entryPt, name,stackBytes, and the pointer Stack. Finally, the RT EPA creates the unique RTEPAP_id handle for every service and maintains an association with the VxWorks sched_tid, WIND_TCB sched_tcb, WIND_TCB *sched_tcbptr; and int assigned_prio. 5.6.1.8.3

RT EPA CB On-Line Statistics and Event Tags

The RT EPA kernel-level monitor tracks the service state three ways: 1) RTState, which is {

0=RT_STATE_NONE, 1=RT_STATE_ADMITTED, 2=RT_ACTIVATED, 3=RT_RESTARTED, 4=RT_DISMISSED, 5=RT_SUSPENDED}.

2) ReleaseState, which is {

0=RELEASE_NONE, 1=PEND_RELEASE, 2=RELEASED, 3=RELEASE_COMPLETED}.

3) ExecState, which is { 0=EXEC_STATE_NONE, 1=EXEC_STATE_DISPATCHED, 2=EXEC_STATE_PREEMPTED}. The remaining on-line kernel updated parameters are self-explanatory, however, it should be noted that the time stamps are all obtained by reading the real-time clock . Typically, the real-time clock is a count-up register state machine clocked by a real-time oscillator which generates an interrupt and resets itself when the register value is equal to the period count. As such, the realtime clock has a frequency (the oscillator frequency) and a period (number of oscillations before it hits zero). This is why the time stamps are composed of ticks and jiffies. The ticks are the interrupt count since the operating system was booted and the jiffies are the value of the count-up register. All other RT EPA times are relative times as defined by the type r_time, which in the VxWorks implementation is an unsigned long integer number of microseconds. So, relative times such as Cexp are accurate to a microsecond and have a maximum value of 4294 seconds. It is not anticipated that relative times will exceed 4294 seconds, but absolute times may. As a default, the real-time clock period is set such that the tick period is one millisecond and the jiffies must provide microsecond or better resolution. So, the time stamps have a maximum value of 1193 hours. This again seems reasonable since most systems will require going off line more frequently than every 49 days. At the very least, these ranges are more than sufficient for the research completed here. 5.6.1.8.4 RT EPA CB On Demand or Periodic Server Computed Performance Statistics The parameters Cexpactexec, Clowactexec and Chighactexec are respectively the average execution time computed from all on-line samples, the minimum execution time from all on-line


48

samples, and the maximum execution time from all on-line samples. Execution time is for a single release and only includes the sum of all times between the thread dispatches and preemptions over each release. The parameters Cexpactcomp, Clowactcomp, Chighactcomp are respectively the actual average, low, and maximum response times (time from interrupt driven release until final processor yield by thread). The Texpact, Tlowact, and Thighact are respectively the average, low, and maximum release periods. The period is the time from event release to event release. The parameters HardReliability and SoftReliability are computed from one minus the HardMissCnt and SoftMissCnt values divided by the total number of service completions. Finally, the parameters ActConfDsoft and ActConfDhard are computed from on-line response times by finding the confidence interval which contains the desired Dsoft and Dterm. The most significant parameters for re-negotiation are the HardReliability and SoftReliability and the ActConfDsoft and ActConfDhard parameters since a service that is missing deadlines can compare actual confidence in the desired deadline with that requested and then either accept a lower confidence or reconfigure to reduce resource demands. The reliability measures from all samples, not just those just buffered in the on-line model. The difference in samples can be huge since the MAX_MODEL parameter is 1000 by default. So, the reliability provides a statistically much more significant indication of the ability to meet deadlines given the current service configuration, however, it doesn’t provide an interval, just confidence in one particular deadline. 5.6.1.9 RT EPA Service Negotiation and Configuration Example The RT EPA service negotiation and configuration API is the primary interface that the realtime system application uses in order to admit threads and bring the system on line. This API is predominately used by the application start-up code which itself is not considered to have any real-time requirements since the system is by definition not yet on line. Typically this is acceptable since all thread may be configured before enabling hardware interfaces and associated interrupts which will quickly transition the system from off-line to on-line. So, as an example, a series of event-released data-pipelined threads could be admitted and the final action of the startup code would be to enable the source hardware and associated interrupts and then exit put the system on-line and to leave the application under the monitoring and control of the RT EPA from that point on. The following C code segment is an example for a single service system: rtepaInitiailize

(

(FUNCPTR) service_hard_realtime_safing_callback, (PERFORMANCE_MON | DEMOTE_OTHER_TASKS CREATE_IDLE_TASK), active_monitoring_period

|

); data_ready_event = semBCreate(SEM_Q_FIFO, SEM_EMPTY); rtepaPCIx86IRQEventInitialize(data_ready_event, irq, (FUNCPTR)service_isr); service_execution_model_initialization(); if( (test=rtepaTaskAdmit(

&rtid[0], service_type_is_reliable, interference_assumption_is_worstcase,


49

execution_model_is_normal, hard_deadline_miss_policy_is_restart &service_execution_model[0], Dsoft[0], Dterm[0], T[0], &SoftConf, &TermConf, "tService1") ) != ERROR ) { printf("Service1 task %d can be scheduled\n", rtid[0]); } else { printf("Service1 task admission error\n"); return ERROR; } event_realease_type_info.release_sem = data-ready_event; rtepaTaskActivate( rtid[0], (FUNCPTR) service_entry_point, (FUNCPTR) service_soft_deadline_miss_callback, (FUNCPTR) service_release_complete_callback, service_complete_type_is_not_isochronous, service_tout_is_zero, service_stack_size, event_released_type_info, online_model_size); if( (rtepaRegisterPerfMon( rtid[0], (FUNCPTR) service_renegotiation_callback, (ACT_EXEC | ACT_RESP | ACT_FREQ | ACT_HRD_CONF | ACT_SFT_CONF) ) == ERROR) { printf("Service1 performance monitoring error\n"); } service_source_activate(); The key to the RT EPA initial service negotiation is that it is all off-line since there may be significant processing required to initially perform admission tests on a large number of threads and to initialize the RT EPA itself. The processing required for re-negotiation will be much less significant given a well designed system, infrequent. 5.6.1.10

RT EPA Admission Request and Service Specification

The RT EPA admission request and service specification are made with the single API function rtepaTaskAdmit. The admission request and service specification must be made together


50

since admission is contingent upon the type of service requested. This is that standard interface through which all services are established before a system is taken on-line. An application can also renegotiate admission through this interface during run time in one of three ways: 1) best effort, 2) with a dedicated re-negotiation service the application must establish in advance, or 3) during previously negotiated time as a soft deadline fault handling procedure. Typically, service re-negotiation while the system is on line will be a combination of fault handling and either best effort or dedicated service re-negotiation. Once a service has been successfully negotiated, then faultiness in actual service due to poor execution modeling up front, a poor event-rate model, or programming error will be handled in real-time by the RT EPA as far as protecting other services from overrun interference, but it is up to the service fault handling and the application to handle re-negotiation. The RT EPA on-line model can provide significant help to the application as far as re-negotiation in terms of missed deadline frequencies, high and low execution times, expected execution time, and response time. For example, a service might be I/O bound, in which case the deadline miss frequency may be high, but the execution times are as expected. This means that the processing resources were sufficient, the execution model was good, but the I/O bandwidth was insufficient to support the frequency and input/output block size of the service. 5.6.1.10.1 Service Type The service type argument provide to rtepaTaskAdmit must be either guaranteed, reliable, or besteffort. If the service type negotiated is guaranteed, this has system implication – namely, missing a guaranteed termination deadline means that the RT EPA will call the system safing callback and then terminate all services and itself. As discussed previously, this implements the traditional notion of a hard real-time service. This service type should be negotiated for only in such circumstances where missing the service deadline will truly result in damage (negative utility) to the system as a whole rather than just that particular service. Most services that are soft, can occasionally be missed without catastrophic system consequence, and should negotiate for reliable service. Reliable means that the RT EPA will allow a soft deadline overrun up to the specified termination deadline, which in reality is a system firewall against uncontrolled overrun interference. When the thread/service overruns its soft deadline, the RT EPA will execute a soft overrun callback in that release context (these callbacks should have much shorter expected execution times than the difference between the soft and termination deadlines). A good example of a missed soft deadline callback action is to renegotiate service, release frequency or to reconfigure the data processing if possible to reduce loading. 5.6.1.10.2 Interference Assumption The interference assumption is fundamental to the overall performance of the RT EPA controlled system. The options for this are worstcase, highconf, lowconf, or expected. For worstcase, the scheduling admission test assumes that all interfering threads may execute up to their termination deadlines and therefore maximum potential interference is assumed for that service, but not for all services/threads, just those that execute to the deadline. It is important to note that the RT EPA allows the engineer to configure the admission test to have a specific interference assumption for each thread – so, if a service is hard real-time, then it is recommended that the worst-case interference also be assumed for this particular thread. For a soft real-time thread it is possible to assume worst-case interference, which means that any release jitter,


51

execution jitter, or completion jitter is wholly due to that thread’s characteristics (i.e. this might be a beneficial way to localize timing variances). However, since the thread is soft real-time itself, it is pessimistic to assume worst-case interference; therefore three other soft options are provided: highconf, lowconf, and expected. The expected assumption is typically advised for soft real-time threads since interference time for all higher priority threads is taken to be their expected execution times –the most likely scenario. This will lead to some release jitter and potential soft overruns and even termination deadline misses, but typically will meet the thread/service requirements. Some services may not require hard real-time performance, but may exhibit very high confidence in meeting deadlines. For this situation, an intermediate option, high confidence, a level between worst-case and expected can be specified which of course assumes that all interfering threads will execute up to their high confidence execution time interval. The final option of low confidence does not seem very useful other than for completeness and for providing an optimistic interference assumption. 5.6.1.10.3 Execution Model An execution model must be provided to the RT EPA for each service admitted. Two types of models are supported: a normal distribution model and a distribution free model which is simply a set of trials. Providing the model may seem to be burdensome, but it is necessary to accurately test scheduling feasibility; however the RT EPA can build this model off-line and then it can be provided as an input for subsequent on-line execution. However, this is only true for distribution free modeling. A distribution-free model simply takes a set of trials which are actual execution times and sorts them in order to compute a confidence interval. One drawback of the distribution free model is that the accuracy is directly in proportion to the size of the model. So, for example, if you want to know the execution time to a 99% confidence interval, you must provide at least 100 trials and for 99.9% you must provide 1000 trials. The RT EPA currently has an on-line maximum model size to support 99.9% confidence (i.e. 1000 trials per service). 5.6.1.10.4 Termination Deadline Miss Policy The hard deadline miss policy may be either restart or dismissal. In the case of restart, the RT EPA terminates the current release of a service when it attempts execute beyond the termination deadline, but all future releases are unaffected. Either way a thread is never allowed to overrun its termination deadline and therefore overruns and interference are ultimately fully bounded. If the policy selected is dismissal, then the application will have to completely renegotiate admission. 5.6.1.10.5 Release Period and Deadline Specification The soft and termination deadline as well as expected or desired release period must be specified for admission. The deadlines should be based upon system requirements and the release period definition depends upon the release type which is actually specified with the taskActivation interface. The specification of the soft and termination deadlines allows the RT EPA to compute deadline confidences based upon the execution model which may be retrieved from the admission test by providing a double precision storage location for each or ignored by providing a null pointer. The deadline confidences may also be retrieved at any time with the rtepaPerfReport API


52

function call. The confidence is always based on the current set of threads admitted. additional admissions will not invalidate the original negotiation, but will erode any margin.

So,

Since threads may be released either by events, time, or by single request, the period has meaning specific to each circumstance, but either way admission is based on the same period. In the case of an event released thread, then the period is the expected release period due to the external event (e.g. data ready ISR gives a semaphore). If the actual event rate is higher, then the RT EPA will not release the thread until the specified period is met or exceeded and period jitter and potential service dropouts will result. 5.6.1.11 Expected Performance Feedback It is possible to obtain performance information at any time, but the actual computation of performance is done during the caller’s negotiated service time, optionally by a performance service at RT EPA initialization (or it is best effort). If the RT EPA is initialized to perform periodic performance monitoring, then the performance API function calls will simply return the last globally computed value for the parameter, otherwise the function will compute and then return the value. The following C code API function calls may be made to obtain the latest performance information from the RT EPA based on on-line monitoring:

5.6.1.11.1 Global Performance Parameters Update API Functions The global performance update functions update all performance parameters on demand with: int rtepaPerfMonUpdateAll(void); Likewise on a periodic basis with: int rtepaPerfMonUpdateService(int rtid);

5.6.1.11.2 Deadline Peformance API Functions The deadline performance functions all return a particular performance value on demand related to service deadlines. These functions include: r_time rtepaPerfMonDtermFromNegotiatedConf(int rtid); r_time rtepaPerfMonDsoftFromNegotiatedConf(int rtid); double rtepaPerfMonConfInDterm(int rtid); double rtepaPerfMonConfInDsoft(int rtid); double rtepaPerfMonDtermReliability(int rtid); double rtepaPerfMonDsoftReliability(int rtid);

5.6.1.11.3 Execution Peformance API Functions The execution performance functions return estimates of execution times from actual monitoring on demand and include: r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid);


53

r_time rtepaPerfMonClow(int rtid); 5.6.1.11.4 Release Performance API Functions The release performance functions return service release performance on demand and include: r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRThigh(int rtid); r_time rtepaPerfMonRTlow(int rtid); r_time rtepaPerfMonTexp(int rtid);

5.6.1.12 RT EPA Task Activation and Execution Specification The RT EPA provides an API for task activation and specification of callbacks for soft deadline misses and for normal completion. 5.6.1.12.1 Service Execution Entry Point and Soft Deadline Miss Callback The soft deadline miss callback affords the service the opportunity to reconfigure (e.g. frequency or algorithm complexity) before its termination deadline. Typically, the time before the termination deadline after a soft deadline miss will be short, but most likely sufficient for a service to at the very least change frequency and or handle related faults. 5.6.1.12.2 Service Release Complete Isochronal Callback In order to simplify implementation of isochronal pipelines which require output rates that are regular for applications like digital control (stability is affected by actuator output regularity) and continuous media, the RT EPA API provides a release completion callback. This callback is made by the RT EPA according to specification of a delay up to a period immediately preceding the release period end. This allows for de-coupling of execution completion and response output so that isochronal outputs can be guaranteed despite jitter in phase between release and completion output.

5.6.1.12.3 Release Type and Event Specification The RT EPA provides for three types of thread releases: 1) event released, 2) time released, and 3) single release. In reality, all threads could be considered event released since a time released thread is really released by a clock event and a single release is released by a request event, but these types are useful from a practical standpoint depending upon the application needs for sequencing execution . For example, the data source in the system might need to be polled, in which case a time released thread can be set up on this interface so it can periodically check interface status and service as needed. Alternatively, a data source may provide interrupts when data is ready, in which case it is most easy to provide associate a semaphore with the interface ISR so that the RT EPA can release the interface servicing thread based on interrupts. In this case the semaphore, externally created, must be provided to both the RT EPA and the ISR. Finally, the single release thread provides a good option for exceptional events such as an out-of-band user request (e.g. dump diagnostic information while the system is on-line) or for non-critical fault handling. In this case, the application will admit the thread for the single release, and the RT EPA will automatically dismiss it upon release completion. Critical fault handling typically should be handled by a real-time periodic monitor at a reserved service level.


54

5.6.1.12.4 Service On-Line Model Size The on-line model will directly drive the ability to estimate confidence intervals. Distribution free confidence interval accuracy is derived from the number of samples and confidence possible is 1.0 - (1/N), so that for example given an on-line model size of 100, confidence interval accuracy is limited to 99%. The current maximum on-line model size is 1000 providing accuracy to 99.9%. 5.6.1.13 Service Performance Monitoring Specification The RT EPA has a negotiation monitoring capability that may be scheduled just like any other RT EPA task (so, it actually is self monitoring). This task periodically checks the on-line model of execution time, response time, thread release frequencies, soft deadline confidence, and hard deadline confidence. The frequency with which the RT EPA monitor runs is of course dependent upon system requirements and available resources, but typically is a much lower frequency than execution control which runs at the same frequency as the aggregate frequency of all RT EPA threads. This periodic monitoring capability simply reduces data collected on a context switch basis and compares on-line performance to the negotiated service and provided execution model. When there are discrepancies, then the RT EPA monitoring task executes a callback function on behalf of the application thread during its service time – typically this can be a soft real-time thread itself, however, the callbacks should be short and typically involve system reconfiguration. For example, changing release frequency, reconfiguring the algorithm for release, or eliminating the thread from the active set. The performance monitoring provides updates to the expected C and T parameters as well as re-estimating confidence in D term and D soft. The API simply provides specification of the monitoring rate and interface.

5.6.2

RT EPA Kernel-Level Monitoring and Control

The kernel-level monitoring and control provided by the RT EPA is fundamental to tracking event performance and to controlling releases. The kernel monitor captures dispatch and preemption times as well as completion time and detects missed soft deadlines and hard deadlines. From this basic information capture, the RT EPA performance monitor and the service release wrapper code can detect release faults and prevent hard deadline overruns. The wrapper code around releases is minimal, but required since the wrapper actually provides the missed hard deadline termination – this wrapper code executes in the same context as the service. All RT EPA guaranteed and reliable services are intended to execute in kernel space. 5.6.2.1 Event Release Wrapper Code The RT EPA supports event release of pipeline service threads from interrupts or by internal events such as completion of processing by another pipeline service. The association of the release to service is made using the rtepaPCIx86IRQReleaseEventInitialize for a hardware interrupt release or using rtepaPipelineSeq for a software pipeline event release. The RT EPA takes the service code and wraps it with a function that provides the generic event release capability, over-run control, stage sequencing, and maintenance of the service release status. The best way to describe the RT EPA event release wrapper is to examine each part in detail, starting with release of the source device interface service code released typically by a hardware interrupt


55

and ending with the sink device interface code which must meet overall pipeline deadline and isochronal output requirements. The source device interface service code must be associated with the source interrupt as a first step in specifying an in-kernel pipeline. 5.6.2.1.1 ISR Release Wrapper Code A pseudo code specification of interrupt service routine release wrapper code is given here (please refer to Appendix A for the full source code API specification). void rtepaInterruptEventReleaseHandler(int rtid) { time_stamp_isr_release(rtid); (*rtepa_int_event_release_table[rtid].app_isr)(); update_release_timing_model(rtid); RTEPA_CB[rtid].ReleaseState = RELEASED; RTEPA_CB[rtid].ReleaseCnt++; semGive(rtepa_int_event_release_table[rtid].event_semaphore); }

5.6.2.1.2 RT EPA Event Release Wrapper Code A pseudo code specification of the event release wrapper code is given here (please refer to Appendix A for the full source code API specification). void event_released_rtepa_task(int rtid) { if((never_released(rtid)) && (no_hard_misses(rtid)) event_released_rtepa_task_init(rtid); RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; while(1) { semTake(RTEPA_CB[rtid].release_method.release_sem, WAIT_FOREVER); release_watchdog_timer_settime (rtid); (*RTEPA_CB[rtid].entryPt)(); release_watchdog_timer_cancel(rtid); if(RTEPA_CB[rtid].complete_type == isochronous) { delay_as_needed(rtid); } for(i=0; i < RTEPA_CB[rtid].NStages;i++) { handle_next_stage_release(rtid, i); } if(RTEPA_CB[rtid].serviceReleaseCompleteCallback != NULL)


56

(*RTEPA_CB[rtid].serviceReleaseCompleteCallback)(); RTEPA_CB[rtid].ReleaseState = RELEASE_COMPLETED; } } 5.6.2.2 Dispatch and Preempt Event Code A pseudo code specification of the dispatch and preempt kernel event code is given here void RTEPA_KernelMonitor(WIND_TCB *preempted_TCB, WIND_TCB *dispatched_TCB) { /**** RTEPA TASK DISPATCH ****/ if((rtid = rtepaInTaskSet(dispatched_TCB)) != ERROR) { RTEPA_CB[rtid].ExecState = EXEC_STATE_DISPATCHED; RTEPA_CB[rtid].Ndispatches++; update_dispatch_statistics(rtid); } /**** RTEPA TASK PREEMPTION ****/ if((rtid = rtepaInTaskSet(preempted_TCB)) != ERROR) { RTEPA_CB[rtid].ExecState = EXEC_STATE_PREEMPTED; RTEPA_CB[rtid].Npreempts++; update_preempt_statistics(rtid); /******** CASE 1: Release Completed ********/ if(RTEPA_CB[rtid].ReleaseState == RELEASE_COMPLETED) { record_completion_time(rtid); compute_cpu_time_for_release(rtid); compute_time_from_release_to_complete(rtid); update_deadline_performance_model(rtid); RTEPA_CB[rtid].CompleteCnt++; RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; } /******** CASE 2: Release In-Progress Preempted ********/ else { RTEPA_CB[rtid].Ninterferences++; update_cpu_time_and_response_time_models(rtid); } } }


57

5.6.2.3 Release Frequency The period between releases is tracked by the RT EPA so that release jitter can be detected. The release period is taken from a release event which must be specified in terms of either a timer or a semaphore associated with an external event (e.g. interrupt). The jitter between releases will be due to clock jitter or interrupt source jitter and is expected to be low, however, if an event rate is assumed to be periodic, and in fact it is not, this will be easily detected through this period jitter. It is important to note that if release jitter becomes significant, this most likely indicates that the environmental model for the event rate is faulty and this will not necessarily be localized to the current thread deadline performance, but will cause more or less interference than is expected, potentially causing other services to have deadline failures. Therefore, tracking this period allows the application to monitor event rates and handle system level faults. 5.6.2.4 Execution Time The kernel monitor computes execution time by detecting operating system scheduler dispatches and preemptions. The execution time of course does not include interference time when a release is preempted and requires re-release. The execution time will always be less than the response time due to release latency and due to the potential for release execution interference. 5.6.2.5 Response Time The response time is the best figure of merit for real-time system performance since the response time must be less than the relative deadlines and is an aggregate measure of end-to-end latency and jitter. The RT EPA determines response time through the use of an event descriptor table and the use of the kernel-level completion detection. The event descriptor table is time stamped when the release timer expires or when the event release interrupt is asserted. Ultimately all event releases must be tied to a hardware interrupt either directly or indirectly. Therefore, the RT EPA provides an ISR registration API function which wraps a traditional ISR entry point with the RT EPA event descriptor table updates required. For timer releases, the normal RT EPA wrapper function provides the event descriptor table update. When a service actually completes, the response time is computed by subtracting the event descriptor table release time (the time the real world event was first detected by the system) from the kernel-level release completion time. 5.6.2.6 Deadline Miss Management Management of missed deadlines is provided by the RT EPA for both soft and termination deadlines as well as traditional hard deadlines. Table 6 summarizes the RT EPA action in each of the three cases.


58

Table 6: RT EPA Deadline Management Summary Deadline Miss Type Soft

Termination

Hard (Same as Termination, but for guaranteed service level rather than reliable or best-effort)

RT EPA Action a) The soft deadline miss is noted in the kernel level monitoring for that service, b) The registered callback for a soft deadline miss is called at the next event release a) The termination deadline miss is noted in the kernel level monitoring for that service, b) The termination timer asynchronously terminates the current release if the dismissal policy is restart, and otherwise the thread is deactivated and dismissed from the current set of online threads a) The system safing callback for a guaranteed service deadline miss is called b) The RT EPA is completely shut down

5.6.2.6.1 Terminate Execution that would Exceed Hard Deadline A fundamental feature of the RT EPA design and a requirement (in order to preserve the integrity of confidence-based scheduling) is that every thread release must have a termination deadline. The RT EPA provides support for both soft and hard real-time thread releases, so if a soft real-time thread were, for example, allowed to overrun indefinitely, it would introduce unbounded interference to the system. Several policies for thread deadline overrun control were considered. First, the RT EPA provides for specification of a soft deadline which a thread release is allowed to overrun and the RT EPA simply provides an application specific callback mechanism so that the application control may decide how to handle this execution fault. For example, if a thread is missing soft deadlines, then the application may want to renegotiate the thread release frequency for a degraded mode. Second, the RT EPA provides for specification of a harder termination deadline which if a thread attempts to overrun, the current release of that thread will be terminated, but future releases will still be made. As an option, the RT EPA allows for specification of policy on hard deadline misses to either be restart or expulsion. The default is restart and therefore future releases of the thread will continue to be made, but if the policy selected is expulsion, then the thread will be completely removed from the currently admitted set of on-line threads and any future execution of this thread would require re-negotiation by the application to readmit the thread. 5.6.2.6.2 Hard Deadline Miss Restart Policy The RT EPA hard deadline miss restart policy works by setting up a timing watchdog on the current release of each thread and if the thread does not complete its release and yield the processor prior to expiration of this watchdog timer, then the RT EPA will asynchronously terminate the release, but will enable future releases of the same thread from the thread’s normal entry point. This is done in the RT EPA by wrapping all thread release entry points with the following C code: while(1) {


59

if(RTEPA_CB[rtid].FirstRelease) event_released_rtepa_task_init(rtid); RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; /* Wait for event release */ semTake(RTEPA_CB[rtid].release_method.release_sem, WAIT_FOREVER); /* Handle previous release soft deadline miss */ if( (RTEPA_CB[rtid].ReleaseOutcome == SOFT_MISS) && (RTEPA_CB[rtid].SoftMissCallback != NULL)) (*RTEPA_CB[rtid].SoftMissCallback)(); /* Set termination watchdog */ timer_settime( RTEPA_CB[rtid].Dterm_itimer, RTEPA_CB[rtid].flags, &(RTEPA_CB[rtid].dterm_itime), &(RTEPA_CB[rtid].last_dterm_itime) ); /******** Release exeuction ********/ (*RTEPA_CB[rtid].entryPt)(); /* Cancel termination watchdog */ timer_cancel(RTEPA_CB[rtid].Dterm_itimer); if(RTEPA_CB[rtid].ReleaseCompleteCallback != NULL) (*RTEPA_CB[rtid].ReleaseCompleteCallback)(); RTEPA_CB[rtid].ReleaseState = RELEASE_COMPLETED; } The watchdog timer will be canceled prior to the termination deadline in the typical case. 5.6.2.6.3 Termination Deadline Miss Dismissal Policy The RT EPA provides two policies for missed termination deadlines: 1) the current release is terminated, but all future releases are unaffected or 2) the service is deactivated and the thread is dismissed from the on-line thread set. 5.6.3

Performance Monitoring and Re-negotiation

RT EPA performance monitoring can be accomplished in two ways: 1) active monitoring as a service or 2) passive monitoring during execution time of any particular service. In the case of active monitoring, the RT EPA is initialized with the performance monitoring option enabled and frequency specified. The RT EPA has an internal execution model for this specialized service it uses and it actually admits the service just like any other service. In this case, performance of all threads in the system is computed periodically and the last computed performance is available though a low-cost referencing function call to any requester. Furthermore, the performance monitor will make callbacks to any service which registers with it during system initialization at the end of each service release to indicate performance that is below the negotiated level. This callback has an event mask for each performance parameter: ACT_EXEC = expected actual


60

execution time, ACT_RESP = expected actual response time, ACT_FREQ = expected actual release period, ACT_HRD_CONF = expected termination deadline confidence and ACT_SFT_CONF = expected actual soft confidence. In the passive mode it is up to the services to poll the performance though the rtepaCurrentPerf API function call which computes performance parameters given the current on-line model and returns a mask indicating which parameters are not at or better than negotiated levels. 5.6.3.1 Soft Deadline Confidence The soft deadline confidence is computed by performance monitoring by computing the confidence in response time being less than the soft deadline – this can be an expensive computation for the on-line distribution free model. As an alternative, it is possible to request the soft deadline reliability which is simply a computation based upon the number of missed deadlines out of the total number of completions since the RT EPA went on-line. 5.6.3.2 Hard Deadline Confidence The hard deadline confidence is computed by performance monitoring by computing the confidence in response time being less than the termination deadline – this can be an expensive computation for the on-line distribution free model. As an alternative, it is possible to request the termination deadline reliability which is simply a computation based upon the number of missed deadlines out of the total number of completions since the RT EPA went on-line.


61

6

The Confidence-Based Scheduling Formulation

The RT EPA on-line admission test is provided by a mathematical formulation which is an extension of the DM (Deadline Monotonic) equations developed by Audsley and Burns at the University of York. The DM equations have been extended to handle expected execution time and the RT EPA Dterm upper bound on overruns. The expected execution time is calculated from an execution model which provides a confidence interval for execution that allows for derivation of the deadline confidence for the service. 6.1

RT EPA CBDM Concept

The concept of RT EPA CBDM thread scheduling for services and pipeline stages is based upon a definition of soft and termination deadlines in terms of utility and potential damage to the system controlled by the application [Bu91]. The concept is best understood by examining Figure 12, which shows response time utility and damage in relation to soft and termination deadlines as well as early responses. Figure 12: Execution Events and Desired Response Showing Utility earliest possible response

earliest desired response

desired optimal response

latest desired response

response utility

release start time

termination

response failure: dropout degradation

response damage

time utility curve desired response interval WCET computation time distribution

Cexpected best-case execution hold early response

Rmin buffered response

Ropt buffered response

Clow, Dsoft signal

Chigh, Dterm signal and abort

The RT EPA design provides callback registration for the controlling application which the RT EPA will execute when either the soft or termination deadline is missed, and specifically will abort any thread not completed by its termination deadline. The RT EPA allows execution beyond the soft deadline which is a bounded overrun. Signaled controlling applications can handle deadline misses according to specific performance goals, using the RT EPA interface for renegotiation of service. For applications where missed termination deadline damage is catastrophic (i.e. termination deadline is a “hard deadline”), the service and/or pipeline must be configured for guaranteed service rather than reliable service. In this case, the entire system will be safed. The


62

key to the extended DM formulation for CBDM (Confidence-Based Deadline Monotonic Scheduling) is that overruns are always bounded and therefore interference in the system also ultimately has an upper bound mathematically. An extension to the well established DM scheduling policy and scheduling feasibility test is used in the RT EPA due to their ability to handle execution where deadline does not equal period and because of the iterative nature of the admission algorithm which provides for better determination of which threads can and can’t be scheduled rather than a monolithic thread set [Au93]. The RMA least upper bound simply provides scheduling feasibility for the entire set and no indication of what subsets may be able to be scheduled. Given the DM basis of CBDM, it is possible to renegotiate based on specific thread admission failures. This may often be true for the applications to be supported. One major drawback of the traditional DM scheduling policy is that to provide a guarantee, the WCET of each pipeline stage or service thread must be known along with the release period. The CBDM extension, on the other hand, provides an option for reliability-oriented applications –where occasional soft and termination deadline failures are not catastrophic, but simply result in degraded performance -- the reliable service option provides quantifiable deadline assurance given expected execution time. Despite the ability to opt for no guarantee, this reliable test and execution control does not just provide best effort execution. Instead, a compromise is provided based on the concept of execution time confidence intervals and the RT EPA execution control combined with the CBDM admission test based on expected execution times and bounded overruns. An example of the CBDM admission test is given here with a simple two-thread scenario. The CBDM scheduling feasibility test eases restriction on the DM admission requirements to allow threads to be admitted with only expected execution times (in terms of an execution confidence interval), rather than requiring deterministic WCET. The expected time is based on off-line determination of the execution time confidence interval. Knowledge of expected time can be refined on-line by the RT EPA kernel-level monitoring features each time a thread is run. By easing restriction on the WCET admission requirement, more complex processing can be incorporated, and pessimistic WCET with conservative assumptions (e.g. cache misses and pipeline stalls) need not reduce utility of performance-oriented pipelines which can tolerate occasional missed deadlines (especially with probability of misses). 6.2

CBDM Deadline Confidence from Execution Confidence Intervals

CBDM provides an extended version of the DM scheduling feasibility tests, which consider computation time and interference for a thread set, and therefore viability of response time less than a deadline. Fundamental to the CBDM extension is that release latency is due predominately to interference by higher priority threads, but execution latency is due to both interference and execution jitter. The DM equations account for interference, but do not account for execution jitter – rather the DM equations assume worst case execution time only. Basic DM scheduling formulas are extended by CBDM to return expected number of missed soft and termination deadlines to the controlling application given the provided execution model which quantitatively incorporates execution jitter. Latency and jitter in dispatch/preemption is not considered here. For


63

this capability, when a module is loaded, the computation time must be provided with a sufficient sample set for distribution-free confidence estimates, or an assumed distribution and a smaller sample set of execution times measured off-line. From this, the computation time used in the scheduling feasibility tests is computed based upon desired confidence for meeting soft and termination deadlines. All interfering threads are pessimistically assumed to run to their termination deadline where they either will have completed or are aborted. For example, for thread i, let C(i) = expected execution time; Dsoft(i) = soft deadline; Dterm(i) = termination deadline; and T(i) = period; with the DM condition that C(i) Rsp must sum to less than 200 msecs including overhead = 36+63+62+37 = 198 tCmdNormal 5 200 (same as above) tSiFIFODrv 15.25878 65.536 Si half-full FIFO rate for 524.288 msec frame 9 time tSiDp 15.25878 65.536 Si slice rate for 524.288 msec frame time 9 tMIPSExpMgr 1.430615 699 Ge Only subgroup production rate + Si raw 2 subgroup production rate (2096)/3). tSiHtrMon 1 1000 Si heater control must run every second and is released by ADC semaphore tADC 1 1000 Analog data collection must run every second and should complete BEFORE limits or telemetry so that data is not stale tLIM 1 1000 Checked every second after ADC tDIAG 1 1000 Handles commands (max rate once a second) tIM 1 1000 Handles commands (max rate once a second) tCKSUM 1 1000 Performs check sum (max rate once a second) tSdmSend 1 1000 Handles SDM send requests (max rate is Ge only every MIPS second) tTLM 0.25 4000 Telemetry collection (rate is every 4 seconds, but deadline is half to prevent stale data) tPKT 0.25 4000 Packet builder (rate is every 4 seconds, but deadline is half to prevent stale data) tIRSExpMgr 0.5 2000 Maximum group production rate for IRS is every 2 seconds (inactive in MIPS modes) tShell 0.1 10000 The shell has no real hard deadline tLogTask 0.1 10000 The LogTask has no real hard deadline tSCRUB 0.01 100000 The memory scrubber is best effort


Deadline (msec)

priority

10 30

3 6

32

9

35

12

36 37 60

15 18 21

61

24

62

27

63 65

30 33

66

36

700

39

701

42

702

45

900 906 907 909 910

48 51 54 57 60

2000

63

2001

66

2002

69

10000 10000 100000

72 75 78

90

8.3.2

MIPS Exposure-Start Reference Timing Model

The reference exposure-start timing model shows the relative order of expected events and times as well as actuals on the Rad6k microprocessor. In all of the MIPS FSW observing modes (photometry, sky-scan, super-resolution, total power, and spectral-energy-distribution) , the MIPS FSW, HTG, and TPG state machines are in a ready state. The ready state is defined by the following CE hardware and software conditions: 1) HTG state machine is executing the ready image cycle (2 MIPS seconds), is asserting the TPG reset address lines, and is producing HTG frames with valid engineering analog and digital status data, but invalid Ge detector science. 2) TPG is slaved to the HTG and under it’s control through the address line interface between the HTG and TPG and is executing the Si detector reset timing pattern. 3) Ge DP task (tGeDP) and the Ge FIFO driver task (tGeFIFODrv) are acquiring frames from the IOB Ge FIFO and extracting the analog and digital status engineering data from them. The ready state of the CE hardware and software provides for Ge detector telemetry collection by the HTG between exposures and therefore an exposure is started by synchronously reconfiguring the HTG for the exposure while it is already running in the ready state -- the exposure configuration and data acquisition will start on the next HTG IC. The synchronization is achieved through a combination of hardware support (HTG synchronous latching of double buffered registers into the state machine prior to each IC start) and software support whereby the MIPS Detector Command task (tDetCmd) writes out the HTG register commands for the exposure within a timing window that is not too close to an HTG IC boundary. If the MIPS FSW encounters the HTG ready IC in the “Unsafe Hold-Off Region”, then it waits 4 frames plus a patchable delay additional amount of time in order to synchronize with the HTG on the next ready IC. This synchronous exposure command window is depicted in Figure 22. Figure 22: MIPS Mode HTG Ready Exposure-Start HW/SW Synchronization Window 1573 msecs Safe Synchronous Unsafe Exposure-Start Window Hold-Off Region 2097 msecs

2621 msecs Si Rst Smpl

Si Smpl GeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGe Si Rst

Si Rst

Si Rst

Si Rst

Si Bst

HTG Ready IC Copyright  2000 Sam Siewert, All Rights Reserved

Si Bst

Si Rst

HTG Exposure IC

91

The FSW scheme for synchronizing with the HTG results in two timing models for both SUR and Raw MIPS exposure starts. These two cases, referred to as Start Case A and Start Case B. Start Case A, the worst case exposure delay, is the result of an asynchronous command arriving inside the Unsafe Hold-Off Region. In this case, the MIPS FSW must delay the exposure start because the HTG image cycle is too near the ready image cycle end for the FSW to safely transition the state machine to exposure settings (i.e. race conditions between the state-machine and data processing software would cause data processing and configuration errors otherwise). So, in Start Case A, the software iteratively delays until the unsafe region has passed, but the software does complete the science generation response and Si DP initialization in the mean time and finally synchronizes the exposure start near the beginning of the next image cycle. Therefore, in Start Case A, after command receipt, the exposure image cycle will not start for at least one full image cycle plus some portion of the hold-off time (between 2097 and 2855 msecs assuming that the pc_ge_dp_WriteSafeICFrameCnt is maintained at the current setting of 4 frames). For Start Case A, the resynchronization will occur between 756 and 1282 msecs after the exposure command is received (756 being the hold-off time + the write safe wait time; the worst case of 1282 includes 2 frames of error in this iterative synchronization approach). An example of Start Case A is shown in Figure 23 . Figure 23: MIPS Exposure Start Worst Case Delay (Case A) Worst Case Exposure Start Delay 2621 msecs

1573 msecs Unsafe Safe Synchronous Hold-Off Exposure-Start Window Region (524 msecs) 2097 msecs

HTG Exposure IC 2097 msecs Resynch HTG Ready IC

HTG Ready IC

Write Safe Wait (234 msecs) Exp Cmd Received

HTG Resynch

Start Case B, the best case exposure start from the standpoint of minimum delay to exposure start, occurs when the exposure command is received just before the Unsafe Hold-Off region. In this case, it is safe to reconfigure the HTG state machine since sufficient time exists prior to the end of the current image cycle. If the command is processed before entry into the safe region, then the exposure image cycle will occur somewhere between 524 msecs and 2097 msecs since exposure command receipt. Furthermore, the resynchronization will occur as quickly as 20 msecs from command receipt, but definitely before 524 msecs. This exposure start scenario is depicted in Figure 24.


92

The MIPS FSW scheme for synchronizing software configuration, control, and data processing with the state machine is based on several key system design facts: 1) The HTG state machine is already running an image cycle prior to commanded exposure starts asynchronously with respect to command processing (this enable telemetry acquisition when not taking exposure data) 2) The HTG state machine does provide double-buffering of key control and configuration registers, but the synchronous latching of these registers into the state machine must still NOT be commanded by the MIPS FSW too close to an image cycle boundary (i.e. no closer than a partial frame from the boundary, or less than 131 msecs) 3) The MIPS FSW has knowledge of where the HTG state machine is in the image cycle to +/- 2 HTG frames (264 msecs) based upon the fact that the HTG FIFO driver is always holding a partial frame (due to modulus between size of frames, 1152 words, and size of a half full FIFO, 2048 words) and based upon the frequency of the GeDP task releases and processing time. 4) The DetCmd task, as noted in Table 19, has a deadline of 62 msecs, and therefore from release the time between checking the HTG state in GeDP and actually completing the software-hardware synchronization can be a good portion of this time Given the facts 1 to 4 above, it was determined that a hold-off region of 4 HTG frames was sufficient worst case. While the MIPS FSW HTG synchronization does cause exposure start jitter of up to 2 MIPS seconds, it guarantees that the software data processing, HTG state machine, and software control of the state machine are fully synchronized no matter when an exposure start command is received and processed. Figure 24: MIPS Exposure Start Best Case Delay (Case B) Unsafe Hold-Off Region (524 msecs) 1573 msecs HTG Exposure IC

Safe Synchronous Exposure-Start Window 2097 msecs

2097 msecs

HTG Exposure IC

HTG Ready IC

SUR/Raw Exp Cmd

HTG Resynch

The MIPS exposure start sequence and timing are very similar for both Raw and SUR (Sample Up-the-Ramp) exposures, however, they do differ slightly, so Table 19 enumerates the


93

order of expected events for an SUR exposure start, and Table 20 enumerates expected events for a Raw exposure start. Both tables include actual measurements made on the flight microprocessor system along with tolerances derived from this model validation. While the deadlines are generous and therefore allow significant jitter in events, the ordering should not change except as noted for Start Case A and B due to the variations possible in software synchronization with the detector timing generator hardware. Furthermore, please note that the model includes validation of both Start Case A and Start Case B. It is important that the regression tester be aware of the two start cases and account for the possibility by verifying both cases or at least one of the possible cases. Table 19: MIPS SUR C0F2N2 Exposure Start Vmetro Time Tags C Code Marker / I/O Board Event Bus Tag Case A Case B {Port, Data} (msecs) (msecs) SUR Exposure Command Received (collection 0xC434, 0x5523 0.0 0.0 trigger) 0x904, 0x9801 2.335 2.247 EXP_PARSED_AND_INIT

Deadline A or B N/A 3.0

EXP_SI_SUBGRP_MQ_CHECKED

0x904, 0x9802

2.552

2.359

3.4

EXP_GE_SUBGRP_MQ_CHECKED

0x904, 0x9803

2.585

2.385

3.8

EXP_SI_FIFO_RESET_DONE

0x904, 0x9804

3.661

2.473

4.2

EXP_INTS_CONNECTED

0x904, 0x9805

3.705

3.517

4.6

EXP_SUR_DP_SETUP_DONE

0x904, 0x9806

3.766

3.580

5.0

EXP_FIRST_SDM_ALLOC_DONE

0x904, 0x9808

3.949

3.772

5.4

EXP_START_RECORDED

0x904, 0x9809

4.106

4.042

5.8

EXP_CMD_RSP_SENT

0x904, 0x980a

4.417

4.348

6.2

EXP_FIRST_GRP_TIMEOUT_SET

0x904, 0x980c

4.449

4.379

6.6

EXP_SEMFLUSH_DONE

0x904, 0x980b

4.472

4.405

7.0

EXP_HTG_START_CALLED

0x904, 0x980d

4.847

4.686

7.4

EXP_HTG_START_COMPLETED

0x904, 0x980e

555.56

27.291

(4 frame delay if cmd received in unsafe zone)

(2 orders)

1282 or 80.0

IOB Event: Sci Data Generation Command Response (Earlier response if exposure start delay required)

0xC330, 0xAAF3

5.891

28.41

8.0 or

SIDP_EXP_INIT_DONE

0x904, 0x0001

100.0

(2 orders)

23.417

45.772

25.0 or 60.0

2378

1099

2855 or 2097

(2 orders) GEDP_EXP_IC_STARTED

0x904, 0x0802 (2 orders)


94

Table 20: MIPS Raw C0F1N2 Exposure Start Vmetro Time Tags C Code Marker / I/O Board Event

Bus Tag {Port, Data} 0xC434, 0x5524

Case A (msecs) 0.0

Case B (msecs) 0.0

Deadline A or B N/A

0x904, 0x9801

2.330

2.273

3.0

EXP_SI_SUBGRP_MQ_CHECKED

0x904, 0x9802

2.437

2.381

3.4

EXP_GE_SUBGRP_MQ_CHECKED

0x904, 0x9803

2.573

2.409

3.8

EXP_SI_FIFO_RESET_DONE

0x904, 0x9804

3.684

3.584

4.2

EXP_INTS_CONNECTED

0x904, 0x9805

3.735

3.637

4.6

EXP_SUR_DP_SETUP_DONE

0x904, 0x9807

3.792

3.694

5.0

EXP_FIRST_SDM_ALLOC_DONE

0x904, 0x9808

3.983

3.886

5.4

EXP_START_RECORDED

0x904, 0x9809

4.146

4.043

5.8

EXP_CMD_RSP_SENT

0x904, 0x980a

4.459

4.357

6.2

EXP_FIRST_GRP_TIMEOUT_SET

0x904, 0x980c

4.490

4.389

6.6

EXP_SEMFLUSH_DONE

0x904, 0x980b

4.513

4.415

7.0

EXP_HTG_START_CALLED

0x904, 0x980d

4.896

4.825

7.4

EXP_HTG_START_COMPLETED

0x904, 0x980e

1217

27.184

(4 frame delay if cmd received in unsafe zone)

(2 orders)

1282 or 80.0

IOB Event: Sci Data Generation Command Response (Earlier response if exposure start delay required)

0xC330, 0xAAF3

5.922

28.303

8.0 or

SIDP_EXP_INIT_DONE

0x904, 0x0001

SUR Exposure Command Received (collection trigger) EXP_PARSED_AND_INIT

100.0

(2 orders) 27.872

45.747

25.0 or 60.0

2494

1414

2855 or 2097

(2 orders) GEDP_EXP_IC_STARTED

0x904, 0x0802 (2 orders)

8.3.3

SIRTF/MIPS Exposure Steady-State Reference Timing Model

The SIRTF/MIPS exposure steady-state compression mode processing, known as SUR, is based upon the instrument data collection and detector timing control hardware. Figure 25 shows how frames are collected from both the Ge and Si detectors and how this relates to software data production events for compressed Ge down-link data (Ge Only Groups) and compressed Si data (SUR Groups). The first two Si frames of an exposure are detector voltage boost frames (Bst) and do not produce data. Furthermore, the third frame of an exposure is a reset frame (Rst) and also produces no data. All subsequent frames produce data until the detector saturates at which time a DCE (Data Collection Event) is complete, and the hardware and software continue in the steadystate as shown in Figure 26. From this point on the DCE shown in Figure 26 repeats until the


95

number of iterations commanded is achieved and then the exposure is terminated and the instrument returns to its ready state. The most significant feature of these timing requirements is that during steady-state exposure processing, a brief period of no data collection (534 milliseconds) exists on a periodic basis. As we will see in Section 8.3.4, this is key to the success of the ME decomposition used in the SIRTF/MIPS scheduling and load distribution. Figure 25: SUR C0F2Nn First DCE Data Collection and Production Event Timing Model 3669 msecs Ge Only Grp

SUR Grp Slope Sub

Ge Only Sub

Ge Only Sub

Ge Only Sub

Si Si Si Si Smpl Rst/Smp Smpl Smpl l X X X X GeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGe GeGeGeGeGeGeGeGe Si Bst

Si Bst

Si Rst

First DCE

Figure 26: SUR C0F2Nn DCE 2 to n Data Collection and Production Event Timing Model 3669 msecs SUR Combined Group

Ge Only Group

Slope/Diff Subgrp

Si Rst

Ge Only Sub

Ge Only Sub

Ge Only Sub Si Rst/Smpl

Si Smpl

Si Smpl

Si Smpl

Si Smpl

Si Smpl

X X X X Ge GeGe Ge Ge GeGeGe G GeGeGe G GeG Ge Ge GeGeGe G GeGe Ge e e e e DCE 2 to N


96

The SIRTF/MIPS raw exposure steady-state processing is based upon the instrument data collection and detector timing control hardware just as the compressed SUR processing mode. Figure 27 shows the first DCE with boost and reset frames and Figure 28 shows all subsequent DCEs with initial reset frames. Figure 27: First DCE Raw C0F1Nn Data Collection and Production Event Timing Model

2621 msecs Raw Grp Si Raw Sub Ge Only Si Bst

Si Bst

Si Rst

Ge Only

Si Rst/Smpl

Si Smpl

X X X X Ge GeGeGe Ge GeGeGe Ge GeGeGe Ge GeGeGe First DCE

Figure 28: DCE 2 to n Raw C0F1Nn Data Collection and Production Event Timing Model 2621 msecs Raw Group Si Raw Subgrp Ge Only

Ge Only

Si Rst

Si Rst/Smpl

Si Smpl

Si Smpl

Si Smpl

X X X X Ge GeGeGe Ge GeGeGe Ge GeGeGe Ge GeGeGe

8.3.4

SIRTF/MIPS SUR Mode Steady-State ME Results

Table 21 shows how the ME decomposed SUR processing with epochs e1 and e2 as described in Section 4.3 meets all deadlines. The deadlines in Table 21 are based upon the data collection


97

for an exposure such that the processing will not fall behind collection. If processing was allowed to fall behind the collection rate, eventually buffer overflows in the pipeline would result and corrupt the science data. Table 21: SUR C0F2N2 Steady-state Exposure Time Tags C Code Marker / I/O Board Event Bus Tag Test {Port, Data} (msecs) SUR Exposure HTG Resynch Commanded 0xC610,0xA580 0.0

Deadline N/A

GEDP_EXP_IC_STARTED

0x904, 0x0802

815.03

2097

GEDP_EXP_SUBGROUP_SENT (1/1)

0x904, 0x0803

2251

4717

MEMDP_EXP_GEONLY_GRP_SENT (1/2)

0x904,0x1004

2264

5766


0x904, 0x0803

3293

5766

SIDP_EXP_SUR_FRM_PROCESSED (1/2)

0x904, 0x0004

3730

5766


0x904, 0x0004

4241

6290


0x904, 0x0803

4333

7338

GEDP_IC_STARTED

0x904, 0x080F

4461

6290

GEDP_EXP_IC_STARTED

0x904, 0x0802

4461

6290

SIDP_EXP_SI_DIFF_COMPUTED

0x904, 0x0007

4516

7338

SIDP_EXP_SI_SLOPE_COMPUTED

0x904, 0x0006

4694

7338

SIDP_EXP_SUR_SUBGRP_SENT (1/1)

0x904, 0x0008

4694

7338

MEMDP_EXP_COMBINED_GRP_SENT (1/2)

0x904,0x1006

4740

7338

GEDP_EXP_AUTOREADY_SENT

0x904, 0x0804

4990

9435


0x904, 0x0004

5810

9435


0x904, 0x0803

5902

9435

MEMDP_EXP_GEONLY_GRP_SENT (2/2)

0x904,0x1004

5915

9435


0x904, 0x0004

6339

9435


0x904, 0x0004

6856

9435


0x904, 0x0803

6948

9435


0x904, 0x0004

7380

9435


0x904, 0x0004

7904

9959


0x904, 0x0803

7997

11007

GEDP_EXP_END

0x904, 0x0806

7997

9435

GEDP_EXP_FULLREADY_SENT

0x904, 0x0805

8005

9435

GEDP_IC_STARTED

0x904, 0x080F

8124

9435

SIDP_EXP_SI_DIFF_COMPUTED

0x904, 0x0007

8198

11007

SIDP_EXP_SI_SLOPE_COMPUTED

0x904, 0x0006

8379

11007

MEMDP_EXP_COMBINED_GRP_SENT (2/2)

0x904, 0x1006

8396

11007


98

Table 22: Raw C0F1N2 Steady-state Exposure Time Tags C Code Marker / I/O Board Event Bus Tag Test {Port, Data} (msecs) SUR Exposure HTG Resynch Commanded 0xC610,0xA580 0.0 (collection trigger) 0x904, 0x0802 1550 GEDP_EXP_IC_STARTED

Deadline N/A 2097


0x904, 0x0803

2988

4718

SIDP_EXP_RAW_FRM_PROCESSED (1/2)

0x904, 0x0002

3406

4718


0x904, 0x0002

3933

5242

SIDP_EXP_RAW_SUBGRP_SENT

0x904, 0x0005

3933

6290


0x904, 0x0803

4035

6290

MEMDP_EXP_RAW_GRP_SENT (1/2)

0x904,0x1005

4048

6290

GEDP_IC_STARTED

0x904, 0x080F

4163

4718

GEDP_EXP_IC_STARTED

0x904, 0x0802

4163

4718

GEDP_EXP_AUTOREADY_SENT

0x904, 0x0804

4690

7339


0x904, 0x0002

4973

7339


0x904, 0x0002

5495

7339


0x904, 0x0803

5601

7339


0x904, 0x0002

6018

7339

SIDP_EXP_RAW_SUBGRP_SENT

0x904, 0x0005

6539

8911


0x904, 0x0002

6539

7863


0x904, 0x0803

6644

8911

GEDP_EXP_END

0x904, 0x0806

6645

7339

GEDP_EXP_FULLREADY_SENT

0x904, 0x0805

6652

7339

MEMDP_EXP_RAW_GRP_SENT (2/2)

0x904, 0x1005

6665

8911

8.3.5

SIRTF/MIPS Raw Mode Steady-State Results

Table 22 shows how the ME decomposed raw processing with epochs e1 and e2 as described in Section 4.3 meets all deadlines. The deadlines in Table 21 are based upon the data collection for an exposure such that the processing will not fall behind collection. If processing was allowed to fall behind the collection rate, eventually buffer overflows in the pipeline would result and corrupt the science data.

8.3.6

SIRTF/MIPS Video Processing RT EPA Epoch Evaluation

The importance of scheduling epochs is clearly demonstrated by the RT EPA monitoring experiments with the SIRTF/MIPS instrument video processing software. Without the epoch analysis and redistribution of releases, this instrument would not have been able to meet its requirements for real-time processing at all. Furthermore, without dynamic adjustment of priority


99

to enable full synchronization of hardware and software during the exposure start epoch of the MIPS software the system never would have succeeded in synchronizing data processing and data production by the hardware. The MIPS software, as noted in Section 8.2, is an excellent example of a marginal task set that by RMA should not be able to be scheduled safely, yet by reorganizing the software into multiple scheduling epochs and by exploiting the fact that it is highly improbable that worst-case executions and releases will cause the maximum interference case (i.e. a timeout is still possible, but highly unlikely), the system has been operating for many months without a single missed deadline.

8.4

Digital Video Pipeline Test-bed Results

The digital video pipeline experiment can be characterized as shown in Table 23. This experiment was a preliminary test of the video processing capabilities ultimately used in the RACE test-bed. This was done to provide a simple example in this thesis and to test basic capabilities of the RT EPA. Table 23: Digital Video Pipeline Marginal Task Set Description task tBtvid tFrmDisp tFrmLnk

Soft Conf 1.0 0.5 0.5

Hard Conf 1.0 0.9 0.8

T µscs 33333 20000 33333 3

Dsoft

Dterm

Cexp

Util

WCET

Util

20000 100000 300000

33333 150000 333333

100 58000 50000

0.003 0.290 0.150

1200 60000 56000

0.036 0.300 0.168

0.443

0.504

The processor is under-loaded, so the results of this test simply show that the RT EPA can schedule a non stressful thread set just as well as an RMA priority preemptive policy. The pipeline includes successive release from the source interrupt, to tBtvid, from tBtvid completion to tFrmDisp release, and finally from tFrmDips completion to tFrmLnk. Given pipeline sequncing like this, the next stage is fully synchronized with the previous and therefore the data is fully consistent through the pipeline. However, the response jitter from the previous stage directly drives the release jitter in the current stage. From the point of release, the only additional jitter is then due to response jitter in that stage, but the overall pipeline sink output jitter is the summation of latency and jitter through all stages. Figure 29 A and B: RACE Frame Compression (A) and Response Jitter (B) Frame Link Response Jitter

60000 59500 59000 58500 58000 57500 57000 56500 0.0E+00

Cactcomp (microsec)

Cactcomp (microsec)

Frame Compression Response Jitter

1.0E+07

2.0E+07

3.0E+07

Time (microsec)


150000 100000 50000 0 0

5E+06 1E+07 2E+07 2E+07 3E+07 Time (microsec)

100

After actually running, the RT EPA on-line monitoring results, summarized in Table 24, showed that the application easily meets all negotiated service levels, which given the underloading is not surprising. The results do show however that the RT EPA correctly computes service level capability that exceeds what was requested. Table 24: Actual Digital Video Task Set Performance t

Tact (sec)

1 2

Online Model Size 1000 1000

3

1000

0.331

0.033 0.331

Cact-low, Cact-high, Cact-exp 30, 1073, 60 56966, 59620, 58400 40911, 55790, 48600

0 +

Dterm Conf act 1.0 1.0

∆ Dsoft Online Model 0 260 + 60000

Dterm Online Model 260 60000

+

1.0

+

120000

N preempt 14 76

Dsoft Conf act 1.0 1.0

∆

2809

1.0

101000

More interesting than the trivial service levels presented here (much more interesting marginal results are presented with the RACE test-bed), is how the RT EPA isochronal output feature can be used to eliminate the tFrmLnk jitter in the pipeline (Figure 30a and 30b shows response without control). Being able to remove jitter in the last stage of the pipeline prior to output to a sink device is a key feature of the RT EPA pipelining capabilities. Figure 30 A and B: RACE Frame Link Execution (A) and Response Jitter (B) Video Link Execution Jitter

Video Link Response Jitter (No Isochrony Control) Cactcomp (microsec)

Cactexec (microsec)

60000 58000 56000 54000 52000 50000 0

130000 120000 110000 100000 90000 0

5E+06 1E+07 2E+07 2E+07 3E+07


Time (microsec)

Figures 31a and 31b show the response jitter filtering effect of the RT EPA isochronal output control feature. In Figure 31a it can be seen that there is up to 6 milliseconds of video processing execution jitter which without control directly lead to similar response jitter in Figures 30 a and b. The RT EPA mechanism can produce isochronal output as long as the given stage is meeting or exceeding specified deadlines by holding (buffering) results before they are passed onto either the next stage or presented to the output device. The mechanism is described in detail in Section 5.6.1.


101

Figure 31 A and B: RACE Frame Link Execution (A) and Response Jitter (B) With Isochronal Output Control Video Link Execution Jitter

Video Link Response Jitter (With Isochrony Control)

58000

Cactcomp (microsec)

Cactexec (microsec)

60000

56000 54000 52000 50000 0

5000000

1E+07

1.5E+07

2E+07

Time (microsec)

8.5

130000 120000 110000 100000 90000 0.0E+00 5.0E+06 1.0E+07 1.5E+07 2.0E+07 Time (microsec)

RACE Results

The RACE results satisfy all experimentation goals for the RT EPA. The following is a summary of results which exhibit the five goals for the RT EPA. 8.5.1

RACE Marginal Task Set Experiment (Goal 1)

The first experiment goal to implement a marginal task set was met by the RACE experiment. The RACE thread set was rejected based upon the execution model by both the RMA least upper bound test and the DM admission test, but admitted by the RT EPA CBDM test. The thread set and expected CPU loading is summarized in Table 25. Note that the expected execution times in RACE lead to a loading of approximately 95%, but given the low confidence required on many of the threads, this is the reason that the thread set can be admitted by CBDM despite the high average loading. No matter how one interprets execution time, in all cases the loading is above the RMA least upper bound of 72.05% for 9 threads. The results of the CBDM admission test may be found in Appendix D. Furthermore, the RT EPA overrun control also makes this otherwise marginal thread set feasible since overruns will be terminated and therefore interference controlled. The CBDM admission test computes the utility each thread imposes over its termination deadline period and the interference expected over that period by other threads. The reason that CBDM works well is based on the relative independence of execution jitter such that an overrun for thread A may have 0.1% probability and for thread B 0.1% also, therefore making the probability that A will interfere up to its worst-case time and that B will execute for its worstcase time less than one chance in a million. The S array taken from the RT EPA on-line admission test which accounts for utility and interference over each thread deadline is provided here. Intuitively, the threads with the largest S values have the highest probability of missing a deadline – a high S value thread with high confidence is the most likely point of failure in maintaining negotiated service. The tasks tNet and tExc are VxWorks tasks. The tNet task actually handles TCP/IP packet transmission and tExc handles VxWorks scheduling and operating system resource management. The tExc task is high frequency, but very low loading. So, tExc can not be demoted below highest priority, but much like interrupt servicing, it provides a fairly constant level of background overhead of approximately 5% on the RACE RT EPA test-bed. The


102

RT EPA kernel monitoring takes place during tExc time and the RT EPA release control takes place during the actual execution time of each release, so the RT EPA is accounted for here as well. Table 25: RACE Source/Sink Pipeline Task Set Description id Name Low High T Cexp Exp Clow Low C C wc WC Conf Conf (msec) (µsec) Util (µsec) Util (µsec) Util 0 tBtvid 1.0 1.0 33.333 64 0.002 0 0 1200 0.036 1 tFrmDisp 0.5 0.9 100.00 38772 0.388 36126 0.361 40075 0.400 2 tOpnav 0.9 0.99 66.67 20906 0.314 19545 0.293 23072 0.346 3 tRACEC 1.0 1.0 66.67 190 0.003 0 0 1272 0.020 tl 4 tTlmLnk 0.2 0.5 200.00 384 0.002 0 0 1392 0.007 5 tFrmLnk 0.5 0.8 500.00 55362 0.111 50083 0.100 58045 0.116 6 tCamCtl 1.0 1.0 200.0 610 0.003 317 0.001 1530 0.008 7 tNet 0.5 0.25 100.00 8000 0.080 4000 0.040 10000 0.100 8 tExc 1.0 1.0 1.0 50 0.050 25 0.050 50 0.050 0.953 0.845 1.073

If we now look closely at plots of the RACE RT EPA kernel monitoring results, we see that in fact tOpnav and tRACECtl, both of which have the highest S values also in fact exhibit the highest jitter in response. This is because these two threads have not only high utility, but high interference from tBtvid and tFrmDisp. What is interesting to note at this point is that there is no assumption about phasing the loads at all. So, even with a loading somewhere between 0.814 and 0.951, there is still pessimism in the critical instant assumption. The RT EPA pipelining control for this thread set as summarized by Table 26. The pipeline control configuration is important because it can control jitter and can reduce interference effects with phasing. To see this, we look at plots of thread release period, execution, and response jitter.

rtid 0 1 2 3 4 5 6

8.5.2

Table 26: RACE Standard Pipeline Phasing and Release Frequencies Task Released by Frequency Offset tBtvid interrupt 30 Hz 0 tFrmDisp tBtvid 10 Hz 0 tOpnav tBtvid 15 Hz 1 tRACECtl tOpnav 15 Hz 0 tTlmLnk tOpnav 5 Hz 0 tFrmLnk tFrmDisp 2 Hz 0 tCamCtl tOpnav 5 Hz 0

RACE Nominal Configuration Results

The following Sections 8.5.2.1 to 8.5.2.7 summarize the jitter in each RACE thread release. Each release is either made by completion of another RACE thread, or by source interrupt as summarized in Table 27. Another important characteristic of CBDM inherited from DM is that the underlying priorities assigned to threads are such that the highest priority is given to the shortest deadline thread. This is summarized for RACE in Table 27.


103

S 0.02 0.82 0.96 0.97 0.77 0.91 0.89 N/A N/A

rtid 0 1 2 3 4 5 6

Table 27: RACE Soft and Termination Deadline Assignment Task Dsoft Dterm RT EPA priority tBtvid 33333 33333 0 tFrmDisp 42000 50000 1 tOpnav 64000 66000 2 tRACECtl 64600 66600 3 tTlmLnk 100000 200000 4 tFrmLnk 300000 500000 5 tCamCtl 360150 1000000 6

8.5.2.1 Bt878 Video Frame Buffer Service The Bt878 Video task simply processes a DMA interrupt event and sets the frame buffer pointer to the current frame and sequences any tasks according to the RT EPA pipelining specification. Since the task is released by the Bt878 hardware interrupt, the period jitter is extremely low except in rare cases where the processing is coincident with tExc kernel resource management (e.g. the 1 msec virtual clock). The affects of a collision with tExc are also evident by clock read dropouts in the RT EPA kernel monitoring (an unfortunate side effect which however is only a problem for dispatch times less than 100 microseconds and rare). The very occasional tExc interference for tasks with execution releases on the order of the context switch time is a problem that is ignored in this thesis since, as is evident in the B878 Video results, it happens much less frequently than 1% of the time – what this means is that 99% of the time the time-stamping accuracy is in fact good to a millisecond, but occasionally it is only good to +/- 100 microseconds and therefore releases like this one will experience on-line monitoring clock read jitter and system interference. This interference can be accounted for by admitting tExc as an RT EPA task, but never activating it since it is activated by the operating system. It was found to be insignificant here either way. Figure 32: Bt878 Video Release Jitter Bt878 Video Period Jitter

T (microsec)

34500 34000 33500 33000 32500 32000 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

Time (microsec)

The execution and response jitter in the Bt878 Video thread is minimal, but is evident in Figures 33 A and B. The occasional tExc interferences are evident again as execution and response time drop outs.


104

Figure 33 A and B: Bt878 Video Execution (A) and Response Jitter (B) Bt878 Video Response Jitter

1200 1000 800 600 400 200 0 0.0E+00

Cactcomp (microsec)

Cactexec (microsec)

Bt878 Video Execution Jitter

1.0E+07

2.0E+07

3.0E+07

4.0E+07

400 350 300 250 200 150 100 50 0 0.0E+00

Time (microsec)

8.5.2.2

1.0E+07

2.0E+07

3.0E+07

4.0E+07

Time (microsec)

Frame Display Formatting and Compression Service

Since the Frame Display and Compression service for RACE is released by the completion of the Bt878 Video task which has response jitter of approximately +/- 200 microseconds and very little interference by tExc or tBtvid, the release period jitter is low – once again +/- 200 microseconds. However, being a pipeline release, if one compares the release jitter in the Bt878 Video and Frame Display services, it is clear that variance in the release is higher in this completion dependent release compared to a pure hardware event release. Figure 34: RACE Frame Display Service Release Period Jitter Frame Display Period Jitter

T (microsec)

100300 100250 100200 100150 100100 100050 100000 99950 99900 0


Given the much more significant processing in the Frame Display service, the execution jitter is much more significant – approximately +/- 2 milliseconds. Since the Frame Display algorithms are determinate – totally drive by number of pixels rather than data driven with variance in algorithm complexity, the execution jitter can only be explained by architectural variance. This is a logical deduction especially when the nature of pixel by pixel processing is considered with respect to the L1/L2 cache and probability of misses and therefore pipeline stalls. Large memory traverses will increase probability of such variance, so not only does this release impose much


105

more loading, it brings out architectural variance. It is apparent from Figure 35 A and B that the execution jitter directly results in equivalent response jitter and that in general the response has a small amount of additional latency with similar jitter. Figure 35 A and B: Frame Display Service Execution (A) and Response (B) Latency and Jitter Frame Display Response Jitter

41000 40500 40000 39500 39000 38500 38000 37500

Cactcomp (microsec)

Cactexec (microsec)

Frame Display Execution Jitter

0

40000 39500 39000 38500 38000 37500 0

1E+07 2E+07 3E+07 4E+07 5E+07


Time (microsec)

Again, in a few instances the execution time exceeds the response time, which again is due to tExc clock interference since of course response time must always be greater than execution time (for the majority of samples the data is as expected). Execution times are in general slightly less than 39 msecs and response times are in general at or slightly above 39 msecs.

8.5.2.3

Optical Navigation Ranging and Centroid Location Service

Since the RACE Optical Navigation service is in a separate, asynchronously executing pipeline from Frame Display and Compression, the RACE pipelining configuration was set up in order to create phasing and event release to minimize the jitter and interference to Optical Navigation by Frame Display. No data consistency is required between the frames that are displayed for the operator and the frames that are used for navigation since the display frames are really just to give the operator a vague idea of the RACE positioning and are ultimately displayed at a much lower frequency than 10 Hz due to network bandwidth limitations. So, tOpnav is released directly by tBtvid at half the rate since it is in a different pipeline. The period jitter is minimal due to the low jitter release source. However, the response jitter is clearly bimodal due to interference from the Frame Display service (Figures 37 A and B). Figure 36: Optical Navigation Event Release Period Jitter Opnav Period Jitter

T (microsec)

68000 67500 67000 66500 66000 65500 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

5.0E+07

Time (microsec)


106

Figures 37 A and B clearly show that while the Opnav execution jitter is low, the response jitter is high due to interference from the frame display processing and is clearly bimodal since latency is added when interference exists and otherwise response is more immediate. Figure 37 A: Optical Navigation Execution (A) and Response Jitter (B) Opnav Response Jitter Cactcomp (microsec)

Cactexec (microsec)

Opnav Execution Jitter 30000 25000 20000 15000 10000 5000 0 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

5.0E+07

70000 60000 50000 40000 30000 20000 10000 0 0.0E+00

1.0E+07

8.5.2.4

2.0E+07

3.0E+07

4.0E+07

5.0E+07

Time (microsec)

Time (microsec)

RACE Vehicle Ramp Distance Control

The RACE Ramp Control service has significant interference from both the frame display and Opnav services. In addition, it is released without isochronal control by Opnav and the overall affect of interference and the previous stage jitter leads to a tri-modal release jitter as seen in Figure 38. This release jitter could be significantly filtered by specifying Opnav to produce isochronal output, but typically this is not needed until a stage actually produces sink device output since processing is not typically sensitive to the jitter, but digital control devices typically are. Either way, the RT EPA can control jitter that is due to staging, but it cannot control jitter due to interference. Figure 38: Ramp Control Release Period Jitter

Ramp Control Period Jitter 120000 T (microsec)

100000 80000 60000 40000 20000 0 0

1E+07

2E+07

3E+07

4E+07

5E+07

Time (microsec)

Figures 39 A and B show that the Ramp Control service itself does not have significant or frequent execution jitter and likewise does not have much response jitter relative to release.


107

Figure 39 A and B: Ramp Control Execution (A) and Response (B) Jitter Ramp Control Response Jitter

1500

Cactcomp (microsec)

Cactexec (microsec)

Ramp Control Execution Jitter

1000 500 0 0

1E+07

2E+07

3E+07

4E+07

5E+07

1000 800 600 400 200 0 0

1E+07

Time (microsec)

2E+07

3E+07

4E+07

5E+07

Time (microsec)

8.5.2.5

RACE Vehicle Telemetry Processing The lowest priority services in the system will suffer from the most release jitter due to interference. In Figure 40 we see approximately 2 msecs of jitter around the 200 msec period worst-case. This is still fairly minimal jitter despite heavy interference. To understand release jitter the pipeline configuration must be considered carefully. Looking at Figure 20, we see that the telemetry service is released by a global mechanism based upon frame events. So, the jitter we see here is completely the result of interference rather than due to previous stage jitter.

Figure 40: RACE Telemetry Release Period Jitter TLM Link Period Jitter 202000 T (microsec)

201500 201000 200500 200000 199500 199000 198500 0

1E+07 2E+07

3E+07 4E+07

5E+07

Time (microsec)

Figures 41 A and B show minimal execution jitter, but more significant response jitter. Since the telemetry service execution time is sub-millisecond and many of the RACE ISR times are in the hundreds of microseconds, the response jitter is most likely due to ISR interference rather than task interference. The RT EPA currently considers ISR time to be insignificant, but it does have affect, and perhaps future work should address ISR time as well as task execution time.


108

Figure 41 A and B: RACE Telemetry Execution (A) and Response (B) Jitter TLM Link Response Jitter

1600 1400 1200 1000 800 600 400 200 0

Cactcomp (microsec)

Cactexec (microsec)

TLM Link Execution Jitter

0

1E+07

2E+07

3E+07

4E+07

1400 1200 1000 800 600 400 200 0 0

5E+07

1E+07

3E+07

4E+07

5E+07

Time (microsec)

Time (microsec)

8.5.2.6

2E+07

RACE Video Frame Link Processing

The RACE video link service release jitter is again bi-modal although not more than a millisecond. This service is globally released by the base source interrupt event, so the jitter is purely a result of interference most likely at interrupt level. Figure 42: RACE Frame Link Release Period Jitter Frame Link Period Jitter

T (microsec)

500900 500800 500700 500600 500500 500400 500300 500200 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

Time (microsec)

Figures 43 A and B show that the execution and response jitter are minimal for this service. Figure 43 A and B: RACE Frame Link Execution (A) and Response (B) Jitter Frame Link Response Jitter Cactcomp (microsec)

Cactexec (microsec)

Frame Link Execution Jitter 100000 80000 60000 40000 20000 0 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

Time (microsec)


500000 400000 300000 200000 100000 0 0.0E+00

1.0E+07

2.0E+07

3.0E+07

4.0E+07

Time (microsec)

109

8.5.2.7

RACE Camera Control

Figure 44 shows that the camera control service suffers from significant release jitter due to interference by other services despite being globally released.

Figure 44: RACE Camera Control Period Release Jitter

Release T (microsec)

Camera Control Release Jitter

400000 350000 300000 250000 200000 150000 100000 50000 0 0.0E+ 1.0E+ 2.0E+ 3.0E+ 4.0E+ 5.0E+ 6.0E+ 00 07 07 07 07 07 07 Time (microsec)

Perhaps more interesting than the high release jitter for camera control is the very high response jitter (Figure 45 B) despite relatively low execution jitter (Figure 45 A). This demonstrates that the camera control service is being interfered with since execution times are 30120 microseconds, yet response times are varying between 200 to 10000 microseconds, an order of magnitude great dispersion in response compared to execution.

Figure 45 A and B: RACE Camera Control Execution (A) and Response (B) Jitter

120 100 80 60 40 20 0 0.0E+0 1.0E+0 2.0E+0 3.0E+0 4.0E+0 5.0E+0 6.0E+0 0 7 7 7 7 7 7

Camera Control Response Jitter Cactcomp (microsec)

Cactexec (microsec)

Camera Control Execution Jitter

Time (microsec)

8.5.3

12000 10000 8000 6000 4000 2000 0 0.0E+ 1.0E+ 2.0E+ 3.0E+ 4.0E+ 5.0E+ 6.0E+ 00 07 07 07 07 07 07 Time (microsec)

RACE RT EPA Initial Service Negotiation and Re-negotiation (Goal 2)

Based upon the initial RACE RT EPA marginal task set configuration tested in Section 8.4.2 (Table 28), the service negotiation is now refined using the on-line models derived from initial execution time estimates based on worst-case observations. The tNet and tExc VxWorks system tasks are ignored here since during RACE testing it was found that tExc imposes insignificant loading on the CPU (less than 1% worst-case) and since tNet was treated as a best effort task.


110

rtid

0 1 2 3 4 5 6

Table 28: Initial RACE Source/Sink Pipeline Task Service Description Name Soft Hard Dsoft Dterm T Cwc DM Conf Conf. (msecs) (msecs) (msecs) (µsec) Util . tBtvid 1.0 1.0 40 50 33.333 1000 02.00 tFrmDisp 0.5 0.9 50 50 100.000 10000 20.00 tOpnav 0. 0.99 66 66 66.67 28000 42.42 tRACECtl 1.0 1.0 66 66 66.67 1200 01.80 tTlmLnk 0.2 0.5 150 200 200.00 2000 01.00 tFrmLnk 0.5 0.8 400 500 500.00 60000 30.00 tCamCtl 1.0 1.0 1000 1000 1000.00 2500 00.75 87.97

RM Util 03.00 10.00 42.42 01.80 01.00 30.00 00.75 88.55

Re-negotiation for tighter deadlines at the same confidence level is one possible service renegotiation, but another is to keep the desired deadlines and accept higher than desired confidence. That was the approach taken with the RACE experiment and the results are summarized in Table 29. In contrast, in the pseudo loading experiment the deadlines were iteratively shortened until the desired confidence and actually reliability converged (Section 8.2.1) The raw data on which these results are based was collected by the RT EPA monitor and included in Appendix D. This example shows very well how observed worst-case execution times are pessimistic and how extremely how actual reliability can be observed over large sample sizes (2300 33.33 msec periods for this data).

rtid

Name

0 1 2 3 4 5 6

tBtvid tFrmDisp tOpnav tRACECtl tTlmLnk tFrmLnk tCamCtl

8.5.4

Table 29: RACE Source/Sink Actual Performance Soft Hard Dsoft Dterm T Cexp Rel. Rel.. (msecs) (msecs) (msecs) (µsec) 1.0 1.0 40 50 33.333 77.9 1.0 1.0 50 50 100.000 9840 1.0 1.0 66 66 66.67 21743 1.0 1.0 66 66 66.67 1200 1.0 1.0 150 200 200.00 2000 1.0 1.0 400 500 500.00 55028 1.0 1.0 1000 1000 1000.00 583

DM Utility 01.56 19.68 32.94 01.80 01.00 11.01 00.06 68.5

RM Utility 02.34 09.84 32.94 01.80 01.00 11.01 00.06 58.99

RACE Release Phasing Control Demonstration (Goal 3a and 3b)

Experimental goal 3a, demonstration of stage-to-stage release phasing control, is demonstrated by Figures 38 and 39b from Section 8.4.2.4. In this experiment, stages were set up to release each other and the effect of the previous stage jitter causing release period jitter in the next is apparent in Figure 38. Figure 39b shows that despite period jitter, a given stage may still have low response jitter since response times are always taken relative to the release and the only contribution to response jitter for a stage is therefore execution jitter. Experimental goal 3b, demonstration of stage-to-stage isochronal release phasing control, is demonstrated by Figures 46a and 46b. In Figure 46a, the video link response jitter is uncontrolled such that the jitter from this stage will result in sink output or next stage release period jitter.


111

However, in Figure 46b, the isochronal hold output control feature of the RT EPA was enabled and the result is that response jitter is greatly minimized. Figure 46 A and B: Before and After Phasing Control Video Link Response Jitter (With Isochrony Control)

130000

130000

120000 110000 100000 90000 0.0E+0 5.0E+0 1.0E+0 1.5E+0 2.0E+0 2.5E+0 0 6 7 7 7 7

Cactcomp (microsec)

Cactcomp (microsec)

Video Link Response Jitter (No Isochrony Control)

120000 110000 100000 90000 0

5000000 1E+07 1.5E+07 2E+07

Time (microsec)

8.5.5

Time (microsec)

RACE Protection of System from Unbounded Overruns (Goal 5)

The RT EPA bounds the interference due to an overrun to the specified termination deadline. Over that period it is possible to assume that the release will attempt to use either the full resources of the period (worst-case assumption) or its typical resource demands, but either way it has not completed by the termination deadline due to one of the following conditions: 1) 2) 3) 4)

lack of I/O resources lack of CPU resources lack of both I/O and CPU resources atypical execution jitter due to algorithmic or architectural variance

8.5.5.1 Example of Unanticipated Contention for I/O and CPU Resources In order to demonstrate the RT EPA ability to protect the system from occasional or malfunctioning service termination deadline overruns, the RACE test-bed was run and an artificial interference introduced by requesting a color frame to be dumped without going through task admission. This unaccounted for interference to the frame link and camera control task resulted in overruns for both of those services. The output log for that case is contained in Appendix C. Figure 47 shows that two missed deadlines occurred due to the unaccounted for interference, but that the system continued to function after those isolated misses. The assumed interference by the RT EPA over the miss was configured for expected execution time for the thread. Looking carefully at the releases around the miss we see that while the execution time of the miss was much higher than normal, due to the nature of TCP/IP packetization and interference by the dump to the same channel, the dropout due to restarting caused the overall interference to average out close to expected execution time. This test was more complicated than simple CPU interference since both the frame link thread and the interfering frame dump request not only were competing for CPU, but also for the ethernet interface. Despite this complication, the RT EPA was able to control the overrun.


112

Figure 47: Frame Link Termination Deadline Miss Control Frame Link Deadline Miss Control

Cactcomp (microsec)

1000000

750000

500000

250000

0 0

1E+07 2E+07 3E+07 4E+07 5E+07 6E+07 7E+07 Time (microsec)

8.5.5.2 RT EPA Protection from Period/Execution Jitter Due to Misconfiguration (Goal 5) Similarly, if a particular task were to malfunction or be misconfigured, then the RT EPA protects other services from the misconfigured task which rather than occasionally missing deadlines, may continually miss deadlines while misconfigured. Such a case is contained in Appendix D. It should be noted that the deadline confidence for the termination deadline dropped below the requested 0.9 to 0.85 due to the period of misconfiguration. Furthermore, the actual reliability in the deadline was 0.71. If the misconfiguration had been allowed to continue, eventually the confidence would have dropped to zero if all actual execution times exceeded the desired deadline. The reliability is based on number of missed deadlines over all samples taken and the confidence is based on the number of samples out of all on-line samples that are within the deadline. Since the misconfiguration was allowed to persist for approximately 150 releases with an on-line model size of 100, the computation of the confidence is straight-forward. Furthermore, since the number of samples was less than the on-line model size (523 samples) and the initial model was a normal model instead of distribution-free, this explains why the reliability was lower than the confidence (all initial values in the model are set to zero unless a distribution free model is loaded). What is also very interesting is that after the misconifuration, there is an execution and response time hysteresis. This is most likely due to a newly evolved L1/L2 cache reference stream and/or dispatch context after the period of higher loading since the hysteresis exists in both the execution time and response time. This particular task has almost no interference since it is one of the shortest deadline and therefore highest priority tasks in the system.


113

Figure 48: Misconfiguration Execution Variance Example Misconfiguration of Frame Display Execution Hysteresis

90000 80000

140000 Cactexec (microsec)

Cactcomp (microsec)

Misconfiguration of Frame Display

70000 60000 50000 40000 30000 20000 10000

120000 100000 80000 60000 40000 20000 0

0 0

1E+07

2E+07

3E+07

4E+07

5E+07

6E+07

Time (microsec)

0

1E+07 2E+07 3E+07 4E+07 5E+07 6E+07 Time (microsec)

In this case, a useful extension to the RT EPA would be to provide for a restart policy on occasional misses with a secondary dismissal policy for miss trends. This is not currently a feature of the RT EPA, but would be a simple extension.


114

9

Significance

The significance of confidence-based scheduling and the RT EPA is that this approach provides a reliable and quantifiable performance framework for mixed hard and soft real-time applications. The examples presented show the ability to specify desired reliability, the RT EPA capability to monitor performance on-line and protect tasks from other poorly modeled tasks, and the ability to renegotiate reliability with the RT EPA through iterative refinement of requests based on actual execution performance. Furthermore, the thesis reports on future work planned to extend and broaden the examples to which the RT EPA can be applied. The set of applications requiring this type of performance negotiation support from an operating system is increasing with the emergence of virtual reality environments, continuous media, multimedia, digital control, and shared-control automation [Bru93][SiNu96]. The RT EPA real-time scheduling framework supports a broad spectrum of contemporary applications ranging from virtual environments to semi-autonomous digital control systems because it does support reliability and on-line monitoring and control of execution [Si96]. Furthermore, in addition to confidence-based scheduling, the RT EPA facility allows an application developer to construct a set of real-time kernel modules that manage an input (source) device; apply sequential processing to the input stream (pipeline stages); control individual processing stage behavior through parameters obtained from a user-space application; provide performance feedback to the controlling application; and manage the output (sink) device. This type of pipeline construction in combination with the RT EPA on-line admission testing based on requested deadline reliability and on-line performance monitoring make implementation of typical continuous media, digital control, and event-driven real-time systems much more simple than hard real-time, QoS, or best-effort systems. In general, the RT EPA provides a real-time scheduling framework which can handle multiservice applications including continuous media processing, digital control, and event-oriented processing. Without such an interface to a reliable scheduler for mixed services with quantifiable performance, such applications can be built using hard real-time methods such as RMA which waste resources to provide guaranteed service. Or, they can be built by using soft real-time approaches which provide abstract levels of service, but no quantifiable reliability assurances and no way of fire-walling services from execution/release variances, nor on-line monitoring methods which provide insight into actual performance. The RT EPA provides deadline reliabilities given execution models and on-line refinement to provide a real-time reliability framework for the first time.


115

10 Plans for Future Research Future research for the RT ERA includes extension of the API and formulation to include resources in addition to CPU (e.g. I/O), more direct support for resource usage epochs, and further demonstration of the RT EPA’s capabilities with additional test-beds exhibiting latency and jitter characteristics not demonstrated here already (e.g. high algorithmic execution jitter). Specifically, goals for future RT EPA research include: 1. Admission test modification to reduce the pessimism of the critical instant assumption for pipelines which specify synchronized release of stages and phasing of those releases. This can greatly reduce the interference in such pipelines and therefore lead to scheduling feasibility that ultimately is much higher than not only the RMA least upper bound, but also the current CBDM admission bound. 2. The admission test algorithm used is an extension of the DM sufficient test and is O(n2) for n services. Since the CBDM test is only sufficient, it is therefore pessimistic in terms of accounting for partial interference. Several other admission tests which have greater complexity, but are in fact necessary and sufficient could be considered for confidence-based extension including the scheduling point [Lehoc87] and completion tests[Jos86]. The CBDM test formulated here was selected for simplicity despite not being necessary and sufficient. More evaluation of the possibility of extending a necessary and sufficient test using expected and reliable execution estimates could lead to a better on-line admission test (less pessimistic). 3. Extend the RT EPA to formally model the demands for I/O resources and scheduling of these resources to meet data transport deadlines. In the current implementation, an I/O bound RT EPA service will have longer than anticipated response times based on interference and execution jitter alone which can be accounted for in the deadline confidence negotiation, but this is not directly formalized by the RT EPA. 4. Admission of services to service epochs such that scheduling feasibility is checked in two or more minor periods over one system major period and control by the RT EPA such that one epoch is protected from the other and so that independent on-line models can be derived for each epoch. 5. In this research the critical instant assumption and the WCET were both shown to be overly pessimistic, but the critical instant assumption appears more significant than WCET pessimism due to jitter. A test-bed which has high algorithmic/architectural execution jitter and less significant phasing impact would demonstrate the jitter control more dramatically. Beyond these specific goals to establish and validate the resource management and negotiation concepts introduced by the research presented here, porting the RT EPA to additional kernels would establish the viability of viewing the RT EPA as kernel-ware which can support mixed hard/soft real-time applications on a variety of platforms. This will require providing some system specific API functions (e.g. for associating interrupts with event releases) and will require ensuring that basic capabilities are portable (e.g. event time-stamping to microsecond accuracy).


116

11 Conclusion Experiments were implemented using both the RT EPA and user-level applications to compare performance. The RT EPA not only improved throughput compared to hard real-time scheduling admission and prioritization policy, it also provided reliable configuration, monitoring, and control through the confidence-based scheduling policy. The fundamental aspect of the RT EPA performance control is based on the CBDM approach for admitting threads for reliable execution. Thus, the RT EPA was evaluated in terms of how well the three example pipelines were able to meet expected and desired performance in terms of missed deadlines. These experiments were also evaluated in terms of real-time parameters such as exposure timeouts, video stream dropouts, latency and control system overshoot in order to evaluate the reliability afforded by the RT EPA to applications. These experiments were run individually and simultaneously to evaluate use of the RT EPA mechanism for complex real-time applications involving multimedia and interaction for complex applications that have multiple hard and soft real-time requirements. The RT EPA provides a framework for on-line service admission, monitoring and control as demonstrated here and was used to establish the theory of CBDM, multiple scheduling epochs, and negotiation for reliable service in terms of expected number of missed/made deadlines. Overall, the RT EPA theory, prototype framework, and validating experiments introduces an engineering oriented process for implementing timing reliability requirements in real-time systems.


117

References [Au93]

Audsley, N., Burns, A., and Wellings, A., "Deadline Monotonic Scheduling Theory and Application", Control Engineering Practice, Vol. 1, pp 71-8, 1993.

[Baruah97] Baruah, S., Gehrke, J., Plaxton, C., Stoica, I., Abdel-Wahab, H., Jeffay, K., “Fair online scheduling of a dynamic set of tasks on a single resource”, Information Processing Letters, 64(1), pp. 43-51, October 1997. [Be95]

Bershad, B., Fiuczynski, M., Savage, S., Becker, D., et al., "Extensibility, Safety and Performance in the SPIN Operating System", Association for Computing Machinery, SIGOPS '95, Colorado, December 1995.

[Bra99]

Brandt, Scott A., “Soft Real-Time Processing with Dynamic QoS Level Resource Management,” PhD dissertation, Deparment of Computer Science, University of Colorado, 1999.

[BraNu98]

Brandt, S., Nutt, G., Berk, T., and Mankovich, J., “A Dynamic Quality of Service Middleware Agent for Mediating Application Resource Usage”, Proceedings of the 19th IEEE Real-Time Systems Symposium, pp. 307-317, December 1998.

[BriRoy99] Briand, Loïc and Roy, Daniel, Meeting Deadlines in Hard Real-Time Systems – The Rate Monotonic Approach, IEEE Computer Society Press, 1999. [Bru93]

Brunner, B., Hirzinger, G., Landzettel, K., and Heindl, J., “Multisensory shared autonomy and tele-sensor-programming - key issues in the space robot technology experiment ROTEX”, IROS ‘93 International Conference on Intelligent Robots and Systems, Yokohama, Japan, July, 1993.

[Bu91]

Burns, A., "Scheduling Hard Real-Time Systems: A Review", Software Engineering Journal, May 1991.

[Carlow84] Carlow, Gene, “Architecture of the Space Shuttle Primary Avionics Software System”, Communications of the Association for Computing Machinery, Vol. 27, No. 9, September, 1984. [Connex98] Connexant Corp., “Bt878/879 Single-Chip Video and Broadcast Audio Capture for the PCI Bus”, manual printed originally by Rockwell Semiconductor Systems, March 1998 (available from www.connexant.com). [Co94]

Coulson, G., Blair, G., and Robin, P., "Micro-kernel Support for Continuous Media in Distributed Systems", Computer Networks and ISDN Systems, pp. 1323-1341, Number 26, 1994.

[Ether96]

ISO/IEC Standard 8802/3, “Information Technology – Local and Metropolitan Area Networks – Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications”, 1996 (supersedes IEEE 802.3), IEEE, New York, NY, 1996.

[Fal94]

Fall, K., and Pasquale, J., "Improving Continuous-Media Playback Performance With In-Kernel Data Paths", Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS), pp. 100-109, Boston, MA, June 1994.

[Fan95]

Fan, C., “Realizing a Soft Real-Time Framework for Supporting Distributed Multimedia Applications”, Proceedings of the 5th IEEE Workshop on the Future Trends of Distributed Computing Systems, pp. 128-134, August, 1995.

[Fle95]

Fleischer, S., Rock, S., Lee, M., "Underwater Vehicle Control from a Virtual Environment Interface", Association for Computing Machinery, 1995 Symposium on Interactive 3D Graphics, Monterey CA, 1995.

[Gov91]

Govindan, R., and Anderson, D., "Scheduling and IPC Mechanisms for Continuous Media", 13th ACM Symposium on Operating Systems Principles, 1991.


118

[HePa90]

Hennessy, J., Patterson, D., “Computer Architecture – A Quantitative Approach”, , Morgan Kaufmann, 1990.

[JefGod99] Jeffay, K., and Goddard, S., “A Theory of Rate-Based Execution”, Proceedings of the 20th IEEE Real-Time Systems Symposium, pp. 304-314, Phoenix, AZ, December 1999. [JefSton91] Jeffay, K., Stone, D., Poirier, D., “YARTOS – Kernel Support for Efficient, Predictable, Real-Time Systems”, Proceedings of Joing IEEE Workshop on RealTime Operating Systems and Software, Atlanta, Georgia, May 1991. [JonRos97] Jones, M., Rosu, D., Rosu, M., “CPU Reservations and Time Constraints: Efficient Predictable Scheduling of Independent Activities”, Proceedings of the 16th ACM Symposium on Operating Systems Principles, October, 1997. [Jos86]

Joseph, M., Pandia, P., “Finding Response Times in a Real-Time System”, The Computer Journal, British Computing Society, Vol. 29, No. 5, October 1986, pp. 390-395.

[Kl93]

Klein, M., Ralya, T., Pollak, B., et al, “A Practitioner’s Handbook for Real-Time Analysis: Guide to Rate Monotonic Analysis for Real-Time Systems”, Kluwer Academic Publishers, Boston, 1993.

[Kl94]

Klein, M., Lehoczky, J., and Rajkumar, R., “Rate-Monotonic Analysis for Real-Time Industrial Computing”, IEEE Computer, January 1994.

[Lehoc87]

Lehoczky, J., Sha, L., Ding, Y., “The Rate Monotonic Scheduling Algorithm: Exact Characterization and Average Case Behavior”, Tech. Report, Department of Statistics, Carnegie-Mellon University, Pittsburgh, Pa., 1987.

[LiuLay73] Liu, C., and Layland, J., “Scheduling Algorithms for Multiprogramming in a HardReal-Time Environment”, Journal of the Association for Computing Machinery, pp. 46-61, Vol. 20, No. 1, January 1973. [Laplante93] Laplante, P.A., Real-Time Systems Design and Analysis – An Engineer’s Handbook, IEEE Press, New York, 1993. [McCart00] McCartney, C., “DirectX Display Drivers – DirectX 7.0 and Beyond”, Windows Hardware Engineering Conference, New Orleans, April 25, 2000. [MerSav94] Mercer, C., Savage, S., Tokuda, H., “Processor Capacity Reserves: Operating System Support for Multimedia Applications”, IEEE International Conference on Multimedia Computing Systems, Boston, MA, May 1994. [MosPet96] Mosberger, D. and Peterson, L., “Making Paths Explicit in the Scout Operating System”, Second Symposium on Operating Systems Design and Implementation, 1996. [NiLam96] Nieh, J., and Lam, M., “The design, implementation and evaluation of SMART: A Scheduler for Multimedia Applications”, Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, October, 1997. [Nu95]

Nutt, G., Antell, J., Brandt, S., Gantz, C., Griff, A., Mankovich, J., “Software Support for a Virtual Planning Room”, Technical Report CU-CS-800-95, Dept. of Computer Science, University of Colorado, Boulder, December 1995.

[NuBra99]

Nutt, G., Brandt, S., Griff, A., Siewert, S., Berk, T., Humphrey, M., "Dynamically Negotiated Resource Management for Data Intensive Application Suites", IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 1 (January/February 2000), pp. 78-95.

[Pa96]

Paulos, E., and Canny, J., “Delivering Real Reality to the World Wide Web via Telerobotics”, IEEE International Conference on Robotics and Automation.

[POSIX93] ISO Standard 1003.1b, “Standard for Information Technology Portable Operating System Interface (POSIX), Part 1: System Application Program Interface (API), Realtime Extension [C Language], IEEE, New York, NY, 1993.


119

[Red98]

Redell, Ola, “Global Scheduling in Distributed Real-Time Computer Systems – An Automatic Control Perspective”, Technical Report, Department of Machine Design, Royal Institute of Technology, Stockholm, Sweden, March 1998.

[ShaRaj90] Sha, L., Rajkumar, R., and Lehoczky, J., “Priority Inheritence Protocols: An Approach to Real-Time Synchronization”, IEEE Transactions on Computers, 39(9), pp. 1175-1185, September 1990. [Si96]

Siewert, S., “Operating System Support for Parametric Control of Isochronous and Sporadic Execution in Multiple Time Frames”, Ph.D. dissertation proposal, Univ. of Colorado Boulder, 1996.

[SiNu96]

Siewert, S., and Nutt, G., “A Space Systems Testbed for Situated Agent Observability and Interaction”, In the Second ASCE Specialty Conf. on Robotics for Challenging Environments, Albuquerque, New Mexico, June 1996.

[Sprunt89]

Sprunt, B., Lehoczky, J., Sha L., “Aperiodic Task Scheduling for Hard Real-Time Systems”, Journal of Real-Time Systems, Vol. 1, 1989, pp. 27-60.

[Sprunt88]

Sprunt, B., Sha L., Lehoczky, J., “Exploiting Unused Periodic Time for Aperiodic Service Using the Extended Priority Exchange Algorithm”, Proceedings of the 9th Real-Time Systems Symposium, IEEE, Huntsville, Alabama, December 1988, pp. 251-258.

[Stank88]

Stankovic, J., Ramamritham, K., “Hard Real-Time Systems, Tutorial”, IEE Computer Society Press, Washington D.C., 1988.

[Ste95]

Steinmetz, R., and Wolf, L., "Evaluation of a CPU Scheduling Mechanism for Synchronized Multimedia Streams", in Quantitative Evaluation of Computing and Communication Systems, Beilner, H. and Bause, F. eds., Lecture Notes in Computer Science, No. 977, Springer-Verlag, Berlin, 1995.

[TatTez94] Tatsuo Nakajima and Hiroshi Tezuka, “A Continuous Media Application Supporting Dynamic QoS Control on Real-Time Mach”, Association for Computing Machinery, Multimedia 94, San Francisco, California, 1994. [Tin93]

Tindell, K., Burns, A., Davis, R., “Scheduling Hard Real-Time Multi-Media Disk Traffic”,University of York, UK, Computer Science Report, YCS, 204, 1993.

[Tin94]

Tindell, K., Clark, J., “Holistic Schedulability Analysis for Distributed Hard RealTime Systems”, Microprocessors and Microprogramming, Vol. 40, No. 2-3, April 1994, pp. 117-134.

[TinBur92] Tindell, K., Burns, A., and Wellings, A., “Allocating Hard Real-Time Tasks: An NPhard Problem Made Easy”, Journal of Real-Time Systems, Vol. 4, pp. 145-165, Kluwer Academic Publishers, 1992. [To90]

Tokuda, H., Nakajima, T., and Rao P., "Real-Time Mach: Towards a Predictable Real-Time System", Proceedings of USENIX Mach Workshop, October 1990.

[Tom87]

Tomayko, J., “Computers in Spaceflight: The NASA Experience”, Encyclopedia of Computer Science and Technology, Vol. 18, Suppl. 3, Marcel Dekker, New York, 1987, pp. 44-47.

[Tör95]

Törngren, M., “On the Modelling of Distributed Real-Time Control Systems”, Proceedings of 13th IFAC Workshop on Distributed Computer Control Systems, Toulouse-Blagnac, France, Sept. 1995.

[Whi99]

Whitaker, J., “Video and Television Engineering”, 3rd Ed., McGraw Hill Companies Inc., New York, NY, April 1999.

[WRS97]

“VxWorks Reference Manual 5.3.1”, Ed. 1, February 21, 1997, part # DOC-12068ZD-00.

[WriSte94] Wright, G., Stevens, R., “TCP/IP Illustrated, Volume 2 – The Implementation”, Addison Wesley Publishing Company, October 1994.


120

Appendix A

RT EPA Source Code API Specification rtepa.h #ifndef _d_rtepa_h_ #define _d_rtepa_h_ #include #include #include #include #include

/* RTEPA internal time used for relative times including C,D, and T has a maximum value of 4 billion microsecs, 4000 seconds, or about 1 hour. This seems quite reasonable for release computation time C, deadline relative to release D, and release period T. */ #define r_time unsigned long int #define MAXRTIME _ARCH_UINT_MAX #define uint unsigned long int #define MAXUINT _ARCH_UINT_MAX #define MAXINT _ARCH_INT_MAX #define MAX_DISPATCH_HISTORY 1000 #define RTEPA_HIGHEST_PRIO 1 #define #define #define #define

TASK_ADMISSION_REQUEST 0 TASK_REMOVE_REQUEST 1 TASK_UPDATE_REQUEST 2 TASK_PERFORMANCE_REQUEST 3

#define #define #define #define

MAX_MODEL 1000 MAX_TASKS 10 UNUSED_ID -1 MAX_NAME 128

#define QTABLE_SIZE 5000 #define MICROSECS_PER_SEC 1000000 #define NANOSECS_PER_MICROSEC 1000 #define MAX_STACK 32768 #define NO_STAGE_RELEASE -1


121

/* RT State: 0=RT_STATE_NONE, 1=RT_STATE_ADMITTED, 2=RT_ACTIVATED, 4=RT_SUSPENDED */ #define RT_STATE_NONE 0 #define RT_STATE_ADMITTED 1 #define RT_ACTIVATED 2 #define RT_SUSPENDED 3 #define RT_TERMINATED 3 /* Execution State: 0=EXEC_STATE_NONE, 1=EXEC_STATE_PEND_RELEASE, 2=EXEC_STATE_DISPATCHED, 3=EXEC_STATE_PREEMPTED, 4=EXEC_STATE_COMPLETED */ #define EXEC_STATE_NONE 0 #define EXEC_STATE_PEND_RELEASE 1 #define EXEC_STATE_DISPATCHED 2 #define EXEC_STATE_PREEMPTED 3 #define EXEC_STATE_COMPLETED 4 /* Release State: 0=RELEASE_NONE, 1=PEND_RELEASE, 2=RELEASED, 3=RELEASE_COMPLETED */ #define RELEASE_NONE 0 #define PEND_RELEASE 1 #define RELEASED 2 #define RELEASE_COMPLETED 3 #define DEMOTE_OTHER_TASKS 1 enum exec_model {normal, distfree}; enum task_control {guaranteed, reliable, besteffort}; enum interference_assumption {worstcase, highconf, lowconf, expected}; enum release_type {external_event, single, internal_timer}; enum hard_miss_policy {restart, dismissal}; enum release_complete {isochronous, anytime}; union release_method { /* release specification parameters */ SEM_ID release_sem; /* external source */ /* OR */ timer_t release_itimer; /* internal source */ }; struct normal_model { /* for normal distribution supplied model */ r_time Cmu; r_time Csigma; double HighConf; /* determines Zphigh unit normal dist quantile */ double LowConf; /* determines Zplow unit normal dist quantile */ double Zphigh; /* computed */


122

double Zplow; /* computed */ r_time Ntrials; }; struct distfree_model { /* for distribution free supplied model */ r_time Csample[MAX_MODEL]; double HighConf; double LowConf; r_time Ntrials; }; struct worst_case_model { /* for worst-case supplied model */ r_time Cwc; }; union model_type { struct normal_model normal_model; /* OR */ struct distfree_model distfree_model; /* OR */ struct worst_case_model worst_case_model; };

struct rtepa_interupt_event_release { FUNCPTR app_isr; SEM_ID event_semaphore; int rtid; };

/* All times are considered microseconds with limit of 4K seconds */ struct rtepa_control_block { /* The main entry point is wrapped by the RTEPA with either a signal handler or a semTake loop such that entry point is called on specified release event. VxWorks task body is kept resident until RT EPA task is removed.


123

RT EPA on-line stats are computed during kernel context switches. Performance parameters are computed on demand and kept in the CB. */ /***************************** supplied ***************************/ /***************************** required for admission */

/**** Service type */ enum task_control tc_type; enum interference_assumption interference_type; enum exec_model exec_model; union model_type model; /**** Release and deadline specification */ enum release_type release_type; union release_method release_method; r_time Dsoft; /* release relative early soft deadline in microsecs */ r_time Dterm; /* release relative hard deadline where execution is terminated by rtepa in microsecs */ r_time Texp; /* period for release expected in microsecs */ enum hard_miss_policy HardMissActon; FUNCPTR serviceDsoftMissCallback; FUNCPTR serviceReleaseCompleteCallback; /***************************** required for activation */ FUNCPTR entryPt; char name[MAX_NAME]; int stackBytes; char Stack[MAX_STACK+1]; /************************** maintained by rtepa ********************/ int RTEPA_id; int sched_tid; WIND_TCB sched_tcb; WIND_TCB *sched_tcbptr; int assigned_prio; /***************************** optional pipeline I/O ctl source -- int --> stage_0 -- semGive(next) --> stage_1 ... --> sink */ /* Output control */ enum release_complete complete_type; r_time Tout; ULONG Tout_ticks; ULONG Tout_jiffies;


124

/* Number of stages being sequenced */ int NStages; /* Pipe stage sequencing of next thread */ int next_stage_rtid[MAX_TASKS]; int next_stage_event_releases[MAX_TASKS]; int next_stage_activation[MAX_TASKS]; /* Pipe stage sequencing next stage sub-frequency */ uint next_stage_cycle_freq[MAX_TASKS]; /* Pipe stage sequencing next stage phasing offset */ uint next_stage_cycle_offset[MAX_TASKS]; /***************************** optional pipeline I/O ctl */

/**** INTERNAL USE ONLY */ timer_t Dterm_itimer; /* watchdog timer for termination deadline */ struct itimerspec dterm_itime; struct itimerspec last_dterm_itime; struct sigevent TermEvent; struct sigaction TermAction; int flags;

/************************** On-line Model */ int ExecState; int RTState; int ReleaseState; /* computed from supplied model */ r_time Cexp; r_time Clow; r_time Chigh; /* Based on the RT clock frequency and interrupt period the exact time can be derived to the accuracy of the oscillator using the number of interrupt ticks and portion of a tick (jiffies). */

/* Event release record */ ULONG prev_release_ticks; UINT32 prev_release_jiffies; ULONG last_release_ticks[MAX_MODEL]; UINT32 last_release_jiffies[MAX_MODEL]; ULONG last_complete_ticks[MAX_MODEL]; UINT32 last_complete_jiffies[MAX_MODEL]; /* Dispatch and preempt time records for current release */


125

ULONG last_dispatch_ticks; UINT32 last_dispatch_jiffies; ULONG last_preempt_ticks; UINT32 last_preempt_jiffies; /* App times */ ULONG app_release_ticks[MAX_MODEL]; UINT32 app_release_jiffies[MAX_MODEL]; ULONG app_complete_ticks[MAX_MODEL]; UINT32 app_complete_jiffies[MAX_MODEL]; uint uint uint uint

Nstart; /* current on-line model starting index */ Nact; /* current on-line model complete index */ N; /* total completions sampled */ Nonline; /* desired on-line model size */

r_time Cactcomp[MAX_MODEL]; /* history of actual completion times */ r_time Cactexec[MAX_MODEL]; /* history of actual execution times */ r_time Tact[MAX_MODEL]; /* history of actual release periods */

/* Statistics */ uint Npreempts; uint Ninterferences; uint Ndispatches; uint SoftMissCnt; uint HardMissCnt; uint HardMissTerm; r_time HardMissCactcomp[MAX_MODEL]; /* history of actual completion times */ r_time SoftMissCactcomp[MAX_MODEL]; /* history of actual completion times */ uint ReleaseCnt; uint CompleteCnt; uint ReleaseError; uint CompleteError; uint ExecError;

/* On demand performance model */ /* Model expectation */ r_time Cexpactcomp; r_time Clowactcomp; r_time Chighactcomp; r_time Cexpactexec; r_time Clowactexec; r_time Chighactexec; r_time Texpact; double HardReliability; double SoftReliability; r_time ActConfDsoft; r_time ActConfDhard;


126

};

int rtepaInitialize(FUNCPTR safing_callback, int init_mask, r_time monitor_period); int rtepaShutdown(int shutdown_mask); int rtepaTaskAdmit(int *rtid, enum task_control tc_type, enum interference_assumption interference, enum exec_model exec_model, union model_type *modelPtr, enum hard_miss_policy miss_control, r_time Dsoft, r_time Dterm, r_time Texp, double *SoftConf, double *TermConf, char *name); int rtepaTaskDismiss(int rtid); int rtepaTaskActivate(int rtid, FUNCPTR entryPt, FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback, enum release_complete complete_control, int stackBytes, enum release_type release_type, union release_method release_method, uint Nonline); int rtepaTaskSuspend(int rtid); int rtepaTaskResume(int rtid); int rtepaTaskDelete(int rtid); int rtepaTaskPrintPerformance(int rtid); int rtepaIDFromTaskID(WIND_TCB *tcbptr); int rtepaInTaskSet(int tid); int rtepaTaskPrintActuals(int rtid); int rtepaTaskPrintCompare(int rtid); int rtepaPCIx86IRQReleaseEventInitialize(int rtid, SEM_ID event_semaphore, unsigned char x86irq, FUNCPTR isr_entry_pt); void rtepaPipelineSeq(int src_rtid, int sink_rtid, int sink_release_freq, int sink_release_offset, SEM_ID sink_release_sem);


127

int rtepaRegisterPerfMon(int rtid, FUNCPTR renegotiation_callback, int monitor_mask); int rtepaPerfMonUpdateAll(void); int rtepaPerfMonUpdateService(int rtid); r_time r_time double double double double

rtepaPerfMonDtermFromNegotiatedConf(int rtid); rtepaPerfMonDsoftFromNegotiatedConf(int rtid); rtepaPerfMonConfInDterm(int rtid); rtepaPerfMonConfInDsoft(int rtid); rtepaPerfMonDtermReliability(int rtid); rtepaPerfMonDsoftReliability(int rtid);

r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid); r_time rtepaPerfMonClow(int rtid); r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRhigh(int rtid); r_time rtepaPerfMonRlow(int rtid); int rtepaLoadModelFromArray(r_time *sample_array, r_time *sample_src, int n); int rtepaTaskSaveCactexec(int rtid, char *name); int rtepaTaskLoadCactexec(r_time *model_array, char *name); void rtepaSetIsochronousOutput(int rtid, r_time Tout); #endif


128

Appendix B

Loading Analysis for Image Centroid Calculation with Variance Due to Cache Misses Centroid Calculation Performance Comparison (X2000 132 MhZ PowerPC 750 and 33 MhZ RAD6000)

11.1

Arhitecture Performance Assumptions

Pipeline (4 Cycles) 1) 2) 3) 4)

INSTRUCTION FETCH, INSTRUCTION DECODE, EXECUTION, WRITE-BACK

Instructions are cached and fetched in ONE clock by each execution unit Instructions are decoded in ONE clock by each execution unit Execution in ONE clock (note that registers are loaded by instruction) Write-back to cache in ONE clock Any cache miss is assumed to stall the pipeline. For superscalar architectures adjust for number of execution units. X2000 132 MhZ PowerPC 750 CLOCK: L1 Cache: L2 Cache: ALU: Data local-bus: I/O bust:

11.2

132 MHZ, 7.58 NANOSECONDS 32 Kbyte 8-way set associative data and instruction NONE pipelined, superscalar, CPI=0.33 best case mixed FP and integer, CPI=0.5 integer only, CPI=2 worst case if both integer pipelines are stalled 64 bits 32 bits

33 MhZ RAD6000 Analysis

Clock: L1 Cache: L2 Cache: ALU: Data local-bus: I/O bus:

33 MhZ, 30.3 nanoseconds 8 Kbyte 8-way set associative data and instruction NONE pipelined, CPI=1.0 best case, CPI=4.0 if pipeline is stalled 32 bits 32 bits

Note: Neither cache is large enough to cache a DMA transferred frame so therefore references to memory containing the frame buffer will always cause a cache miss as the frame is traversed. I.e.


129

there is no good way to keep the frame values cached, only the temp variables used in the calculations in both cases.

11.3

Centroid Computation Time Model

Supplying initial time models to the RT EPA can be done by analyzing code or by off-line experimental runs. If the code is not yet implemented, and several algorithms are under consideration, the method outlined here to approximate computation time may be useful. The key is to determine the complexity of the algorithm, the native architecture instructions required for the algorithm, and the affect of the code on the native architecture pipeline. The best execution model will always be one based upon actual execution, but a method of approximation is useful during application design.

11.3.1

Alogirthm Description

x-bar = sum ( x * m ) / M , where x-bar is the weighted-mean coordinate, and m is the increment mass (or brightness) and M is the total image mass. To brute-force process each frame of image data takes at least: 1 million multiplies (for each of x and y) 1 million adds (for total brightness) 1 divide (for each of x and y)

11.3.2

Load-Store RISC Pseudo-code Instructions to Implement for X-bar and Y-bar

TB_LOOP: load r1, M load r2, m[r0] iadd r1,r2,r3 store r3,M INCR R0 jne r0, TB_LOOP

-- cache hit on total M -- cache miss to load pixel brightness -- cache hit

zero r0 zero r31 Y_LOOP: X_LOOP: imul r0,r31,r30 load r1,m[r30] imul r0,r1,r29 load r28,Xsum iadd r28,r29,r27

-- DX =1 AND DY = 1 -- doubly dimensioned array index calculation -- cache miss to load pixel brightness -- (x*m) -- cache hit -- sum(x*m)


130

store r27,Xsum imul r31,r1,r29 load r28,Ysum iadd r28,r29,r27 store r27,Ysum incr r0 jne r0, X_LOOP

-- cache hit -- (y*m) -- cache hit -- sum(y*m) -- cache hit

zero r0 incr r31 jne r31, Y_LOOP load r0,M load r1,Xsum load r2,Ysum idiv r1,r0,r3 idiv r2,r0,r4 store r3,Xbar store r4,Ybar

11.4

Overall Expected Cache Hit Rate

Variables assumed cached include: X-bar, Y-bar, x, y, M. Given large array sizes that can’t be cached after DMA transfer, it is assumed that pixel sample m references cause a cache miss on every time. Therefore, analyzing the load-store reduced-instruction-set (RISC) pseudo code above, one gets the following cache hit rates for each significant section of code: TB_LOOP – 0.66 hit rate Y_LOOP, X_LOOP – 0.8 hit rate Both loops have n iterations where n is the number of pixels, so therefore overall expected cache hit rate is: (0.66 + 0.8)/2 = 0.73 centroid computation hit rate

11.5

Centroid CPI Estimations

CPIr6k = (0.73 * 1.0) + (0.27 * 4.0) = 0.73 + 1.08 = 1.81 CPIppc = (0.73 * 0.5) + (0.27 * 2.0) = 0.365 + 0.54 = 0.905

11.6

Algorithm Complexity

M – 6n X,Y – 12n Nf = 18n, where n is the number of pixels and Nf is the number of instructions per frame Final calculation and outer loop calculations are not significant!


131

11.7

Time to Compute Array Centroid

Tf = (Nf * CPI) * Tclk Tf-r6k = (18n * 1.81) * 30.3 nanoseconds * (1 sec / 1e+9 nsecs) Tf-ppc = (18n * 0.905) * 7.58 nanoseconds * (1 sec / 1e+9 nsecs)

11.8

Example for 1024x1024 Array

Tf-r6k = (18 * 1024 * 1024 * 1.81) * 30.3 nanoseconds * (1 sec / 1e+9 nsecs) = 1.035 seconds Tf-ppc = (18 * 1024 * 1024 * 0.905) * 7.58 nanoseconds * (1 sec / 1e+9 nsecs) = 0.1295 seconds

11.9 General Result Tf-r6k / Tf-ppc = (1.81 * 30.3) / (0.905 * 7.58) Tf-r6k = 8 * Tf-ppc Time to compute centroid on the RAD6000 is 8 x as long as it takes on the PPC 750.


132

Appendix C Unmodeled Interference Causes Several Termination Deadline Misses

Script started on Tue Jun 27 09:01:15 2000 -> ld < rtepaLib.o Undefined symbols: _rtepaTaskDismiss Warning: object module may not be usable because of undefined symbols. value = 655440 = 0xa0050 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_race microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0xa6 **** VIDEO PRESENT **** DECODING EVEN FIELD **** PLL OUT OF LOCK


133

**** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0xe300022e I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x447fedc mmio INTSTATUS testval = 0xe300022e I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x447fedc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0xeb000204 I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA ENABLED EVEN FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37d610 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedScam Servo Driver Serial Interface Driver /tyCo/0 intialized and opened with status=0 OOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x383d3c, and assigned = 0x383d3c RTEPA stack base = 0x1c0f14c RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 50001000 Created RTEPA task 1 with tcbptr=0x1c172e8 Entry pointer passed in = 0x38444c, and assigned = 0x38444c RTEPA stack base = 0x1c25028 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 66000000 Created RTEPA task 2 with tcbptr=0x1c2d1c4 Entry pointer passed in = 0x383ae4, and assigned = 0x383ae4 RTEPA stack base = 0x1c3af04 Created RTEPA task 3 with tcbptr=0x1c430a0


134

RTEPA_CB[3].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[3].dterm_itime.it_value.tv_nsec = 66600000 Entry pointer passed in = 0x38390c, and assigned = 0x38390c RTEPA stack base = 0x1c50de0 Created RTEPA task 4 with tcbptr=0x1c58f7c RTEPA_CB[4].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[4].dterm_itime.it_value.tv_nsec = 200000000 Entry pointer passed in = 0x3839fc, and assigned = 0x3839fc RTEPA stack base = 0x1c66cbc Created RTEPA task 5 with tcbptr=0x1c6ee58 RTEPA_CB[5].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[5].dterm_itime.it_value.tv_nsec = 500000000 Entry pointer passed in = 0x383b68, and assigned = 0x383b68 RTEPA stack base = 0x1c7cb98 Created RTEPA task 6 with tcbptr=0x1c84d34 ******** RACE system fully activated ******** RTEPA_CB[6].dterm_itime.it_value.tv_sec = 1 RTEPA_CB[6].dterm_itime.it_value.tv_nsec = 0 value = 46 = 0x2e = '.' = s_B + 0x6 -> prio_dump_frame(5) value = 32989940 = 0x1f762f4 -> ****************** MISSED HARD DEADLINE [rtid=5, release=47] ****************** RESTARTING ****************** MISSED HARD DEADLINE [rtid=6, release=148] ****************** RESTARTING -> stop_race Actual pipeline sequencing rtid=0 completed 2294 times, activated next stage @ 192 => next_stage_rtid=1 released 701 times [specified freq = 3, offset = 0, expected releases = 700] rtid=0 completed 2294 times, activated next stage @ 223 => next_stage_rtid=2 released 1036 times [specified freq = 2, offset = 5, expected releases = 1035] rtid=0 completed 2294 times, activated next stage @ 252 => next_stage_rtid=3 released 1022 times [specified freq = 2, offset = 0, expected releases = 1021] rtid=0 completed 2294 times, activated next stage @ 282 => next_stage_rtid=4 released 336 times [specified freq = 6, offset = 0, expected releases = 335] rtid=0 completed 2294 times, activated next stage @ 330 => next_stage_rtid=5 released 66 times [specified freq = 30, offset = 0, expected releases = 65] rtid=0 completed 2294 times, activated next stage @ 350 => next_stage_rtid=6 released 195 times [specified freq = 10, offset = 0, expected releases = 194] ******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c0127c ******** Dispatch parameters Dsoft=40000, Dterm=50000, Texp=33333, Cexp=100 ******** Initial model ********


135

High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model ******** Dhard from actual dist free confidence interval =479 Dsoft from actual dist free confidence interval =479 Confidence in supplied Dhard based on exec time=0.999000 Confidence in supplied Dsoft based on exec time=0.999000 Confidence in supplied Dhard based on complete time=0.999000 Confidence in supplied Dsoft based on complete time=0.999000 N samples =2294 Start sample index =294 Last sample index =294 ReleaseCnt=2294 CompleteCnt=2294 Npreempts=2307 Ninterferences=13 Ndispatches=2307 Texpact=33334 Cexpactexec=135 Clowactexec=0 Chighactexec=1178 Cexpactcomp=135 Clowactcomp=0 Chighactcomp=1178 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0

********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=36576 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823

******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c17158 ******** Dispatch parameters Dsoft=50000, Dterm=50001, Texp=100000, Cexp=10000


136

******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =701 Start sample index =0 Last sample index =701 ReleaseCnt=701 CompleteCnt=701 Npreempts=1403 Ninterferences=702 Ndispatches=1403 Texpact=100104 Cexpactexec=299846 Clowactexec=37558 Chighactexec=182941525 Cexpactcomp=299846 Clowactcomp=37558 Chighactcomp=182941525

******** free confidence interval =39471 free confidence interval =38865 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000

******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0


******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2d034 ******** Dispatch parameters Dsoft=66000, Dterm=66000, Texp=66666, Cexp=10000


137

******** Initial model ******** High Conf = 0.990000 Low Conf = 0.900000 Cexp = 10000 Chigh = 10257 Clow = 10164 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =1036 Start sample index =36 Last sample index =36 ReleaseCnt=1036 CompleteCnt=1036 Npreempts=1038 Ninterferences=2 Ndispatches=1038 Texpact=66669 Cexpactexec=21407 Clowactexec=0 Chighactexec=26974 Cexpactcomp=21407 Clowactcomp=0 Chighactcomp=26974


******** Deadline performance ******** SoftMissCnt=1 SoftMiss C[0] = 0 HardMissCnt=1 HardMissTerm=0 HardMiss C[0] = 0 ******** Execution performance ******** SoftReliability=0.999035 HardReliability=0.999035 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0


******** Performance Summary for rtid=3, prio=4, tcbptr=0x1c42f10


138

******** Dispatch parameters Dsoft=66600, Dterm=66600, Texp=66667, Cexp=500 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 500 Chigh = 1500 Clow = 1500 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =1022 Start sample index =22 Last sample index =22 ReleaseCnt=1022 CompleteCnt=1022 Npreempts=1025 Ninterferences=3 Ndispatches=1025 Texpact=66669 Cexpactexec=184 Clowactexec=0 Chighactexec=1214 Cexpactcomp=184 Clowactcomp=0 Chighactcomp=1214





139

******** Performance Summary for rtid=4, prio=5, tcbptr=0x1c58dec ******** Dispatch parameters Dsoft=150000, Dterm=200000, Texp=200000, Cexp=500 ******** Initial model ******** High Conf = 0.500000 Low Conf = 0.200000 Cexp = 500 Chigh = 567 Clow = 525 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =322 Start sample index =0 Last sample index =322 ReleaseCnt=336 CompleteCnt=322 Npreempts=376 Ninterferences=54 Ndispatches=376 Texpact=200212 Cexpactexec=577772 Clowactexec=0 Chighactexec=185913477 Cexpactcomp=577772 Clowactcomp=0 Chighactcomp=185913477





140

******** Performance Summary for rtid=5, prio=6, tcbptr=0x1c6ecc8 ******** Dispatch parameters Dsoft=400000, Dterm=500000, Texp=500000, Cexp=60000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 60000 Chigh = 60128 Clow = 60067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =60 Start sample index =0 Last sample index =60 ReleaseCnt=66 CompleteCnt=60 Npreempts=2908 Ninterferences=2848 Ndispatches=2908 Texpact=992740 Cexpactexec=3171435 Clowactexec=36983 Chighactexec=186960233 Cexpactcomp=3171435 Clowactcomp=36983 Chighactcomp=186960233


******** Deadline performance ******** SoftMissCnt=6 SoftMiss C[0] = 0 SoftMiss C[1] = 473495 SoftMiss C[2] = 498358 SoftMiss C[3] = 991332 SoftMiss C[4] = 490276 SoftMiss C[5] = 566849 HardMissCnt=2 HardMissTerm=1 HardMiss C[0] = 0 HardMiss C[1] = 991332 ******** Execution performance ******** SoftReliability=0.900000 HardReliability=0.966667 ******** Execution errors ******** ReleaseError=0 CompleteError=31 ExecError=0


141


******** Performance Summary for rtid=6, prio=7, tcbptr=0x1c84ba4 ******** Dispatch parameters Dsoft=1000000, Dterm=1000000, Texp=1000000, Cexp=500 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 500 Chigh = 1500 Clow = 1500 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =126 Start sample index =0 Last sample index =126 ReleaseCnt=195 CompleteCnt=126 Npreempts=766 Ninterferences=640 Ndispatches=766 Texpact=338948 Cexpactexec=1491956 Clowactexec=352 Chighactexec=187915109 Cexpactcomp=1491956 Clowactcomp=352 Chighactcomp=187915109


******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=1 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=44


142

ExecError=0


Canceling timer for task 0 Suspended task 0






Canceling timer for task 6 Suspended task 6 Deleted task 0 Deleted task 1 Deleted task 2 Deleted task 3 Deleted task 4 Deleted task 5 Deleted task 6 value = 0 = 0x0 -> exit thinker exit thinker exit script done on Tue Jun 27 09:02:56 2000


143

Appendix D RACE Initial Scheduling and Configuration Admission Results ********************Admit test [Ntasks = 7] **** Thread 0 => D[0]=50000 Util=0.021160, Intf=0.000000 for thread 0 S[0]=0.021160 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=50001 i=1, j=0, D[1]=50001, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 1 from 0 Cfull=1058 for 1 from 0 Ifull=1058 for 1 from 0 Npart=1 for 1 from 0 Cpart=1058 for 1 from 0 Ipart=1058 for 1 from 0 Int=2116 for 1 from 0 Util=0.783504, Intf=0.042319 for thread 1 S[1]=0.825823 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=66000 i=2, j=0, D[2]=66000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 2 from 0 Cfull=1058 for 2 from 0 Ifull=1058 for 2 from 0 Npart=1 for 2 from 0 Cpart=1058 for 2 from 0 Ipart=1058 for 2 from 0 Int=2116 for 2 from 0 i=2, j=1, D[2]=66000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=1 for 2 from 1 Cfull=39176 for 2 from 1 Ifull=39176 for 2 from 1 Npart=0 for 2 from 1 Cpart=39176 for 2 from 1 Ipart=0 for 2 from 1 Int=41292 for 2 from 1 Util=0.334348, Intf=0.625636 for thread 2 S[2]=0.959985 **** Thread 2 can be scheduled safely **** Thread 3 => D[3]=66600 i=3, j=0, D[3]=66600, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 3 from 0 Cfull=1058 for 3 from 0 Ifull=1058 for 3 from 0 Npart=1 for 3 from 0 Cpart=1058 for 3 from 0 Ipart=1058 for 3 from 0 Int=2116 for 3 from 0 i=3, j=1, D[3]=66600, D[1]=50001, T[1]=100000, C[1]=39176


144

Nfull=1 for 3 from 1 Cfull=39176 for 3 from 1 Ifull=39176 for 3 from 1 Npart=0 for 3 from 1 Cpart=39176 for 3 from 1 Ipart=0 for 3 from 1 Int=41292 for 3 from 1 i=3, j=2, D[3]=66600, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=1 for 3 from 2 Cfull=22067 for 3 from 2 Ifull=22067 for 3 from 2 Npart=0 for 3 from 2 Cpart=22067 for 3 from 2 Ipart=0 for 3 from 2 Int=63359 for 3 from 2 Util=0.017973, Intf=0.951336 for thread 3 S[3]=0.969309 **** Thread 3 can be scheduled safely **** Thread 4 => D[4]=200000 i=4, j=0, D[4]=200000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=5 for 4 from 0 Cfull=1058 for 4 from 0 Ifull=5290 for 4 from 0 Npart=2 for 4 from 0 Cpart=2 for 4 from 0 Ipart=4 for 4 from 0 Int=5294 for 4 from 0 i=4, j=1, D[4]=200000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=2 for 4 from 1 Cfull=39176 for 4 from 1 Ifull=78352 for 4 from 1 Npart=0 for 4 from 1 Cpart=0 for 4 from 1 Ipart=0 for 4 from 1 Int=83646 for 4 from 1 i=4, j=2, D[4]=200000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=3 for 4 from 2 Cfull=22067 for 4 from 2 Ifull=66201 for 4 from 2 Npart=1 for 4 from 2 Cpart=2 for 4 from 2 Ipart=2 for 4 from 2 Int=149849 for 4 from 2 i=4, j=3, D[4]=200000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=3 for 4 from 3 Cfull=1197 for 4 from 3 Ifull=3591 for 4 from 3 Npart=0 for 4 from 3 Cpart=1197 for 4 from 3 Ipart=0 for 4 from 3 Int=153440 for 4 from 3 Util=0.001865, Intf=0.767200 for thread 4 S[4]=0.769065 **** Thread 4 can be scheduled safely **** Thread 5 => D[5]=500000


145

i=5, j=0, D[5]=500000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=14 for 5 from 0 Cfull=1058 for 5 from 0 Ifull=14812 for 5 from 0 Npart=2 for 5 from 0 Cpart=5 for 5 from 0 Ipart=10 for 5 from 0 Int=14822 for 5 from 0 i=5, j=1, D[5]=500000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=5 for 5 from 1 Cfull=39176 for 5 from 1 Ifull=195880 for 5 from 1 Npart=0 for 5 from 1 Cpart=0 for 5 from 1 Ipart=0 for 5 from 1 Int=210702 for 5 from 1 i=5, j=2, D[5]=500000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=7 for 5 from 2 Cfull=22067 for 5 from 2 Ifull=154469 for 5 from 2 Npart=1 for 5 from 2 Cpart=22067 for 5 from 2 Ipart=22067 for 5 from 2 Int=387238 for 5 from 2 i=5, j=3, D[5]=500000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=7 for 5 from 3 Cfull=1197 for 5 from 3 Ifull=8379 for 5 from 3 Npart=1 for 5 from 3 Cpart=1197 for 5 from 3 Ipart=1197 for 5 from 3 Int=396814 for 5 from 3 i=5, j=4, D[5]=500000, D[4]=200000, T[4]=200000, C[4]=373 Nfull=2 for 5 from 4 Cfull=373 for 5 from 4 Ifull=746 for 5 from 4 Npart=1 for 5 from 4 Cpart=373 for 5 from 4 Ipart=373 for 5 from 4 Int=397933 for 5 from 4 Util=0.114170, Intf=0.795866 for thread 5 S[5]=0.910036 **** Thread 5 can be scheduled safely **** Thread 6 => D[6]=1000000 i=6, j=0, D[6]=1000000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=29 for 6 from 0 Cfull=1058 for 6 from 0 Ifull=30682 for 6 from 0 Npart=2 for 6 from 0 Cpart=10 for 6 from 0 Ipart=20 for 6 from 0 Int=30702 for 6 from 0 i=6, j=1, D[6]=1000000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=10 for 6 from 1 Cfull=39176 for 6 from 1 Ifull=391760 for 6 from 1


146

Npart=0 for 6 from 1 Cpart=0 for 6 from 1 Ipart=0 for 6 from 1 Int=422462 for 6 from 1 i=6, j=2, D[6]=1000000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=15 for 6 from 2 Cfull=22067 for 6 from 2 Ifull=331005 for 6 from 2 Npart=1 for 6 from 2 Cpart=10 for 6 from 2 Ipart=10 for 6 from 2 Int=753477 for 6 from 2 i=6, j=3, D[6]=1000000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=15 for 6 from 3 Cfull=1197 for 6 from 3 Ifull=17955 for 6 from 3 Npart=0 for 6 from 3 Cpart=1197 for 6 from 3 Ipart=0 for 6 from 3 Int=771432 for 6 from 3 i=6, j=4, D[6]=1000000, D[4]=200000, T[4]=200000, C[4]=373 Nfull=5 for 6 from 4 Cfull=373 for 6 from 4 Ifull=1865 for 6 from 4 Npart=0 for 6 from 4 Cpart=0 for 6 from 4 Ipart=0 for 6 from 4 Int=773297 for 6 from 4 i=6, j=5, D[6]=1000000, D[5]=500000, T[5]=500000, C[5]=57085 Nfull=2 for 6 from 5 Cfull=57085 for 6 from 5 Ifull=114170 for 6 from 5 Npart=0 for 6 from 5 Cpart=0 for 6 from 5 Ipart=0 for 6 from 5 Int=887467 for 6 from 5 Util=0.001698, Intf=0.887467 for thread 6 S[6]=0.889165 **** Thread 6 can be scheduled safely


147

Appendix E Video Pipeline Test Results (Without Isochronous Output) -> ld < rtepaLib.o value = 808160 = 0xc54e0 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_vpipe(-0_ __ _0) microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Warning: failure to demote system task Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0x86 **** VIDEO PRESENT **** DECODING ODD FIELD **** PLL OUT OF LOCK **** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0x200022e I2C RACK DMA DISABLED ODD FIELD VIDEO PRESENT CHANGE DETECTED


148

LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc

******** RTEPA tid = 0 Number of RTEPA tasks = 1 LowConf = 1.000000, Zplow = 10.000000, HighConf = 1.000000, Zphigh = 10.000000 interference=2, Cmu = 100, Csigma = 100, Clow = 200, Chigh = 200, Dsoft = 20000, Dterm = 33333 RTEPA_Cterm[0]=200 RTEPA_Dterm[0]=33333

********************Admit test [Ntasks = 1] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely


Btvid task 0 can be scheduled by more sufficient


********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0


149

Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely

********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely

Frame compress task 1 can be scheduled by more sufficient


********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0


150

Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=150000 i=2, j=0, D[2]=150000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=4 for 2 from 0 Cfull=200 for 2 from 0 Ifull=800 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1000 for 2 from 0 i=2, j=1, D[2]=150000, D[1]=100000, T[1]=200000, C[1]=10067 Nfull=1 for 2 from 1 Cfull=10067 for 2 from 1 Ifull=10067 for 2 from 1 Npart=0 for 2 from 1 Cpart=10067 for 2 from 1 Ipart=0 for 2 from 1 Int=11067 for 2 from 1 Util=0.067113, Intf=0.073780 for thread 2 S[2]=0.140893 **** Thread 2 can be scheduled safely

********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=180000 i=2, j=0, D[2]=180000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=5 for 2 from 0 Cfull=200 for 2 from 0 Ifull=1000 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1200 for 2 from 0 i=2, j=1, D[2]=180000, D[1]=150000, T[1]=200000, C[1]=10164


151

Nfull=1 for 2 from 1 Cfull=10164 for 2 from 1 Ifull=10164 for 2 from 1 Npart=0 for 2 from 1 Cpart=10164 for 2 from 1 Ipart=0 for 2 from 1 Int=11364 for 2 from 1 Util=0.056267, Intf=0.063133 for thread 2 S[2]=0.119400 **** Thread 2 can be scheduled safely

Frame TLM task 2 can be scheduled by more sufficient Entry pointer passed in = 0x3827ac, and assigned = 0x3827ac RTEPA stack base = 0x1bf91f8 RTEPA_CB[0].dterm_itime.it_value.tv_sec = Created RTEPA 0task RTEPA_CB[0].dterm_itime.it_value.tv_nsec = 033333000 with tcbptr=0x1c01394 mmio INTSTATUS testval = 0x300022e I2C RACK DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0x8b000204 I2C RACK DMA_MC_SYNC DMA ENABLED EVEN FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37b830 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedOOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x381da4, and assigned = 0x381da4 RTEPA stack base = 0x1c0f0e0 RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 150000000 Created RTEPA task 1 with tcbptr=0x1c1727c Entry pointer passed in = 0x381c1c, and assigned = 0x381c1c RTEPA stack base = 0x1c24fc8 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 180000000 Created RTEPA task 2 with tcbptr=0x1c2d164 ******** VIDEO system fully activated ******** value = 47 = 0x2f = '/' = s_B + 0x7 -> stop_vpipe Actual pipeline sequencing


152

rtid=0 completed 851 times, activated next stage @ 190 => next_stage_rtid=1 released 67 times [specified freq = 10, offset = 0, expected releases = 66] Actual pipeline sequencing rtid=1 completed 67 times, activated next stage @ 4 => next_stage_rtid=2 released 64 times [specified freq = 1, offset = 0, expected releases = 63]

******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c01204 ******** Dispatch parameters Dsoft=20000, Dterm=33333, Texp=33333, Cexp=100 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =851 Start sample index =1 Last sample index =851 ReleaseCnt=851 CompleteCnt=851 Npreempts=861 Ninterferences=10 Ndispatches=861 Texpact=33197 Cexpactexec=57 Clowactexec=0 Chighactexec=1081 Cexpactcomp=57 Clowactcomp=0 Chighactcomp=1081




153

********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=16418 gv_rtepa_dispatch_cnt=3550 gv_rtepa_preempt_cnt=3550

******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c170ec ******** Dispatch parameters Dsoft=100000, Dterm=150000, Texp=200000, Cexp=10000 ******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =67 Start sample index =1 Last sample index =67 ReleaseCnt=67 CompleteCnt=67 Npreempts=135 Ninterferences=68 Ndispatches=135 Texpact=328703 Cexpactexec=57948 Clowactexec=0 Chighactexec=59877 Cexpactcomp=57948 Clowactcomp=0 Chighactcomp=59877




154

ExecError=0


******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2cfd4 ******** Dispatch parameters Dsoft=150000, Dterm=180000, Texp=200001, Cexp=10000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10128 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =64 Start sample index =1 Last sample index =64 ReleaseCnt=64 CompleteCnt=64 Npreempts=2554 Ninterferences=2490 Ndispatches=2554 Texpact=328468 Cexpactexec=51795 Clowactexec=0 Chighactexec=57625 Cexpactcomp=51795 Clowactcomp=0 Chighactcomp=57625


******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0


155

CompleteError=0 ExecError=0




Canceling timer for task 2 Suspended task 2 Deleted task 0 Deleted task 1 Deleted task 2 value = -771751424 = 0xd2000200 -> exit thinker exit thinker exit script done on Wed Jun 28 21:55:25 2000


156

Appendix F Video Pipeline Test Results (With Isochronous Output) -> ld < rtepaLib.o value = 807560 = 0xc5288 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_vpipe(10_ _) microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Warning: failure to demote system task Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0xa6 **** VIDEO PRESENT **** DECODING EVEN FIELD **** PLL OUT OF LOCK **** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0x300022e I2C RACK DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED


157

LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc




Btvid task 0 can be scheduled by more sufficient


********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0


158

Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely

********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely

Frame compress task 1 can be scheduled by more sufficient


********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0


159

Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=150000 i=2, j=0, D[2]=150000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=4 for 2 from 0 Cfull=200 for 2 from 0 Ifull=800 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1000 for 2 from 0 i=2, j=1, D[2]=150000, D[1]=100000, T[1]=200000, C[1]=10067 Nfull=1 for 2 from 1 Cfull=10067 for 2 from 1 Ifull=10067 for 2 from 1 Npart=0 for 2 from 1 Cpart=10067 for 2 from 1 Ipart=0 for 2 from 1 Int=11067 for 2 from 1 Util=0.067113, Intf=0.073780 for thread 2 S[2]=0.140893 **** Thread 2 can be scheduled safely

********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=180000 i=2, j=0, D[2]=180000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=5 for 2 from 0 Cfull=200 for 2 from 0 Ifull=1000 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1200 for 2 from 0 i=2, j=1, D[2]=180000, D[1]=150000, T[1]=200000, C[1]=10164


160

Nfull=1 for 2 from 1 Cfull=10164 for 2 from 1 Ifull=10164 for 2 from 1 Npart=0 for 2 from 1 Cpart=10164 for 2 from 1 Ipart=0 for 2 from 1 Int=11364 for 2 from 1 Util=0.056267, Intf=0.063133 for thread 2 S[2]=0.119400 **** Thread 2 can be scheduled safely

Frame TLM task 2 can be scheduled by more sufficient Entry pointer passed in = 0x3827ac, and assigned = 0x3827ac RTEPA stack base = 0x1bf91f8 RTEPA_CB[0].dterm_itime.it_value.tv_sec = Created RTEP0A task RTEPA_CB[0].dterm_itime.it_value.tv_nsec = 033333000 with tcbptr=0x 1c01394 mmio INTSTATUS testval = 0x200022e I2C RACK DMA DISABLED ODD FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0x8a000204 I2C RACK DMA_MC_SYNC DMA ENABLED ODD FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37b830 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedOOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x381da4, and assigned = 0x381da4 RTEPA stack base = 0x1c0f0e0 RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 150000000 Created RTEPA task 1 with tcbptr=0x1c1727c Entry pointer passed in = 0x381c1c, and assigned = 0x381c1c RTEPA stack base = 0x1c24fc8 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 180000000 Created RTEPA task 2 with tcbptr=0x1c2d164 ******** VIDEO system fully activated ******** value = 47 = 0x2f = '/' = s_B + 0x7 -> stop_vpipe


161

Actual pipeline sequencing rtid=0 completed 766 times, activated next stage @ 190 => next_stage_rtid=1 released 58 times [specified freq = 10, offset = 0, expected releases = 57] Actual pipeline sequencing rtid=1 completed 58 times, activated next stage @ 4 => next_stage_rtid=2 released 55 times [specified freq = 1, offset = 0, expected releases = 54] ******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c01204 ******** Dispatch parameters Dsoft=20000, Dterm=33333, Texp=33333, Cexp=100 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =766 Start sample index =1 Last sample index =766 ReleaseCnt=766 CompleteCnt=766 Npreempts=779 Ninterferences=13 Ndispatches=779 Texpact=33308 Cexpactexec=53 Clowactexec=0 Chighactexec=1065 Cexpactcomp=53 Clowactcomp=0 Chighactcomp=1065




162


******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c170ec ******** Dispatch parameters Dsoft=100000, Dterm=150000, Texp=200000, Cexp=10000 ******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =58 Start sample index =1 Last sample index =58 ReleaseCnt=58 CompleteCnt=58 Npreempts=117 Ninterferences=59 Ndispatches=117 Texpact=327931 Cexpactexec=57782 Clowactexec=0 Chighactexec=59763 Cexpactcomp=57782 Clowactcomp=0 Chighactcomp=59763




163

ExecError=0


******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2cfd4 ******** Dispatch parameters Dsoft=150000, Dterm=180000, Texp=200001, Cexp=10000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10128 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =55 Start sample index =1 Last sample index =55 ReleaseCnt=55 CompleteCnt=55 Npreempts=2296 Ninterferences=2241 Ndispatches=2296 Texpact=327613 Cexpactexec=52025 Clowactexec=0 Chighactexec=56895 Cexpactcomp=52025 Clowactcomp=0 Chighactcomp=56895


******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0


164

CompleteError=0 ExecError=0




Canceling timer for task 2 Suspended task 2 Deleted task 0 Deleted task 1 Deleted task 2 value = -771751420 = 0xd2000204 -> exit thinker exit thinker exit script done on Thu Jun 29 00:02:22 2000


165

A Real-Time Execution Performance Agent Interface for Confidence ...

A Real-Time Execution Performance Agent Interface for Confidence ...

Suggest Documents

Interface Groups for Analytic Execution Architectures

Realtime Performance Strategies for the Electronic

A Multi-Agent Framework for Execution of Complex Applications

A Particular Multi-Agent Framework for Execution ... - Semantic Scholar

A Hybrid Diagnostic-Recommendation System for Agent Execution in

Design, Installation & Execution of a Security Agent for ... - CiteSeerX

A Performance Interface for Component-Based ...

Interface for Performance Environmental ...

Performance Models for Split-execution Computing Systems

Interpreting Execution Plans - Striving for Optimal Performance

An Approach for Improving Execution Performance

Handout - Confidence - Center for Performance Psychology

Neva: A Conversational Agent Based Interface for Library ... - CiteSeerX

Neva: A Conversational Agent Based Interface for Library ... - CiteSeerX

A Tcl/Tk BASED USER INTERFACE FOR MULTI-AGENT SYSTEMS

Realtime

Fault-tolerant mobile agent execution - Infoscience - EPFL

Diagnosing Delays in Multi-Agent Plans Execution

Street Re-Performance: Practicing Realtime Soundscape Composition

Resilient Control System Execution Agent (ReCoSEA)

Realtime

Fault-tolerant mobile agent execution - Infoscience - EPFL

Interface Groups for Analytic Execution Architectures - Homepages of ...

Performance Evaluation for a Multimodal Interface of a Smart ...