Predictable Assembly of Substation Automation Systems - CiteSeerX

3 downloads 39329 Views 2MB Size Report
An overview of substation automation and the details of the SAS model ..... pins and sends communication (responses) to the environment via its source ...... copy of the original schematic, contact Kurt Wallnau via email at [email protected].
Predictable Assembly of Substation Automation Systems: An Experiment Report, 2nd Edition Scott Hissam John Hudak James Ivers Mark Klein Magnus Larsson Gabriel Moreno Linda Northrop Daniel Plakosh Judith Stafford Kurt Wallnau William Wood

Initial Publication September 2002 TECHNICAL REPORT CMU/SEI-2002-TR-031 ESC-TR-2002-031

Revised July 2003

Pittsburgh, PA 15213-3890

Predictable Assembly of Substation Automation Systems: An Experiment Report, 2nd Edition CMU/SEI-2002-TR-031 ESC-TR-2002-031

Scott Hissam John Hudak James Ivers Mark Klein Magnus Larsson Gabriel Moreno Linda Northrop Daniel Plakosh Judith Stafford Kurt Wallnau William Wood

Initial Publication September 2002 Revised July 2003 Predictable Assembly from Certifiable Components Initiative

Unlimited distribution subject to the copyright.

This report was prepared for the SEI Joint Program Office HQ ESC/DIB 5 Eglin Street Hanscom AFB, MA 01731-2116 The ideas and findings in this report should not be construed as an official DoD position. It is published in the interest of scientific and technical information exchange. FOR THE COMMANDER

Christos Scondras Chief of Programs, XPK

This work is sponsored by the U.S. Department of Defense. The Software Engineering Institute is a federally funded research and development center sponsored by the U.S. Department of Defense. Copyright 2003 by Carnegie Mellon University. Requests for permission to reproduce this document or to prepare derivative works of this document should be addressed to the SEI Licensing Agent.

NO WARRANTY THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT. This work was created in the performance of Federal Government Contract Number F19628-00-C0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center. The Government of the United States has a royaltyfree government-purpose license to use, duplicate, or disclose the work, in whole or in part and in any manner, and to have or permit others to do so, for government purposes pursuant to the copyright license under the clause at 252.227-7013. Use of any trademarks in this report is not intended in any way to infringe on the rights of the trademark holder. For information about purchasing paper copies of SEI reports, please visit the publications portion of our Web site (http://www.sei.cmu.edu/publications/pubweb.html).

Table of Contents

Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Preface to 2nd Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Objective of This Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Audience for This Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Structure of This Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2

Model Problems for Substation Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Three Model Problems and Three PECTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 The IEC 61850 Domain Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Lab Environment and the SEI Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3

Prediction-Enabled Component Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1 The Theory of PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 The Structure of PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.1 Component Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.2 Prediction Enablers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.3 PECT=Component Technology+Analysis Technology . . . . . . . . . . . 14 3.3 The Pin Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.1 Components and Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3.2 Assemblies and Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.3 Behavior: Reaction and Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3.4 Hierarchical Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Pin Subset in SAS Model Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.5 SAS Software Controller in Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4

PECT Process and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1 The PECT Development Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1.1 Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.2 Workers, Activities, and Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

CMU/SEI-2002-TR-031

i

4.2.1 4.2.2 4.2.3 4.2.4

Definition Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Co-Refinement Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Validation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Packaging Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29 30 31 33

5

Co-Refinement of the Controller PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 What Is Co-Refinement? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 How Co-Refinement Proceeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Key Ideas for the Controller PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Co-Refinement of the Controller PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 First Iteration: Worst-Case Latency (λW) . . . . . . . . . . . . . . . . . . . . . 5.4.3 Second Iteration: Average Latency (λA) . . . . . . . . . . . . . . . . . . . . . . 5.4.4 Third Iteration: Worst-Case Latency + Blocking (λWB) . . . . . . . . . . 5.4.5 Fourth Iteration: Average Latency + Blocking (λAB) . . . . . . . . . . . . 5.4.6 Fifth Iteration: Asynchronous Interactions (λABA) . . . . . . . . . . . . . .

35 35 35 37 37 37 38 40 41 41 42

6

Empirical Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 What Is Empirical Validation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Introduction to Statistical Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Component Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Property Theory Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Workflow for Empirical Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Define Validation Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Define Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Define Sampling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Develop Measurement Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.5 Collect Validation Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.6 Analyze Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43 43 43 44 45 46 47 48 49 50 50 51

7

Safety and Liveness Analysis in the Controller PECT . . . . . . . . . . . . . . . . . . . . . . . 7.1 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Building the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Determining the Scope of Problem Analysis . . . . . . . . . . . . . . . . . . . 7.3.2 Producing a Pin Model of the Problem . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 Translating the Pin Model into a State Machine . . . . . . . . . . . . . . . . 7.3.4 Translating the State Machine into NuSMV . . . . . . . . . . . . . . . . . . . 7.4 Analyzing the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Types of System Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Temporal Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 CSWI Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53 53 54 55 55 56 57 59 60 60 61 62

8

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 8.1 Results on Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 8.1.1 Interfaces Between Specialized Skills . . . . . . . . . . . . . . . . . . . . . . . . 65

ii

CMU/SEI-2002-TR-031

8.2 9

8.1.2 Infrastructure Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 8.1.3 Design Space: Rules for Inclusion and Exclusion . . . . . . . . . . . . . . . 68 8.1.4 Confidence Intervals for Formal Theories . . . . . . . . . . . . . . . . . . . . . 69 Results on Model Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 9.1 PECT Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 9.1.1 Measurement and Validation Environment . . . . . . . . . . . . . . . . . . . . 73 9.1.2 Core Pin Language and Assembly Environment . . . . . . . . . . . . . . . . 74 9.1.3 Real-Time Component Runtime Environment . . . . . . . . . . . . . . . . . . 74 9.1.4 Model Checking (Analysis) Environments . . . . . . . . . . . . . . . . . . . . . 75 9.2 PECT Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 9.2.1 PECT Empirical Validation Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 9.2.2 Certification Locale and Foundations for Trust . . . . . . . . . . . . . . . . . 76

Appendix A λABA Property Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 A.1 Essential Concepts of λABA Property Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 A.2 λABA Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 A.3 Pin-λABA Interpretation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 A.4 Syntax Directed Interpretation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Appendix B λABA Empirical Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 B.2 Empirical Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 B.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 B.4 StInt Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Appendix C

NuSMV Model of CSWI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Appendix D

Switch Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Acronym List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

CMU/SEI-2002-TR-031

iii

iv

CMU/SEI-2002-TR-031

List of Figures

Figure 1: Three PECTs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Figure 2: Key Concepts of the IEC 61850 Standard for a SAS Model Problem . . . 8 Figure 3: SEI Switch Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Figure 4: Interpretations on Constructive Assemblies and Analysis Views . . . . . 12 Figure 5: The Four Environments of a PECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Figure 6: Pin Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Figure 7: A Simple Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 8: Reactions Specify Component Behavior . . . . . . . . . . . . . . . . . . . . . . . . 20 Figure 9: SAS Software Controller in Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Figure 10: An Overview of the PECT Development Method . . . . . . . . . . . . . . . . . 26 Figure 11: Definition Phase Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Figure 12: Co-Refinement Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Figure 13: Empirical Validation Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Figure 14: Clock Component Initiating a Sequence of Interactions . . . . . . . . . . . . 39 Figure 15: Expanded Workflow for Empirical Validation. . . . . . . . . . . . . . . . . . . . . 47 Figure 16: Controller System Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Figure 17: CSWI Pin Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Figure 18: CSWI State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Figure 19: NuSMV Counterexample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Figure 20: Interfaces Among PECT Development Skills . . . . . . . . . . . . . . . . . . . . 66 Figure 21: Results of Spot Validation of Operator PECT Latency Theory . . . . . . . 71

CMU/SEI-2002-TR-031

v

Figure 22: Timeline Showing Task Phasing and Hyper-Period . . . . . . . . . . . . . . . 78 Figure 23: Elements of the λABA Analytic Model . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Figure 24: Blocking Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Figure 25: Prediction Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Figure 26: Pseudocode of getNextEvent and getNextPeriod . . . . . . . . . . . . . . . . . 83 Figure 27: Pseudocode of advanceClock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Figure 28: Task Statechart Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Figure 29: Pseudocode of montecarlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Figure 30: Simple Constructive Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Figure 31: Timeline for a Sink Pin with Interactions in the Middle of Its Execution 87 Figure 32: Timeline for a Sink Pin with Interactions at the End of Its Execution . . 88 Figure 33: Assembly with Synchronous Pins and Non-Linear Topology . . . . . . . . 89 Figure 34: Taxonomy of Sink Pins in Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Figure 35: Constructive Assembly with Asynchronous Connections . . . . . . . . . . . 91 Figure 36: Synchronous Connections Following Asynchronous Connections . . . . 93 Figure 37: Asynchronous Connections Following Synchronous Connections . . . . 93 Figure 38: Asynchronous Connections and Reentrant Synchronous Connections 95 Figure 39: Multiple Synchronous Connections Problem. . . . . . . . . . . . . . . . . . . . . 96 Figure 40: Multiple Asynchronous Connections Rewrite Rule Using Proxy Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Figure 41: Pin Component Enter and Leave Events . . . . . . . . . . . . . . . . . . . . . . 104 Figure 42: Time Measurements on Component Sink and Source Reactions. . . . 105 Figure 43: Measuring the Execution Time of a Pin Component . . . . . . . . . . . . . . 105 Figure 44: A Simple Linear Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Figure 45: A Simple Multilinear Interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

vi

CMU/SEI-2002-TR-031

Figure 46: A Simple One-to-Many Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Figure 47: A Simple Many-to-One Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Figure 48: A Simple Many-to-Many Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Figure 49: Measuring the Latency for a Simple Assembly . . . . . . . . . . . . . . . . . . 107 Figure 50: Execution Calculation Made by the Synthetic Component . . . . . . . . . 108 Figure 51: General Assembly Topology Description for Benchmarking Synthetic Component Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Figure 52: General Assembly Topology for Benchmarking Synthetic Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Figure 53: Example Measures for Component Execution Time . . . . . . . . . . . . . . 111 Figure 54: Variation Points for One Assembly Design Space . . . . . . . . . . . . . . . 113 Figure 55: Example of a Randomly Selected Assembly. . . . . . . . . . . . . . . . . . . . 114 Figure 56: Topology of an Example of a Randomly Selected Assembly . . . . . . . 114 Figure 57: Measurement Infrastructure Tool Workflow. . . . . . . . . . . . . . . . . . . . . 116 Figure 58: Histogram of AVGMRE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Figure 59: Histogram of LOG(AVGMRE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Figure 60: Distribution-Free Calculation Using StInt. . . . . . . . . . . . . . . . . . . . . . . 132 Figure 61: SEI Switch: Full Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Figure 62: SEI Switch: Upper Right Quadrant . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Figure 63: SEI Switch: Lower Right Quadrant . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Figure 64: SEI Switch: Upper Left Quadrant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Figure 65: SEI Switch: Lower Left Quadrant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

CMU/SEI-2002-TR-031

vii

viii

CMU/SEI-2002-TR-031

List of Tables

Table 1:

IEC 61850 LNs as Components of the Controller PECT. . . . . . . . . . . . 8

Table 2:

PECT Development Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Table 3:

Distribution-Free Tolerance Interval for λABA . . . . . . . . . . . . . . . . . . . 46

Table 4:

Areas of Expertise Needed to Build SAS Operator and Controller PECTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Table 5:

Merge Operation Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Table 6:

Merging the Sequence in Figure 37 . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Table 7:

Merging the Sequences in Figure 38. . . . . . . . . . . . . . . . . . . . . . . . . . 95

Table 8:

Mapping Pin to λABA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Table 9:

Syntax-Directed Definition for the Analytic Interpretation . . . . . . . . . . 99

Table 10: Variation Points for All Assembly Design Spaces . . . . . . . . . . . . . . . 113 Table 11: Predicted Latency for Synthetic5’s Job . . . . . . . . . . . . . . . . . . . . . . . 115 Table 12: Observed Average Latency for an Assembly . . . . . . . . . . . . . . . . . . 115 Table 13: Measurement Infrastructure Tool Suite . . . . . . . . . . . . . . . . . . . . . . . 117 Table 14: Hardware Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Table 15: Software Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Table 16: Processes (Running Tasks) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Table 17: Recorded Actual Execution Time Per Synthetic Pin Component . . . 122 Table 18: Recorded Predicted and Actual Latencies Per Task . . . . . . . . . . . . . 124 Table 19: Descriptive Statistics for Job Assembly Measurements . . . . . . . . . . 129 Table 20: Results from the Shapiro-Wilk Normality Test on AVGMRE . . . . . . . 129 Table 21: Results from the Shapiro-Wilk Normality Test on LOG(AVGMRE) . . 130

CMU/SEI-2002-TR-031

ix

Table 22: Results from the Initial Calculation for a One-Sided Distribution-Free Tolerance Interval . . . . . . . . . . . . . . . . . . . . . . . . . 132 Table 23: Final Statistical Label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

x

CMU/SEI-2002-TR-031

Acknowledgements

The authors would like to thank Otto Preiss and Holger Hoffman from Asea Brown Boveri, Ltd. (ABB) for providing invaluable expertise on substation automation. We also wish to thank Ivica Crnkovic and Mälardalen University for their continuing support of our work in predictable assembly.

CMU/SEI-2002-TR-031

xi

xii

CMU/SEI-2002-TR-031

Preface to 2nd Edition

Our motivation for producing a second edition of this report is primarily to provide a proper conclusion to the experiment in developing a PECT for substation automation systems. The first edition ended the story prematurely—it reported a PECT that satisfied requirements on the reliability of predictions, but which nonetheless exhibited a sufficient level of random error to leave us unsatisfied with the result, both in terms of the PECT itself, and in the methods used to specify and document the PECT. The major revisions found in this new edition are limited to Appendices A and B, which document the lABA reasoning framework and its empirical validation, respectively. The authors have resisted, wherever possible, revising this report to reflect the most recent developments in our approach to PECT-building. This is explained by our preference for an editorial process that has a good chance of terminating. Nonetheless, where practical, we indicate more recent thinking in footnotes.

CMU/SEI-2002-TR-031

xiii

xiv

CMU/SEI-2002-TR-031

Executive Summary

Background The Predictable Assembly from Certifiable Components (PACC) Initiative at the Software Engineering Institute (SEISM) is developing methods and technologies for predictable assembly. A software development activity that builds systems from components is predictable if the runtime behavior of an assembly of components can be predicted from known properties of components and their patterns of interactions (connections), and if these predictions can be objectively validated. A component is certifiable if these known properties can be obtained or validated by independent third parties. The SEI’s approach to predictable assembly1 is through prediction-enabled component technology (PECT). At the highest level, PECT is a scheme for systematic and repeatable integration of software component technology, software architecture technology, and design analysis and verification technology. This scheme is not simply technological; it also includes the processes needed to design and validate the predictive powers of a PECT. A PECT, then, is a packaging of engineering methods and a supporting technical infrastructure that, together, enable predictable assembly from certifiable components. PACC became an SEI initiative on October 1, 2002. Prior to that, from 1999-2001, PACC was considered an exploratory research project. The objective of such a project at the SEI is to develop an understanding of a problem area and proposed solutions for it. During the third year of exploratory research (2001-2002) the SEI, in partnership with an industrial sponsor, the Asea Brown Boveri, Ltd. Corporate Research Center (ABB/CRC), undertook the prototyping effort described in this report.

Objective The objective of this prototyping effort was not to develop a PECT solution for a particular design problem per se, but rather to explore as many aspects of PECT as was practical within a limited time frame. Our primary concerns were methodological and practical: Could we define a process for developing, and validating PECTs? Would the process be practical? What are the risks to transitioning PECT to practice? What are the technical limits of PECT, and how might they be overcome? On more than one occasion, our prototyping effort was diverted to explore these and similar questions. SM

SEI is a service mark of Carnegie Mellon University.

1

In this report, the term predictable assembly means predictable assembly from certifiable components.

CMU/SEI-2002-TR-031

xv

Approach To focus and ground our research, we selected as a prototype problem the substation automation system (SAS), an application area in the domain of power generation, transmission, and management. This problem area was chosen because it is well bounded, well defined, and representative of the broader class of critical infrastructure systems. The SEI defined three model problems in the predictable assembly of SAS: the assembly of an operator station, a primary equipment controller, and an integrated operator/controller. The nominal objective of the prototype was to develop three PECTs, one for each model problem.

Results of This Work This report describes the results of an experimental application of PECT to SAS, focusing on the primary equipment PECT, the SAS switch controller. A low-power, high-speed switch was developed by the SEI to simulate primary equipment. Controller assemblies were developed in accordance with the International Electrotechnical Commission (IEC) 61850 standard for substation controllers, and laboratory scenarios were developed for external switch control and over-current protection. A family of latency models and their interpretations (λ*) were developed to predict point-to-point, intra-assembly (intra-controller) latency. One member of this family, average case latency with blocking and asynchrony (λABA), was empirically validated. Primary results. The primary results of this work were methodological. We developed an overall process model for the design, development, and validation of PECTs. We studied two key processes in depth: co-refinement and empirical validation. Technically, co-refinement is the process of defining an interpretation from component model to analysis model. Intuitively, it is a negotiation that results in a component model that is expressive enough to span an intended application area, but restrictive enough to ensure that analysis is tractable. Technically, empirical validation defines measurement procedures for obtaining component measures and assembly measures, and for designing the experiments to compare the predicted and observed assembly measures. Intuitively, it is a means of attaching meaningful statistical labels to components and the predictive powers of PECT analysis. Secondary results. The secondary results of this work were technological. We developed a measurement and validation infrastructure and gained valuable experience in the construction of a laboratory environment for empirical validation. More significantly, we specified an analytically extensible component model. This model defines the basics of component interface and connection topology (assembly) and is extended by formal interpretations to one or more analysis models. An analysis model supports compositional reasoning about assembly properties PA and defines the component properties PC required to predict PA. Each interpretation may impose additional constraints on components and their topologies. In return, each analysis model guarantees that assemblies of components satisfying these constraints will be analyzable, and hence be predictable by construction, with respect to PA.

xvi

CMU/SEI-2002-TR-031

Although the work emphasized empirical predictability, formal approaches to predictability were also explored. Specifically, safety conditions for the prototype (hardware) switch and IEC 61850 (software) controller dealing with the over-current protection components were expressed in branching time temporal logic. State machines for these components were developed, and the safety and liveness conditions were verified using the SMV model checker.2 Tertiary results. The tertiary results of this work were the PECTs themselves. A PECT was developed for the operator interface and controller. The operator PECT was developed on the Microsoft .NET environment. Latency predictions for operator interfaces were accurate (< 3% magnitude of relative error [MRE]) but not technically demanding. More interesting was the controller PECT, which was substantially more complex and demanding. Controller assemblies introduced threading, contention, priority-based scheduling, and periodicity not encountered in the operator PECT. The original normative requirement for controller latency prediction was a 90% confidence interval that 80% of predictions would not exceed an upper bound 5% MRE. This norm was arbitrary and not particularly demanding. Nevertheless, the results were encouraging. lABA predictions match observations (expressed as an MRE) to within 0.5% for 80% of all predictions, with greater than 99% confidence that 0.5% is indeed the upper bound. Although the requirements imposed on the property theory (λABA) were relatively lax, and although the property theory itself was significantly restricted, the PECT exhibited many interdependencies among component technology, components, property theories, property-theory imposed well formedness rules, and use scenarios. Several inconsistencies in the PECT itself were discovered during the empirical validation presented in the first edition of this report. This, in turn, prompted improvements in the way PECTs are specified and documented, and to the revalidation of λABA reported in this edition.

Next Steps The results of the exploratory prototype are encouraging. However, as expected, there are aspects of PECT that require further elaboration, and some that have yet to be explored. This report identifies two areas requiring attention. First, a technology infrastructure for specifying and deploying components and their assemblies must be constructed; the substation automation prototypes relied on manual processes for much of this. We developed a substantial suite of tools to support empirical validation, but

2.

SMV is a symbolic model-checking tool developed at Carnegie Mellon University. For more information about it, see .

CMU/SEI-2002-TR-031

xvii

greater automation is still required to generate the massive data sets required to adequately validate even moderately complex analysis models. Second, the methods and technologies needed to support independent (third-party) measurement and certification of components and PECTs must be established; the substation automation prototypes relied on in situ measurements. Means must be established for acquiring measurements in third-party environments, establishing and maintaining trust in assertions about those measurements, and extrapolating those assertions to different, possibly heterogeneous development and deployment environments.

xviii

CMU/SEI-2002-TR-031

Abstract

The Predictable Assembly from Certifiable Components (PACC) Initiative at the Software Engineering Institute (SEISM) is developing methods and technologies for predictable assembly. A software development activity that builds systems from components is predictable if the runtime behavior of an assembly of components can be predicted from known properties of components and their patterns of interactions (connections), and if these predictions can be objectively validated. A component is certifiable if these known properties can be obtained or validated by independent third parties. The SEI’s technical approach to PACC rests on prediction-enabled component technology (PECT). At the highest level, PECT is a scheme for systematic and repeatable integration of software component technology, software architecture technology, and design analysis and verification technology. This report describes the results of an exploratory PECT prototype for substation automation, an application area in the domain of power generation, transmission, and management. This report focuses primarily on the methodological aspects of PECT; the prototype itself was only a means to expose and illustrate the PECT method.

SM

SEI is a service mark of Carnegie Mellon University.

CMU/SEI-2002-TR-031

xix

xx

CMU/SEI-2002-TR-031

1

Introduction

The Predictable Assembly from Certifiable Components (PACC)3 Initiative at the Software Engineering Institute (SEISM) is developing methods and technologies for predictable assembly. A software development activity that builds systems from components is predictable if the runtime behavior of an assembly of components can be predicted from known properties of components and their patterns of interactions (connections), and if these predictions can be objectively validated. A component is certifiable if these known properties can be obtained or validated by independent third parties. The SEI is not alone in pursuing the objectives of PACC, although academic and commercial terminology may obscure this fact for those not deeply familiar with the subject. The SEI technical approach to PACC rests on prediction-enabled component technology (PECT). At the highest level, PECT is a scheme for systematic and repeatable integration of software component technology, software architecture technology, and design analysis and verification technology. The results of this integration are engineering methods and a supporting technical infrastructure that, together, enable predictable assembly from certifiable components.4 This report describes an exploratory prototype of PECT. We emphasize the word exploratory, since the objective of the work was to touch on as many methodological and technological aspects of PECT as possible, using the broad constraints imposed by an industrially significant problem area as a guide. We selected the problem area of substation automation, an application area in the domain of power generation, transmission, and management, because it is well bounded, well defined, and representative of the broader class of critical infrastructure systems.

3.

For more information on this initiative, see .

4.

We will henceforth use the term predictable assembly to mean predictable assembly from certifiable components.

SM

SEI is a service mark of Carnegie Mellon University.

CMU/SEI-2002-TR-031

1

1.1

Background

In 2000, the SEI initiated a feasibility study in predictable assembly. The study was undertaken to •

identify the fundamental science and engineering challenges posed by predictable assembly and software component certification



determine if the need to address these challenges was widespread and of economic or strategic interest to the U.S. Department of Defense (DoD) and to the software industry as a whole



ascertain whether technological and methodological elements to address these challenges existed, or could be created with reasonable probability and effort

In 2001, the SEI formed a collaboration with the Asea Brown Boveri, Ltd. Corporate Research Center (ABB/CRC). This collaboration was established to undertake a two-year feasibility study of predictable assembly in an industrial setting. The application area was in power generation and transmission systems, specifically, substation automation systems (SASs). Typically, an SAS is implemented as a distributed system composed of 20 to 100 computing elements (personal computers and/or other electronic computing devices) executing a mix of soft- and hard-real-time applications. An SAS is characterized by stringent performance and reliability requirements.

1.2

Approach

The SEI/ABB joint feasibility study was intended to address two broad feasibility questions: 1. Can a practical technology infrastructure be developed to automate significant aspects of predictable assembly? 2. Can an engineering method be developed as a basis for the systematic improvement of an industry-wide capability in predictable assembly? To conduct this feasibility exploration, the SEI and ABB defined a series of model problems. Briefly, a model problem is a simplification of a more complex one such that a solution to it can be extrapolated to that of a real problem. In 2001 and 2002, three model SAS problems were specified; in 2002 and 2003, they were extended to include the domain of industrial robotics.

1.3

Objective of This Report

This report consolidates the results and discoveries of the SEI/ABB investigation in predictable assembly. It is, in effect, a retrospective on an extended laboratory experiment. Our intent

2

CMU/SEI-2002-TR-031

is that this report serve as a basis for continued research within the SEI’s PACC Initiative. Our intent is also that this report, when combined with related SEI publications on predictable assembly and PECT, will provide a valuable resource for other applied research in this field.

1.4

Audience for This Report

The audience for this report is computer scientists and software engineering researchers with an interest in predictable assembly from certifiable components (by this or any other name). We assume that readers have technical backgrounds that are both diverse and deep, spanning areas of software architecture, software component technology, probability and measurement theory, real-time systems, and model checking, to name the major areas reflected in this report. Therefore, this report is not meant to serve as a general introduction to predictable assembly, component technology, or PECT.

1.5

Structure of This Report

An overview of substation automation and the details of the SAS model problems are described in Chapter 2. In Chapter 3, we provide an overview of the technical concepts of PECT. The key workflows of a process for developing and validating a PECT are summarized in Chapter 4. In Chapter 5, we summarize the chronology of a key development activity in the design of a PECT, called co-refinement, the result of which was a controller PECT for predicting point-to-point latency for any controller assembly. The techniques used to empirically validate the SAS-controller PECT are outlined in Chapter 6. In Chapter 7, the focus shifts from empirical to formal approaches to predictable assembly, where we describe the use of model checking to verify safety and liveness conditions of controller assemblies using temporal logic. Chapter 8 extracts the key results of the work to date, emphasizing the benefits and risks of the PECT approach to predictable assembly. The plan for the next phase of our investigation is outlined in Chapter 9. More details on the work are provided in appendices of this report. Appendix A describes the λABA analysis model for predicting controller latency and its interpretation; this is the model that emerged from co-refinement. A detailed account of the empirical validation of λABA is provided in Appendix B. In Appendix C, we provide the temporal-logic specifications and state model used to reason about controller safety conditions. Appendix D provides a schematic for the switch hardware developed for the prototype. The Acronym List on page 145 defines the acronyms used in this report.

CMU/SEI-2002-TR-031

3

4

CMU/SEI-2002-TR-031

2

Model Problems for Substation Automation

The following description of an SAS is excerpted from a paper by Preiss and Wegmann [Preiss 01]: Networks (power grids) of different topologies are responsible to transport energy over short or long distances and finally distribute it to end-consumers (such as households and companies). The nodes in such a network are called substations .... Substations may be manned or unmanned depending on the importance of the station and also on its degree of automation. Substations are controlled by Substation Automation Systems (SAS). Since unplanned network outages can be disastrous, an SAS is composed of all the electronic equipment that is needed to continuously control, monitor, and protect the network. This covers all the high voltage equipment outside the substation (overhead lines, cables, etc.) as well as those inside the substation (transformers, circuit breakers, etc.). This combination of challenging and concrete requirements means that an SAS provides a good basis for model problems in predictable assembly. We outline the basic structure of the SAS model problems in Section 2.1. Background information on the International Electrotechnical Commission (IEC) 61850 standard, the SAS domain model used to develop the controller PECT prototype, is provided in Section 2.2. An overview of the hardware and software used in the operation and controller prototypes is provided in Section 2.3.

2.1

Three Model Problems and Three PECTs

The scale of an SAS depends on the cost, size, and criticality of the primary equipment that it manages. For the model problems described in this report, we envisage a manned SAS that has an operator workstation and one or more controllers, each of which controls some primary equipment (for example, breakers, switches, and transformers). The operator has a graphical display console for observing and interacting with primary equipment. For the purpose of this report, we are not concerned with details such as the allocation of controllers to bays or the interaction of multiple controllers (e.g., bay interlocking). Instead, we assume a single operator console, a single primary switch, and a single controller for that switch. From this general structure, we defined three model problems, each leading to a PECT as its model solution.

CMU/SEI-2002-TR-031

5

The overall scheme is depicted in Figure 1. The SEI developed a low-power, high-speed switch (labeled “SEI Switch” in Figure 1) to play the role of primary equipment. One model problem was to develop a PECT for predictably assembling SEI switch controllers, a second model problem was to develop a PECT for predictably assembling operator interfaces, and a third model problem was to develop a higher order PECT for composing controller and operator assemblies into an overall SAS application. This PECT would allow operators to monitor and control the state of the switch. SAS PECT Controller PECT Microsoft Win32/IEC 61850

Operator PECT Microsoft .NET/AIP :HMI

:CSWI :XCBR I/O channels

SEI Switch

:TCTR

:PIOC

:MMXU

LAN

:HMI

:HMI

:HMI

:TVTR

Figure 1:

Three PECTs

The PECTs were intended to enable the prediction of the end-to-end latency of operations within an operator assembly, within a controller assembly, and over a combined operator/controller assembly. Nominal requirements were established for the accuracy and reliability of predictions and were expressed as statistical tolerance intervals:5 •

controller PECT: 99% confidence that 80% of latency predictions will have a magnitude of relative error (MRE) < 5%



operator PECT: 95% confidence that 80% of latency predictions will have MRE < 10%



SAS PECT: 95% confidence that 80% of latency predictions will have MRE < 10%

It must be emphasized that these requirements were not selected with any consideration for the practicality of or resemblance to operational requirements. For example, the controller PECT was to be developed on Windows 2000, and we understood that it is problematic to achieve the stated tolerance interval on that platform (given that Windows 2000 is not a real-time operating system). Instead, the above norms were selected as a plot device for our study of empirical validation. 5.

6

The statistical intervals selected for these model problems were derived from Preiss and Wegmann’s work [Preiss 01].

CMU/SEI-2002-TR-031

Although the model problems were purposely kept quite abstract, two technology decisions were made in the interest of reducing the gap between the model solutions and their ultimate deployment environments. The operator PECT was developed for the ABB Aspect Integration Platform (AIP). The AIP is a development and deployment environment for automation systems; it uses OPC6 to enable operator interfaces to communicate with controllers. The controller PECT was developed using the IEC 61850 standard for substation automation. This report focuses on the controller PECT, since that model problem posed the most challenges. Although the latency prediction enabled by the controller PECT is quite flexible in defining the end points of an “end-to-end” latency prediction, two specific controller scenarios were defined: 1. switch control latency. In this scenario, we are interested in predicting the end-to-end latency of an operation to manually “throw” the SEI switch. 2. over-current protection latency. In this scenario, we are interested in predicting the time required to detect an over-current condition and automatically “throw” the switch. These scenarios were selected in part because they correspond to actual SAS functionality, as defined in IEC 61850.

2.2

The IEC 61850 Domain Model

The IEC 61850 standard defines a domain model of the functionality of substation automation in a form that is readily adopted for use with software component technology [Ivers 02]. Figure 2 depicts, in the Unified Modeling Language (UML), the IEC 61850 concepts that are most important for SAS model problems. An SAS consists of a physical system and a logical system. The physical system comprises primary equipment, such as switches, breakers, and transformers, and secondary equipment, such as intelligent electronic devices (IEDs) or other controller hardware. The logical system comprises the data and software functions for Supervisory Control and Data Acquisition (SCADA), Energy Management Systems (EMSs), and other substation applications. The IEC 61850 standard defines a variety of functions and the quality attributes required for each, including reliability, accuracy, and latency. Each function is defined in terms of one or more logical nodes (LNs). The standard defines many LNs; each one is a primitive building block from which many SAS functions may be composed. Conceptually, then, there is a close correspondence between LNs and software components. Although it is possible, and perhaps reasonable, to deploy several LNs as a single component, we chose to maintain a one-to-one correspondence between LNs and the software components that implement them. 6.

OPC stands for OLE (Object Linking and Embedding) for Process Control.

CMU/SEI-2002-TR-031

7

Substation Automation System

1

1

Logical System

Physical System

1..n

1..n

Function

1..n

Device

+distributed over 1..n

+ realized by

Logical Node (LN) +hosts

+run on

1..n

1

Physical Device (PD) or IED

High Voltage Device

+connect 2 1..n

1..n +connected by +map to

Logical Connection (LC) 1..n

Figure 2:

Physical Connection (PC) 0..1

Key Concepts of the IEC 61850 Standard for a SAS Model Problem

The IEC 61850 components implemented for the controller PECT are summarized in Table 1. Only rudimentary functionality was implemented. (Other components that were developed but are not shown here include an OPC gateway for communicating with the operator PECT and a clock component for delivering periodic stimulus to controller components.) Table 1:

IEC 61850 LNs as Components of the Controller PECT

IEC 61850 LN

Summary of Function

TCTR

Calculates current

TVTR

Calculates voltage

MMXU

Calculates power

CSWI

Provides safety rules for the controller (e.g., select before use)

XCBR

Acts as the controller interface

PIOC

Provides over-current protection

8

CMU/SEI-2002-TR-031

2.3

Lab Environment and the SEI Switch

The hardware and software platforms used for the operator and controller PECTs were conventional, Intel-based personal computers running Microsoft’s Windows 2000 operating system. The operator PECT was developed for Microsoft .NET in C#. The controller PECT used VenturCom’s RTX7 real-time extensions package to support priority-based scheduling. Other details of the hardware and software configuration for the controller PECT are described in Table 14 and Table 15 on page 120. The SEI also developed a low-power, high-speed switch to facilitate the prototype development. An abstraction of this switch is depicted in Figure 3; the actual schematic is shown in Figure 61 on page 139. Software Software Switch Switch

3 Analog/ 3 Analog/ 2 Digital 2 Digital Outputs Outputs Switch selected

Switch select Current output Switch trip Voltage output 2 Digital 2 Digital Inputs Inputs

Software switch trip Hardware Hardware Switch Switch

microprocessor

Hardware switch trip

SEI SEISwitch Switch

Figure 3:

SEI Switch Abstraction

The SEI switch communicates with the software controller using two digital input channels for commands and five output channels (two digital and three analog) for status and values. Digital input is provided for the controller to select the switch and set its position to “open” or “close.” Analog output provides positive feedback on switch selection status and reports on the switch’s current, voltage, and position. The switch is implemented as a serial composition of two high-speed electronic switches, which are also known as TRIACs. One TRIAC represents the primary equipment being controlled (labeled “Software Switch” in Figure 3); the other is controlled internally on the SEI switch and used for over-current protection test scenarios (labeled “Hardware Switch” in Figure 3). The hardware switch is controlled by a microprocessor that monitors current and time. If the current exceeds a value for a specified duration (both are configurable), the microprocessor “throws” the hardware switch, representing a failure of the PIOC (see Table 1) component to protect the primary equipment.

7.

See for details about this software package.

CMU/SEI-2002-TR-031

9

The software switch is operated by a software controller running on the Windows 2000 platform through a data acquisition card that is connected to the analog and digital lines. This controller is shown in detail in Figure 9 on page 23.

10

CMU/SEI-2002-TR-031

3

Prediction-Enabled Component Technology

Our approach to predictable assembly is to develop and use prediction-enabled component technologies (PECTs). We formulated the basic concepts of PECT prior to undertaking our SAS effort [Hissam 02], but refined and expanded them substantially as a result of the work reported here. In this chapter, we first outline the theory of PECT in Section 3.1 and then describe the structure of a PECT implementation in Section 3.2. Next, we provide an overview of Pin, the core component model we use to build PECTs, in Section 3.3. Not all the Pin features described in this report were implemented in the model solutions; those that were are summarized in Section 3.4. Section 3.5 depicts the SAS software controller developed in Pin. In the following discussion, we adopt the terminology of architecture documentation used by Clements and associates to describe PECT [Clements 02b]. You should consult their work if the meanings of these terms are not self evident. Our only modification to this terminology is that, in some situations, we prefer the term assembly to view8 for its connotations of compositionality.

3.1

The Theory of PECT

Component composition, in the literature, almost universally assumes an underlying component-and-connector viewtype.11 Correspondingly, a PECT has a central component-andconnector11 viewtype, called the constructive viewtype that models runtime components and their interaction topologies. In general, we denote a view in this viewtype as a constructive assembly. The constructive viewtype has one or more analysis viewtypes, called analysis views,9 associated with it. Each analysis viewtype provides a basis for compositional reasoning about some runtime behavior, for example, latency, security, safety, liveness, or reliability.

8.

As defined by Clements and associates [Clements 02b].

9.

In previous documents, we referred to analysis views as analysis assemblies. View seems preferable to assembly here, because assembly has a stronger connotation of components and connectors than can be justified in the general case of views that support behavioral analysis technology. However, in this case study, the term analysis assembly turns out to be appropriate.

CMU/SEI-2002-TR-031

11

In PECT, analysis views are derived automatically from constructive assemblies by means of a syntactic transformation, called an interpretation (see Figure 4). An interpretation is a partial function from constructive assemblies to analysis views. Three interpretations are shown in Figure 4: ILATENCY, ISAFETY, and IAVAIL. ILATENCY 1

Constructive Assembly

ISAFETY 1

IAVAIL 1

Figure 4:

Latency Analysis View Safety Analysis View Availability Analysis View

Interpretations on Constructive Assemblies and Analysis Views

Interpretations may impose constraints on constructive assemblies beyond those specified in the constructive viewtype, for example, restrictions on allowed topologies of component interactions or on component behavior. Interpretations may also require analysis-specific component information that is not defined in the constructive viewtype. For example, a latency analysis viewtype might require the priority assignment of component execution threads. An interpretation is consistent if each constructive assembly is related to, at most, one analysis view under that interpretation. An interpretation is valid if there is empirical or formal justification for the claim that behaviors predicted in the analysis view will be manifested by the assembly of components specified in the constructive view. A PECT, then, consists (in part) of a set of consistent and valid interpretations.

3.2

The Structure of PECT

Figure 5 depicts the four different environments in PECT and their conceptual dependencies in terms of a UML class diagram. As implied by its name, a PECT consists of a component technology that has been extended with one or more prediction-enabling technologies. We do not claim to have discovered a universally acceptable definition of these technologies; instead, we define them relative to PECT. The assembly and runtime environments are provided by a component technology discussed in Section 3.2.1, while the analysis and validation environments are provided by a prediction-enabling technology discussed in Section 3.2.2.

12

CMU/SEI-2002-TR-031

Component Technology

ASSEMBLY ENVIRONMENT

Prediction Enablers

1..* interpretation (plug-in)

deployment RUNTIME ENVIRONMENT (CONTAINER)

Figure 5:

3.2.1

ANALYSIS ENVIRONMENT

confidence

1..* observation (plug-in)

VALIDATION ENVIRONMENT

The Four Environments of a PECT

Component Technology

A component technology comprises an assembly environment and a runtime environment. In current literature and commercial products, a component runtime environment is often referred to as a container. We prefer our own terminology here. •

The assembly environment supports development-time activities such as selecting components, configuring component properties, and connecting components into assemblies. An assembly environment supports pure composition if those operations are the only ones required to develop applications. Pure composition involves developing applications entirely from components and connectors.



The runtime environment manages the execution of components (e.g., their execution schedules and life cycles), manages resources shared by components, and provides services that allow components to interact with the external world.

Assemblies are deployed in their runtime environment, that is, by a mechanism either native or external to the component technology. Not shown in Figure 5 is the fact that both of the above environments are specialized to support a particular constructive model.10 The constructive model specifies the types of components in an assembly, their type-specific runtime behavior, and the constraints on their connection topology [Bachmann 00]. In our view, this concept of constructive model is nearly (if not exactly) identical to that of architectural style, as that term has been used in current literature [Gardiner 00], [Bass 98], [Clements 02b]. Therefore, the Pin constructive model, or,

10.

Our use of the term constructive model supersedes the earlier, and more ambiguous, term component model.

CMU/SEI-2002-TR-031

13

more ponderously, the Pin constructive component model, is also sometimes referred to as the Pin style. Pin is described in Section 3.3.

3.2.2

Prediction Enablers

Prediction enablers comprise an analysis environment and a validation environment: •

The analysis environment supports reasoning about the runtime behavior of a system. Simplistically, reasoning is compositional if it supports “divide and conquer” analysis.



The validation environment is an infrastructure for controlled, experimental validation of predictions of assembly behavior made in the analysis environment.

The association between the analysis and validation environments shown in Figure 5 reflects a logical dependency between the theoretical model underlying a particular form of design analysis and the experimental apparatus needed to experimentally validate the theory.

3.2.3

PECT=Component Technology+Analysis Technology

We enable reasoning in a component technology by extending its assembly environment with one or more analysis environments. This extension is implemented via a plug-in interface. An analysis environment provides an implementation that satisfies the plug-in, which, in turn, allows users of an assembly environment to reason about as many assembly properties as there are plug-ins. We enable the validation of analysis technology by extending the component runtime with one or more validation environments. This extension is also implemented by means of a plug-in interface, although in this case, we do not envisage multiple validation environments operating concurrently (but you never know). The observation mechanism is based on execution traces where observable trace events (e.g., procedure calls or message exchanges) have been annotated with property-theory-specific information.

3.3

The Pin Style11

Pin allows us to emphasize the compositional potential of software component technology. This may sound tautological at first, but, with a little study, it becomes apparent that components and composition do not always go hand in hand. Consider, for example, the amount of “glue” code that must often be developed, in conventional programming languages, to inte11.

14

What is referred to here as the Pin Style, or elsewhere as the Pin Component Model, or often just Pin, has become progressively more formal over time. Pin is now (at the time of the 2nd edition) the metamodel of the Construction and Composition Language (CCL). This language is discussed in a forthcoming technical note (Wallnau, K. & Ivers, J. Snapshot of CCL: A Language for Predictable Assembly. Pittsburgh, PA: Software Engineering Institute, to be published).

CMU/SEI-2002-TR-031

grate components [Wallnau 02].12 With Pin, we can explore the potential for pure composition. It is our view that breakthroughs in programmer productivity and system quality require higher level programming abstractions, and that pure composition has many of the qualities we might want such abstractions to possess, especially when it is “prediction enabled.” The Pin style (hereafter referred to as simply Pin) adopts the hardware metaphor that components interact with their environment through a set of pins. Components in Pin are specified as a set of pins and pin behaviors; assemblies are specified as a set of connections among pins. Pin defines a number of connectors, each specialized to a particular type of pin and pin-to-pin interaction scheme. In this report, we provide a high-level overview of the graphical notation for Pin and the connectors used in SAS PECTs. A more formal and detailed treatment is provided by Ivers and associates [Ivers 02]13. We describe the notation of components and pins in Section 3.3.1, assemblies and environments in Section 3.3.2, the behavioral specification of components and the composition of component behavior in Section 3.3.3, and assemblies of assemblies in Section 3.3.4.

3.3.1

Components and Pins

A component interacts with the external world exclusively through its pins; there are no other communication paths to or from a component. There are two types of pins: sink pins and source pins. A component receives communication (stimuli) from the environment via its sink pins and sends communication (responses) to the environment via its source pins. Figure 6 depicts the graphic notation we use. As suggested by the different shapes given to pin heads in Figure 6, there are different types of source and sink pins. These types and their notations are defined in the following sections.

pin name sink pins

r1

s :t º1 1 s2

> s > 3 s :t >4 2 s5:t3

> s :t >6 3

º

[r2]

C

>

pin head source pins

pin component name component

12.

By “glue” code, we mean something messy and ad hoc that integrators cobble together as needed. While connectors are also a form of “glue,” we typically think of connectors as being well defined and provided by a compiler or infrastructure.

13.

A more recent treatment of this topic since the 1st edition can be found in a short paper that outlines an algorithm for the variant (near subset) of UML statecharts used by CCL [Ivers 03].

CMU/SEI-2002-TR-031

15

Figure 6:

Pin Notation

Components are denoted by cj, sink pins are denoted by sj (for stimulusj), and source pins are denoted by rj (for responsej). Where the distinction between sink and source pin is irrelevant, we use pj (for pinj). In all cases, we omit subscripts where they are not required. Component names, which must be unique, are also used as pin names. The expression c.p denotes pin p of component c. We use capitalized names to denote the predicates defined on pins. For example, SomeCondition(c.p) is true if pin c.p satisfies SomeCondition, and is false otherwise. We use lowercase names to denote properties of pins. For example, someProperty(c.p) denotes the value of pin c.p’s someProperty property.

3.3.1.1

Pin Data Interface

All pins have a data interface, denoted by interface(c.p). This corresponds to what is usually called the signature of a method and defined as a sequence of formal arguments , where mode is one of {In, Out, InOut}, and type is one of {Boolean, SByte, UByte, SWord, UWord, SDWord, UDWord, SDouble, Float, String}. For simplicity, we do not define the representation of these types in this report.

3.3.1.2

Source Pins

There are two types of source pins: asynchronous and synchronous. Informally, an asynchronous source pin represents the ability of a component to make a request, where the component does not wait (or block) for the request to be satisfied. A synchronous source pin also represents the ability to make a request by a component, but in this case, the component must wait for the request to be satisfied. Asynchronous source pins are graphically denoted with the º pin head, and synchronous ones use the > pin head. In Figure 6, c.r1 is an asynchronous source pin, while c.r2 is a synchronous source pin. It is reasonable to think of asynchronous source pins as message sends and synchronous source pins as procedure calls, but you shouldn’t put too much faith in this gross interpretation; in fact, that is not how these pins were implemented in the SAS model solutions. Nonetheless, arguments in an asynchronous pin data interface must be of mode In. If Mandatory(c.r), then c.r is a mandatory source pin; otherwise it is optional. Optional source pins need not be connected in an assembly, while mandatory source pins must be connected. This models the distinction between uses (mandatory) and calls (optional), respectively. A component c0 uses sink pin c1.s if the behavior of c0 depends on the correct behavior of c1.s; otherwise it only calls c1.s. Graphically, we distinguish optional source pins from mandatory ones by enclosing the names of optional sources in square brackets ([ ]). So, c.[r2] is an optional source pin in Figure 6.

16

CMU/SEI-2002-TR-031

3.3.1.3

Sink Pins

There are two types of sink pins: asynchronous and synchronous, referring to a component’s ability to receive asynchronous or synchronous communication, respectively. As with source pins, the data interface of an asynchronous sink pin may include only arguments of mode In. Asynchronous and synchronous sink pins are graphically denoted with pin heads º and >, respectively. In Figure 6, c.s1 is asynchronous, while c.sk, 2 ≤ k ≤ 6, are synchronous. If Threaded(c.s), then c.s has its own thread of control, and threadId(c.s) denotes its identity. Threads represent units of concurrent execution and may be implemented by operating system threads, processes, tasks, and so forth. Graphically, tj = threadId(c.s) is denoted as a suffix: tj on the sink pin name. In Figure 6, sink pins c.s2 and c.s3 are unthreaded. Threads may be shared by sink pins. So, in Figure 6, c.s5 and c.s6 share thread t3, but c.s1 and c.s4 have their own threads. Note that asynchronous sink pins must be threaded, that is, Asynchronous(c.s) ⇒Threaded(c.s). If Mutex(c.s), then c.s is called a mutex sink, and only one caller may be active on c.s at any given time—in effect, the caller must obtain the semaphore for c.s. Conversely, if ¬Mutex(c.s), then c.s is called a reentrant sink and is never guarded by a semaphore. Note that even a reentrant c.s might force a caller to wait while c.s synchronizes on an internal (to c.s) resource. This characteristic pertains only to synchronous sink pins. Mutex sinks are represented by >|. In Figure 6, Mutex(c.sk), where 3 ≤ k ≤ 6. Note that Threaded(c.s) ∧ ¬Asynchronous(c.s) ⇒ Mutex(c.s).

3.3.2

Assemblies and Environments

An assembly is defined as a set of components and their connections. We denote an assembly as aj, and a.c denotes the component c in the scope of assembly a. An assembly is graphically depicted as a box enclosing a set of connected components, which visually reinforces the notion that assemblies are composed of components. Figure 7 depicts a simple assembly to illustrate key points in the following discussion. A connection, or connector, is established between two components when some source pin ci.r is connected to some sink pin cj.s, i ≠ j (this inequality is assumed in further discussion). Components ci and cj must be part of the same assembly if they are to be composed. In text, we denote the connection as a(ci.r k cj.s), although we usually omit the a() where doing so will not cause confusion. A connection ci.r k cj.s requires that ci.r and cj.s be mutually conformant. The rule for mutual conformance is simple: both pins must be synchronous, or both must be asynchronous, and their data interfaces must have the same argument types, modes, and positions. Graphically, we denote a connection as a double-headed lollipop, with the lollipop heads circumscribing the connected pins. In Figure 7, c0.r k c2.s is a connection.

CMU/SEI-2002-TR-031

17

º º >

CLOCK

s1:t s2

C0

r

º

º

s:t

º

s:t

C1

[r]

r1

C2

[r2]

>

>

s:t

C3

r1

º º

>

CONSOLE

> a0: E0

assembly name

Figure 7:

environment type

A Simple Assembly

Each assembly has an associated environment type, which is denoted by the :Ej suffix of an assembly name. Environment types play two crucial roles in Pin. First, they define the services (specified as sink and source pins) that components may use. Graphically, these services are attached to the assembly by means of environment junctions, drawn as small black boxes. For example, in Figure 7, a runtime environment associated with assembly a0 provides two services: the source pin CLOCK and the sink pin CONSOLE. The environment’s services can be thought of as being implemented by an environment-provided component with a single sink pin (a0.CONSOLE) and source pin (a0.CLOCK). We denote a connection with an environment service in an analogous way to that defined earlier, that is, as a0.CLOCK k a0.c0.s1 and a0.c3.r1 k a0.CONSOLE. To make these expressions easier to read, we allow assembly names to distribute over k, so, instead of the above, we could write a0(CLOCK k c0.s1) and a0(c3.r1 k CONSOLE). Again, we might omit a0 if doing so does not cause confusion. Second, environment types supply the interaction models implemented by connectors. For example, consider the interaction topology in Figure 7 that includes •

CLOCK k c0.s1, c3.r1 k CONSOLE, c2.r2 k c0.s2



c0.r k {c1.s, c2.s}



{c1.r, c2.r1} k c3.s

where we use sets {c.p1, c.p2,...} to denote multiple sinks and sources on 1:N and N:1 compositions, respectively. The semantics of 1:1 composition (the first bulleted item, above) is likely to be clear intuitively, but what about the 1:N asynchronous composition (the second bulleted item)? How many connectors are there? Is a broadcast or a sequence of unicasts represented? If a sequence is, what is its order? What is the semantics of message buffering—first in, first out (FIFO) or last in, last out (LIFO)? What is the capacity of the message buffers? Analo-

18

CMU/SEI-2002-TR-031

gously, what is the interaction order of the N:1 synchronous composition? We may want different answers to these questions in different component runtime environments. Ivers and associates give a detailed treatment of how these semantics are defined [Ivers 02].

3.3.3

Behavior: Reaction and Interaction

In PECT, the semantics of composition is defined by analysis views. That is, we define the semantics of an assembly as the observable behavior of an assembly. Informally, the Pin syntax describes what an assembly looks like, while the semantics describes what it does at runtime. We say that Pin is semantically extensible since different syntactic elements of Pin may be annotated14 with information that is used to construct, via interpretations, the semantics of each assembly in the Pin style. Nonetheless, one semantics is of such utility to predictable assembly that it is built into Pin— the behavior that is specifiable in a process algebra such as Hoare’s CSP [Hoare 85], Magee and Kramer’s FSP [Magee 99], or Milner’s CCS [Milner 89] and π-Calculus [Milner 99]. We have chosen, at this time, to use CSP as the behavior specification language for Pin. In this report, we describe only the essence of our approach. For complete details on it, see A Basis for Composition Language CL [Ivers 02]. We first treat the specification of component behavior in the form of reactions and then treat the composition behavior in the form of interactions.

3.3.3.1

Component Behavior: Reactions

Components have intrinsic behavior, as defined by their implementations. We model the behavior of a component in terms of reactions. A reaction is a CSP process that relates one or more sink pins to one or more source pins, indicating how the component reacts to the stimulation of its sink pins. The general form is a process that looks something like R = s → r → R, where s is a sink pin that can be stimulated, r is a source pin that is stimulated in response to an interaction on s, and R is the CSP process that describes this pattern of behavior. Note that reactions are defined within the scope of a component, so the usual denotation of c.s, c.r, c.R and so on is superfluous. Reactions also reflect the thread structure of a component. For example, all behavior implemented by a common thread is modeled as a single reaction. This allows analyses to take into account the actual degree of threading, and potential concurrency errors, of the component implementation. A component’s behavior as a whole is specified as the CSP parallel (indicated by the || symbol) or interleaved (indicated by the ||| symbol) composition of its reactions, depending on which better models the actual interaction among the component’s threads.

14.

The annotation mechanism or syntax is not described in this report.

CMU/SEI-2002-TR-031

19

The gist of reaction rules is depicted in Figure 8. s1:t > s2:t > s3

> s >4

Figure 8:

R1 R2 R3

r1

> > r3 > r4 > r2

Reactions Specify Component Behavior

In this example, there are three reactions: R1, R2, and R3. The ovals are used to illustrate which pins are related by each reaction; for example, R1 is shown as relating sink pins s1 and s2 to source pins r1 and r2. R1 represents the behavior of a single thread t that is used to process sinks s1 and s2. A definition of R1 could be R1 = (s1 → r1 → R1) p (s2 → r2 → R1) where p means external choice. Reactions allow us to specify the causal dependencies among behaviors in an assembly of components. The most basic causal dependency is the dependency chain, as illustrated in the above reaction. In fact, this is the only causal dependency we required for the latency analysis in SAS model solutions. More complex behaviors, such as coordination among reactions or changes in behavior based on accumulated state information, can also be modeled.

3.3.3.2

Assembly Behavior: Interactions

Up to this point, we have been using the k operator in an informal manner. Although we never made the claim, the reader might infer that k has some sort of algebraic properties and associated composition semantics. A simple algebraic model for k might be n n a 〈 R, { k • a ∈ E }〉 , where R denotes reactions, and in place of the single operator k, we have a an , one for each (nth) interaction scheme an defined for an environment E, set of operators k where the operators of an may be of arbitrary arity. º {c .s, c .s}) denotes the syntacSo, for example, from Figure 7, the 1:N interaction a0(c0.r k 1 2 tic composition of three components c0, c1, and c2, in assembly a0, on pins c0.r, c1.s, and c2.s. The interaction scheme for this composition is º, as defined in environment type E0. We canº is a binary (unicast) or n-ary (broadcast) operator. This and other aspects of not tell whether k the semantics of this interaction must be formally specified.

Such specifications are not trivial, but there are specification patterns that can simplify them. One objective of A Basis for Composition Language CL [Ivers 02] is to produce a general approach for specifying the composition semantics for arbitrary interaction schemes (connectors) for arbitrary environment type E. We hasten to add here that end users of PECTs never

20

CMU/SEI-2002-TR-031

see these semantic complexities, any more than users of modern programming languages see the complexity of type checking or code generators.

3.3.4

Hierarchical Assembly

So far we have described a component model that is quite flat: components can interact only with components in the same assembly and only with a single runtime environment associated with that assembly. Pin will not scale to interesting problems without introducing some form n n a of hierarchical composition. In fact, the algebraic model 〈 R, { k • a ∈ E }〉 has a hierarchical meaning, because R3 = R1kR2 (for some arbitrary binary composition operator k and reactions R{1,2,3}) induces a hierarchy that is rooted at R3 and composes R1 and R2. Hierarchical assembly in Pin always involves treating an assembly as a component. That is, the assembly has an interface defined in terms of source and sink pins. The correspondence between component pins and assembly pins is established by means of assembly junctions. Two forms of junctions are defined: null junctions and gateway junctions. We note at the outset that hierarchical composition introduces many subtle complexities not generally addressed by component technology. We do not discuss these subtle issues in this report because our focus is on the controller assembly only.

3.4

Pin Subset in SAS Model Solutions

The preceding description of Pin is more general and complete than the version of Pin used in the SAS PECTs—many of the generalities were a result of our experience. The following summarizes those aspects of Pin that were actually used: •

The operator PECT used Microsoft .NET as its component runtime, while the controller PECT used Microsoft Windows 2000 augmented with support for real-time, prioritybased scheduling. Differences between these runtimes were reflected in the latency theories used, but not in composition semantics.



The assembly environments for the operator and controller PECTs were conventional text editors operating on component and assembly descriptions written in Extensible Markup Language (XML).



The operator and controller PECTs had an analysis environment in the limited sense that latency prediction was automated, and both had validation environments based on traces of time-stamped pin activations.



All the pin types described in Section 3.3.1 were implemented. However, components were restricted to one thread of control that was shared by all asynchronous sink pins. Further, synchronous sink pins were not threaded.

CMU/SEI-2002-TR-031

21



Sink pins were annotated (in XML) with execution time. Components were annotated with a single execution priority. Callers on synchronous, mutexed sink pins would acquire the priority level of the called component.



The notion of assemblies was implicit—although assemblies were units of deployment, they were not named entities. Environment types did not appear at all. Environment services, such as periodic timers, were implemented as environment-provided components.



OPC gateway junctions were used to (hierarchically) compose an assembly of the operator PECT with an assembly of the controller PECT; analysis hierarchies using null junctions were not used.



The CSP reactions were specified, on paper, for each controller component. These reactions were then used to manually construct the models used in the temporal-logic model checking, as described in Chapter 7 and Appendix C, and the ordering of interactions over 1:N connection topologies for controller latency prediction.



There were no formal compositional semantics defined for the connectors.

3.5

SAS Software Controller in Pin

Figure 9 depicts the SAS software controller developed in Pin for this model problem. The names and functional roles played by the components in Figure 9 (i.e., CSWI, TCTR, TVTR, MMXU, XCBR, and PIOC) are defined by the IEC 61850 standard. SwMonSource and SwMonSink manage the interface to the hardware switch, Clock is a component that is (logically) bundled with the λABA latency model, and OPCGateway allows us to create an assembly of assemblies—one that allows the operator and controller assemblies to interact using the OPC protocol.

22

CMU/SEI-2002-TR-031

> >

º

0

º

> >

1 2 3 4 5

CLOCK

º º

Figure 9:

15.

0

0

TCTR 1

TVTR 1

0

> >

2

CSWI 3

1

> > > º º º

º º º º º

º º

º

º º

1 0

0

PIOC

2

2

MMXU 1

º >

0

6

1

7

2

3

> >

> >

0

SwMonSink 1

4

5

º º º º º º º

0

1

2

3

4

¢ § OPCGateway

7

XCBR

¢ §

OPCGateway8

SwMonSource

º

º

5

6

º

SAS Software Controller in Pin15

Each component in this figure had at most one thread of execution that was shared among all asynchronous sink pins. Those threads are not depicted in this figure for the sake of simplicity.

CMU/SEI-2002-TR-031

23

24

CMU/SEI-2002-TR-031

4

PECT Process and Workflows

4.1

The PECT Development Process

We develop and validate a PECT using a process that consists of four fundamental phases: 1. Definition 2. Co-Refinement 3. Validation 4. Packaging During the Definition phase, we define the functional requirements, the assembly property to analyze, and the statistical goals for the PECT. Once those goals have been set, we create a component model, an analysis model, and a measurement framework through an iterative process known as co-refinement. A PECT instance, which integrates the component and analysis models, emerges from co-refinement. During the Validation phase, we validate the PECT instance against the defined goals. If it meets the goals sufficiently, the PECT is packaged and released. If the goals are not met, co-refinement is repeated until they are. Figure 10 shows a brief overview of the PECT process. As the arrows indicate, this process is highly iterative.

CMU/SEI-2002-TR-031

25

[pre: identified business need for predictable assembly, business case for PECT]

Define

Co-Refine

Validate [insufficient]

Package

[post: PECT]

Figure 10: An Overview of the PECT Development Method In the following sections, we translate the PECT process concretely into (using Kruchten’s terms) deliverables, workers, activities, artifacts, and workflows [Kruchten 00]. Deliverables constitute the primary output of the PECT process. Workers are those involved in performing the PECT process. Artifacts describe what is produced during the PECT process. Activities describe how the PECT process is accomplished, and workflows provide insight into when in the process they occur.

4.1.1

Deliverables

The outcome of the process is a PECT that consists of •

a component technology



one or more analysis views and their associated interpretations



a validation environment or other means of establishing trust in analysis and predictions



validation data and statistical labels

26

CMU/SEI-2002-TR-031

The delivered PECT has the following characteristics: •

zero programming assembly Components can be assembled into applications without additional programming. If additional code has to be provided, the code must be encapsulated in PECT-compliant components.



automatic interpretation To perform predictions in an execution environment where the end assemblies are unknown, the PECT must be able to interpret a constructive assembly into an analysis view automatically. The analysis view is then used to perform the desired prediction.



objective trust Objective trust is achieved by validating the prediction theory using a wide variety of assemblies. Statistical means can be used to calculate how many assemblies have to be executed to meet the statistical goals set for the prediction theory.

4.1.2

Workers, Activities, and Artifacts

In the Rational Unified Process (RUP), an activity is generally assigned to a specific worker and has a duration of a few hours to a few days [Kruchten 00]; we relax this stipulation somewhat and allow multiple workers to contribute to an activity, where that activity has an indeterminate duration. As with the RUP, each activity has a clear purpose, usually centered on creating or modifying artifacts—the tangible products of the PECT development method. Table 2 depicts the major artifacts of the PECT development process that are the result of workers performing activities. The fact that each artifact is assigned to just one worker should be interpreted not as an exclusive assignment, but rather as an assignment of primary responsibility. In general, multiple workers are required for most artifacts. For example, defining the requirements for a PECT is primarily the responsibility of the customer, but setting the appropriate requirements requires the contributions of the PECT designer, attribute specialist, and component model specialist. Table 2:

PECT Development Activities

Activity

Workers

Artifacts

Define functional requirements

customer

PECT requirements

Define assembly property

customer

assembly property definition

Define statistical goals

customer

normative confidence labels

Define component model

component model specialist

constructive viewtype

Define analysis model

attribute specialist

analysis viewtype

Define property theory

attribute specialist

property theory and its validation concept

CMU/SEI-2002-TR-031

27

Table 2:

PECT Development Activities (Continued)

Activity

Workers

Artifacts

Validate theory empirically

measurement specialist

experiment design, statistical labels

Develop analysis interpretations

PECT designer

component and assembly description schemas; interpretations

Develop component measurement apparatus

component developer

test components (synthetic, actual) testbench for component and assembly measurement

measurement specialist Develop assembly measurement apparatus

measurement specialist

assembly generator and analysis tools

Create validation sample

component developer

any additional assembly components sample assembly set

application assembler Obtain component measurements

measurement specialist

component measurements (labels)

Obtain assembly measurements

measurement specialist

assembly measurements and analysis (labels)

Deploy PECT

PECT designer

packaged PECT

Below, we describe workers whose roles are not implicitly clear by their names alone: •

component model specialist: defines the constructive viewtype used in the PECT. This person may or may not be the designer of a proprietary component technology. Either way, if the component technology has already been defined, this specialist will have in-depth knowledge about it.



attribute specialist: defines and helps to validate both the property theory and the analysis model. This person must know the theories behind the chosen attribute (emergent property) that the PECT is engineered to predict. For example, if the attribute is latency, the attribute specialist would likely have in-depth knowledge about real-time performance and how to analyze for performance.



measurement specialist: develops both the component and assembly measurement apparatus and obtains the component and assembly measurements. This person must know how to measure and empirically validate the property theories in the PECT and would typically have broad knowledge of statistical methods.



PECT designer: unifies the constructive and analysis models through co-refinement (as manifested in the analysis interfaces) and then packages and deploys the PECT. This person holds the holistic view of the PECT and communicates with the other workers to keep the various parts of the PECT development synchronized.

28

CMU/SEI-2002-TR-031

In addition to those workers actually involved in the PECT process, two other roles are key to its successful execution: •

system specialist: has in-depth knowledge about the execution environment that usually affects the behavior of the components residing in the PECT. For example, the PECT may need to be designed to restrict the use of the operating system to meet the set goals.



domain expert: knows how the PECT will be used from the end customer’s perspective and provides input about existing standards, methods, and user profiles. This role is important, because it helps to ensure that the PECT’s design will fit the system’s intended usage.

4.2

Workflows

The workflows in this section are elaborations of the PECT overview workflow shown in Figure 10 on page 26.

4.2.1

Definition Phase

The workflow shown in Figure 11 is an elaboration of the Define process shown in Figure 10. The Definition phase has two parallel paths—one to define the functional requirements and another to gather the necessary requirements for the property of interest and set the PECT’s prediction goals. [pre: identified business need for predictable assembly]

Define Functional Requirements

Define Assembly Property

Define Statistical Goals

[post: identified property and tolerance requirements]

Figure 11: Definition Phase Workflow

CMU/SEI-2002-TR-031

29

Statistical goals should be based on the business need. Because producing it takes effort, a PECT should be only as general as necessary to fill that need.

4.2.2

Co-Refinement Phase

During the Co-Refinement phase shown in Figure 12, we define the constructive and analysis viewtypes, and the interpretation that integrates them. We also define a plausible and possibly high-level scheme for validating predictions based on the analysis viewtype. The workflow depicted suggests a process that is somewhat more structured than what occurs in actual practice. The constructive and analysis viewtypes are depicted as being developed in parallel, but they are, in fact, mutually constraining, as explained and illustrated in Chapter 5. In our experience, considering validation places useful constraints on the development of the analysis viewtype, so we placed “Define Validation Concept” at the same level of activity in the workflow as “Define Constructive Viewtype” and “Define Analysis Viewtype.” A similar argument can be made for placing “Define Interpretation” alongside “Define Validation Concept,” although our experience suggests that defining the interpretation is usually the closing step in a co-refinement iteration. [pre: identified property and tolerance requirements]

Define Constructive Viewtype

Define Analysis Viewtype

Define Validation Concept

Define Interpretation [¬(tractable∧general∧interpretable)] [tractable ∧ general

∧ interpretable]

[post: analysis viewtype, interpretation, validation concepts]

Figure 12: Co-Refinement Workflow

30

CMU/SEI-2002-TR-031

The iteration within the co-refinement workflow must not be confused with the iteration that results from the empirical validation shown in Figure 13. Within co-refinement, the condition for iteration is an assessment of whether the resulting constructive assembly is sufficiently general to handle an anticipated range of assemblies and whether the property theory underlying the analysis viewtype will, with reasonable computation and interpretation effort, yield predictions for all these assemblies.

4.2.3

Validation Phase

After co-refinement, the resulting PECT is validated against its requirements. The definition workflow shown in Figure 11 identifies functional requirements and tolerance requirements. Functional requirements define the range of assemblies that must be “spanned” by the constructive model and its interpretations; the tolerance requirements refer to the desired quality of predictions based on the analysis viewtypes supported by the PECT. The workflow shown in Figure 13 is biased toward the validation of analysis viewtypes whose underlying property theories are empirical (i.e., based in measurement) rather than formal (i.e., based in logic). As this workflow is currently defined, it addresses the validation of both functional requirements and statistical goals.

CMU/SEI-2002-TR-031

31

[pre: analysis view and interpretation statistical norms for predictions]

Define Validation Goal

Define Validation Process

Collect Validation Data

Analyze Results

[post: validation data and labels]

Figure 13: Empirical Validation Workflow There are four general aspects to carrying out an empirical validation study: 1. Define the validation goal. The goal is a normative or informative statement about the predictive power of the PECT. Typically, a goal is stated in terms of the probability that a prediction will lie within some accuracy bounds, with some stated confidence level. 2. Define the validation process. Validating the accuracy of a prediction is analogous to validating a scientific theory. That is, behaviors are observed systematically under controlled circumstances, and this observed behavior is then compared to predicted behavior. The key word is systematic, since validation results should be, above all else, objective and hence repeatable.

32

CMU/SEI-2002-TR-031

3. Collect validation data. This step involves conducting the actual validation experiment. Important in this activity are good laboratory techniques—for example, keeping good notes and records so that anomalies can be studied and results can be repeated. 4. Analyze the results. The purpose of the analyses is twofold: first, to objectively and reliably describe the predictive powers of a PECT; second, to provide analysis data to support additional co-refinement, should normative goals fail to be satisfied. This workflow is expanded considerably in Chapter 6.

4.2.4

Packaging Phase

The workflow for packaging is a lacuna at this time. The objectives of packaging are twofold. One objective is to ready the technology for deployment, including the development of installation support, documentation, and so forth. A second and more fundamental objective is to design automation support for the PECT to minimize the property-theory-specific expertise required by PECT users to make effective use of analysis viewtypes supported by the PECT. This area requires further work.

CMU/SEI-2002-TR-031

33

34

CMU/SEI-2002-TR-031

5

Co-Refinement of the Controller PECT

5.1

What Is Co-Refinement?

PECT creation is a process of iterative negotiation between the constructive and analysis points of view. We call this process co-refinement. Co-refinement does not necessarily follow a rigorously defined process, but rather it is channeled by a set of forces that guide co-refinement as the two points of view (models) converge into a PECT. The constructive point of view pushes for assembly generality; the more assemblies that can be represented in the constructive model, the better. The analysis point of view pushes for predictability, which implies constraining the constructive model to adhere to the assumptions of the property theories that enable analysis and prediction. These forces tend to act in opposition to each other. Practicability, tractability, and interpretability also weigh in as important forces and act as moderators. Practicability pushes for a level of generality that is necessary (but no more than necessary) to represent some class of realistic system; it acts as a bounds on generality. Tractability is concerned with the level of difficulty in solving the analysis model; it acts as a bounds on predictability. For example, it might not be viable to require a supercomputer to solve the analysis model. Interpretability ensures that automatic translation from the constructive model to the analysis model is possible. This possibility implies the ability to formally specify both models and to translate the constructive representation into the analysis representation. Validatability is concerned with the level of difficulty in obtaining empirical validation of the analysis model.

5.2

How Co-Refinement Proceeds

The co-refinement process starts with an initial, not necessarily formal, description of the “languages” for the constructive and analysis models. The elements of the constructive model language are influenced by the PECT’s target domains and how systems in those domains will be structured, developed, deployed, and sustained (evolved) over time. The elements of the analysis model language are simply a subset of some property theory that is suitable for analyzing the behavior of interest.

CMU/SEI-2002-TR-031

35

Initially, each model is independently subjected to its own starting criteria. The constructive model16 has to be expressive enough to describe the types of components and interactions expected in the domains that will use the PECT. However, initially, the constructive model might not be interpretable vis-a-vis the analysis model, and the analysis model might not be general enough to account for all aspects of the constructive model. Given a starting point for the constructive and analysis models, the forces guide co-refinement as the initial models converge into final models that define the PECT. While there is no precise set of steps for determining the sequence of intermediaries from the initial models to the final models, there are some rules of thumb. Generality increases monotonically. After the first iteration, the PECT designer should have reasonable confidence that the constructive model is interpretable and that the analysis model is tractable. Therefore, the dominant force in co-refinement at this point is moving the constructive model to the desired state of generality. The only reason to reduce generality would be if the designer’s confidence in interpretability proved to be unwarranted. Interpretability increases monotonically. While co-refinement might not start with interpretability, it must end with it. This leads us to believe that interpretability must monotonically increase as co-refinement proceeds. This means that, during co-refinement, the languages describing the constructive and analysis models, and the interpretation rules become more formal. In addition, a translator for the constructive model is developed iteratively as co-refinement proceeds. Validatability is an invariant. Validatability is a defining characteristic of a PECT. Unlike generality and interpretability, however, it is important that all analysis models be validatable, at least in principle, for each iteration of co-refinement. By in principle, we mean that there must be a plausible means for obtaining or validating component properties, and for falsifying predictions made in the analysis model. While plausibility is not an objective quality, it is one that has a reasonable intuitive meaning. Using these rules, co-refinement drives for increasing generality in the constructive model until all the following stopping conditions are met: •

The practicability boundary is reached—striving for more generality is not worth it.



The interpretability boundary is reached—an automatic translation becomes impossible.



The tractability boundary is reached—because of its complexity, solving the analysis model is too costly.

This iterative scheme was depicted in Figure 12 on page 30. 16.

36

The terms constructive model and constructive model language are used interchangeably in this report.

CMU/SEI-2002-TR-031

5.3

Key Ideas for the Controller PECT

Performance is important in the intended domain for this PECT—power station control. Sensor inputs are gathered at various rates, inputs are coalesced, and actions are taken depending on the input values. Computations must adhere to both worst-case and average-case latency requirements. Typically systems are multiprocessing on a single processor or multiprocessors, connections can be both synchronous and asynchronous, and some data stores must be updated atomically. The property theory for the controller PECT is based on rate monotonic analysis (RMA) [Klein 93]. RMA is well suited for predicting worst-case latency for a collection of processes on a single processor in a mostly (but not solely) deterministic situation; events occur mostly periodically, and process execution times are bounded and do not vary dramatically. RMA requires information such as process periods and execution times and priorities; whether mutually exclusive access to shared resources is needed; whether there are intervals of nonpreemptability; and whether each process has a unique priority. In general, RMA strives to account for any factor that can contribute to latency—the time interval from when an event occurs until it has completely been processed.

5.4

Co-Refinement of the Controller PECT

This section describes the starting state (initial condition) for each model and then walks through the five iterations of the co-refinement process, leading to λABA, the property theory whose validation is the subject of Appendix B. For each iteration, constraints on the constructive model and model semantics (i.e., what is predicted) are described. The five iterations are •

predicting worst-case latency (λW)



predicting average-case latency (λA)



introducing mutual exclusion and worst-case blocking (λWB)



predicting average-case blocking (λAB)



introducing asynchrony (λABA)

It should be noted that this is an a posteriori examination of a co-refinement experience. A priori, you would not expect to know the specifics of each iteration or how many iterations are required.

5.4.1

Initial Conditions

The initial constructive model was relatively simple. It consisted of a collection of components, where each component had a set of source and sink pins. Since concurrency was known to be important, each component could have either no threads or one thread associated with

CMU/SEI-2002-TR-031

37

it.17 Since some components had to be activated periodically, there was a special source pin that was denoted as periodic. Naturally, these components were threaded. Sink pins could be either reentrant or non-reentrant. Both synchronous and asynchronous connections were allowed. The analysis model is based on a collection of real-time analysis techniques known as RMA for predicting worst-case latency. Latency is the measure of how long it takes to service an event. For example, we consider reaching a voltage sensor’s read time as the occurrence of an event. Servicing the event comprises all the computations that are necessary to respond to the event, such as acquiring input from the sensor, converting the input to a number with some physical meaning, filtering noise, combining the input with other data, testing the data against some important physical conditions, and then issuing some output. These computations take place on a single processor that is also servicing other events. Latency is how long it takes from the voltage sensor’s read time (not necessarily when the sensor is actually read) to the completion of the response to the event. The initial analysis model used a subset of the modeling capabilities of RMA including •

considering only periodic events, that is, those that occur at regular intervals



each periodic event is allocated its own process (or thread)



each process is assumed to have a constant, unique priority



the variability of the execution time for any given component is bounded



the worst-case latency of all events is less than or equal to the period



only reentrant, synchronous connections are allowed



no blocking is allowed (sources of blocking include critical sections and nonpreemptable sections)

5.4.2

First Iteration: Worst-Case Latency (λW)

Tractability and interpretability were the dominant forces during this iteration. They caused us to both extend the constructive model with the annotations necessary to provide the analysis model with the needed information and to constrain the constructive model to conform to the restrictions of the initial conditions of the analysis model. The annotations included •

specifying execution times as properties



specifying unique priorities for each thread. This turned out to be a significant restriction, because Windows imposed severe limitations on the number of priority levels.

17.

This has since been generalized. See Section 3.3.1.

38

CMU/SEI-2002-TR-031

Modifications to the constructive model involved mainly the thread model. Since periodic invocations were important, we decided to extend the constructive model with an explicit notion of time using an environment component, the clock component. This extension eliminated the need for a periodic source pin. Also, since the analysis model was not considering blocking during this iteration, we disallowed threaded components, because they would have resulted in non-reentrant components and hence blocking. The constructive model was also constrained to allow only synchronous calls, conforming to this restriction of the analysis model. With these restrictions to the analysis model, the worst-case latency for each process could be calculated using the following fixed-point calculation: i–1

Ln + 1 =

∑ j=1

Ln ------ C j + C i Tj

This formula can be used to compute the worst-case latency Li for the ith process. Ci is the execution time of the ith process; Ti is the period of the ith process. Processes 1 through i-1 are assumed to be all the processes whose priorities are higher than that of process i. The iterative calculation starts by using Ci as the first guess for the worst-case latency by setting L1 to Ci. Then it computes Ln+1 using the formula above. It continues until Ln = Ln+1, at which point the fixed point has been reached, and Ln+1 is the worst-case latency. The interpretation from the constructive model to this analysis model (the above formula) is straightforward. Execution time (associated with sink pins), periods (associated with clock components), and priority (associated with threads) all map directly to the formula. The only complication occurred for the case in which a clock component initiates a sequence of component calls, as shown in Figure 14.

º CLOCK º

s

C1

r

> >

s

C2

> >

s

C3

>

a0: E0

Figure 14: Clock Component Initiating a Sequence of Interactions In this case, the period of the clock corresponds directly to the period of process i in the analysis model, the priority of component ci corresponds to the priority of process i in the analysis model, and the sum of the execution times of components c1, c2, c3, and so on, corresponds to the execution time of process i.

CMU/SEI-2002-TR-031

39

At this point, we did not yet have an automatic translation from the constructive to the analysis model, but we did have a high degree of confidence that such a translation could be created.18

5.4.3

Second Iteration: Average Latency (λA)

For the second iteration, practicability started to exert influence mainly in terms of the need of the constructive model to satisfy average-case latency requirements in addition to worst-case latency requirements. Another facet of practicability influencing this iteration was the realization that the zero-phasing19 assumption of the analysis model was not realizable in the constructive model. Therefore during this iteration, the constructive model did not change; the analysis model evolved to predict average-case latency and to handle non-zero-phasing. Zero-phasing is significant because a key idea of RMA is that worst-case latency occurs for process i when it becomes ready to execute at exactly the same time that all higher priority processes do. The fixed-point formula shown in the previous section computes worst-case latency assuming zero-phasing. If the constructive model does not in fact adhere to zero-phasing, the prediction might produce a result that is too pessimistic. Accounting for non-zerophasing requires only minor changes to the fixed-point formula shown in the previous section. These changes were made during this iteration. A more substantive change to the analysis model was required to calculate average latency. The critical observation was that the execution profile of process i repeats itself. The interval of repetition (known as the hyper-period) is equal to the least common multiple (LCM) of T1, T2, ..., Ti.. Every NP period, process i’s performance behavior repeats, where NPi = LCM(T1, T2, ..., Ti)/Ti. Therefore, the latency for the first instance of process i should be the same as the latency for the 1+NPith instance. To calculate the average latency for process i, we simply compute the average of the NPi instances of process i during a hyper-period. The fixed-point calculation above had to be generalized to work across the hyper-period instead of just stopping after the first instance of the job. At this point, we performed a “spot validation” of the interpretation and analysis model. That is, we constructed a small experiment, consisting of a small sample (< five simple assemblies) to see if we were predicting average latency accurately. The results were encouraging.

18.

Although rewrite rules were possible, we chose not to implement them, because the generality of this approach to interpretation was not clear.

19.

We say two or more events are zero-phased in time when they happen at the same moment (i.e., there is no time delay between them). In the context of the controller PECT, zero-phasing means that process i will become ready to execute at the exact moment all higher priority processes become ready to execute (the critical instant).

40

CMU/SEI-2002-TR-031

5.4.4

Third Iteration: Worst-Case Latency + Blocking (λWB)

During this iteration, practicability continued to be a dominant influence. The controller domain required non-reentrant execution within components to ensure data consistency. During the first two iterations, the only influences on process latency (in theory) were its own execution time and preemption from higher priority processes. With non-reentrancy, a higher priority process might have to wait while a lower priority process is executing a non-reentrant component. This effect is known as blocking. It prolongs latency and must be accounted for in the analysis model. To account for blocking in the analysis model, a blocking term is simply added to the expression on the right side of the worst-case latency equation shown on page 39. However, the real issue is how to control the magnitude of this term. To do so, we impose a restriction on the constructive model that is based on another key result of RMA. One of the synchronization protocols discovered by RMA is the priority ceiling protocol. That protocol exhibits an important property known as the blocked-at-most-once property, which says that a process that executes in several critical sections can be blocked in, at most, one critical section. The priority ceiling protocol requires that when more than one component interacts with a non-reentrant sink pin, the thread for that sink pin must execute at a priority level that is at least as high as the priority of the effective thread of any of the calling components. This priority level is known as the ceiling priority. Note that for this priority assignment to emulate the priority ceiling protocol, the thread that is assigned this priority cannot suspend for any reason, including input/output (I/O). The prediction obtained using the priority ceiling is a worst-case prediction with respect to blocking, because a blocking term is used for every job in the hyper-period, regardless of whether the job is actually blocked during its execution. Assemblies must honor the priority ceiling to be well formed with respect to λWB. Since the constructive model has no concept of priority, it is the interpretation rather than the constructive model that enforces this topological constraint.

5.4.5

Fourth Iteration: Average Latency + Blocking (λAB)

The goal of this iteration is to calculate average latency in the presence of blocking, taking into account the possibility that blocking will occur only for a portion of the jobs in a hyper-period. The initial challenge for this iteration was deriving a variation of the fixed-point calculation that could also determine when blocking did and did not occur. To make this determination, the notion of a subtask was introduced. A subtask is a portion of a component’s computation distinguished by its priority; different subtasks have different priorities.

CMU/SEI-2002-TR-031

41

At this point, a purely analytic approach became unwieldy—in fact, nearly intractable. However, the analysis model yielded to a hybrid simulation-based approach [Schwetman 78]. In a hybrid simulation approach, an analysis model is used to compute values that define initial or test conditions in a simulator. Details of the hybrid simulation are provided in Appendix A. At this stage, a formal interpretation was defined. Although we knew that the average-case latency theories were, in principle, validatable, it was by no means clear that the earlier spot validation would scale from λA to λAB. The PECT was not yet sufficiently general, since it did not accommodate assemblies with asynchronous pins. This led to the final iteration.

5.4.6

Fifth Iteration: Asynchronous Interactions (λABA)

The previous iterations had all involved adding new terms and features to the analysis model, and introducing or removing constraints on the constructive model to accommodate those additions. At this point, it was clear that the final and necessary step of the co-refinement process would be to relax the prohibition against asynchronous pins. It was unclear at this point how this constraint could be relaxed, or how much its relaxation would affect the analysis model or simulation. Our great fear was that it would result in some kind of fundamental discontinuity, effectively demonstrating that the basic RMA approach was insufficient or vastly complicating the simulator (loss of tractability). Our fears proved to be unfounded, however. We made the fortuitous discovery that we could, through the application of rewrite rules, transform a constructive assembly with an arbitrary topology of asynchronous interactions into a behaviorally equivalent one (from the perspective of latency computation) with a more restrictive topology, as with synchronous pins only. These restrictions simplified the interpretation and preserved the hybrid simulation model. Thus, we were able to introduce asynchrony to the latency model strictly through rewriting and interpretation.

42

CMU/SEI-2002-TR-031

6

Empirical Validation

6.1

What Is Empirical Validation?

There are two classes of property theory: those based in logic and those based in measurement. Logical property theories, such as those found in model checking and other forms of program verification, involve demonstrative reasoning: the validity of a property is mathematically demonstrated, or proven, in a way that bars contradiction. Empirical property theories, such as λABA, involve plausible reasoning: the validity of a property theory is established inductively, based on observations. The fact that empirical theories are based on observations and measurement introduces uncertainty. The property theory itself may abstract aspects of an implementation that influence the assembly-level property. This, in turn, introduces variability and some degree of non-determinacy, for we cannot give a systematic accounting of an abstracted system aspect without undoing the effect of its abstraction. Empirical validation is the means by which we validate empirical property theories. The validation does not yield a simple “yes” or “no.” Instead, it produces statistical evidence of a property theory’s effectiveness and of the component property measures on which the validation’s predictions are based. Before describing the workflow for empirical validation in Section 6.3, we provide a brief overview of the statistical models used to express statistical evidence.

6.2

Introduction to Statistical Labels

Rigorous observation takes the form of measurement, and measurement invariably introduces error. Thus, measured component properties are described not as discrete values, but as probability density functions over a range of values. As was the case with component properties, measured assembly properties are described using probability density functions over a range of values. Likewise, determining the effectiveness of a property theory also involves measurement, this time in a more classical application of the scientific method. The property theory is a hypothesis; it is tested by comparing assembly properties predicted by the hypothesized property theory with observed assembly properties. We refer to the statistical descriptors of component properties and property theory effectiveness as statistical labels. Our position is also that all empirical properties and property theories

CMU/SEI-2002-TR-031

43

can be described in a uniform way, using the same, or very similar, statistical models, resulting in standard labels.

6.2.1

Component Labels

The measurement and subsequent description of empirical component properties do not introduce novel methodological challenges. Objective measurement requires a measurement object (the component), a measurement scale for the component property of interest (e.g., time, in seconds), and a measurement apparatus (e.g., the Microsoft Windows’ high-performance clock). Measurement is conducted using the apparatus within some controlled environment and with control exerted over all independent variables of the dependent (measured) property.

Component Property Estimators In general, the measurand Y, or property of interest, is a function of N values: Y = f(X 1, X 2, …, X N) [Taylor 94]. For latency, these values include execution time, blocking time, and period. We would like to know the true value of Y, component latency. Of course, the true value is not obtainable, as the following definition makes clear: True Value: the mean (µ) that would result from an infinite number of measurements of the same measurand carried out under repeatable conditions, assuming no systematic error. Because we cannot, even in principle, know the true value of µ, we must use an estimator for it, one that is produced by statistical methods. For example, we take sample observations of X and use their average x as the estimator of µ, a population parameter. The uncertainty associated with this estimator is expressed as the deviation s such that the true—and unknown—value of µ will fall within some interval x ± ks with some specified confidence. The factor k is known as the coverage factor. When k=1, x ± ks yields a 68% confidence interval (one standard deviation). That is, we have 0.68 confidence that this interval contains µ. Typically, we compute the 0.95 confidence interval (k=2), which yields higher confidence but a larger bound. This confidence interval expresses a fundamental component measure; it is fundamental because x ± ks is how a component property is modeled in any property theory parameterized by that component property.

Component Property Intervals It might be useful to have other descriptors, or labels, for component properties. One such label is the tolerance interval. For example, in the case of component latency, the consumer might want to know which latency interval ± x ms contains a specified proportion p of all executions of component c. For instance, a tolerance interval with p=0.95 might state that there is

44

CMU/SEI-2002-TR-031

a 0.95 probability that any given component’s execution will have a latency of 50ms ± 17ms , where ± 17ms is the computed tolerance interval. Another potentially useful label is the conceptual, but not mathematical,20 inverse of the above tolerance interval. This label is the confidence interval on the probability of satisfying some property specification. Conceptually, this label is the inverse of the tolerance interval, since we specify the latency interval here and use it to compute the probability that any particular execution will fall within this interval. For example, we might specify the latency interval ± 5ms and use it to calculate that there is a 0.64 probability that any given component’s execution will fall within the interval 50ms ± 5 ms .

6.2.2

Property Theory Labels

Measuring and describing the effectiveness of property theories is methodologically more challenging than measuring component properties. As noted earlier, at the limit, this measurement process reduces to theory falsification as it is practiced in the empirical sciences; in that sense, at least, no new ground is broken. Our concern is with characterizing (labeling) the quality of a property theory. Such a characterization translates into labeling the accuracy of the theory’s predictions and determining how often they are accurate.

One-Tail Inferential Intervals We use inferential statistical models to characterize how effective a property theory is likely to be for future predictions. For this purpose, we use confidence and tolerance intervals. For property theories, we are usually interested in the MRE between predicted and observed values. For latency, this equation is used: a. λ – a. λ′ MRE = -------------------------a. λ′

where a.λ′ is the measured assembly latency and a.λ is the predicted latency.21 A normative confidence interval will describe the probability that the MRE for a particular prediction will lie within a specified MRE interval. Notice that the lower bound for an MRE interval represent situations where predictions are better than the mean MRE of the statistical sample (i.e., in our experiment, for N=30 assemblies). While we might want to know how frequently our predictions are better than the mean, we might be concerned only when the predictions are worse than the mean. In this case, we

20.

Different equations are used to compute conceptual and mathematical intervals.

21.

We note in passing that a.λ′ is described using the fundamental label for components. That is, for statistical analysis, the assembly is treated as a component, and the estimator for assembly latency is described by an interval obtained using a coverage factor k=2.

CMU/SEI-2002-TR-031

45

use a one-tail interval, or bound, instead of a two-tail interval. The one-tail tolerance interval for λABA is summarized in Table 3. Distribution-Free Tolerance Interval for λABA

Table 3:

Part of Interval

Description

N = 156

sample size

γ = 0.9929

confidence level

p = 0.80

population

µMRE = 0.0051

MRE

UB = .01

upper bound

The tolerance interval for λABA can be interpreted as saying that 80% of latency predictions will not exceed an MRE of roughly 1%, and that the manner in which that interval is constructed yields greater than 99% confidence that this upper bound is correct. As with measures of component properties, one- and two-tail normative confidence intervals can be computed by specifying interval bounds and computing p, the probability that any particular prediction will satisfy these specified bounds.

Linear Correlation Linear correlation is a descriptive statistic: it describes the strength of correlation between two data sets and is not directly useful for drawing inferences about future data sets. A consumer might be interested in linear correlation analysis as a descriptor of previous experimental validations of a property theory’s accuracy. We characterize that accuracy using linear correlation analysis, which allows us to assess the strength of the linear relation between two variables—in our case, predicted and observed assembly latency. The result of such analysis is the coefficient of determination, 0 ≤ R2 ≤ 1, where 0 represents no relation, and 1 represents a perfect linear relation. In a perfect prediction model, predicted and observed latency would be identical; therefore, the goal for the model builder is a linear relation. For λABA, a distribution-free linear correlation was used (Spearman rank correlation). The resulting R2 = 0.998 means that there is < 0.001 probability that the reported correlation could have been achieved by chance.

6.3

Workflow for Empirical Validation

The workflow in Figure 15 expands the first two steps of the four-step validation process illustrated in Figure 13 on page 32. Figure 13’s workflow is reproduced in the left portion of Figure

46

CMU/SEI-2002-TR-031

15; the first two steps of Figure 13 have been replaced by their expansions, resulting in a sixstep workflow. Define Validation Goal

n

Define Validation Goal

Define validation goal Define Validation Goal

Define Validation Process

r

p

Define sampling procedure

o Define measures

q Develop measurement infrastructure

Collect Validation Data

s Analyze Results

Figure 15: Expanded Workflow for Empirical Validation In the remainder of this section, we describe the six steps shown in Figure 15.

6.3.1

Define Validation Goal

The overall objective of empirical validation is to assign a statistical label to a property theory that describes the effectiveness of that theory and to design the means of producing trusted component labels in support of that theory. Two types of goals yielding different types of statistical labels are possible: normative and informative. Normative goals express a prediction requirement that has to be met. For example, predictions might be required to be, on average, accurate to within 0.5%. An informative goal is free of norms—it’s up to the validator to describe the effectiveness of predictions. For example, the validator might compute the upper bound of a confidence interval for 60, 70, 80, and 90% of predictions, assuming varying levels of confidence. Part of defining the validation outcome involves selecting the statistical models used for labels. As discussed in Section 6.2.2, we have used the MRE as a basis for the empirical vali-

CMU/SEI-2002-TR-031

47

dation of property theories and two forms of intervals for descriptive and inferential labels. Those choices are not a necessary consequence of the initial “customer” objective—rather they are choices made by the measurement specialist of the best labels for a particular validation effort. As we learn more about empirical validation, we anticipate a broadened repertoire of statistical models. For λABA, the design problem was to develop a controller PECT that would predict latency with an upper bound of a tolerance interval for MRE ≤ 0.05, for a population p = 0.80, with a confidence level γ =0.99. Thus, the empirical validation for λABA was dominated by normative goals.

6.3.2

Define Measures

It is not always obvious what exactly must be measured to validate a property theory, or which measurement units are best to express those measures. Indeed, frequently there are several possible measurement units, depending on whether a direct or indirect approach is desired, or on the degree of accuracy or stability required. A good discussion of measures and measurements can be found in the seminal work of Fenton and Pleeger [Fenton 97]. From λABA, we have identified the property of interest—latency—and defined it as the total time a task requires to perform its work. The constituents of this total time—execution time and blocking time—also had to be expressed in terms of the Pin component model and measurement units. We also had to know whether there was a technical basis for measurement at all, which, surprisingly perhaps, was not trivial. Translating the notion of time to Pin was conceptually straightforward. Sink and source pins provided convenient “hooks” for hanging instrumentation. Each pin can be instrumented to record the moment of its activation (e.g., when a sink pin receives a request or when a source pin sends a message or makes a request), and the moment of its deactivation (e.g., when the sink pin satisfies its request or when a source pin completes its message send or has its request satisfied). These four measurement points permit a variety of time durations to be recorded on any assembly. Defining the appropriate units of time depends on the available instruments. For λABA, we chose to use the Microsoft high-performance clock, which records times in nanoseconds. Before making this selection, though, we had to establish that that clock could be used to measure component latency. To do so, we had to first demonstrate that the latency of a process, as recorded by the clock, was equal, within measurement tolerance, to the sum of system time and wait time for a process, as recorded by the central processing unit (CPU) clock. Once we had established this correlation, which itself required a validation activity, we were satisfied with our defined measures.

48

CMU/SEI-2002-TR-031

6.3.3

Define Sampling Procedure

Defining the validation experiment eventually requires that we select a set of components and assemblies for measurement. This can be viewed as selecting components and assemblies from a population, with one important proviso: if we expect a PECT to work for all future assemblies, we must assume that not all of its components and assemblies exist. So a question arises as to how to select from a population of nonexistent entities. The three main approaches to selecting samples from a population are random, convenience, and judgment sampling. Random sampling is useful when samples can truly be selected randomly. Convenience sampling occurs when the sample is taken while considering some a priori knowledge. Judgment sampling occurs when the sample is selected explicitly and sufficient domain knowledge exists to judge a sample set that represents the whole population. Of the three methods, random sampling provides the best statistical data, but it might be difficult to obtain a real random set of samples. On the other hand, convenience and judgment sampling are more practical and easier to perform, but the resulting statistical data might be biased and therefore not fully representative. For λABA, random sampling was used to collect latency measures from a set of synthetic components—components whose functionality consists of consuming CPU resources. We used a combination of judgment and random sampling to collect measures for assembly latency measures, and we used judgment sampling to define variation points for assemblies. Doing so created an assembly design space for constructing viable assemblies. Then, from that design space, we selected assemblies randomly. For example, in the controller PECT, no realistic assemblies have more than 50 components. Thus, that was an upper bound for one variation point in the assembly design space that we defined. In an analytic study, we must define the design space and select a random sample from it. Our approach to validating λABA was to exploit the “pure composition” aspects of Pin to define various dimensions of variation—the number of input and output connectors, the number of instances of each type of component, and so forth—and thereby sketch the outlines of the design space. Using these variations, we generated the pseudorandom assemblies. We have made only tentative steps in understanding how to analytically characterize the design space and draw a random sample from it. At this point, we have only two conjectures. The first is that a component model that provides well-defined and restricted rules for allowable patterns of component interaction is likely to be better suited for statistical analysis than a wholly unconstrained component model. The second conjecture is that product line settings may augment a component model’s structure-oriented rules with rules governing semantic variation (i.e., product feature variation [Clements 02a]).

CMU/SEI-2002-TR-031

49

Section B.2.3 on page 107 describes the details of the sampling procedure for both components and assemblies used in the empirical validation of the controller PECT.

6.3.4

Develop Measurement Infrastructure

As with any experimental process, a measurement infrastructure must be developed to collect data. Where objective measurements are taken, as with λABA, this infrastructure must be sufficiently well constructed to produce reliable and repeatable measurements. For λABA, we developed a shared-memory mechanism for capturing traces of pin activation and deactivation. We also developed a distributed, trace mechanism based on the Universal Datagram Protocol (UDP) for measuring higher order SAS assemblies (i.e., an assembly of operator and controller subassemblies). The measurement infrastructure must also be sufficiently well documented to allow the results to be repeated, possibly by disinterested third parties not associated with the original validation experiment. Just what should be documented has much to do with the property of interest. Often, it is clear which environmental factors have a direct correlation to a property, for instance, latency. Latency for a component will vary from environment to environment, based on processor speed. Other environmental characteristics could have an indirect correlation to a property—which may not be immediately obvious—for example, the compiler (or compiler flags) used to produce the component. Describing the test environment becomes increasingly important when trying to establish causal relationships between runtime and observed behavior and is especially critical when a discrepancy exists between the predictions and observations. Trying to find a correlation between such environmental characteristics and behaviors of components and assemblies can be difficult. Altering the environment to limit or prevent unexpected behaviors is one viable approach. However, having a description about what characteristics were present during the validation may hold clues as to what was and was not considered to be possible sources of error. For λABA, the environmental characteristics that were recorded for the validation included platform hardware characteristics (e.g., processor, bus, and memory speed), software characteristics (e.g., operating system and compiler versions), and residual processes (e.g., those background tasks running at the time of the validation). These characteristics are discussed in Appendix B. A summary of the overall measurement and analysis infrastructure developed for λABA is discussed in Appendix B.

6.3.5

Collect Validation Data

This step more or less speaks for itself. The only issue that requires attention is that a realistic validation experiment will generate a large volume of data and may require significant com-

50

CMU/SEI-2002-TR-031

puting resources and time to do so. All generated data should be archived so that it can be subjected to historical analysis.

6.3.6

Analyze Results

As noted earlier, the primary objective of empirical validation is to assign statistical labels to components and property theories. The computation of these labels should be straightforward, given an adequate definition of goals, measures, and sampling procedures. However, this does not mean that the only utility of empirical validation is the assignment of labels. In particular, the data generated by the validation experiment can be an invaluable resource for improving the quality of the property theory. This proved to be a source of particularly valuable lessons in the case of λABA. The data produced by validating the property theory uncovered several subtle and not so subtle inconsistencies in the PECT implementation. It is the repair of these inconsistencies that led to a second edition of this report.

CMU/SEI-2002-TR-031

51

52

CMU/SEI-2002-TR-031

7

Safety and Liveness Analysis in the Controller PECT

This report is focused on an empirical property theory (λABA) and its validation. Equally vital to PECT are formal property theories. In this chapter, we illustrate the main ideas of model checking, one approach to formally verifying safety and liveness properties of components and assemblies. In that approach, a system model is extracted from component reactions and their compositions in an assembly. Safety and liveness conditions are expressed in a modal (temporal) logic and tested by an exhaustive search of the state space described by the system model. We describe only the modeling and verification of temporal claims against a single component in the controller assembly. We defer a description of composed controller models and verifications to a follow-on report. A formal description of composition in Pin is provided by Ivers and associates [Ivers 02].

7.1

Model Checking

Model checking is a formal method for verifying finite-state concurrent systems. It is based on algorithms for system verification that operate on a system model usually expressed in some form of state-transition representation. This system model is then checked against a set of desired properties expressed in a modal (temporal) logic. Model checking is a desirable verification approach, because it is automated and exhaustive. State-transition representations, in essence finite state machines, are intuitive to the average engineer. Temporal logics, such as computational tree logic (CTL) and linear temporal logic (LTL), are not as intuitive, but can be mastered with practice [Prior 62], [Pnueli 85]. Verification of the system is accomplished by “checking” the model—exhaustively searching the state space described by the system model and, for each state, checking whether the desired properties have been satisfied.

CMU/SEI-2002-TR-031

53

Characteristics of system models that favor model checking over other behavioral verification techniques include •

ongoing I/O behavior (in so-called “reactive” systems)



concurrency (not a single flow of control)



control intensive (minimal data manipulation)

The controller assembly has each of the above characteristics, indicating that model checking is likely to be a good fit. The next section addresses how we apply model checking in the context of a PECT.

7.2

Process

The goal of our analysis is to determine whether the assembly of components used to implement the controller satisfies various safety and liveness properties. Therefore, we begin by producing a model of the software components used to implement the controller. In this example, we focus on the CSWI component of the SEI switch controller (see Figure 9 on page 23). First, we build a Pin model of the CSWI component, basing the model on the implemented component, rather than its requirements. This is an important distinction; we care about what the implemented component actually does, not what it should do. Essentially, we want to ensure a strong correspondence between the model and the component implementation. Extraction of the model from the implementation is one way to accomplish this, even when performed manually, as in this example. (Alternatively, we could have built the model and generated the implementation; however, since we already had the latter, that approach was not useful.) Next, we use a model checker to analyze the CSWI model. However, the Pin model uses a formal language—CSP [Hoare 85]—that is not understood by the model checker used in this exercise—NuSMV [Cimatti 00]. Consequently, we must define an interpretation of Pin to NuSMV. In this exercise, the interpretation is performed manually, although there is no obvious reason why it cannot be automated. Once the CSWI model is expressed in a form that is understood by the model checker, we formalize the safety and liveness properties we want to analyze and use the model checker to determine whether the model satisfies them. Any failure that is uncovered by the tool must then be analyzed. Most failures include a counterexample—an execution trace in which the desired property is not satisfied.

54

CMU/SEI-2002-TR-031

A failure could mean one of three things: 1. There is an error in the CSWI component model. If the counterexample is not a possible trace of the implemented component, the formal component model must be corrected. 2. There is an error in the formalization of the safety or liveness property. If the counterexample does not violate the desired property we meant to check, the property has not been formalized correctly. 3. The model does not satisfy the property. If the counterexample is a possible trace of the implemented component and fails to satisfy the desired property, we have a real error, and the component implementation must be repaired. Any type of failure can mean that we iterate back through the steps of this process. Modeling and analysis are often iterative processes that conclude only when a statement can be made regarding whether the component satisfies its required safety and liveness properties. The steps of this process are elaborated in the following sections.

7.3

Building the Model

We assume that a Pin model of an assembly is available, such as the one shown in Figure 9 on page 23. As noted, the model checker we used, NuSMV, does not accept input in the formal notation used by Pin. Consequently, we translate our Pin model into a formal notation that the model checker will accept. The steps to translating the model include (corresponding to Sections 7.3.1-7.3.4, respectively) 1. Determine the scope of problem analysis. 2. Produce a Pin model of the problem. 3. Translate the Pin model into a state machine. 4. Translate the state machine into NuSMV. These steps are elaborated in the following sections.

7.3.1

Determining the Scope of Problem Analysis

The context diagram shown in Figure 16 shows the controller assembly in Pin and highlights the region of that assembly used for our verification. The CSWI component contains the logic for interfacing to the operator and carrying out the selection and operation protocol with the XCBR component. The XCBR component controls the signals sent to the physical switch, and contains protection logic (via PIOC) and the logic needed to accept commands (i.e., select, open, and close) from the operator (via CSWI).

CMU/SEI-2002-TR-031

55

> >

º

0

º

> >

1 2 3 4 5

CLOCK

0

> >

2

CSWI 3

1

º º º º º

>|

> > > º º º

º

0

º

0

TCTR 1

TVTR 1

º º

º º

0

6

1

7

2

3

> >

> >

>| PIOC

0

1

2

2

MMXU 1

º >

0

SwMonSink 1

4

5

swsel

opsel oppos

º

0

CSWI

>

swpos

>

º º º º º º º

0

1

2

3

4

¢ § OPCGateway

7

XCBR

¢ §

OPCGateway8

SwMonSource

º

º

5

6

º

Figure 16: Controller System Diagram As shown in Figure 16, the operator has two inputs to the controller—opsel and oppos—that indicate requests to select or position the switch. The CSWI component can send two outputs to the XCBR component—sbosel and sbopos—the effect of which is to select or position the switch. Before continuing, CSWI always waits until it receives a reply from XCBR indicating that the task has been performed. XCBR has two outputs to the physical switch where a selection or position change actually occurs—swsel and swpos. Two inputs from the physical switch—sosel and sopos—provide acknowledgement feedback to XCBR, through the respective sbo channels, and ultimately to the operator (perhaps in the form of indicator lights, etc.). In this discussion, we focus on modeling the behavior of the controller assembly’s CSWI component.

7.3.2

Producing a Pin Model of the Problem

CSWI has two sink pins—opsel and oppos—each of which is a synchronous mutex pin. Additionally, each sink pin is threaded, and the two sink pins are handled by the same thread. CSWI also has two synchronous source pins—sbosel and sbopos.

56

CMU/SEI-2002-TR-031

Because CSWI only has one thread of control, it is formally modeled in CSP as a single reaction. This specification is shown in Figure 17. Note that the leading underscore character (_) in event names indicates the initiation of an interaction on a pin of that name; the absence of a leading underscore indicates the completion of an interaction.

CSWI = _opsel.on -> _sbosel!on -> sbosel -> opsel -> Selected [] _opsel.off -> _sbosel!off -> sbosel -> opsel -> CWSI Selected = _opsel.on -> _sbosel!on -> sbosel -> opsel -> Selected [] _opsel.off -> _sbosel!off -> sbosel -> opsel -> CSWI [] _oppos?x -> _sbopos!x -> sbopos -> _sbosel!off -> sbosel -> oppos -> CSWI

Figure 17: CSWI Pin Specification

7.3.3

Translating the Pin Model into a State Machine

While the Pin model of CSWI is based on CSP, NuSMV (our target model checker) requires input in the form of a finite state machine. As shown in Figure 18, we translated the Pin model to a simple state machine as an intermediate representation. The transitions in that diagram correspond to CSP events in the Pin model. The names associated with these transitions, and subsequently in the NuSMV model, are based on the names of CSP events in the Pin model. For example, the Pin model defines the operator selection event as _opsel, with a data parameter of either on or off, written as _opsel.on or _opsel.off, respectively. To preserve traceability to event names in the Pin model, the state machine modifies this convention slightly by changing _opsel.on to _opsel_on or _opsel.off to _opsel_off. This transformation is necessitated by the syntax of NuSMV, which uses the period (.) operator for a different purpose. Consequently, periods (as well as exclamation points [!] and question marks [?]) were changed to underscores.

CMU/SEI-2002-TR-031

57

s ta rt _ o p s e l_ o n / _ s b o s e l_ o n

w a itin g

s b o s e l/ oppos

d e s e le c tin g

s b o s e l/ opsel

_ o p s e l_ o ff/ _ s b o s e l_ o ff

u n s e le c tin g s e le c tin g _ o p s e l_ o ff/ _ s b o s e l_ o ff

sbopos/ _ s b o s e l_ o ff

_ o p s e l_ o n / _ s b o s e l_ o n

sbopos/ _ s b o s e l_ o ff

o p e n in g c lo s in g

s b o s e l/ opsel

_ o p p o s _ c lo s e / _ s b o p o s _ c lo s e

s e le c te d _oppos_open/ _sbopos_open

Figure 18: CSWI State Machine Seven states compose CSWI: waiting, selecting, selected, opening, closing, deselecting, and unselecting. The states and major transitions are described below. •

waiting state: CSWI starts in the waiting state and transitions to another state only when an _opsel_on or _opsel_off event occurs. An _opsel_on event, for example, indicates that the operator is selecting a switch and results in a transition to the selecting state. As part of this transition, CSWI initiates an _sbosel_on event. The “event1/event2” convention indicates that an occurrence of event1 not only triggers a transition, but also results in the generation of event2.22 This convention is how we reflect the behavior found in the Pin reaction specification.



selecting state: CSWI remains in the selecting state until it receives an acknowledgement that the selection has been accomplished, an sbosel output is received, and an opsel input is generated. CSWI then transitions to the selected state.



selected state: CSWI waits in the selected state until one of the following events occurs:

22.

58

-

An _opsel_off event is received, indicating the operator action of deselecting the switch and causing a transition to the unselecting state.

-

An _oppos_open event is received, indicating the operator action of requesting to open the switch and causing a transition to the opening state.

This follows the convention of Harel’s statecharts, which are reflected in UML [Rumbaugh 00], where “event2” is called “action.”

CMU/SEI-2002-TR-031

-

An _oppos_close event is received, indicating the operator action of requesting to close the switch and causing a transition to the closing state.

-

An _opsel_on event is received, indicating the operator action of selecting the switch and causing a transition back to the selecting state.



opening state: If CSWI is in the opening state and an sbopos output is received, CSWI transitions to the deselecting state and generates an _sbosel_off event to deselect the switch, since the positioning is complete. If the acknowledge is not received, the system continues to wait in the opening state.



closing state: If CSWI is in the closing state and an sbopos output is received, CSWI transitions to the deselecting state and generates an _sbosel_off event to deselect the switch, since the positioning is complete. If the acknowledge is not received, the system continues to wait in the closing state.



deselecting state: If CSWI is in the deselecting state and an sbosel output is received, CSWI transitions to the waiting state and generates an oppos input.



unselecting state: If CSWI is in the unselecting state and an sbosel output is received, CSWI transitions to the waiting state and generates an opsel input.

Although they are similar, the deselecting and unselecting states are both needed; the select and operate actions require different acknowledgements to be generated, even though the transition to each is caused by the same thing—an sbosel output.

7.3.4

Translating the State Machine into NuSMV

This section discusses the translation of the model from the state machine represented in Figure 18 to the NuSMV textual model shown in Appendix C. It was done in a straightforward manner following the steps listed below. Note that since the Pin model indicates that CSWI has a single thread of control, there is no need to use multiple processes in the NuSMV model. 1. The transitions in the state machine (e.g., _opsel_on and _oppos_open) appear in the NuSMV model as symbolic names associated with the input variable, which can be assigned any of the input values shown in the state diagram. The input variable can also have the value “none,” indicating that no inputs currently exist. The input variable is declared as an IVAR,23 directing NuSMV to consider all combinations of input sequences when analyzing claims.

23.

IVAR is a keyword from the NuSMV tool. It’s short for “input variable.”

CMU/SEI-2002-TR-031

59

2. The outputs in the state machine (e.g., _sbosel_on and _sbopos_open) appear in the NuSMV model as symbolic names associated with the output variable, which is declared as a VAR.24 As with input, output can have the value “none,” indicating that no outputs currently exist. The output variable is changed only within the NuSMV model. The other VAR declaration is the state variable, which takes on the state names from the state machine (e.g., waiting and selecting). 3. There is a case expression relating the behavior of the output variable at the next state to the current values of the state and input variables. For example, the first rule states that if the current state is waiting and the current input is _opsel_on, the next value of the output is _sbosel_on. Hence, each output is delayed by a single tick from the condition causing it. The last rule in the case expression (1: none) indicates that if none of the other rules in the case expression apply (e.g., if the state is waiting and the input is _oppos_open), the output variable will default to “none.” 4. The other case expression relates the changes in the state variable to the current state and the input variable. For example, if the state is waiting and the input is _opsel_on, the next state will become selecting. The last rule (1: state) indicates that if none of the other rules apply, the next state remains equal to the current state.

7.4

Analyzing the Model

Once we have a model of CSWI in a form that NuSMV understands, we can apply model checking to determine whether the model (and, therefore, the component’s implementation) satisfies various system properties. For example, this type of analysis can determine whether deadlock is possible. In this section, we begin by discussing different types of system properties that we can analyze using model checking. We then introduce temporal logic, a formal language commonly used to express system properties. Finally, we show examples of properties expressed in temporal logic that we used in analyzing the CSWI model.

7.4.1

Types of System Properties

The types of system properties evaluated by model checkers are typically classified as safety or liveness properties. Although other taxonomies have been proposed by Naumovich and Clarke [Naumovich 00], we continue to use the safety and liveness classification. Intuitively, a safety property specifies that “bad things” cannot happen, and a liveness property specifies that “good things” eventually happen [Lamport 77].

24.

60

VAR is a keyword from the NuSMV tool. It’s short for “variable.”

CMU/SEI-2002-TR-031

To illustrate these points, consider a software application in which two or more concurrent processes must access the same critical section of code. A condition that must hold is mutual exclusion, which means that two (or more) concurrent processes cannot enter and execute a common critical section of code simultaneously. This condition can be specified as the safety property “two processes cannot be in the same critical section at the same time.” Another condition that is usually desirable in concurrent processes is starvation freedom, which means that eventually each process will enter the critical section. This can be specified as the liveness property “whenever process P1 wants to enter the critical section, it will eventually do so.”

7.4.2

Temporal Logic

Temporal logic is frequently used to formally define safety and liveness properties. Two wellknown logics used in verification are LTL and CTL. In LTL, time is viewed as a linear sequence of time instants, with each time instant having exactly one successor. The semantic intuition of LTL expressions (where a and b are primitive prepositions) is as follows:

a

-

This is read as “henceforth (or always) a” and means that statement a is true now and in all future time instants. -

◊a This is read as “eventually a” and means that a is true either now or at some time instant in the future.

-

oa This is read as “in the next state or at the next time instant, a will be true.”

-

u

a b This is read as “a is true until b becomes true.”

In CTL, time is viewed as a branching tree of time instants, each with one or more possible successors. A path is a particular linear sequence of time instants (a path in the tree) in which each successive time instant is one of the possible successors of the current time instant. The semantic intuition of CTL expressions (where p and q are primitive prepositions) is as follows: -

AG p Along all paths, p holds globally (henceforth).

-

EG p At least one path exists where p holds globally.

CMU/SEI-2002-TR-031

61

-

AF p Along all paths, p holds at some time instant in the future (eventually).

-

EF p A path exists where p holds at some time instant in the future.

-

AX p Along all paths, p holds in the next time instant.

-

A[p U q] Along all paths, p holds until q holds.

-

E[p U q] A path exists where p holds until q holds.

Both LTL and CTL include the usual a ∧ b, a ∨ b, ¬a, a → b logic expressions. While LTL expressions appear to be a subset of CTL, their expressive power differs. It is possible to express properties in LTL that cannot be expressed in CTL, and vice versa. The NuSMV model checker allows claims to be expressed in either CTL or LTL; however, CTL was used as the primary specification language in this experiment.

7.4.3

CSWI Claims

The desired system properties analyzed with respect to the CSWI model include safety and liveness properties. The specific behavioral characteristics of each property are often first expressed in a natural English language format and then translated into CTL (or LTL). These temporal logic expressions are often called claims. Examples of claims for CSWI are discussed in the following sections.

7.4.3.1

Example Liveness Claim

A number of liveness claims were constructed for CSWI, the most obvious of which investigate its operational correctness. Those claims follow the pattern of “if an operator action occurs, a signal is conveyed to the appropriate entity.” In this example, the entity would be XCBR. A specific desired liveness property is “if the operator selects the switch, the ‘select before operate’ signal is sent to the XCBR.” Translating this into the appropriate CTL expression results in AG ((state = waiting) ∧ (input = _opsel_on) −> AX(output = _sbosel_on)) The above claim is read as “for all paths globally, whenever CSWI is waiting and the operator select comes on, the sbo select signal will come on.”

62

CMU/SEI-2002-TR-031

7.4.3.2

Example Safety Claim

Safety properties are logical expressions of conditions that should never occur, presumably because the resultant action could endanger a person or harm equipment. One of the operational characteristics of the controller assembly is the enforcement of the “select before operate” protocol. This can be rephrased as the desired safety property “under human control, the switch cannot be operated unless it is first selected.” Translating this into the appropriate CTL expression results in

¬E [¬(output = _sbosel_on) U (output = _sbopos_open)] The above statement is read as “it is impossible for the switch to be opened (_sbopos_open) before it has been selected (_sbosel_on).”

7.4.3.3

Example Failed Claim

The CSWI model does not satisfy the following safety claim: AG(input = _opsel_on −> AG output = none) This is actually a good thing, because we don’t want the safety claim to be true. The claim asserts that whenever the operator selects the switch, nothing will ever happen (i.e., no output will be generated). When we run this claim in NuSMV, it confirms that the model does not satisfy the claim, and it also presents a counterexample showing a particular trace in which the claim is not satisfied. The counterexample is shown in Figure 19. -- specification AG (input = _opsel_on -> AG output = none) -- is false -- as demonstrated by the following execution sequence State 1.1: input = none output = none state = waiting State 1.2: input = _opsel_on State 1.3: input = _oppos_close output = _sbosel_on state = selecting

Figure 19: NuSMV Counterexample

CMU/SEI-2002-TR-031

63

The counterexample represents three time instants (1.1, 1.2, and 1.3), which have the following meanings: •

1.1: CSWI begins in the initial state, waiting, and there is no input or output.



1.2: An input of _opsel_on is received, indicating that the operator selects the switch.



1.3: An output of _sbosel_on is generated, and CSWI transitions to the selecting state.

In the final time instant, 1.3, CSWI generates an output, violating the claim that nothing will happen after an operator selects the switch. The NuSMV model and the claims verified against it are contained in Appendix C.

64

CMU/SEI-2002-TR-031

8

Discussion

The basic structure and concepts of a PECT had been prototyped before the SAS model problems were defined [Hissam 02]. Nonetheless, the SAS problems, although simple by the standards of production systems, were complex enough to reveal previously unappreciated, or underappreciated, subtleties of PECT. We discuss some of those subtleties, with an emphasis on those aspects of PECT that were shown to require further study. In Section 8.1, we discuss methodological results, while in Section 8.2 we briefly touch on the model solutions.

8.1 8.1.1

Results on Method Interfaces Between Specialized Skills

One striking observation is that designing, implementing, and validating a PECT requires a combination of specialized expertise. Some indication of this can be gleaned from the different workers identified in Chapter 4. Table 4 focuses not on workers, but on their skills, and in particular those skills needed to build the SAS PECT. To some extent, these skills are coherent with the workers identified earlier. However, it is significant that we often found it necessary for a single worker to straddle several skills, suggesting, as is anticipated in the RUP, that a single person will fill many roles, perhaps simultaneously. Note that some divergence from the general set of workers and their implied skills outlined in Chapter 4 is expected, given our focus on research rather than production. Table 4:

Areas of Expertise Needed to Build SAS Operator and Controller PECTs

Area of Expertise

Use of Expertise in SAS Model Solutions

component technology

Design the Pin component model, application program interface (API), runtime environment, and deployment model.

component-based development

Implement the component technology and define and implement conformant SAS components (including their behavioral models) and their SAS assemblies.

systems programming

Implement real-time extensions to Windows 2000, observation mechanisms for validation environments, and low-level features of the component runtime environment.

measurement, validaDefine measurement scheme and statistical analysis and labeling tion, experimental design, approach for component execution time and latency property theories. statistical analysis Conduct component measurement and empirical validation.

CMU/SEI-2002-TR-031

65

Table 4:

Areas of Expertise Needed to Build SAS Operator and Controller PECTs (Continued)

Area of Expertise

Use of Expertise in SAS Model Solutions

performance modeling

Define the latency property theory based on the theory underlying RMA.

model checking

Develop model-checker-specific state models and define, develop, and check temporal claims of controller safety and liveness conditions.

language design and for- Design the specification language(s) for Pin assemblies and the formal mal semantics (CSP) compositional semantics of connectors between component (CSP) behaviors. electronics

Design, develop, and calibrate test hardware to simulate switch control. This custom hardware was used to empirically validate the controller latency theory.

substation automation

Define the model problems and evaluation criteria such that the PECT reflects a representative class of design and engineering problems.

As discussed in Chapter 5, co-refinement is a means by which actors possessing expertise in two domains, each with specialized conceptual schemes and vocabularies, can nonetheless negotiate mutual satisfaction of a shared concern (practicability), while ensuring satisfaction of their unique concerns (generality and tractability). Co-refinement is a negotiation process that establishes a shared conceptual interface between those actors. Note that although the interface is conceptual, it serves an analogous role to the more familiar notion of software interface: it helps to make explicit, and therefore manage, the dependencies in a system. Skills Interface: Co-Refinement [repeatable]

performance modeling [tractable]

interpretable validatable

component technology [general]

Skill Interfaces TBD

certifiable

measurement validation

representative

measurable

substation automation

systems programming

[valuable]

[implementable]

Figure 20: Interfaces Among PECT Development Skills This metaphor of negotiation is apt and can be applied elsewhere in the PECT development process. Consider, for example, the interface between the skills depicted in Figure 20. Above or beneath each skill, we placed, in brackets ([]), a one-word descriptor for the PECT design concern that motivates the application of that skill. The associations are labeled with a one- or two-word descriptor of a shared concern. In Figure 20, we show a shared concern between component technology and measurement validation—“certifiable.” This concern covers issues such as whether the measured property can be validated by an independent party, whether the

66

CMU/SEI-2002-TR-031

measures fall within required confidence or tolerance intervals, or whether the level of systematic and random measurement error can be quantified. Measurement and validation also share a concern with the domain specialist—whether the assemblies used for empirical validation are representative of real design problems. It is unreasonable to expect a single person to possess the range of expertise reflected in Figure 20. A transitionable PECT development method must provide a clear distribution of these forms of expertise across different workers and processes that define a separation of the skills across activities.

8.1.2

Infrastructure Complexity

Carl Sagan introduced his television series, Cosmos, with the statement, “To bake an apple pie, we must first create the universe.” Before we can assemble components and predict their aggregate properties, we must create a PECT infrastructure that includes the collection of programs exclusive of application-level (e.g., controller-level) components and their assemblies. The view of PECT depicted in Figure 5 on page 13, The Four Environments of a PECT, is a good guide to what constitutes infrastructure. Nonetheless, that structure is guilty of oversimplification. The practicality of PECT rests, to some extent, on the cost of developing and maintaining its infrastructure. We must therefore strive to understand which part of a PECT infrastructure’s complexity is inherently related to PECT, and which part arises from concerns that have already been addressed by available technology or is inherent to a problem domain such as SAS. One approach to obtaining this understanding is to examine Figure 5 from the perspective of a software development environment (SDE). Clearly, more is required of an SDE than the assembly and analysis environments shown in Figure 5. An SDE would also include facilities for configuration and version management, and a repository of components and assemblies, for example. The components themselves will likely be implemented in a conventional programming language such as C#, C++, or Java. This suggests that code browsing, compiling, linking, and debugging facilities are also required in a PECT. Still, none of this is part of a PECT-specific infrastructure, and it is all readily available—for example, Microsoft’s VisualStudio and Borland C++’s Developer. What remains is to ensure that a PECT infrastructure is integrated with an existing SDE. While not trivial, this is not a substantial risk to PECT. Another way to look at Figure 5, however, is from the perspective of a language-specific SDE (LS-SDE). A language comprises its syntax and semantics. In PECT, the syntax of the language is defined by the constructive model [Ivers 02]. However, the language used to specify (constructive) assemblies for any PECT is not necessarily simple in structure; its syntax and semantics are dependent on an open-ended set of environment types and analysis views, only some of which may be applicable in any given application. This added complexity must be

CMU/SEI-2002-TR-031

67

laid directly on the PECT doorstep—it cannot be waived off as a simple matter of SDE tooling. Some of the infrastructure needed to support such a semantically extensible LS-SDE can be regarded as a one-time investment—once developed, a constructive model and its infrastructure can be used repeatedly for a family of applications [van Ommering 02]. This suggests that a PECT may be more easily justified in the context of a product line. Although product lines require a substantial organizational commitment, they are justified by business considerations [Clements 02a], and PECT may in fact make product lines more effective. On the other hand, there is not likely to be an infinite supply of useful constructive models. For example, Pin bears a strong resemblance to other component languages, notably Darwin [Magee 93], Wright [Allen 97], and the emerging UML V2.0 standard.25 There is hope, then, for convergence on an industry-standard component language.

8.1.3

Design Space: Rules for Inclusion and Exclusion

We have adopted the idea of a PECT design space as a basis for validating empirical property theories—in this report, latency. The term design space has a metaphorical connotation, but it is given concrete meaning in empirical validation through its definition as a set of discrete variation points. That is, we define design space as an N-dimensional cartesian coordinate system,26 where coordinates on each dimension take values from a (small) finite set representing one possible variation at that variation point, or on that dimension. From this design space, we select the sample of assemblies to use for the statistical analysis of predicted versus observed assembly behaviors. The coordinate system for this design space defines the set of assemblies that are included within the scope of a property theory. That is, the coordinate system defines inclusion rules for assemblies. That definition reflects a positive statement of where a property theory must be valid, in the sense that any sample taken from this design space will be representative of the class of systems to which this PECT will be applied. We want the predictions to be valid for all members of this class. There is evidently a close connection between this concept of design space and that of variation points in product line architecture (e.g., Theil and Hein’s work [Theil 02]), and in fact we use the term variation point to emphasize this connection. In practice, however, we must also define rules for exclusion from the design space. That is, we will almost certainly want to characterize not just where we think the property theory 25.

UML V2.0 is currently a work in progress. See for more information about it.

26.

To be more precise, the coordinate system under discussion defines sufficient conditions for the assemblies of a class of products; the necessary conditions are defined by the constructive model and its interpretation under the property theory being validated. For example, the interpretation of the controller latency theory λABA restricts assembly topologies to only those that satisfy the priority ceiling constraint. All assemblies must satisfy these purely syntactic, and therefore necessary, conditions.

68

CMU/SEI-2002-TR-031

should be valid, but also where we know it is not valid. One way to approach this is to discover the analysis limits for each variation point. The analysis limits of a variation point define the greatest lower bounds and/or the least upper bound for the set of values defined for each variation point.27 More simply, each variation point can be tested to destruction to find its analysis limits. This is analogous to the approach taken by Gorton and Liu [Gorton 02]; however, their measurement object is a large-scale commercial off-the-shelf (COTS) component (an Enterprise JavaBeans server) and not the property theory used to predict performance. In general, there has been insufficient work in software engineering research regarding the empirical validation of design theories. PECT provides a conceptual vocabulary and technical means for making much needed progress in that area.

8.1.4

Confidence Intervals for Formal Theories

If there has been insufficient work in the empirical validation of design theories based on observation (and, therefore, measurement), there has been almost no work in the empirical validation of theories based on formal logic. (Barnett and Schulte’s work is an exception [Barnett 01].) Indeed, the very notions seem to be mutually exclusive, which may account for the lack of attention to this subject. Nonetheless, assertions about safety and liveness properties of assemblies (as discussed in Chapter 7) are made with respect to a model of the components’ behaviors and their interactions. We can have 100% confidence that a claim is satisfied or falsified by a model. However, there are two more substantial questions: 1. Can we have objective confidence that the model corresponds to reality? 2. Can we have confidence that the model is sufficient to justify the claims? The answer to question 1 depends, to some extent, on how the correspondence (the hypothesized “satisfies” relation) between the model and reality is established in the first place. It might be established through automated (implying formal) translation. Two forms of automated translation are possible: generating components from models (e.g., Sharygina’s work [Sharygina 02]) and generating models from components (e.g., Havelund and Pressburger’s work [Havelund 00]). The first form is common in practice, while the second is more of a research topic. In both cases, our confidence in the satisfaction claim is proportional, if not equal, to our confidence in the correctness of the translator implementation. This does not lead to infinite regress, however, since we have an empirical means of establishing this confidence through the testing of the translator. If, however, the correspondence between the model and component is manual, there is no basis for objective confidence in the hypothesized “satisfies”

27.

We assume that the set of values for each variation point is partially ordered. This assumption seems reasonable if there is a correspondence between the variation point and some property theory parameter. Only those variation points will be of interest when testing that theory’s failure.

CMU/SEI-2002-TR-031

69

relation. The implication is that we must use automation to generate models and/or components in our approach to PECT. The answer to question 2 is more subtle and rests on equivalence relations over models. Asking whether a model satisfies an implementation (we now use this term in place of reality) is just a more specific form of the question of whether two models are equivalent for some purpose. Sometimes, but by no means always, the more concrete model is said to refine, or be a refinement of, the more abstract model; the equivalence relation is then called refinement. A property proven to hold for a model will also hold for its refinements.28 However, there are many different ways of defining refinement. Roscoe defines a hierarchy of three forms of refinement for the CSP process algebra—trace, trace failures, and failures-divergence refinement [Roscoe 98]; Davies (citing work by Reed [Reed 88]) describes a lattice of nine refinement equivalences for CSP [Davies 93]. Not to be outdone, van Glabbeek [van Glabbeek 01] identifies 13 equivalence relations, not all based on CSP traces. In each case, different notions of equivalence can be used to justify different claims. We can draw two inferences from the discussion of question 2. First, we should leave the definition of equivalence relations over formal models—and the theorem provers based in them— to the experts; we should apply the fruits of their efforts in PECT. Second, a PECT infrastructure must be flexible enough to accommodate a variety of formal theories, each of which provides a notion of semantic equivalence sufficient to make the kinds of specification claims required.

8.2

Results on Model Solutions

Three model problems were posed, one leading to an operator-level PECT, one to a controllerlevel PECT, and one to a higher order PECT for assemblies of assemblies. In the final analysis, only the operator and controller PECTs were developed. While a higher order operator/controller assembly was demonstrated, no property theory was developed for higher order assemblies. The operator-level PECT was developed without incident. However, it became apparent during its development that the operator interface was quite simple—in fact, only reentrant sink pins were required. Moreover, operator display latency is completely dominated in any higher order assembly—the time required to interact with the display and perform data validation on operator input is minor (often less than a simple scheduling question), compared with network and controller latency.

28.

70

This is actually too strong a statement; not all properties are preserved under refinement—deadlock freedom, for example.

CMU/SEI-2002-TR-031

Nonetheless, we developed a simple latency property theory for the operator PECT that proved to be, under the highly constrained constructive model used, highly accurate. A summary of the property theory’s performance is shown in Figure 21. We do not summarize the results of this performance as a confidence or tolerance interval, since the simplicity of the PECT led us to focus our validation efforts elsewhere.

Predicted vs. Measured

msec

1000 900

Predicted start -> end

800

C1 to End, after

700

C1 to C1, after

600 500 400 300 200 100 0 1

5

10

20

50

100

#Components

Figure 21: Results of Spot Validation of Operator PECT Latency Theory The controller-level PECT presented considerably more methodological and technological challenges, most of which have been discussed already. To reiterate the main results, the λABA produces stable and accurate predictions—a 99% confidence interval for an upper bound of 1% MRE for 8 out of 10 predictions. A full report on the empirical validation of λABA is provided in Appendix B. The λABA validation exercise was more interesting for its methodological implications than for its validation of a specific latency model. The initial validation data revealed a significant quotient of systematic and random error. Several experiments were required to stabilize the model, each of which exposed different sources of experimental error. The first three served to remove errors in the measurement infrastructure itself—the instrumentation was developed in tandem with the validation work and was therefore itself a work in progress. The last two runs served to remove inconsistencies between the model and the controller runtime implementation. In other words, the empirical validation exposed an inconsistency between model assumption and runtime implementation; moreover, the raw data produced by the validation allowed us to quickly identify the sources of error.

CMU/SEI-2002-TR-031

71

72

CMU/SEI-2002-TR-031

9

Next Steps

As noted in the introduction to this report, the objective for this work was exploration—the technical concepts of PECT and the methods for developing and validating a PECT. As with any exploration, a body of knowledge was generated that must now be consolidated. Improved understanding leads to new, and more challenging, questions that must be explored. In this chapter, we summarize the most important areas of consolidation and exploration. Motivating these priorities is our plan to extend the work reported here during the next year to accommodate more rigorous feasibility requirements and, working with our industry partner, ABB, to incorporate feasibility requirements drawn from the domains of substation automation and industrial robotics.

9.1

PECT Infrastructure

The infrastructure of a PECT is only a means to an end—predictable assembly from certifiable components. However, it is a necessary means, and our ability to “scale up” the PECT approach requires some up-front investment in that infrastructure. Some of this investment has already occurred as a by-product of producing SAS model solutions and their PECTs; more needs to be done.

9.1.1

Measurement and Validation Environment

A substantial suite of tools was developed to measure component execution time, automatically generate assembly samples, interpret those assemblies to instantiate and operate on analysis views, capture traces of assembly execution, and perform statistical analysis of the timed execution traces vice their predicted traces. Most of the tools in this tool suite were used for both the operator and controller PECTs, suggesting that those tools may well form the core of a common measurement and validation environment for future PECT development. The future work required is consolidative in nature; it involves integrating, refining, and ruggedizing the existing tool set. Integration will achieve the end-to-end automation that is required to generate the large data sets necessary to empirically validate a substantial design space. This automation will require more attention to the interfaces defined between tools and to data interchange syntax and semantics. A good foundation for data interchange has been established by using XML to externalize component assemblies, but this externalization must be made consistent with an evolving constructive model and its specification language.

CMU/SEI-2002-TR-031

73

9.1.2

Core Pin Language and Assembly Environment

One of the tenants of the PECT approach is that complexity can be “packaged” and hidden from end consumers. Still, there will be some minimum, irreducible complexity exposed by each analysis view. If this complexity is to be manageable in the aggregate, we must strive to eliminate as many forms of unnecessary complexity as possible. To (mis)appropriate a now famous political aphorism, “it’s the packaging, stupid!” The process used in the SAS model solutions for creating and deploying assemblies of components was tedious and error prone. Component properties and assemblies of components were encoded directly in XML, and although the connector “wiring” was generated automatically from XML, there is no disguising the awkwardness of representation and process. A visual composition environment (e.g., see the work of Plakosh and associates—specifically Chapter 5 [Plakosh 99]) is a form of packaging that can dramatically simplify, and make more pleasurable, the process of attributing component behavior, and developing, analyzing, and deploying their assemblies. As discussed in Section 8.1.2 on page 67, a (visual) PECT assembly environment is a form of LS-SDE—in this case, one that emphasizes pure composition. The future work needed in this area is a mix of exploration and consolidation, and requires a definition of a core language for the Pin component model (see the work of Ivers and associates for a start at this [Ivers 02]) and the development of a visual environment to support it (a small matter of programming). By core, we mean a graphical and textual syntax (including a syntax for reaction rules) and syntax-directed externalization.

9.1.3

Real-Time Component Runtime Environment

One pleasant surprise in developing the SAS model solutions was that the component runtime environments for the operator and controller were rather simple to develop. Only the controller environment posed a challenge due to the artificial circumstance of our building a prioritybased real-time PECT on an operating system (Windows 2000) with very limited support for priority-based scheduling. Even so, the operator and controller component runtime environments emerged largely from a restriction of component freedom to only a limited set of environment (runtime) interfaces. Still, for the next stage of PECT development, more care must be taken when selecting an underlying operating system and specifying the environment types for hierarchical assemblies (see Section 3.3.2 on page 17)—the latter supports distributed assemblies. A definition of these environment types will also be essential to the compositional semantics of environmentspecific connectors added to the core Pin component model.

74

CMU/SEI-2002-TR-031

9.1.4

Model Checking (Analysis) Environments

The SAS model problems were focused primarily on theories of latency. For PECT to be successful, it must be applicable to other behavioral theories. Compositional theories of reliability for components and assemblies are also of interest to the software industry. Empirical theories for compositional reliability are possible, but generalized, falsifiable theories have proven to be elusive. See the work of Mason [Mason 02] and Stafford and McGregor [Stafford 02] for an entree to this topic. Formal proofs of correctness have traditionally been thought of only as a means of obtaining reliability, not for predicting it. However, as discussed in Section 8.1.4 on page 69, there may be an empirical dimension to formal theories of correctness after all. Complete proofs of correctness remain problematic, but within recent years, substantial progress has been made in improving the practicality of technologies for so-called partial proofs of correctness. In particular, proofs based on the exhaustive model checking of specific safety and liveness properties specified in a temporal logic are becoming almost mainstream. For recent developments in model checking of control systems, see Sharygina’s work [Sharygina 02], for details on process algebra and model checking of concurrent Java programs, see the work of Magee and Kramer [Magee 99], and for a more theoretical perspective on the model checking, see the work of Clarke and associates [Clarke 99]. These results and those related to them are finding their way into practice. For example, Microsoft is using compositional verification to certify third-party device drivers [Ball 02].29 Our principle motivation for using the CSP process algebra to specify models of component behavior (i.e., reactions) is that models of assembly behavior can be composed for analysis automatically. Our concrete next step is to develop back ends to generate lower level model representations for use in formal model checkers such as SMV30 [Cimatti 00], Failures-Divergence Refinement [Gardiner 00], and SPIN [Holzman 97], probably in that order. Although this is not trivial, it is fairly simple. More complexity is involved in the question of compositional reduction and abstraction, and other techniques to minimize the state space of composed assembly behaviors. Our approach is to defer as many of these details to the back-end analysis tool as is practical and to introduce such techniques in the Pin infrastructure only as needed.

9.2

PECT Method

This report has outlined in a broad way the processes involved in developing and validating a PECT. The general area of certification is ripe for consolidation into a PECT validation guide and further exploration into the certification locale (i.e., the environment where the certification will be done). The next steps in these areas are outlined below. 29.

For more information, see .

30.

SMV is a symbolic model-checking tool developed at Carnegie Mellon University. For more information about it, see .

CMU/SEI-2002-TR-031

75

9.2.1

PECT Empirical Validation Guide

How is a validation experiment constructed? How are measurements collected, and how is systematic and random error quantified? How is systematic error removed? And how is the remaining error propagated? How many samples are required to achieve the desired statistical confidence? Which form of confidence interval is most appropriate? How is the assembly population defined, and how is the sample (or frame) of that population selected? What inferences can be drawn given this sampling procedure? What statistical tools can be used to analyze unexpected results? Most of these questions are reflected in the workflows discussed for empirical validation in Chapter 6 and illustrated in detail in Appendix B. However, our workflows are descriptive rather than prescriptive—they describe what steps should be performed, but do not, in most cases, specify how. Any prescriptions provided are quite general. As a result, the skills needed to validate a PECT are not well contained within the current method; a more prescriptive guide for laboratory validation work is a necessary consolidation effort. The objective of a prescriptive guide would be to describe the validation process and identify (if not describe) the experimental and statistical foundations necessary for conducting a sound validation. Elements of such a guide are outlined by Moreno and associates [Moreno 02], but many more elements are needed.

9.2.2

Certification Locale and Foundations for Trust

Our work so far has focused on establishing component measures in a laboratory setting. We have not yet addressed issues of how these measures can be established independently. Of equal importance to the ability of third parties to validate the predictive power of a PECT is their ability to assign trusted and objective properties to components. One aspect is the potential of insurance underwriting (e.g., see the work of Li and associates [Li 02]) as a basis for a weaker notion of trust, which we may call confidence. This is, however, just one aspect of a set of aspects required to establish what amounts to a social network of trust. We plan to explore them in the context of our next round of industry model problems.

76

CMU/SEI-2002-TR-031

Appendix A λABA Property Theory

An integral part of a PECT is its reasoning framework, which comprises three elements: a property theory that allows reasoning about a specific quality attribute using models of the system; an automated reasoning procedure that maps assemblies to elements of the property theory (interpretation) and produces predictions about the assemblies; and a validation procedure that is used to assess the validity of the predictions. This appendix describes the first two elements of λABA reasoning framework; Appendix B shows how empirical evidence is used to validate the predictions. λABA can be used to predict the average latency of arbitrary execution paths through an assembly comprised of concurrent periodic tasks with bounded execution times running on a single processor. The essential concepts of the λABA property theory are presented in Section A.1, and the algorithms used for prediction are discussed in Section A.2. Section A.3 describes the Pin to λABA interpretation and Section A.4 gives an (abstract) syntax directed translation scheme for this interpretation.

A.1 Essential Concepts of λABA Property Theory A task is the unit of concurrency in the system; it executes periodically and has a fixed period. Initially, let’s assume that a task has two other properties: priority and CPU time (sometimes referred to as execution time). We refer to each complete execution of a task as a job. The job latency is the amount of time it takes from the moment the task is ready to run to the moment it finishes executing. In Figure 22, there are three tasks that are ready to run at time zero. The timeline shows six jobs for task T1, each one with latency of one unit of time. It shows four jobs for task T2, of which the first, second and fourth have a latency of three units of time, and the third has a latency of two units of time. Finally, the timeline shows one job for task T3 that has a latency of 14 units of time. The latency of different jobs of the same task may be different due to the phasing of tasks and their priorities. In Figure 22, for example, as a result of delays, T2 latency varies from two to three units of time. The start of a task is sometimes delayed because of a higher priority task (e.g. the start of the first job of T2 is delayed by T1). A delay can also occur when a task is already running and is preempted by a higher priority task (e.g. in its second job, T2 is preempted by T1).

CMU/SEI-2002-TR-031

77

Task T1 Priority = 100 Period = 3 CPU Time = 1 Task T2 Priority = 75 Period = 5 CPU Time = 2 Task T3 Priority = 50 Period = 15 CPU Time = 3

Time

0

1

2

3

4

5

6

7

8

9

10

11

12

13

hyper-period

14

15

16

17

18

Pattern of preemption repeats

Legend Start of task period Task execution – The start of execution is not delayed and the task is not preempted. Task execution – The start of execution is delayed by a higher priority task(s), but the task is not preempted during execution. Task execution – The start of execution is not delayed, but the task’s execution is preempted by a higher priority task(s) during execution. Task execution – The start of execution is delayed by a higher priority task(s), and execution is preempted by a higher priority task(s) during execution.

Figure 22: Timeline Showing Task Phasing and Hyper-Period

Given N tasks Ti, we define the hyper-period31 (HP) as the least common multiple (LCM) of the periods of all tasks. As shown in Figure 22, the hyper-period is the amount of time until the pattern of execution for a set of tasks repeats. We only need to analyze a hyper-period, because it covers all the possible latencies that a task will exhibit. After a hyper-period, the pattern of latencies repeats, meaning that, for each task, the number of latency predictions can be computed as follows:

NP =

HP ----------------------T i .period

Besides the period, a task Ti has a property, Ti.offset, that indicates the arrival time of the task’s first job. In Figure 22 all tasks have offset zero, but in most assemblies analyzed using λABA tasks are ready to run their first job at different times after the first job of the first task starts. In this case, the tasks have different offsets. Period and offset are required, fixed-value properties of a task.

31.

78

Sometimes also referred to as cycle time in real-time literature.

CMU/SEI-2002-TR-031

Each task Ti consists of a sequence of subtasks, Ti.subtasks, that perform the actual computation. A subtask corresponds to the execution that takes place when a sink pin of component is activated. Each subtask sj has a priority sj.priority and an execution time sj.C. These concepts and their relations are depicted in Figure 23. The total execution time of a task can be calculated as the sum of the execution time of its subtasks.

Figure 23: Elements of the λABA Analytic Model The following are the major assumptions of λABA property theory: •

A task is the only unit of concurrency. In fact, tasks are partitioned in subtasks, but subtasks within a task run sequentially, not concurrently.



A subtask corresponds to the activation of a component and, because different components may have different priorities, each subtask runs at a different priority. Therefore, the same task can run at different priorities depending on which subtask of that task is running at the moment32.



A task can be preempted only by a higher priority task33.



If no higher priority task is ready, a running task runs up to completion.



Subtasks do not yield the CPU until they complete their execution, i.e., tasks do not suspend themselves.



Subtasks do not block on I/O.

λABA does not address the situation where tasks can be blocked waiting on an I/O request, but it can handle blocking on a semaphore to enter a critical section. If two or more tasks share a critical section, that section is modeled as a subtask in each task. In Figure 24, the shared criti-

32.

Priorities are fixed and pre-defined by the assembly developer; priorities are not assigned rate monotonically.

33.

Based on Figure 23, one can argue that tasks do not have priority. Indeed, when we say that a task T is running at priority X, we implicitly mean that the subtask of T that is currently running has priority X. Also, when we say task T1 preempts task T2, it is the current subtask of T1 that is preempting the current subtask of T2.

CMU/SEI-2002-TR-031

79

cal section is subtask C. Task A consists of subtasks A1, A2, C, and D; task B consists of subtasks B1, C, and D.

Figure 24: Blocking Example The priority of critical section C must be assigned according to the priority ceiling protocol34. That is, its priority has to be as high as the highest priority of all the subtasks immediately preceding the critical section. For example, if the priority of A2 is 2 and the priority of B1 is 4, the priority of C has to be at least 5. In Section A.3 we will see how interpretation handles priorities.

A.2 λABA Predictor As explained in Chapter 5, the initial λ* property theories were purely analytic. When we wanted to know not only how the latency of a task would be affected by blocking, but also when blocking would occur, the amount of information in the model made the formulation of a purely analytic model quite awkward. Thus, we developed a simulator that uses the tasks, subtasks and their properties to simulate the execution of one hyper-period, keeping track of the state of all the tasks while advancing time.

34.

80

The use of priority ceiling has several virtues, including the “blocked at most once” property of assemblies that adhere to this restriction. Nonetheless, there are alternatives, for example allowing priority inheritance if that is supported by the runtime environment.

CMU/SEI-2002-TR-031

Some analytic computations are used by the simulator, thus making λABA a hybrid analyticsimulation theory35. For example, the end of the hyper-period is computed as HPEnd = maxi=1..N(Ti.offset) + LCMi=1..N(Ti.period) Also, the arrival time of the next task at or after time t for task Ti is computed as

ai ( t ) =

t – T i .offset ------------------------------ ( T i .period ) + T i .offset T i .period

The simulation is event driven (commonly referred to as discrete-event simulation) and controlled by a main loop that advances the time from one scheduling event to the next. A scheduling event is a point in time when a task must be stopped, started, or resumed. Figure 25 shows the pseudocode of the predict function. It takes as input a vector of tasks; each task has period, offset and an ordered list of subtasks. Tasks have also two dynamic properties that are relevant when we run the prediction: state and current subtask. The state indicates whether the task is ready to run, running or sleeping. In Figure 22 for example, task T2 is ready from time 0 to 1, running from 1 to 3, sleeping from 3 to 5, running from 5 to 6, ready from 6 to 7, running from 7 to 8, sleeping from 8 to 10, and so forth. The current subtask points to the subtask of that task that will execute if the task has the CPU. Each subtask has priority and execution time. The predict function simulates the execution of the tasks through a loop that advances time from instant zero to the end of the hyper-period. In each iteration of the loop, we inspect the state of all tasks to determine the next scheduling event for each task (see Figure 26 for the pseudocode of getNextEvent). Then, based on time, priority, and type, one event is selected to be handled next. Tasks are told to advance their internal clock to the point in time when the selected event starts, and update their internal states accordingly (see Figure 27 for the pseudocode of advanceClock). Upon an event, each task knows what it has to do next: start, run, or stop. At any point, only one task has the CPU; when this task advances its internal clock, it assumes it has executed for that amount of time.

35.

Models can be analytic, simulation based, or a hybrid of both. An analytic model is based on mathematical relations among variables and intended to be solved mathematically. A simulation model is a computer program that reproduces, to some level of detail, the behavior of a real system. Simulation models are generally used instead of analytic models when it is impossible, or very difficult, to create an analytic model. lABA is a hybrid analytic-simulation model.

CMU/SEI-2002-TR-031

81

predict(tasks) { hyperPeriodEnd = lcmOfPeriods(tasks) + maxPeriod(tasks) // put all tasks in the sleeping queue for (int i = 0; i < tasks.length; ) { tasks[i].state = SLEEPING tasks[i].currentSubtask = tasks[i].subtasks[0] } t = 0 // time in the execution timeline taskWithCPU = null while (t < hyperPeriodEnd) { // Each task can be: ready to run, running, or sleeping (has // executed all subtasks and is waiting for the next period). // Check what would be the next event of each subtask Vector events for (int i = 0; i < tasks.length; ) { events[i] = getNextEvent(tasks[i], t) } sort events vector by timeItHappens and priority (in this order) // The event in events[0] is the next to occur (ordering by time and // priority). It will indeed be the next if: // - it's START or STOP // - it's RUN and has higher priority than the priority of the subtask // currently running. If it has lower priority, we check the // following events to see if any of them is able to preempt the // currently running subtask. j = 0 while (events[j].type == RUN && events[j].priority < taskWithCpu.currentSubtask.priority) { j++ } // Next event to happen is events[j] and we have to "advance the clock" // accordingly delta = events[j].timeItHappens - t // For each task, we check what happens when we advance the clock for (int i = 0; i < tasks.length; ) { advanceClock(tasks[i], t, delta, taskWithCpu) } t = t + delta if (events[j] == STOP) { taskWithCpu = NULL } else if (events[j] == RUN) { taskWithCpu = events[j].task } } }

Figure 25: Prediction Pseudocode

82

CMU/SEI-2002-TR-031

Event getNextEvent(task, currentTime) { if (task.state == SLEEPING) { // task is sleeping: next thing is to start when the next period arrives nextEvent.type = START nextEvent.priority = priority of the 1st subtask nextEvent.timeItHappens = getNextPeriod(task, currentTime) } else if (task.state == READY) { // task is ready to run: next thing is to run as soon as now nextEvent.type = RUN nextEvent.priority = task.currentSubtask.priority nextEvent.timeItHappens = currenTime // task is ready to run now } else if (task.state == RUNNING) { // task is running: next thing is to stop when the current subtask ends nextEvent.type = STOP nextEvent.priority = task.currentSubtask.priority nextEvent.timeItHappens = currentTime + amount of time to the end of current subtask } return nextEvent }

Time getNextPeriod(task, currentTime) { nextPeriod = task.offset + ceiling((currentTime - task.offset) / task.period) * task.period return nextPeriod }

Figure 26: Pseudocode of getNextEvent and getNextPeriod Figure 28 is a statechart diagram using UML notation that represents a task as a composite state. There are two concurrent substates, one that tells if the task is ready to run, running or sleeping, and the other that indicates whether the task has the CPU or not. When predict ends, we have recorded the exact moment when each subtask started and finished, and hence we know how long it took to run each job of each task. That would give us the average latency of the task, accounting for the execution times and delays due to blocking and preemption. However, in reality the execution time of a subtask is not an exact number as we have used it so far. It comes from the execution time of a previously measured certified component and consists of a mean value and respective standard deviation. Therefore, in order to gain statistical certainty that average job latency is correctly estimated, we should run the prediction several times, each time picking a value for the subtask execution time that is

CMU/SEI-2002-TR-031

83

bounded by the respective standard deviation. Thus, the definite prediction comes from the average of the results of many executions of predict, each execution using random execution times for subtasks. Such procedure follows the Monte Carlo simulation method, as illustrated in the pseudocode in Figure 29. The inputs to the montecarlo function are a vector of tasks and the number of iterations. In each iteration, the function creates a clone of each task and their subtasks, with the single difference that the execution time of each subtask is reassigned to a random value following a normal distribution. To this point, we have seen how the property theory handles tasks and subtasks in a hybrid analytic-simulation model to predict average job latency. What we have not seen so far is how to obtain tasks and subtasks from a Pin assembly. That is the subject of the next section. advanceClock(task, currentTime, delta, taskWithCPU) { if (task.state == SLEEPING) { nextPeriod = getNextPeriod(task, currentTime) if (currentTime + delta == nextPeriod) { task.state = READY } } else { if (task.state == READY && task == taskWithCPU) { task.state = RUNNING } if (task.state == RUNNING if (task == taskWithCPU) { if (currentTime + delta == time when current subtask is complete) { if (task.currentSubtask == task.lastSubtask) { // finished task task.state = SLEEPING task.currentSubtask = task.subTasks[0] record execution time for the job just completed } else { // There are more subtasks to run within this task. The state // changes to READY and when the next subtask can be scheduled // to run the state we'll change to RUNNING again. task.currentSubtask = task.nextSubtask task.state = READY } } } else { // Task was running but lost CPU to a higher priority task task.state = READY } } } }

Figure 27: Pseudocode of advanceClock

84

CMU/SEI-2002-TR-031

Figure 28: Task Statechart Diagram

CMU/SEI-2002-TR-031

85

montecarlo(tasks, numberOfIterations) { for (sample = 1; sample

Suggest Documents