discrete event simulation with application to computer

0 downloads 0 Views 2MB Size Report
make a model as simple as possible, and as complex as necessary ...... performability models, Lanus M. , Yin L. and Trivedi K. S., IEEE Trans. Rel.,. 2003.
DISCRETE EVENT SIMULATION WITH APPLICATION TO COMPUTER AND COMMUNICATION SYSTEMS PERFORMANCE AND DEPENDABILITY ECE 557 Spring 2015 Duke University Kishor Trivedi [email protected] Room 203, Hudson Hall Copyright © 2015 Kishor S. Trivedi

1

Some nice quotes  LECTURE: An art of transmitting information from the notes of the lecturer to the notes of students without passing through the minds of either

 CONFERENCE: The confusion of one man multiplied by the number present

Copyright © 2015 Kishor S. Trivedi

2

Logistics of the course  6-8 home works, no late homeworks accepted, homeworks will be mixture of paper/pencil, those involving writing a simulation program, and those involving using the SHARPE or SPNP software packages  Must work individually on all homeworks  1 project  Grading: 60% HW, 40% project  Class webpage: www.ee.duke.edu/~ktrivedi/ECE5557  TA: Zheng Zheng; e-mail: [email protected]  TA Office Hours: Thursday, 1-4 pm Copyright © 2015 Kishor S. Trivedi

3

Three Segments & Two addenda 1. Segments 1. Discrete Event Simulation (class notes including Chapters 10 & 11 of bluebook + DOE+ Case studies) 2. DTMC, CTMC, PFQN (Ch. 7, 8, 9 of Blue book), SPN/GSPN/SRN (Ch. 8 of Blue book) 3. MRGP, MRSPN, FSPN will be very nice to cover as well but unlikely due to time constraints Bluebook: Trivedi, Probability and Statistics with Reliability, Queuing and Computer Science Applications, John Wiley, 2001

2. Addenda 1.

2.

Project; each student can choose an application to study and the simulation package to use – Start the project early and do it in phases Applications to networks, computing, smartgrid etc. will be throughout the course Copyright © 2015 Kishor S. Trivedi

4

Outline of the Segment on Simulation             

Introduction to Simulation Stat. Anal. Input Data ––Chap10a, Chap10b, Chap10c, Chap10d Random Variate generation- Chap3r and additional reading Output Analysis- Chap10e Case Studies – NASA Satellite Prob. Rep. data, Bugzilla reports Simulation Packages – Module 5 Applications of simulation - Module 6 Regression and ANOVA– Chap11a, Chap11b Design of Experiments (DOE) Case Studies: Software aging and rejuvenation, software fault tolerance using environmental diversity Case Studies: Cloud, uncertainty propagation, smartgrid, healthcare Model Validation and Verification- Module 7 Variance Reduction Techniques-Module 8 Copyright © 2015 Kishor S. Trivedi

5

Why do we need Statistics?  Making sense out of measurement data 

NASA Satellite data; Bugzilla reports for open source software; software aging data

 Developing/solving a Simulation Model [spnp, CSIM]  Input Data Analysis  Simulation Output Data Analysis  While developing an Analytic Model [e.g., using SHARPE]  

Input Data Analysis Propagating parametric uncertainty thru an analytic model

 Online control [e.g., software rejuvenation or performance control] Copyright © 2015 Kishor S. Trivedi

6

Need to Model Random Phenomena  Random Phenomena in a Computer/networking/cloud/web environment/VANET  Arrival of jobs/messages/requests  Execution (transmission/processing) time of jobs/messages/requests  Memory requirement of jobs/messages/requests.  Failure times or times to repair of components/resources/system/service  How to Quantify Randomness?  Use probabilistic models  How to Estimate these Quantifiers?  Use statistical techniques on measurement data Copyright © 2015 Kishor S. Trivedi

7

Modeling Random Phenomena Understand the system Measurement Data from a Real System

Statistical Analysis

Model Input Parameters

Probability Model (PM) Model Outputs

Output Measurement Data from a Real System

Model Validation

PM: Structural or white/grey box (Chaps 1-9); or Empirical or black box (Chaps 10-11) Copyright © 2015 Kishor S. Trivedi

8

Examples  Performance and reliability analysis of composed web services [Sato/Trivedi icsoc 2007 paper]  Availability/Reliability analysis of SIP (Session Initiation Protocol) on IBM WebSphere on BladeCenter [Trivedi et al PRDC 2008 paper; ISSRE 2010 paper]  Determining mean response time/availability/power in cloud [Ghosh et al FGCS 2013, IEEE-TCC and IEEE-TSC 2014]  Transmitting safety messages in a VANET [Yin et al papers]  Deciding rejuvenation schedule of a system to clean its environment of any software aging effects [many papers]  Propagating parametric uncertainty through Probability Models [Mishra et al, ISSRE 2011] Copyright © 2015 Kishor S. Trivedi

9

MEASURES TO BE EVALUATED  Dependability    

Reliability: R(t), System MTTF Availability: Steady-state, Transient Downtime Security, safety “Does it work, and for how long?''

 Pure (Failure Free) Performance  Throughput, Blocking Probability, Response Time  “Given that it works, how well does it work?'‘

 Composite Performance and Dependability (performability) 

“How much work will be done(lost) in a given interval including the effects of failure/repair/contention?'' Copyright © 2015 Kishor S. Trivedi

Dependability– An umbrella term  Laprie: Trustworthiness of a computer system such that reliance can justifiably be placed on the service it delivers Attributes

Dependability

Availability Reliability Safety Maintainability

Means

Fault Fault Fault Fault

Threats

Faults Errors Failures

Copyright © 2015 Kishor S. Trivedi

Prevention/Avoidance Removal Tolerance Forecasting

Extended Dependability and Security tree Faults/Vulnerabilities Threats

Errors/Atomic attacks Failures/

Dependability and Security

Intrusions

Confidentiality Attributes

Integrity Availability Reliability Safety Maintainability Fault/Intrusion Prevention

Means

Fault/Intrusion Detection Fault/Intrusion Tolerance Fault/Vulnerability Removal Fault/Intrusion Forecasting Copyright © 2015 Kishor S. Trivedi

Security

Metrics and Methods  Metrics: Performance (utilizations, thruput, goodput, response time, blocking probability), Reliability, Availability, Safety, Security, Power, Performability, Survivability, Resilience  System-oriented vs. user-oriented metrics

 Methods of Evaluation  Evaluation Vs. Prediction Vs. Bottleneck Detection Vs. Optimization

Copyright © 2015 Kishor S. Trivedi

13

Software (Program) Performance Evaluation      

   

Worst-case vs. Average case Data-structure-oriented (Ch 2,7) vs. Control structure-oriented (Ch 2,3,4,5,7,8) Sequential vs. Concurrent (single threaded vs. multi-threaded) Restricted (Structured) (Ch 1-5) vs. unrestricted transfer of control (Ch 7) Unlimited (hardware) resources vs. limited resources Software architecture: modules, their characteristics (execution time) and interactions (branching, looping) Business process flows, Composed Web Services (similar to programs) Metrics: completion time & response time (mean, variance & dist.) Measurements, Probability Models (simulation vs. analytic), or a combination Analytic models: non-state-space (directed acyclic task precedence graph); State Space: DTMC, SMP, CTMC, SPN; Hierarchical

Copyright © 2015 Kishor S. Trivedi

14

System Performance Evaluation 

Workload: Traffic arrival process, service time distributions, pattern of resource requests



Hardware and software architecture



Resource Contention, Scheduling & Allocation

  

Concurrency, Synchronization, Distributed processing Timeliness (may have to Meet Deadlines) Metrics: Throughput, Goodput, loss (blocking) probability, response time or delay (mean, variance & dist (Sec 9.6)) Low-level (CPU Cache, memory interference: Ch. 7) System-level (CPU-I/O, multiprocessing: Ch. 8,9) Network-level (protocols, handoff in wireless: Ch. 7,8) Measurements, models (simulation or analytic), or combination Analytic models: DTMC (Ch 7), CTMC (Ch 8), PFQN (Ch. 9), SPN (Ch 8); NPFQN: Hierarchical (Ch 9), Approximation (Ch 9)

    

Copyright © 2015 Kishor S. Trivedi

15

Software Reliability Evaluation 

Black-box (measurements + statistical inference) vs. Architecture-based approach (Structural Models)



Black-box approach is called software reliability growth modeling (Ch. 3, 5, 8, 10)



Black-box approaches treat software as a monolithic whole, considering only its interactions with external environment, without an attempt to model its internal structure



With growing emphasis on reuse, software development process moves toward component-based software design



White-box approach may be better to analyze a system with many software components and how they fit together Copyright © 2015 Kishor S. Trivedi

16

Software Architecture 

Software behavior describes the manner in which different components interact.



May include the information about the execution time of each component.



Control flow graph is used to represent architecture.



Sequential program architecture can be modeled by  Discrete Time Markov Chain (DTMC; Ch 7)

 Continuous Time Markov Chain (CTMC; Ch 8)  Semi-Markov process (SMP)  Markov Regenerative Process (MRGP)



Parallel program architecture can be modeled by  Stochastic Petri Net (Ch 8) Copyright © 2015 Kishor S. Trivedi

17

System Reliability/Availability           

Fault load: fault types, failure rates, repair/recovery procedures, delay time distributions and imperfect coverage for recovery steps Hardware and software architecture Minimum Resource Requirements Performance/Reliability interdependence Metrics: Reliability, Availability, system MTTF, Downtime Low-level (Physics of failures, chip level) System-level (CPU-I/O, multiprocessing: Ch 1,3,4,5,6,8,9) Software and Hardware combined together (Ch 8) Network-level Measurements, models (simulation or analytic) or a combination Analytic models types : RBD (Ch 1,3,4,5,6), FTREE (Ch 1,3,4,5), CTMC (Ch 8), SPN (Ch 8), Hierarchical (Ch 8) Copyright © 2015 Kishor S. Trivedi

18

Evaluation vs. Optimization  Evaluation of system for computation of desired metrics given a set of parameters  Sensitivity Analysis   

Parametric (Blake et al. Sigmetrics 1988) Bottleneck analysis (Sato & Trivedi ICSOC07, Rubens et al IEEE-TR 2012) Reliability importance (Fricks, RAMS 2003)

 Optimization (Ch. 11 in 1st ed. white book)   

Static: Linear, nonlinear, geometric, integer, multiobjective; constrained or unconstrained Dynamic: Dynamic programming, Markov decision process, semi-Markov decision process Simulated annealing, Evolutionary programming Copyright © 2015 Kishor S. Trivedi

19

PURPOSE OF EVALUATION  Understanding a system  Observation 

Operational environment



Controlled environment

 Reasoning

A probability model is a convenient abstraction  Predicting behavior of a system

 Need a model Copyright © 2015 Kishor S. Trivedi

20

PURPOSE OF EVALUATION(Contd.) Famous quotes bring out the difficulty of prediction based on models:

 “All Models are Wrong; Some Models are Useful” George Box and Albert Einstein  “Prediction is fine as long as it is not about the future” Mark Twain Copyright © 2015 Kishor S. Trivedi

21

Methods of EVALUATION  Measurement-Based  More Accurate, most expensive  Not always possible or cost effective during system design.  Statistical techniques are very important here  Empirical model can be formulated regression, machine learning etc.

via

 (structural) Model-Based Copyright © 2015 Kishor S. Trivedi

22

Methods of EVALUATION(Contd.)  Model-Based Less Accurate, Less expensive

1. Discrete-Event Simulation vs. Analytic solution 2. State-Space Methods (Ch. 7,8) vs. NonState-Space Methods (Ch. 1-5,9) 3. Hybrid: Simulation + Analytic (SPNP) 4. State Space + Non-State Space (SHARPE)

Copyright © 2015 Kishor S. Trivedi

23

Methods of EVALUATION

(Contd.)

 Measurements + Models

 Models need  input parameters that are estimated from measurements  Validated against measurements

 Measurements should be guided by models Vaidyanathan & Trivedi IEEE TDSC, 2005; Hsueh, Iyer & Trivedi IEEE TC, 1988; Gokhale et al, Perf Eval. 2005; Trivedi et al, PRDC 2008; ISSRE 2010 Copyright © 2015 Kishor S. Trivedi

24

QUANTITATIVE EVALUATION TAXONOMY

Closed-form solution

Numerical solution using a tool Copyright © 2015 Kishor S. Trivedi

25

Notes 

Both measurements & simulations imply statistical analysis of outputs      

Statistical inference (Ch 10) Hypothesis testing (Ch 10) Regression (linear, nonlinear) (Ch 11) Design of experiments (not in bluebook) Trend Detection (Ch 11) Analysis of variance (Ch 11)



Distribution-driven simulation requires generation of random deviates (variates). (Ch. 3, 4, 5)



Probability and Statistics are different but highly intertwined.



Probability models need inputs that generally come from measurement data (followed by statistical inference)



Statistics in turn uses probability theory to derive formulas Copyright © 2015 Kishor S. Trivedi

26

Introduction to Simulation

MODULE 1

Copyright © 2015 Kishor S. Trivedi

27

What is Simulation?  An experiment on a system model to empirically determine its characteristics.  A model solution method that mimics or emulates the behavior of a system over time.  Involves generation and observation of artificial history of the system under study.  Inferences are then drawn from the response of the model, concerning the dynamic behavior of the real system.

Copyright © 2015 Kishor S. Trivedi

28

Computer Simulation 





Involves modeling of actual or theoretical system, executing the model (an experiment) on a digital computer, and (statistically) analyzing the execution output. Current state of the physical system is represented by state variables (program variables). State variables are modified to mimic the evolution of the physical system over time

Copyright © 2015 Kishor S. Trivedi

29

What is a Model?  Model is a representation of the system under study developed through techniques of 

 

Abstraction, that is, discarding unimportant details to make a model as simple as possible, and as complex as necessary Decomposition, that is, divide and conquer Idealization (e.g., relaxing unimportant constraints)

 All three techniques aim at complexity reduction  More art involved than science

 Physical or Mathematical (abstract, formal) models Copyright © 2015 Kishor S. Trivedi

30

What is a Model?  Frequently models have random inputs, consisting of a set of sequences of random variables with specified distributions  Non-determinism can also be introduced by some random operational decisions represented in the model.

 Correspondingly such models have a random output with unknown distribution  In such cases the goal is to estimate certain characteristics of output distributions.

Copyright © 2015 Kishor S. Trivedi

31

Model Solution Types Model Solution

Transient (Terminating)

Steady-state (Non-terminating)

Copyright © 2015 Kishor S. Trivedi

32

Model Solutions (Transient) Model Solution Transient

Analytic

Fully-Symbolic solution

Semi-Symbolic solution

Simulation (terminating)

Numerical solution

Copyright © 2015 Kishor S. Trivedi

33

Model Solutions (Steady State) Model MODEL Solution SOL. Steady State

Analytic

Symbolic Solution

Simulation (Steady-state)

Numerical solution Copyright © 2015 Kishor S. Trivedi

34

Nature of Model Solutions 

Fully Symbolic Closed form solution of an analytic model by hand or via Mathematica (Matlab?, others?)  Exact [Example will follow]



Numerical solution of an analytic model using one of many packages such as SHARPE or SPNP  numerical errors (round off, truncation, convergence) [Example will follow]



Semi-Symbolic (semi-numerical) (transient) solution– symbolic in t (see SHARPE cdf in exponomial form); note that for steady-state case, there is no semi-symbolic solution [Example will follow]



Simulative solution  statistical (or sampling) errors (finite number of paths traversed out of (possibly) infinitely many paths) [Example will follow]

Copyright © 2015 Kishor S. Trivedi

35

Fully Symbolic Transient Solution (by hand)

Copyright © 2015 Kishor S. Trivedi

36

Markov Reliability Model With Repair  Consider a 2-component parallel system where we disallow repair from system down state.  Note that state 0 is an absorbing state. The state diagram is given in the following figure.  This is reliability model with repair. We need to resort to Markov chains.

Copyright © 2015 Kishor S. Trivedi

37

Markov Reliability Model With Repair (Contd.)



2 2

1

0

 Absorbing state 

Markov chain has an absorbing state.



In the steady-state, system will be in state 0 with probability 1.



Hence steady state analysis will yield a trivial answer; transient analysis is of interest. States 1 and 2 are transient states. Copyright © 2015 Kishor S. Trivedi

38

Markov Reliability Model With Repair (Contd.)

2 2

 1

0

 

Assume that the initial state of the Markov chain is 2, that is, p2(0) = 1, pk (0) = 0 for k = 0, 1.



Then the system of differential Equations is written

based on: Rate of buildup = Rate of flow in - Rate of flow out for each state Copyright © 2015 Kishor S. Trivedi

39

Markov Reliability Model With Repair (Contd.) 2

2

 1

0



dp 2 (t )  2p 2 (t )  p1 (t ) dt

dp 1 (t )  2p 2 (t )  (   )p 1 (t ) dt dp 0 (t )  p 1 (t ) dt Copyright © 2015 Kishor S. Trivedi

40

Markov Reliability Model With Repair (Contd.) Using the technique of Laplace transform, we can reduce the above system to:

sp 2 ( s )  1  2 p 2 ( s )   p 1 ( s ) sp 1 ( s )  2 p 2 ( s )  (   )p 1 ( s ) sp 0 ( s )   p 1 ( s )



where p ( s )   e  stp (t ) dt 0

Copyright © 2015 Kishor S. Trivedi

41

Markov Reliability Model With Repair (Contd.) _ _ _ _ _ _ _ _ __ _

Solving for π 0 (s) , we get: p 0 ( s) 

s[ s 2

22  (3   ) s  22 ]

 After an inversion, we obtain p0 (t), the probability that no components are operating at time t ≥ 0. For this purpose, we carry out a partial fraction expansion.

Copyright © 2015 Kishor S. Trivedi

42

Markov Reliability Model With Repair (Contd.) Inverting the transform, we get 22 e  2t e 1t R(t )  1  p 0 (t )  (  ) 1   2  2 1

where 1 ,  2 

(3   ) 

2  6   2 2

Copyright © 2015 Kishor S. Trivedi

43

Fully Symbolic Transient Solution 2

2

(by hand)

 1

0

 22 e  2t e 1t R(t )  1  p 0 (t )  (  ) 1   2  2 1 (3   )  2  6   2 1 ,  2  2 Copyright © 2015 Kishor S. Trivedi

44

Fully Symbolic Closed form Transient solution in Mathematica

Absorbing state

Copyright © 2015 Kishor S. Trivedi

45

Fully Symbolic Transient Solution  What are the fundamental limits of this approach?  Finding roots of polynomial in a fully symbolic fashion  Currently possible only up to a fifth degree polynomial

Copyright © 2015 Kishor S. Trivedi

46

Semi-Symbolic (semi-numerical) (transient) solution in SHARPE (textual input) bind lambda 1/1000 bind mu 1/1 markov semi 2 1 2*lambda 1 0 lambda 1 2 mu end * Initial Probabilities assigned: 21 10 00 end

echo ************************** **************** echo ********* Outputs asked for the model: semi ************** cdf(semi,0) end

Copyright © 2015 Kishor S. Trivedi

47

Semi-Symbolic (semi-numerical) (transient) solution in SHARPE

Copyright © 2015 Kishor S. Trivedi

48

Semi Symbolic Transient Solution  What are the limits of this approach?  Only full matrix method is known  When the roots are close by, numerical instability occurs

Copyright © 2015 Kishor S. Trivedi

49

Numerical Transient solution in SHARPE (textual input) echo

bind lambda 1/1000 bind mu 1/1 markov numeric 2 1 2*lambda 1 0 lambda 1 2 mu end * Initial Probabilities defined: 21 10 00 end

******************************* ******************************* ********** echo ********* Outputs asked for the model: numeric ************** func Reliability(t) 1-tvalue(t;numeric) loop t,1,991,10 expr Reliability(t) end var MTTAb mean(numeric, 0) expr MTTAb end

Copyright © 2015 Kishor S. Trivedi

50

Numerical Transient solution in SHARPE (textual input)

Copyright © 2015 Kishor S. Trivedi

51

Numerical Transient Solution  What are the limits of this approach?  Sparse matrix storage and sparsity preserving algorithms enable very large Markov models to be solved  Stiffness of Markov models will slow down the solution

Copyright © 2015 Kishor S. Trivedi

52

Symbolic Solution (Steady state)

1  Anonshared 

2 2

2

2 1  2  

 1



1

0

Shared repair

 Copyright © 2015 Kishor S. Trivedi

53

Steady-state balance equations  For any state: Rate of flow in = Rate of flow out  Consider the shared case, 2p 2  p1

(   )p 1  2p 2  p 0

p 1  p 0  pi : steady state probability that system is in state i, that is: πi  lim P( X (t )  i) t 

Copyright © 2015 Kishor S. Trivedi

54

Steady-state balance equations (Contd.)

 p1 2

 Hence

p2 

 Since

p 0  p1  p 2  1

 We have p 0  or

p0 

p1 

 p0 

     p0     p 0  1     2  1

 2 1   22

Copyright © 2015 Kishor S. Trivedi

55

Symbolic steady-state Solution  What are the limits of this approach?

Copyright © 2015 Kishor S. Trivedi

56

Numerical Solution (steady state)

Copyright © 2015 Kishor S. Trivedi

57

Shared Case markov shared 2 1 2*lambda

* Could be also written * 2 1 2/MTTF 1 0 lambda

1 2 mu 0 1 mu end

bind mu 1 lambda 0.1 end var U prob(shared,0) var downtime 60*8760*U loop j ,2, 5, 0.5 bind lambda 1.0 *10^-j expr downtime end end

Copyright © 2015 Kishor S. Trivedi

58

Markov Availability Model

Copyright © 2015 Kishor S. Trivedi

(Contd.)

59

Numerical steady-state Solution  What are the limits of this approach?  Sparse matrix storage and sparsity preserving solution methods are known (mostly iterative methods)  Some iterative methods are guaranteed to always converge (Power method) while some others (though faster on the average) may fail to converge (SOR, GS) Copyright © 2015 Kishor S. Trivedi

60

Simulation of Markov Model 2 2

 1

2 0

2



1

 Absorbing state

 0



 Useful steps to follow: 1. 2. 3. 4.

Simulation flow chart Random Variate generation (see module 3) Write Simulation code (Java, C, C++, others) Interpret results Copyright © 2015 Kishor S. Trivedi

61

Simulation Flow chart

Copyright © 2015 Kishor S. Trivedi

62

System flow chart

Copyright © 2015 Kishor S. Trivedi

63

Random Variate Generation public class homework1a {

Java Example

//Create the object before every simulation run to guarantee a new

seed. Random generator = new Random(); //(Uniform random generator)

. . . private static double generateRandomVariate(double f) { double x=0; double u = generator.nextDouble(); x = -(Math.log(1-u))/f; return x; } Copyright © 2015 Kishor S. Trivedi

64

Random Variate Generation C++ Example //initialize variables ... while (t

Suggest Documents