Reliability Assessment of WEB Applications V. S. Alagar O. Ormandjieva Department of Computer Science Concordia University Montreal, Quebec H3G 1M8, Canada Phone: +1(514) 848-7810 falagar,
[email protected] February 18, 2002
Abstract The paper discusses a formal approach for specifying time-dependent Web applications and proposes a Markov model for reliability prediction. Measures for predicting reliability are calculated from the formal architectural specification and system configuration descriptions. Keywords: reliability prediction, software measurement, Markov model.
1 Introduction The reliability of a software system is defined in [IEEE90] as the ability to perform the required functionality under stated conditions for specified period of time. In this paper the software system under discussion is a Web-based system. Web is a large and complex distributed system whose heterogeneous components interact in various ways to achieve the result of an application. Often, the performance of an application initiated at a site is rated as good if the server at that site is robust and links are not broken. Such a rating does give a subjective qualitative assessment, but does not provide a scientific quantitative measurement of the reliability of the site. This paper proposes a methodology for an assessment of quality of Web application components through reliability prediction, when a formal model of the Web application could be specified in an Objected-oriented formalism. Many techniques exist to test and statistically analyze traditional software. However, these methods can not be readily applied to a Web environment. In a recent paper Kallepalli and Tian [KT2001] have surveyed the characteristics of Web applications and usage and proposed a statistical testing method for Web applications. Their approach relies on usage and failure information collected in the log files. Web failure is defined as the inability to correctly deliver information or documents required by Web users. Based on this definition of failure, they classify types of failures and provide a method for testing source or content failures. We complement their work by offering a formal time-constrained model of the Web on which testing and reliability analysis can be done.
This work is supported by grants from Natural Sciences and Engineering Research Council, Canada and Concordia University Graduate Fellowships.
Quality assurance and reliability assessment for Web applications should focus on the prevention of Web failures or the reduction of chances for such failures. Consequently, we contend that early reliability assessment is necessary for the reduction of testing efforts, and for ensuring a level of operational reliability. We propose an early analysis on the formal architecture model of the Web application. This uses a Markov model, which can adapt to changing system configurations that satisfy the architectural design. We may view Web applications as Markov systems, in which state changes occur with certain probabilities. From the Markov model of an application, we can calculate a predictive measurement of reliability. Markov matrices for individual Web components can be constructed from log files, the source used for statistical testing in [KT2001]. We give methods for calculating Markov matrices for synchronously interacting Web components from the Markov matrices of individual components. Synchrony hypothesis is that two Web components that interact on a shared message will change their status (states and associated information) simultaneously. In a typical application, several Web components collaborate to achieve a task. It is important to assess the reliability of every collaboration in an application. We provide a method to compute the reliability of the whole system from the reliability measures of the collaborations in the system.
2 Web and Markov Models: Basic Concepts The Web is a large network of interconnected components. Conceptually, it is a graph where each vertex (node) is a computer system providing an interface to the other nodes in the network. A Web application is multi-layered, with the user at the top of the layer and the information source at the bottom of the layer. A user interacts with the nodes in the Web through a browser, which has links to the home pages at the interfaces of the nodes in the network. The sever at a node provides the services for controlled navigation for accessing and retrieving information from the information sources at that node. We model the Web components as User, Browser, and Server classes. Objects, instantiated from the classes, interact in a meaningful way through messages. The behavior of objects in a class is captured by a hierarchical labeled transition system with finite number of states. A state represents an operational high-level unit. A transition between two states may be labeled by a message shared by objects of different classes, or by an event internal to the object. A state, when complex, is itself a hierarchical labeled transition system, with its substates and transitions defined by the buttons and the whistles specific to that state. That is, a transition from a substate to another substate 0 of an object is implicitly labeled by the link name on the page associated with . Some Web applications may put time constraints on the navigation paths within their systems. This is typical when secure information or time-varying information is to be made available. For instance, consider the home page of a hypothetical on-line brokerage system HOBS. A user may be able to reach the home page of HOBS using a browser. However, the user must be authenticated to get services at the site. Once authenticated, the user may be authorized to do one or more business activity at each state of the server object. A state may include secure information such as user-id, account-number, and account-balance. Typical states of the HOBS system where time may plays a role can be Stock Trading, Mutual Fund Trading, Account Overview, Positions, Quotes and Research, Financial Planning, and Services. The architecture of HOBS design may allow the user to explore the substates of a state or change to a different state as specified by the links on the current page. For instance, the hierarchy of stages rooted at the state Stock Trading may impose timing constraints on the information displayed at its substates. After reviewing an order at one of its substates, the user may be allowed to change, review or cancel the order.
2
If the activity at the state review is not completed within a certain amount of time, the backward transition may be disabled. There are two reasons for this: (1) the page contains secure information, and (2) the information, such as stock price is a time-dependent value. Another instance where time plays a role is when the user fails to interact for a certain period of time in a state. The system, after waiting for a period of time, may force a new log in session. These instances illustrate the failure of the system to deliver the information requested by the user. However, this type of failure is not a fault of the system. The behavior of the system deviates from the user-expected behavior of the system, yet the system behaves according to the time-constrained functionality imposed by system requirements. In order to model such applications and their reliability our formal model includes time constraints.
2.1 Markov models Markov models are one of the most powerful tools available to engineers and scientists for analyzing complex systems. Analysis of Markov models yield results for both the time-dependent evolution of the system and the steady state properties of the system. The Markov property states that given the current state of the system, the future evolution of the system is independent of its history. The Markov model of a Web component may be represented by a state diagram. The states represent the stages in the Web component that are observable to the users and the transitions between states have assigned probabilities. The probabilities are calculated from the usage and related failure information collected in the log file that maintains the Web site. We may use this data as initial transition probabilities. An algebraic representation of a Markov model is a matrix, called transition matrix, in which the rows and columns correspond to the states, and the entry pij in the i-th row, j -th column is the transition probability for being in state j at the stage following state i. We use transition matrix representation in reliability calculation algorithms.
2.2 Discussion Initial transition probabilities, obtained from various sources including log files and other subjective opinions of experts can not be used for predicting the reliability of the system. We contend that the reliability should be calculated from the steady state of the Markov system. A steady state or equilibrium state is one in which the probability of being in a state before and after transitions is the same as time progresses. Computing the steady state vector for the transition matrix of a large system is hard. However, as in our approach, when the system is modularly constructed it seems possible to partition the system into smaller components, which might reduce the complexity of computing steady state vectors. The formal model of the Web that we discuss in the next section is based on timed labeled transition system semantics. From the state machine description of a Web component, it is possible to construct the Markov machine corresponding to that model. The organization of this paper is as follows. A formal model of the Web is given in Section 3. Section 4 formally describes the method of modeling the Web application as a Markov system. Section 5 presents the reliability prediction measures. Section 6 concludes the paper with a discussion on our ongoing research directions.
3
3 Formal Model Web is a reactive system, characterized by the following two important properties:
stimulus synchronization: the Web (process) always reacts to a stimulus from its environment; response synchronization: the time elapsed between a stimulus and its response is acceptable to the relative dynamics of the environment, so that the environment is still receptive to the response.
In addition, certain timing constraints are inherent in the design of many Web components. Hence, we may characterize Web as a real-time reactive system. Real-time constraints are strictly enforced in security related information browsing and retrieval. When security is related to safety, such real-time constraints are hard requirements, in the sense that defaulting it would lead to dire consequences. In characterizing Web as realtime reactive systems, we have taken a stronger view than the traditional one, where the Web is regarded as an interactive system. For many applications such a soft view is sufficient. However, when time constraints and secure transactions are part of the Web application, ours is a more appropriate characterization. The major distinction between the two views is in the available synchronization mechanism: an interactive system will wait for an input from its environment; whereas, a reactive system is fully responsible for synchronization with its environment. That is the underlying reason why certain stages in secure transactions cannot be accessed through backward navigation. A Timed Reactive Objected-oriented Model for the development of real-time reactive systems is discussed by Alagar et. al [AAM98]. We model the Web using this formalism. Abstractly, we model a Web component as a class parameterized with port types. A port type is associated with a signature, a finite set of messages that can occur at a port of that type. We use the notation e! to emphasize that e is an output message, and write e? to emphasize that e is an input message. A class may include attributes of two kinds: port identifies, and data types such as integer, set, list, and queue. An object of the class A[L℄, where L is the list of port types, is created by instantiating each port type in L by a finite number of ports and assigning the ports to the object. Any message defined for a port type can be received or sent through any port of that type. For instance, A1 [p1 ; p2 : P ; q1 ; q2 ; q3 : Q℄, and A2 [r1 : P ; s1 ; s2 : Q℄ are two objects of the class A[P; Q℄. The object A1 has two ports p1 and p2 of type @P , and three ports q1 , q2 , q3 of type @Q. Both p1 and p2 can receive or send messages of type @P ; the ports q1 , q2 , and q3 can receive and send messages of type @Q. Sometimes the port parameters of objects are omitted in our discussion below. Messages may also have parameters with basic types. An incarnation of an object Ai is a copy of Ai with a name different from the name of any other object in the system, and with its port types renamed, if necessary. Several incarnations of the same object can be created and distinguished by their ids. Letting ids to be positive integers, A1 [1℄, A1 [2℄ are two distinct incarnations of the object A1 . Every incarnation of an object retains the same port interfaces. For instance A1 [1℄[a1 ; a2 : P ; b1 ; b2 ; b3 : Q℄ and A1 [2℄[p1 ; p2 : P ; q1 ; q2 ; q3 : Q℄ are two distinct incarnations of the object A1 [p1 ; p2 : P ; q1 ; q2 ; q3 : Q℄. The contexts and behavior of incarnations of an object Ai are in general independent. The context for the incarnation Ai [k ℄ is defined by the set of applications in which it can participate. Hence the context of an incarnation effectively determines the objects with whom it can interact and the messages it can use in such an interaction. For instance, the incarnations A1 [1℄[a1 ; a2 : P ; b1 ; b2 ; b3 : Q℄ and A1 [2℄[p1 ; p2 : P ; q1 ; q2 ; q3 : Q℄ can be plugged into two distinct
4
configurations for two distinct applications in a system. In the rest of the paper we use the term object to mean incarnation as well. The behavior of objects in a class is specified by a finite state machine, augmented with state hierarchy, logical assertions and timing constraints for transitions. A complex state is an encapsulation of a state hierarchy, and hence another finite state machine, with an initial state, and which can include other complex states. In our model, Web objects communicate using a synchronous message passing mechanism. An external event in the system is either an input or an output event, which can only occur at an instance of a specific port type. Events label the transitions between states. Logical assertions on the attributes specify a port condition, an enabling condition, and a post condition on each transition. Local clocks are defined to enforce time constraints associated with a transition. Both time constraints and functionality are encapsulated in an object. An abstract model of a Web system is specified as a collection of interacting Web components. A Web component is an object instantiated from a generic class. A pair of objects in this collection interact synchronously through shared messages. These messages occur at the compatible ports. Two ports in a system are compatible if the set of input messages at one port is equal to the set output messages at the other port. A port link connects two compatible ports. A port link is an abstraction of communication mechanism between the objects associated with the ports. Since the signature of ports are well-defined, the port links effectively determine the set of all valid messages that can be exchanged among the objects in a subsystem.
3.1 Operational Semantics Web objects communicate through messages. A message from an object to another object in the system is called a signal and is represented by a tuple hei ; pi ; ti i, denoting that the event ei occurs at time ti , at a port pi . The status of an object at any time ti is the tuple ( ; ~a; R), where the current state is a simple state, ~a is the assignment vector for attributes, and R is the vector of outstanding reactions. A computational step of an object occurs when the object with status ( ; ~a; R), receives a signal hei ; pi ; ti i and there exists a transition specification that can change its status. A computation of an object A is a sequence, possibly infinite,
h
i
h
i
! OS 1 ! : : :. Typically, the Web system is nonof alternating statuses and signals, OS 0 terminating; consequently, a computation is in general an infinite sequence. The set of all computations of an object A is denoted by Comp(A). The computation of the Web system is an infinite sequence of system statuses and signals that effect status changes [AAM98]. A period is a finite subsequence of the Web computation such that it starts with some initial state and finishes with its next appearance in the computation sequence. e0 ;p0 ;t0
e1 ;p1 ;t1
3.2 A Simple Model of the Web We abstract the multi-layered architecture of Web applications into three Web components: User, Browser, and Server. This abstraction, although is simple, is quite expressive and sufficient to illustrate the reliability calculation. Extension of our approach to more complex and detailed models are not difficult. In our model, we assume that several users (clients) may use a browser independently and concurrently to access information from a server. For simplicity, we assume that one browser is associated with a server. Once again, this restriction is only for the sake of simplicity of exposition, and can be generalized. A user
5
User cr : @C
@G events : Set = {Permit!,StopPermit!}
@S
Browser inSet : Set[@P,PSet]
@C events : Set = {Get!,Exit!}
@P events : Set = {Get?,Exit?}
Server
events : Set = {Permit?,StopPermit?}
Figure 1: Class Diagram for User, Browser and Server Entities.
chooses the server of his choice and initiates a request to a server. That is, the user sends a message to the corresponding browser, which then commands the server to allow the connection. When the last user requesting access to a server disconnects, the browser commands the server to close. During this period, the user- browser-server interaction must work without fault. The security (expressed as a safety property) requires that the operation of the system satisfies certain timing constraints, the server remains open, and provides the requested information (not violating time constraints) during every period of transaction. A high-level class structure diagram of the model in UML-based notation is shown in Figure 1. The User class has one port type with signature fGet!; Exit!g. The Browser class has two port types, @P with signature fGet?; Exit?g, and @G with signature fP ermit!; StopP ermit!g. The Server class has one port type with signature fP ermit; StopP ermitg. The figure shows that a port type, modeled as a class, has an aggregation relationship with the class for which it is intended. An association relationship between compatible port types is shown. A port identifier is declared as a variable of type @C in User class, and a variable of type Set is introduced in Browser class. Time constraints and functionality of objects of classes are described in statechart diagrams. A formal specification includes structural and behavioral information. User Model The statechart diagram for User is shown in Figure 2(a). The significant states of a User object are idle, toAccess, access, leave. At any instant, a user is in one of these states. In the Idle state, a user has not initiated any request. To access the server, the user sends the event Get to the browser used by it in state Idle, and changes his state to toAccess. In state toAccess, the attribute cr is set to pid, the identifier of the port where Get occurs. This transition is the constraining transition for two time constraints, labeled TCvar1 and TCvar2. Within 2 to 4 units of time of outputting the request (specified by TCvar1), the user accesses the server. That is, the user changes his state to access by initiating the internal event In. The state leave is reached when the user has retrieved the information requested, and this happens within 6 units of time (specified by TCvar2) from the instant the user requested access to the server. The user sends the message Exit to the browser and reaches the initial state. The formal specification of the User class is shown in Figure 2(b).
6
S1: idle
Get / cr’=pid && TCvar1=0 AND TCvar2=0
S2: toAccess
Exit[ pid=cr && true && TCvar22 AND TCvar1 fg > fg > fg >f g < > < > < > < > fg fg
S3: access Out
(a) User Statechart
=>cr0 =pid; => true; => true; => true;
(b) User class Specification
Figure 2: User Class
Get[ NOT(member(pid,inSet)) && true ] / inSet’=insert(pid,inSet) Get / inSet’=insert(pid,inSet) &&TCvar1=0 C1: idle
C2: activate
Permit[ true && true && TCvar1>0 AND TCvar10 AND TCvar2 < 1 ]
C4: deactivate
Get[ NOT (member(pid,inSet)) &&true ] / inSet’=insert(pid,inSet)
C3: monitor
Exit[ member(pid,inSet) && size(inSet)=1 ] / inSet’=delete(pid,inSet) && TCvar2=0 Exit[ member(pid,inSet) && size(inSet)>1 ] / inSet’=delete(pid,inSet)
(a) Browser Statechart
Class Browser [@P, @Y] Events: Permit!@Y, Get?@P, StopPermit!@Y, Exit?@P States: *idle, activate, deactivate, monitor Attributes: inSet:PSet Traits: Set[@P,PSet] Attribute–Function: inSet ; deactivate inSet ; activate monitor inSet ; idle ; Transition–Specifications: R1: activate,monitor ; Permit(true); true true; R2: activate,activate ; Get(NOT(member(pid,inSet))); true inSet0 = insert(pid,inSet); R3: deactivate,idle ; StopPermit(true); true true; R4: monitor,deactivate ; Exit(member(pid,inSet)); size(inSet) = 1 inSet0 = delete(pid,inSet); R5: monitor,monitor ; Exit(member(pid,inSet)); size(inSet) 1 inSet0 = delete(pid,inSet); R6: monitor,monitor ; Get(!(member(pid,inSet))); true inSet0 = insert(pid,inSet); R7: idle,activate ; Get(true); true inSet0 = insert(pid,inSet); Time–Constraints: TCvar1: R7, Permit, [0, 1], ; TCvar2: R4, StopPermit, [0, 1], ; end
< < < < < <
f >f
=> => =>
g g
> > >
=> > > => > => > =>
> fg
>f
>
fg
fg
(b) Browser class Specification
Figure 3: Browser class
7
g
G1: idle
Permit / true && TCvar1=0
G2: toOpen
DisAllow[ true && true && TCvar2 >1 AND TCvar2< 2 ]
Allow[ true && true && TCvar1>0 AND TCvar1 < 1 ]
G3: toClose
G3: opened StopPermit / true && TCvar2=0
(a) Server Statechart
Class Server [@S] Events: Permit?@S, Allow, DisAllow, StopPermit?@S States: *Idle, toClose, toOpen, opened Attributes: Traits: Attribute–Function: ; toClose ; Idle toOpen ; opened ; Transition–Specifications: true; R1: Idle,toOpen ; Permit(true); true R2: toOpen,opened ; Allow(true); true true; true; R3: toClose,Idle ; DisAllow(true); true true; R4: opened,toClose ; StopPermit(true); true Time–Constraints: TCvar1: R1, Allow, [0, 1], ; TCvar2: R4, DisAllow, [1, 2], ; end
> fg > fg > fg > fg < > < > < > < > fg fg
=> => =>
=>
(b) Server class Specification
Figure 4: Server class Browser Model The statechart diagram for Browser is shown in Figure 3(a). A Browser object can be in one of four states: idle, activate, monitor, deactivate. In its initial state Idle the object receives the Get message from a user object. In response, it synchronously changes its state to activate and includes the identifier of the port where the message was received in its attribute inSet. In activate state, the browser object may either receive the Get message from another user object or may send the message Permit to the server object associated with it. In the former case, it includes the pid where the message was received to its attribute inSet, and stays in the same state. In the later case, it changes its state to monitor within 1 time unit from the instant it received the first Get message. In state monitor three possible situations arise: 1. The object receives the Get message from another user object. The response is identical to its response for the Get message in state activate. 2. The object receives the Exit message from a user. In response, it removes the user object from inSet, and as a result of this deletion if inSet is empty (signifying that there are no more users) it changes its state to deactivate or it stays in the same state. Within 1 and 2 units of time of reaching deactivate state, it sends the message StopPermit to the server and changes its state to idle. Server Model The statechart diagram for Server is shown in Figure 4(a). A Server object can be in one of four states: idle, toOpen, opened, toClose. Initially the server is in idle state. Upon receiving the event Permit, it changes its state synchronously with the Browser object and goes to toOpen state. Within one unit of time of receiving 8
the Permit event, the Server object initiates the internal event Allow and reaches the state opened. It stays in that state until receiving the event StopPermit from the Browser object. Within 1 and 2 units of time of receiving StopPermit, the Server will return to idle state from toClose state. The formal specificaiton is shown in Figure 4(b). user1 : User
@C1 : @C
Server1 : Server
Browser1 : Browser
@P1 : @P
@S1 : @S
SCS UserServerBrowser Includes: Instantiate: Server1::Server[@S:1]; User1::User[@C:1]; Browser1::Browser[@P:2, @G:1]; Configure: Browser1.@G1:@G – Server1.@S1:@S; Browser1.@P1:@P – User1.@C1:@C; end
@G1 : @G
(a) Collaboration Diagram - Simple System
(b) System Specification - Simple System
user1:User
user2:User
user3:User
user4:User
user5:User
@C1: @C
@C2: @C
@C3: @C
@C4: @C
@C5: @C
@C6: @C
@P1: @P
@P2: @P
@P3: @P
@P4: @P
@P5: @P
@P6: @P
Browser1:Browser
SCS UserServerBrowser Includes: Instantiate: Server1::Server[@S:1]; Server2::Server[@S:1]; User1::User[@C:1]; User2::User[@C:1]; User3::User[@C:2]; User4::User[@C:1]; User5::User[@C:1]; Browser1::Browser[@P:3, @G:1]; Browser2::Browser[@P:3, @G:1]; Configure: Browser1.@G1:@G – Server1.@S1:@S; Browser2.@G2:@G – Server2.@S2:@S; Browser1.@P1:@P – User1.@C1:@C; Browser1.@P2:@P – User2.@C2:@C; Browser1.@P3:@P – User3.@C3:@C; Browser2.@P4:@P – User3.@C4:@C; Browser2.@P5:@P – User4.@C5:@C; Browser2.@P6:@P – User5.@C6:@C; end
Browser2:Browser
@G1: @G
@G2: @G
@S1: @S
@S2: @S
Server1:Server
Server2:Server
(c) Collaboration Diagram - Complex System
(d) System Specification - Complex System
Figure 5: System Configuration Models and Specifications A system configuration specification defines objects instantiated from the three classes and their interactions. Figure 5(a) is a collaboration diagram in UML style for a linear system with one user object, one server object, and one server object. The formal specification for this system is shown in Figure 5(b). A more complex system, that is non-linear, consisting of five users, two browsers and two servers is shown in Figure 5(c). In this configuration, user3 is allowed to access both browsers, while the other user objects interact with only one browser. The formal specification for this subsystem configuration is shown in Figure 9
5(d). In both specifications, there is no included subsystem. In the Instantiate section, objects are created, and in the Configure section compatible ports of objects are linked. The behavior of a system configuration specification can be simulated by applying the operational semantics to the system starting in some initial configuration.
4 Markov Models We construct the Markov model of a Web system in three steps. In the first step we construct the Markov models for Web objects. In the second step we construct the Markov models for every pair of interacting objects in the system configuration specification. Finally in the third step we construct the Markov model for the fully configured system.
4.1 Step 1: Markov Models for Objects We associate with each Web object in the architecture another finite state machine, called its Markov model. The states in the Markov model of an object are the states of the object in the formal design. A transition between two states in the Markov model is defined only if there exists at least one transition between those states in the statechart of the object. For instance, the states and transitions of User object Markov model are the same as those in User statechart but for labels and constraints. In the absence of statistical information gathered by experts on the usage and failure, we will assume that all the external events have equal probability in each state. For the transition from state i to state j in the Markov model, a fixed probability pij of it going into state j at the next time step is calculated as follows: 1. The initial probabilities for all the transitions in the state machine of the reactive object are calculated. The algorithm for calculating such probabilities for a state is based on the following assumptions: 1) all external events that can happen at the state have the same probability; 2) all internal events that can happen at the state have the same probability, and (3) these are in general different. 2. In case there is more than one transition fl1 ; : : : ; ln g of the same type (shared/internal) from statei to statej , then the above mentioned transitions are substituted by one whose probability is P
= 1
(1
f g :::
P l1
)
(1
f g
P ln
)
3. The probabilities of all the transitions for a state have to sum to 1. The Markov models and transition probabilities for User, Browser, and Server objects are shown in Figure 6.
4.2 Step 2: Markov Model for Object Pairs The interaction between two objects is due to shared events. We compute the state machine for an interacting pair of objects and compute the Markov model with transition probabilities from the transitions at each state of the product machine.
10
p 12 S1 p
S2 p 23
41
S4
S3 p
S1 S2 S1 0 1
S3 0
S4 0
S2 0
0
1
0
S3 0
0
0
1
S4 1
0
0
0
G1 G2 G1 0 1
G3 0
G4 0
G2 0
0
1
0
G3 0
0
0
1
G4 1
0
0
0
34
(a) User Class
p 12 G1 p
G2 p 23
41
G4
G3 p
34
(b) Browser Class
p 22
p 12 C1 p
C1 C2 C1 0 1
C2
C2 0
p 23
41
p 34 C4
C3
C3 0
1/2 1/2
C3 0
0
C4 1
0
C4 0 0
3/4 1/4
0
0
p33 (c) Server Class
Figure 6: Markov Models Algorithm for Transition Matrix for the Synchronous Product Machine Let E1 and E2 be the sets of internal events in the statecharts P and Q of interacting objects, and F denote the set of shared events. Let M1 and M2 be the transition matrices for P and Q. Let R be the synchronized product machine of P and Q. Algorithm SPM computes the transition matrix M of R by first computing the synchronous product machine R, and next determining the transition probabilities for transitions in each state of R. If all the transitions in a state are labeled by internal events or if all of them are labeled by shared events the probabilities are obtained by normalizing the probabilities in their respective machines. However, if both internal events and shared events occur at the state, the probabilities for the shared events are calculated first, and the remaining measure is distributed to transitions labeled by internal events. 11
Algorithm SPM Step 1. p = 1; == row sum
fej e is a shared event occurring at state i (P ) and at state j (Q) g x fej e is an internal event occurring at state i (P )g [ fej e is an internal event occurring at state j (Q) g Step 3. If x 6 ; == calculate probabilities for transitions due to shared events then N F (Normalization Factor); set ;; 0 0 0 0 Step 3.1 For each event e 2 x find the (set of) states i (P ) and j (Q) such that i ! i , j ! j 0 0 0 0 Step 3.2 y y \ fi ; j g, if fi ; j g 2 =y 0 0 Step 3.3 N F N F M i; i M j; j 0 0 0 0 Step 3.4 M i; i ; j; j M i; i M j; j 0 Step 3.5 set set \ j; j Step 4. If x 6 ; == calculate probabilities for transitions due to internal events 0 then N F (Normalization Factor); set ;;
Step 2. x1
=
2 =
1 =
= 0
1 =
e
1
=
=
[(
+
1[
) (
1 =
℄
)℄ =
(
1
2[
1[
℄
℄
2[
℄
)
2 =
= 0
2 =
Step 4.1 For each event e 2 x2 , if e 2 M1 then
0
0
find the state i (M1 ) such that i ! i ; y
=
y
e
\ fi0 ,j g if fi0 ,j g 2= y; 0
M [(i; j )(i ; j )℄ set2
=
set2
\
=
i; i
(
0
0
M1 [i; i ℄; N F
0
=
NF
0
+
0
M [(i; j )(i ; j )℄;
)
else
0
find the state j (M2 ) such that j y
=
y
\ fi; j 0 g if fi; j 0 g 2= y;
M [(i; j )(i; j set2
Step 5. If x1 Step 6.If x1
=
set2
)℄ =
\
j; j
(
0
0
M2 [j; j ℄; N F
For each (
2 =
e
0
=
NF
0
+
M [(i; j )(i; j
0
;
)℄
)
; ^ x ;, the ;^x 6 ; 0 0 i ; j 2 set do
=
=
0
! j0 ;
i; j ) row is deleted from M
(
2 =
)
2
0 0
M [(i; j ); (i ; j
0 0
)℄ =
12
M [(i; j ); (i ; j 0 NF
)℄
e
P 12
P
S1,C1
S2,C3
P 61
P 34 P 56
P 45
S1,C4
S4,C3
S1,C4
S2,C2
S1,C1
0
1
0
0
0
0
S2,C2
0
0
1
0
0
0
S2,C3
0
0
0
1
0
0
S3,C3
0
0
0
0
1
0
S4,C3
0
0
0
0
0
1
S1,C4
1
0
0
0
0
0
S3,C3
S2,C3
S3,C3 S4,C3
S1,C1
23
S2,C2
Figure 7: Markov Model and State Transition Matrix for Synchronous Product of User and Browser
6 ;^x ; 0 0 For each i ; j 2 set
Step 7. If x1
=
2 =
(
)
1
do
0 0
M [(i; j ); (i ; j
0 0
)℄ =
M [(i; j ); (i ; j NF
)℄
6 ; ^ x 6 ; do 0 0 For each i ; j 2 set do
Step 8. If x1
=
2 =
(
)
2
0 0
M [(i; j ); (i ; j
)℄ =
(1
NF )
M
0 NF
0 0
i; j ); (i ; j
[(
)℄
Step 9. To fill in the matrix M with 0 where there are no entries The Markov model and transition probability matrix for the synchronous product of User and Browser objects is shown in Figure 7.
4.3 Step 3: Markov Model for a System A partitioning method given in [O2002], is the basis of our discussion in this section. A system configuration, when partitioned, produces two types of subsystem components: (1) linear subsystem configuration, as shown in Figure 8(a), and (2) non-linear subsystem configuration as shown in Figure 9(a).
4.4 Case 1: Linear System In a linear system, objects synchronize in the past. If o1 ; : : : ; on are objects in the linear system and M1 ; : : : ; Mn are respectively their transition matrices, then the transition matrix M of the linear system is computed as follows: 1. Compute M 2. for j
= 3
=
M1
M
2
to n compute M
(Apply Algorithm SPM) =
M
M
j
(Apply Algorithm)
The Markov model and the transition matrix for the linear system (Figure 5(a)) are shown in Figure 8(b).
13
Ini
User
Server
Browser
synchronize
synchronize
(a) Linear Architecture
P 89
S1 C1 G1
S1 C1 G4 S1 C1 G1
P 78
P 12
S1 C1 G1 S2 C2 G1
S2 C2 G1
S1 C4 G3
S2 C3 G2 S2 C3 G3
P 23
P 67
S2 C3 G2
S4 C3 G3
S3 C3 G3 S4 C3 G3 S1 C4 G3
P 34
f
P 56
S1 C1 G4
0 0 0 0 0 0 0 1
S2 C2 G1 S2 C3 G2 S2 C3 G3 S3 C3 G3
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
S4 C3 G3 S1 C4 G3
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0
S1 C1 G4
0 0 0 0 0 0 1 0
P 45 S2 C3 G3
S3 C3 G3
(b) Markov Model
Figure 8: User-Browser-Server Linear Model
4.5 Case 2: Non-linear System In a non-linear system, as in Figure 9(a), several objects interact with an object. These interactions may be initiated at different times. The synchronous product machine dynamically changes as and when users join or leave the system, and hence the transition probability matrices also change, and should be recomputed. For one scenario, the synchronous product of a non-linear system and its corresponding Markov model are illustrated in Figure 9(b). In the state machine diagram, interpret each Si as a vector, with number of components equal to the number of users in the system. For simplicity of discussion we assume that users join the system one at a time, but don’t leave the system. Let 0; k1 ; k2 ; : : : ; kn 1 be the intervals of successive arrivals of users. That is, for j > 1 the j -th user joins the system kj 1 (> 0) time units after j 1-st user joined the system. Let M1 be the transition matrix of the Markov model for the linear system composed from U ser1 ; Browser1 ; Server1 , and M1k be its transition matrix at time step k . The transition matrix for M(n) , n 2, users interacting with one browser and one server is calculated as follows: M(2) M(3)
L
L
M1 M1k1 k2 = [M(2) ℄ M1 =
.. . M(j )
.. .
M(j
= [
kj
1) ℄
1
LM ,
1 2
j
n
L
denotes the direct product operator for matrices. The justification In the above calculation, the symbol for direct product computation is based on the observations: 14
User 1
User 2
User 3
User n
Browser
Server
(a) Non-Linear Architecture
S1 C1 G1
P 91
P 89 S1 C1 G4 S1 C1 G1
P 12
P 74
S1 C1 G1 S2 C2 G1
P 22
S2 C2 G1
S1 C3 G3
P 23 P
33
P67
S2 C3 G2
44
77
S2 C3 G2 S2 C3 G3
P 68
S4 C3 G3
S3 C3 G3
S1 C4 G3
P 66 P 56 P 45 S2 C3 G3
S4 C3 G3 S1 C3 G3
P 34 P
P
S3 C3 G3
S1 C4 G3 S1 C1 G4
P55
0 0 0 0 0 0 0 0 1
S2 C2 G1 S2 C3 G2 S2 C3 G3 S3 C3 G3
S4 C3 G3 S1 C3 G3
S1 C4 G3 S1 C1 G4
1 0 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 1/3 1/2 1/6 0 0 1/2 0 0 1/2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0
(b) Markov Model
Figure 9: User-Browser-Server Non-Linear Model
the event Get? from a new User object can come when the current system configuration is in any one of its states in which the Browser object can receive it, (i.e; Get? is not time constrained); that is M1 for the new User-Browser-Server interaction is independent of [M(j 1) ℄kj 1 , and when a new user joins the system there will be a three-fold increase in the size of the transition matrix
For the non-linear system in Figure 5(c), assume that users join the system,one at a time, at times 0, 2, 4, 5. So, k1 = 2; k2 = 2; k3 = 1. The transition matrix of the system at different time points are shown below:
L
At time 0 or 1: (1 user): M1 At time 2: (2 users): M(2) = M12 M1 At time 4: (3 users): M(3) = [M(2) ℄2 M1 At time 5: (4 users): M(4) = [M(3) ℄ M1
L L
Let us consider the general case when r > 1 users simultaneously join the system, say when there are j 1 users in the system. It is easy to see that the transition matrix for the new configuration with r + j 1 users (r ) (r ) is M(r+j 1) = [M(j 1) ℄kj 1 M1 ; where M1 is the direct product M1 M1 : : : M1 , taken r times. When r users leave the system, the transition matrix is computed as follows: Let there be j ( 2) users in the system when r (1 < r j ) users leave. If r = j , then the transition matrix is not defined. If r < j , there are j r users left in the system. If d is the interval of time that elapsed between the latest time when there were j r users in the system and the current instant, then the new transition probability matrix is
L
L L
15
M(j
. The rationale is that the transition probability matrix M(j time steps. [
5
d
r) ℄
r)
for j
r users have evolved over d
Reliability Measures
The reliability prediction for a system configuration composed from k reactive objects is defined as the level of certainty quantified by the source ex ess entropy :
XH k
Reliability (Subsystem)
=
P P
i
H
i=1
vi j pij log pij is a level of uncertainty of the Markov system corresponding to a where H = i subsystem; vi is a steady state distribution vector for the corresponding Markov system and the pij values are the transition probabilities. Hi is a level of uncertainty in a Markov system corresponding to a reactive object. For a transition matrix P the steady state distribution vector v satisfies the property vP = v . The level of uncertainty H is related exponentially to the number of paths that are ”statistically typical” of the Markov system. Thus, higher entropy value implies that more sequences must be generated in order to accurately describe the asymptotic behavior of the Markov system. We illustrate the calculation of our reliability measure on two configurations of the case study shown in Figure 8 and 9. Reliability (F igure 8)
=
HU ser + HServer
+
HBrowser
HF igure 8
where HU ser = HServer = HF igure 8 = 0. For calculating HBrowser we will need the the steady vector of the Browser: vBrowser = f:125; :25; :5; :125g. Then, HBrowser = :25 + :15 = 0:4. Therefore, Reliability (F igure 8) = 0:4. We calculate the reliability for Figure 9 at time step k
:
= 0
Reliability (F igure 9) = HU ser + HServer + HBrowser HF igure 9 , where HU ser = HServer = 1 3 3 1 1 log + v6 ( log + log ) > 0. 0; HBrowser = 0:4, and HF igure 9 = (v2 + v3 + v4 + v5 + v7 ) 2 4 4 4 4
Therefore, Reliability (F igure 8) > Reliability (F igure 9). The above measurement data collected on two different configurations for the case study given above, tests the consistency of the reliability measures. The reliability prediction for a system is defined as the least reliability measure value among its m subsystems: Reliability (System)
=
f
g
min Reliability (Subsystemi )
m i
We chose the minimum value due to the safety-critical character of the real-time reactive systems. Higher value of reliability measure implies less uncertainty present in the model, and thus higher level of software reliability. The Markov model of a configured system changes when the system undergoes change. The calculation of the Markov matrix for the reconfigured system would allow to compare the systems based on reliability prediction. If the system configuration Cj 1 changes to the configuration Cj , we need to calculate the reliability of the configuration Cj and compare it with the reliability of the configuration Cj 1 : Reliability (Cj
1) =
f
g
min Reliability (Si )
16
m i
;
where Si is a subsystem of Cj
1
, and
Reliability (Cj )
=
min Reliability (Si0 )
f
g
m i
;
where Si0 is a subsystem of Cj . If Reliability (Cj ) Reliability (Cj 1 ), then the uncertainty present in the reconfigured system is less than the uncertainty that existed in the current system. The reliability measurement will allow the reconfigured system to be deployed. However, if Reliability (Cj ) < Reliability (Cj 1 ), then there is more uncertainty present in the reconfiguration. This would suggest to determine the subsystem(s) of Cj that are responsible for lowering the overall reliability.
6 Conclusions and Research Directions The main result of this paper is a formal approach to calculate the reliability of a time-dependent Web application. The Web model discussed in the paper is simple, yet representative of the different Web layers. The model can be generalized to include more Web components:
a browser linked to several servers, users interacting with Agents, who in turn interact with browsers/servers, and servers protected by firewalls, and hence a model of firewall will have to be included as well.
In a practical setting, the number of Web components and their interactions will be large. There are also other factors such as resource constraints, load factor, and communication complexity. From a reliability point of view, we require a good formal model which takes these factors into account. In the formal model proposed in this paper the load factor and communication delays can be brought in as synchronization constraints, and resources can be modeled within each class (such as the Set in Browser class) and timing constraints may be imposed on database transactions. Calculation of transition probabilities for large evolving configurations involves multiplying fairly large matrices. The density of the transition probability matrix of a system depends on the number of transitions in the product matrix, which due to synchronization constraints, might be sparse. The sparsity of the matrix and the availability of very fast powering and multiplication algorithms for matrices may be used to speed up reliability calculation for changing configurations. One of our goals is to empirically evaluate the reliability model. This is one aspect of our ongoing study in metrics and measurements for real-time reactive systems.
References [AAM98] V.S. Alagar, R. Achuthan, D. Muthiayen. TROMLAB: A Software Development Environment for Real-Time Reactive Systems. Technical Report, (first version 1996, revised 1998), Concordia University, Montreal, Canada. [IEEE90] IEEE Standard Glossary of Software Engineering Terminology. IEEE Std 610.12.1990. [KT2001] Chaitanya Kallepalli, Jeff Tian. Measuring and Modeling Usage and Reliability for Statistical Web Testing. IEEE Transactions on Software Engineering, Nov. 2001 (Vol.27, No.11), pp.1023–1036. [O2002] Olga Ormandjieva. Quality Measurement for Real-Time Reactive Systems Ph.D thesis, Department of Computer Science, Concordia University, Montreal, Canada, January 2002.
17