An Operational Semantics for Java Eva Coscia and Gianna Reggio Dipartimento di Informatica e Scienze dell'Informazione Universita di Genova Via Dodecaneso, 35 { Genova 16146 { Italy fcoseva,
[email protected] http:nnwww.disi.unige.it
1 Introduction We have developed a formal semantics for a very meaningful subset of the Java language as a part of a more articulated project that would lead us to have a development method for some classes of reactive concurrent systems. By development we mean the process that, starting from an informal description of a system that a customer gives to an implementor, performs some kinds of activity transforming this description till a coding in a programming language (Java, in our case) is produced. In our approach, we want to integrate formal methods with expressive and adequate notations (graphical, for example) and with supporting tools; what will result is a semi-formal method, where not everything is formal, i.e., with mathematical foundations, but it is described a methodical and precise way to perform every phase of the process (see [1]). The method would cover every phase of the development: from a requirement description to a programming language encoding. This last phase should be, given a detailed design speci cation, automatic or, at least, quasi-automatic; this implies that we must propose \standard" implementations for the most common communication mechanisms, concurrency control strategies and so on, established in the design phase. We chose Java as the language for the implementation step of the method, since it is a new and interesting language, developed for real industrial applications, which combines features from objectoriented language, like C++, with some simpli cations. This results in a clear and quite simple language, widely spreading also in the academic world. There exist some works by other authors on formal semantics of some parts of the Java language but, for our purpose, we need to give our own semantics to the whole Java language. The reasons are essentially two: deeply understanding Java, even if w.r.t. other languages it is much more easy to use and has a clear informal description (see [4]); the activity of giving a semantics to Java obviously leads us to become familiar with its features; having a formal base for checking the correctness of the nal step of the method, transforming a design speci cation into Java code. Furthermore, the semantics must cover a signi cant part and not only a subset of the Java language. Our method is addressed to a large class of concurrent-reactive systems, so we need to deeply understand the power and the features of the whole language, to know how concurrency, interactivity and distribution can be managed and realized. We also want to develop a (semi)automatic translation from system speci cations to Java code and this code-generation phase needs a precise notion of equality between Java programs (see [2]). Thus, we need a semantics that is simple and intuitive and gives a formal counterpart to the Java speci cation in [4], by using the simplest mathematical foundations and tools. An operational semantics 1
Versione 0.1 { September 1, 1998
2
seems to be the correct way to have a formal but simple and natural description of Java. Our semantics is given in the SOS style of Plotkin ([7]). In our approach, multi-threaded Java programs are modeled by a labelled transition system (lts); given the lts, we can associate with each initial state a transition tree that is a labelled tree whose nodes are decorated by states and whose edges by labels; moreover between two nodes decorated, respectively, by s l and s0 there exists an edge labelled by l i s ?! s0 . Thus, an edge of the tree describes the execution of an atomic step in the computation, the branching of the tree models the non-determinism of the program. Inductive rules describe the transitions of the model and give the semantics of the language in a simple and intuitive way. Since it is not possible to simply associate a result with a multi-threaded Java program, it is absolutely necessary to nd out a correct observational semantics and, consequently, an equivalence notion to test if two programs are equivalent. We are currently working on this equivalence notion, taking advantage of the model de ned by the operational semantics: as we will see in the following, a labelled transition system not only gives an operational semantics but also describes what information about a program execution is observable from outside. We have analysed the most important communication means implemented in Java; in sec. 6 we give a semantics of some of these mechanisms, such as communication with an external user by standard input-output, read/write operations on les, use of sockets, etc. The whole amount of data exchanged by these operations is observable from an external user, thus, it must be described by our semantics: transitions associated with communication operations are labelled with the communicated data. Once we have described how a model associated to a Java program gives us information about interactions with the external environment, the equivalence relation is intuitive: roughly speaking, two programs are equal if they allow the same observations. Obviously, we need to have a semantics as simple as possible, that abstract from some aspects of the language, like, for example, priorities associated with threads. Every time it seems reasonable and correct, we try to avoid the description of some language aspects that could make the presentation of the semantics more \tricky" and less easy to be understood. In [2] we aim to prove, once a good and sensible equality notion has been developed, that the model associated by the semantics presented in this work and the model associated by a more \abstract" one are equals if restricted to some subclasses of Java programs. These classes are characterized by interesting properties, i.e., properties required to all the Java programs that are implementations of \real" systems and, thus properties guaranteed by programs developped by our method. The Java language does not have reached a \stable" point in its development, since many features are continuously added, yielding new releases. We would like to follow a modular approach in order to minimize modi cations and changes in the semantics due to new releases of the language. A good way, in our opinion, is to choose an initial subset of Java (called Java1) that we consider interesting and useful enough and give it a semantics. This rst sub-languages is interesting since contains all the object-oriented Java features. The main restriction is that Java1 programs are just programs without synchronization or I/O operations. When these other features are considered, they result in a richer sub-language (Java2) whose semantics is given modifying as less as possible the preceding one. Java2 introduces the mechanisms for thread synchronization. Finally, in Java3, we also consider I/O operations using les and streams. In this work we omit only Java features that are peculiar to distributed programming, like Remote Method Invocation and, consequently, Object Serialization and mechanism for interactivity by graphical interfaces, like event-handling.
A comparison with other semantics
If we compare our semantics for Java with the work of other authors, we can say that we essentially focus on the dynamic semantics of multi-threaded Java. In the case of pure sequential programs and for the static part of the semantics, we just apply the idea given in [8] and other works, even if here we extend the type system and the functions enriching syntactic terms with type information, as introduced in [8], to cover the whole Java sub-languages we have chosen. For a more detailed comparison of our static semantics with the one in [8], see sec.4.2.2. Our pragmatic approach to Java semantics is also dierent from the one adopted in [6, 5] where the use of event structures allows the semantics to get rid of many details about the execution order of certain
Versione 0.1 { September 1, 1998
3
operations, while we chose to be as descriptive as possible. While in [6, 5], the semantics of operations on working and main memories is given by axiomatizing the ordering rules given in [4], we explicitly describe such interaction in a detailed way, subsequently proving (see Appendix B) that such rules are satis ed. Besides, in [6, 5], operational semantics does not give any information about what can be observed from a program execution and, consequently, which could be a sensible notion of semantic equivalence between Java programs, that is essential from our point of view and that we are currently exploring (see [2]).
2 Static and dynamic semantics for Java As we said above, we need to develop a Java semantics giving a precise relation describing program equivalence. It is worthwhile to stress that our interest is not on the static semantics of Java, i.e., in a semantics that statically analyses a Java program and calculate information about well-formedness and types of terms. We need a semantics that describes the execution and gives a model for each Java program; we can call it the dynamic semantics of a Java program. However, as we will see in the following, a dynamic semantics for Java cannot be given apart from some information collected by a previous static analysis. But to develop a static semantics from scratch is behind the scope of our work, so we refer to the work [8] that is mainly focused on type semantics of Java. More precisely, we consider an extended sublanguage of Java that, w.r.t. the one in [8], contains also: local variables, class methods and elds, super expression and thread synchronization.
An operational, observational, structural approach to dynamic semantics
We need a semantics that is simple and intuitive and gives a formal counterpart to the Java speci cation in [4], by using the simplest mathematical foundations and tools. The reason is that we want to give a formal description of Java that must be easily understood and discussed by many people, especially by the users of our development method; we also need a formal base for proving the correctness of the coding step of the method and, perhaps, as a formal support to the use of tools within the method. A Java program is a sequence of classes; each class consists of a declarative part, describing which is the superclass and which are its components ( elds and methods), and an implementation part describing the bodies of the methods. A particular class contains a special method (main) that is automatically executed when the program starts. Our interest is on the semantics of the execution of this method, so we can think to separate the computational part (consisting of the main method and the other method bodies) from the declarative part of the program (consisting of classes and variable declarations) containing static information that cannot be modi ed along the execution of the program. This declarative part (called the environment) must be enriched with the declarations of all the standard prede ned classes of Java, described in [4]. The computational part (called the program) needs some type information that are necessary, for example, to evaluate eld access expressions and method calls (see 3.1). So we assume to have a function A that applies the type system rules to calculate information that must be added to each term. Then, the semantics of the computational part takes this annotated code and returns a transition system modeling the program (see g.1). While the static semantics is just an extension of the one developed by other authors, the dynamic one is largely dierent from the one in [8]. This semantics can be described as operational, observational and structural. We have chosen an operational approach, in the style of Plotkin's SOS [7], that models a Java program by a labeled transition system (LTS shortly), that is a triple (States,Labels,!) where States is the set of the meaningful intermediate states of the computations of the program, Labels is the set of the labels describing information exchanged with the external environment during the computation and ! States l Labels States is the transition relation describing how the computation proceeds: s ?! s0 means that the
Versione 0.1 { September 1, 1998
4
Java program
Declarative part
ρ
C PA
Computational part
P
static semantics
dynamic semantics
T LTS
observational equivalence
................
................
P’A
dynamic semantics
T’ LTS
Figure 1: Static and Dynamic Semantics program in the state s may perform a transition with the information exchange to the external environment described by l ending in the state s0 . We need to have a very precise semantics that gives a fully detailed description of a program execution. The transitions of the LTS model not just the execution of single statements, but also the low level actions that are involved in the execution of complex statements. For example, entering a synchronized block implies the execution of locking operations that do not correspond to Java statements. The choice of the information put on the labels of the system is crucial for the semantics, since it also determines what is observable from outside the program and, consequently, whether two programs are equivalent. The observation of data exchange involved in input-output operations performed over les and sockets seems to be a good choice. From our point of view, the Java support for non-sequential programming is one of the most interesting feature of this language, mainly because our method is addressed to the development of concurrent systems. So, we want a semantics that put a particular emphasis on describing the role of separate concurrent components inside a Java program. In a multi-threaded Java program we can identify several computation
ows, each one associated with a \particular" object, that is an instance of a thread class (we indicate with the term thread class the class Thread and all its subclasses). These objects have a particular run method, whose execution runs concurrently with the other ones, so we can call these objects active components of the program in contrast with the passive components, (classes and instances of non thread classes), that do not perform an independent activity and whose state is modi ed only by the action of the active components. Our semantics re ects this structure. The active components are modeled by labelled transition systems. The activity of the whole program is than de ned by means of the concurrent behaviour of the active components and by the status of the passive ones (see g.2). In our structural approach, classes have a semantics too, since classes have states (given by the value of class elds) and methods, as well as objects. Moreover, an explicit description of the class states is useful to record the identities of the instances created for a class.
Versione 0.1 { September 1, 1998
5 LTS
Java program
active component
passive component
Figure 2: A structured semantics
3 Peculiar aspects of Java Even if Java is quite a simple language, that mutuates many aspects and features from well-known OO programming languages, it is necessary to pay attention to some peculiarities that heavily impacts on the semantics we present in the following. First, we brie y describe how expressions to access les and invoke methods are evaluated. The mechanism is not as simple, since Java has instance and class elds and methods with overriding; late binding is allowed only for instances methods. Than, we describe how a main memory shared among concurrent components of a program can be managed; some low-level actions among this main memory and local working memories are forced to occur on certain orders.
3.1 Field Access and Method Invocation
In order to justify the choice of information that are annotated to terms by the static semantics, we brie y describe how a eld value or a method code can be accessed in Java and which restriction we impose to simplify the semantics.
Field access semantics Since we do not consider interfaces in this sub-language, an object cannot
have an instance eld f multiply inherited by a superclass and an interface. While in Java we can use expressions like C.f and I.f to access two dierent incarnations of the instance eld f inherited by the superclass C and by the interface I, in Java1 we assume that a subclass can only hide a eld f of its superclass, and that the only way to access the hidden eld is by super.f. A eld access expression may access a eld of an object, a reference to which is the value of either an expression or the special keyword super. The result of the expression is computed as follows: if the eld is a class eld, then the result is the class variable in the class that is the type of the expression; if the eld is an instance eld, then the result is the speci ed instance variable, having the same type as the evaluated expression, in the object referenced by the value of the expression. It is important to note that the eld is determined only by the type of the expression, not by the class of the object referenced at run time. If a eld f of a class S is hidden in a subclass T, then any object of class T has two elds named f. If the evaluation of an expression e leads to an object of class T but its type is S, then the value of the eld access e.f is the value of f declared in S, hidden by the declaration of T. There is no dynamic look-up for eld access in Java; late binding is only available for instance methods, not for instance eld access.
Versione 0.1 { September 1, 1998
6
A eld f is hidden in a subclass by the declaration of a eld with the same name; it can be accessed in any subclass instance by the eld access expression super.f. So, an object can have two elds with the same name. In order to access the right one, the object has two distinct identi ers for the eld f: simply f and super.f. Note that super.super.f is not a well-formed Java term.
Method call semantics Determining the method invoked by a method invocation expression is more
complicated than accessing a eld, because of the possibility of method overloading and instance method overriding. The main problems are to gure out which class has to be considered for the de nition of the method and which is the most speci c method, since there may be several method declarations for the same method name. Given the environment, together with a class to start from, a method name and the typed list of value for the method call, the static semantics automatically determines the most applicable method. This information is annotated to the terms representing method calls. The problem we have to solve is to gure out the correct class in which start searching the method. Java uses static method binding for class methods while it allows dynamic method binding for instance method. The generic form of a method invocation expression is: MethodInvocation: this.MethodName(ArgumentListopt ) Primary. Identi er(ArgumentListopt ) super.Identi er(ArgumentListopt) The method determination involves several steps, as schematized in the following: 1. The type of the expression preceding the method name is calculated and it is used to identify the compile-time class. 2. This compile-time class is searched for method declarations and the most speci c 1 one is chosen. These two steps can be performed at compile-time, so this information can be annotated to the term by static semantics. 3. If the most speci c method declaration is static (a class method), then overriding is not allowed and the most speci c method is the one to be invoked. Otherwise, if the most speci c method is an instance method, then the expression is evaluated and the value is the target object. If the method invocation is of the form super.Identi er(ArgumentListopt ), then the target object is the value of this. Then a dynamic method lookup is used. (a) Starting from the class of the target object or from its superclass if the method invocation used the keyword super. it is searched for a declaration of the method with the given number and type of parameters and the given return type. (b) If no method is found, then the search continue in the superclass of the class until a method declaration is found. The last two steps are performed at run-time, so the right choice of the method can be completed by the dynamic semantics. It is worthwhile to note that elds and methods overriding are dierent. Shadowed variables are maintained in every object of the given class and each one can be accessed using, for example, casting expression. On the contrary, overridden methods cannot be called by instances. Only the last method declaration for an overridden method can be used. There is no way to refer to the other one in the super-class hierarchy. A method declaration is applicable to a method invocation i the number of parameters in the declaration equals the number of argument expressions in the method invocation the type of each argument can be converted to the type of the corresponding parameter. A method declaration is more speci c than another one if any invocation handled by the rst method can be passed to the other one without any compile-time type error (se also Sect. ??). 1
Versione 0.1 { September 1, 1998
7 Lock
Unlock
Read
Thread
Main Execution
Use
Load Memory
Enginee Working
Store
Assign Memory
Write
Figure 3: Working and main memory interaction
3.2 Main memory and working memory
Java supports concurrent programming by means of threads that independently execute code. Such code operates on Java values and objects residing into a shared main memory. In order to make memory access more ecient, [4] says that every thread owns a private working memory in which it keeps working copies of all the variables it uses or assigns. When a thread uses a local copy of a variable, it performs a sequence of actions, some of which, for example when it assigns or uses a value, directly operate on variables. These actions are low level actions that do not correspond to Java statements. The master copies of variables are kept into main memory. Some rules (in [4]) describe when a thread is permitted to update its own local copy of a variable. Moreover, Java implements monitors that allow one single thread at a time to execute protected parts of code. For that purpose, for each object and class there exists a lock kept into the main memory. Every thread has to compete with the other ones in order to acquire a lock. Dierent actions are performed by the thread execution engine, by the working memory and by the main memory (just tee, wm and mm, respectively, in the following) (see g. 3). A tee can use and assign a variable; the wm can load a value from or store it into the mm, that, conversely, has the capability to send a value to the wm by a read action or to receive a value from wm by a write action. Every read is tightly coupled to a load. Every write is tightly coupled to a store. Moreover, both tee and mm can perform (in a tightly coupled way) lock and unlock actions. All the rules describing how these actions can be interleaved, are reported in appendix B. To give a precise modeling of this part of Java is sometimes hard and \tricky", for the great number of rules involved. Moreover, these \low-level" actions are a bit dierent from the rest of the semantics, as they do not describe execution of Java statements neither interactions among objects, but some mechanisms related to the Java Virtual Machine behavior. However, in [5] it is proved that it is not possible to abstract from them, since this asynchronous communication is actually observable. In [2], we propose a restriction to Java programs characterized by "good" properties, for which it is really possible to abstract from this interaction. This allows to give, for this restricted set of Java program, a more simple and natural semantics that skips away working memories and main memory components and all the related rules. As a positive consequence, the observational semantics and the equivalence notion becomes more intuitive and simpler.
4 Java1 Java1 is the subset of Java consisting of the following constructs: class declarations and inheritance, class elds and methods, instance elds and methods, primitive types, object creation by the new method, multi-threaded programming and thread management, plus the usual statements (assignment, conditional, return and concatenation). Method synchronization, arrays and interfaces are the main features that are
Versione 0.1 { September 1, 1998
8
not included in Java1.
4.1 Syntax
For simplicity, the complete syntax of Java1, as given in [4], is reported in Appendix 9.1.2. We found that some simpli cations do not change the expressivity of Java but helps to make the semantics easier. First of all, since we consider the \declarative" part to be completely disjoint from the computational one, the assignment expressions used for the initialization of variables and elds are not allowed. Furthermore, class eld access is allowed only by using class names and not instance references. Then we assume that the this expression is always explicitly indicated when accessing instance elds or methods (so we do not have an assignment of the form f = e, but this:f = e). Finally, the execution of a method-call is rendered by substituting the method invocation statement with the body of the method. When the method returns, the computation goes on by executing the statement following the method invocation. If the return instruction within a method body is not the last one, then it would be necessary to skip all the code after the return and before this instruction. So, for simplicity, we assume that the return statement is always the last statement in a method body.
4.2 Static Semantics
The compile time (static) semantics is kept separated from the dynamic one. The rst one looks for errors that can be detected using a sound type-system, before executing the program. In this phase it is also possible to enrich the program, by annotating some terms with the type information that will became necessary later, at run-time, by the dynamic semantics when evaluates, for example, eld access or method invocation expressions.
4.2.1 The type system and the annotation function
In Java1 it is possible to keep separate the type information (deduced by the declarations of variables, classes and methods) from the code. The environment contains type information, while the program contains the code.
Environment The environment is a sequence of class declarations. A class declaration contains the de nition of the inheritance relation for the class plus the eld and method declarations. Method declarations contain access modi ers and parameter types. In the following, ENV is the set of all possible environments as de ned in appendix; it is ranged over by metavariable . Programs The program is a sequence of class bodies, that, in turn, are sequences of method bodies. Each method body contains: the method identi er, the types of the arguments and of the result (that univocally identify a method) plus the block of method code with its local environment. PROG is the set of Java1 programs as de ned in appendix; it is ranged over by metavariable p. The static semantics that assigns types to terms, needs only information in the environment. In a similar way, we would like to use only the program for giving the dynamic semantics, discarding the environment. Unfortunately, some information calculated by the static semantics are used by the dynamic one too. The solution (see g.1) is to give the dynamic semantics on a \transformed" program, in which the original code is substituted by the equivalent annotated one, obtained by adding such type information to some terms. Enrichment of terms with type information is performed by the function: A: Java1ENV ! Java1A . A type can be associated with each Java1A terms, by a type system described in terms of an inference system, to prove assertions of the form ` t : T with the meaning that terms t has type T in the environment ; at the same time, these rules de ne the function A.
Versione 0.1 { September 1, 1998
9
4.2.2 Type Rules
Type rules are given in table 2.
Auxiliar de nitions We de ne the following syntactic and semantics sets for the metavariable we will use in the type system and in the following: CL ids: is the set of class names; ranged over by metavariable C; TYPE = CL ids [ fint; bool; charg: is the set of types as de ned in the appendix; ranged over by metavariable T; FIELD: is the set of eld names; ranged over by metavariable f; METHODS: is the set of method names; ranged over by metavariable m; IDENTIFIER is the set of identi ers; ranged over by metavariable id; STATEMENT is the set of Java1 statements as de ned in the appendix; ranged over by metavariable stm; skip; stm and stm; skip are simply denoted by stm; fskipg is abbreviated in skip INTEGER is the set of integer values; ranged over by metavariable i; BOOL = f true ; false g is the set of boolean values; ranged over by metavariable b; CHAR is the set of char values; ranged over by metavariable c; PRIMITIVE = INTEGER [ BOOL [ CHAR is the set of primitive type values; ranged over by metavariable pv; The main dierence w.r.t. [8] is that we consider local variables too. Thus we need to consider also this local environment when calculating the type of terms within a method block. So, the function A, when calculated on a method block, uses an environment of the form l ; , where l is the local environment of the block. The name of a local variable can appear both in l and in , so we de ne table lookup function (x), which returns the type of x in . If x is declared twice, then the type of the rst declaration is returned. In the following we use metavariables mod i ; mod 0i 2 fst ; nst g to indicate eld modi ers and classdecs as an abbreviation for a list of elds and method declarations f mod 1 f1 : T1; : : :; mod n fn : Tn; mod 01 m1 : M T1; : : :; mod 0k mk : M Tk g. Given an environment , with unique declaration for every identi er, (x) is de ned as follows: (x) = undef i = x1 : T1 : : : xn : Tn and 8i = 1; : : :; n; x 6= xi (x) = T i = 0 ; x : T; 00 and 0 (x) = undef (C) = C ext C0classdecs i = 0 ; C ext C0 classdecs ; 00 We also give some other de nitions for auxiliary functions used to extract information from the environment: for a method type MT : T1 : : : Tn ! T the argument types are de ned by: Args(MT) = T1 : : : Tn; the result type is: Res(MT) = T Given an environment , a class identi er C and a eld name f, FDec : ENV CL ids FIELD ! CL ids s.t. FDec(, C,f) returns the nearest superclass of C (possibly C itself) which contains a declaration of f, its type and its access modi ers (st or nst). If (C) = C ext C0fmod 1 f1 : T1; : : :; mod n fn : Tn; mod 1 m1 : M T1; : : :; mod k mk : M Tk g { FDec(, C,f) = (C, Tj , modj ) i f=fj ; { FDec(, C,f) = FDec(; C0 ; f) i f6= fj , 8j 2 f1; : : :; kg
Versione 0.1 { September 1, 1998 ` C v C; ` C v C if = ; C ext C : : : ; 0
0
0
`CvC `C vC `CvC
0
0
00 00
` Object v Object
00
10 `CvC ` C wdn C `CvC
0
` C wdn C
0
` C v Object
` nil wdn C
Table 1: Subclass and widening relations MDecs : CL ids METHODS ! Set(CL ids CL ids) returns all the classes and the signatures of the declarartions of a methos. MDecs(, C,m name) = f(C; MT j )jm name = m namej g[f(C00; MT 00 )j(C00; MT 00) 2 MDecs (; C0 ; m name); and 8j 2 f1; : : :; lgm name = m namej Args (MT j ) = Args (MT 00)g Given an environment , variable types T and Ti, i 2 f1; : : :; n + 1g, and identi er m name, the most special declaration is de ned as follows:
{ ApplMeths (; m name; T; T1 : : : Tn)=f(T0 ; MT 0 )j(T0; MT 0) 2 MDecs (; T; m name) and MT 0 = T01 ; : : :; T0n ! T0n+1 and ` Ti wdn T0i for i 2 f1; : : :; ngg { (T; T1 : : : Tn ! Tn+1) is more special than (T0 ; T01 : : : T0n ! T0n+1 ) i ` T wdn T0 and 8i 2 f1; : : :; ng; ` Ti wdn T0i ; T; T1 : : : Tn ) = f(T0; MT 0) 2 ApplMeths (; m name; T; T1 : : : Tn ) { MostSpec (; m name 00 00 and if (T ; MT ) 2 ApplMeths (; m name; T; T1 : : : Tn ) and (T00; MT 00) is more special than (T0 ; MT 0) then T00 = T0 and MT 00 = MT 0 g A value of type T can be assigned to a variable of type T0 if T can be widened to type T0 (written Twdn T0, where wdn is the widening relation. The de nition of the subclass v and of the widening relations among types are in table 1. The well-formedness relation on environments (3) requires that a class as well as a variable are declared just once, every class has a superclass, within a class declaration, two methods cannot have the same name and the same arguments, widening and subclass relations are acyclic. The well-formedness realtion is not formally given (see [8]). dare anche MostSpec
Comments to type rules Base: no information is annotated on primitive values and variables. BaseClass: the type of a class identi er is the class name; no additional information is annotated
to the term. Ass: an expression of type T can be assigned to a variable of type T0 if T0 can be widened to T; the assignment expression has void type. Ret1, Ret2: a return statement has void type if it does not return any expression, otherwise it has the type of the returned expression. Seq, Cond: a statement sequence has the type of the last statement; a conditional statement has void type, since both the statement lists have void type. New: a new object creation expression has the type of the class of the new object; the information needed when creating a new instance of a given class are recorded in the class itself, so the annotation function does not add information to a creation term. ` C v C indicates that C is de ned as a class in the environment.
Versione 0.1 { September 1, 1998 Base
if
z
11
` null :nil; ` true :bool; ` false :bool; ` i : integer; ` c : char; ` id : (id) Af(z; )g = z
is i or c or
BaseClass
id
or
false
`C:C Af(C; )g = C
or
true
or
if
(C) = C ext C
null
` t : T; ` e : T ` t = e : void Af(t = e; )g = (Af(t; )g = Af(e; )g)
0
0
Ass
Ret1
Seq
` return : void
0
`e:T
Ret2
Af(return; )g = return
` T wdn T
if
` return e : T
Af(return e; )g = return Af(e; )g
` stm : void; ` stm : T ` stm; stm : T Af(stm; stm; )g = Af(stm; )g; Af(stm; )g
New
` new C : C Af(new C; )g = new C
if
`CvC
` e : bool; ` stm : void; ` stm : void ` if e stm else stm : void Af(if e stm else stm )g = if Af(e; )g Af(stm; )g else Af(stm ; )g 0
Cond
0
0
InstanceFieldAcc
ClassFieldAcc
0
`t:T ` t:f : T Af(t:f;)g = Af(t;)g:[C; nst ]f
`C:C ` C:f : T Af(C:f;)g = C:[C ; st ]f
if
FDec(; T; f) = (C; T ; nst ) 0
if
0
FDec(; C; f) = (C ; T; st ) 0
0
` e : T i = 1; : : : ; n ` e1 :m name(e2 ; : : : ; e ) : Res(MT ) Af(e1 :m name(e2 ; : : : ; en ); )g = Af(e1 ; )g:[T; Args (MT ); mod ]m name(Af(e2; )g; : : : ; Af(en ; )g) n 1; MostSpec(; m name; T1 ; T2 : : : T ) = f(T; MT ; mod )g i
MethCall
i
n
if
n
` this : C l ; ; z1 : T1 : : : z : T ` stm[z1=x1 : : : z =x ] : T MethBody ` mBody : T1 : : : T ! T Af(mBody;)g = m name(T x1 : T1 : : : x : T )Af(f[l ]stmg; l ; )g if ` T wdn T, mBody = m name(T x1 : T1 ; : : : ; x : T )f[l ]stmg; x 6= this; i 2 f1 : : : ng, z1 : : : z are new variables in l ; n
n
n
n
0
n
n
0
n
n
n
i
n
; this : C; super : S ` mBodyi : MT i = 1 : : : m ` cBody : (C) Af(cBody;)g = C ext C fAf(mBody1;; this : C; super : S)g : : : Af(mBodym;; this : C; super : S)gg if ` 3; n 0; k 0; m 0, (C) = C ext C' : : : cBody = C ext C' f mBody1,:: : ,mBodymg , mBody = m is implem , (this) = undef, (super) = undef, Super(C,)= S i
ClassBody
0
i
i
i
` cBodyi : (C ) i = 1; : : : ; n ` p3 Af(p;)g = Af(cBody1; )g : : : Af(cBodyn; )g n 0; C1 ; : : : ; C are all the classes de ned in p, p = cBody1; : : : ; cBody i
ProgramBody
n
n
, cBodyi =
Table 2: Type System for Java1
Ci ext
:::
Versione 0.1 { September 1, 1998
12
: a term t must have a class type T that de nes or inherits from a superclass the eld with type (the type of the eld access expression); a eld access by an expression is correct only if the eld is declared to be non static (nst). FDec (; C; f) returns the class from which C (the class of the instance) inherits eld f and the type of the eld; the annotation function adds the information about the class in which the eld is declared and about the access modi er. ClassFieldAcc: similar to previous case, but a eld access by a class identi er is correct only if the eld in the class (or inherited by the class) is declared to be static (st). FDec (; C; f) returns the class from which C inherits eld f and the type of the eld; the annotation function adds the information about the class in which the eld is declared and about the access modi er. Let's remember that class eld access can be possible only by using class names, not by instance identi ers. So, a class eld access expression has the form: C.f. MethCall: a method can be applied to its parameters if the type of the actual parameters can be widened to the types of the corresponding formal ones; the annotation function adds to a method call term a triple (C; Args (MT ); mod ) where C is the class, Args(MT) are the argument types and mod is the access modi er of the most speci c method. MethBody: the annotation function adds the local environment of the method to the program environment . ClassBody: a class is well-formed if it gives the implementation for all the methods in its declaration. The annotation function adds the type of super to the local environment; the type of this and super can be calculated by applying the Base rule. Program: a program is well-formed if it contains a well-typed class body for each class in its declaration. We think that it is possible to give an operational semantics to Java1 that does not need this static annotating semantics. Indeed, it is possible to give rules that \calculate" the added type information \on the y", by means of the information stored in the global environment. In such case it would not be necessary to use two separate syntax for simple and annotated terms even if, as a counterpart, the environment cannot be discarded in the dynamic semantics. This approach \mixes" rules de ning the type system together with the rules for the operational semantics. Premises and side conditions for one rule of this \mixed" semantics are the union of premises and side conditions of a single operational rule and one or more type system rules. However, it easy to see that this approach makes more dicult to prove the correctness of the semantic rules, since it is less modular, while following the approach in [8] we can check separately the correctness of type association to terms and the correctness of operational semantics. There are other positive aspects in the separate approach. The rst one is that the annotations do not heavily transform the program, since it is not necessary to calculate type information for every term, but only the information used for eld accesses and method invocations. The other one is that the evaluation of the annotated program does not refer anymore to declarations, so the environment is used only in the static semantics and forgotten in the run-time one. [8]. In [8], JavaS is a subset of Java that lacks some important features, including: class variables and methods, local variables, super and concurrency that are included in Java1. Information that annotate terms of Java1A are more complex that the ones used in JavaSE (used in [8]), since in Java1 we consider also local variables, class elds and methods. For example, when evaluating a method call, it is necessary to know if the most speci c method calculated at compile time is a class or an instance method. In the rst case, also the information about the class of the most speci c method must be available. Moreover, in Java1 there is also the super construct, that can be used both in eld access and in method invocation expressions. InstanceFieldAcc f T0
4.3 Semantics
We have chosen an operational approach that models a Java program by a labeled transition system (LTS shortly), that is a triple (States,Labels,!) where States is the set of the meaningful intermediate states of the computations of the program, Labels is the set of the labels describing information exchanged with the
Versione 0.1 { September 1, 1998
13
external environment during the computation and ! States Labels States is the transition relation l describing how the computation proceeds: s ?! s0 means that the program in the state s may perform a transition with the information exchange to the external environment described by l ending in the state s0 . The transitions of the LTS model not only the execution of each single statement, but also the low level actions that are involved in the execution of complex statements. For example, entering a synchronized block implies the execution of locking operations that do not correspond to Java statements. In a multi-threaded Java program, instances of a thread class execute code concurrently, i.e., change their status and access and modify objects so we can call these objects active components of the program in contrast with the passive components, that do not perform an independent activity and whose state is modi ed only by the actions of the active components. The activity of the whole program results by combining together the activity of these active components. The state of the program, i.e., the value of its component at dierent times along the computation result by the combining the value of class variables, thread and non-thread object variables. As a consequence, we model active components by LTS and passive object as, essentially, memory functions. The LTS of the whole program is de ned by combining the models of the active and passive components. We also need to describe the interaction among the local and the main memories. Obviously, the local memory is part of each threads' state, while the main memory is \distributed" into each object and class. In order to properly model actions of threads' execution engines, working memories and main memory, we need to use a main memory manager MmMng, whose states are modeled by a stack structure that records occurrences of some of these actions. For example, a thread's working memory is allowed to load a value for a variable i there exists a corresponding read action performed by the MmMng; moreover, there is no other previous load action coupled to the read one. It is worthwhile to note that transition relations for the lts' in the following are annotated with a metavariable p denoting the program to be modeled. Let us remember that the program contains the bodies of the methods and thus it is necessary at run-time to give the dynamic semantics; since it remains unchanged along the code evaluation, it is used for labeling the semantic arrow. We use the following sets of semantic values for (active and passive) components identi ers: TH ids: is the set of thread identi ers; ranged over by metavariable tid ; OB ids: is the set of (non-thread) object identi er; ranged over by the metavariable oid ; COMP ids = TH ids [ OB ids [ CL ids: is the set of generic component identi ers; ranged over by the metavariable gid ; NonCL ids = TH ids [ OB ids: is the set of non-class identi ers; ranged over by the metavariable ncid ; Thread and object identi ers contain also information about the class they are instances of; such information can be extracted by the function Class() : NonCL ids ! CL ids. Now we can de ne the set of all possible values for an identi er to be: VALUE = COMP ids [ PRIMITIVE [ fnullg; ranged over by metavariable v. null denotes a reference to a special \null" object. We de ne dierent LTS' modelling threads, memory managers and the whole program.
4.4 Thread Semantics
The semantics of threads is given by the lts T LTS=(T states 1 , T labels 1, ??? > p 1 ). The state of a thread in each instant, during program execution, can be modeled by the code it has to execute and by the value of its local and global variables. Furthermore, it is necessary to know the status of the thread, that can be started, stopped or suspended. Each thread state contains information to describe an intermediate state: the identi er and the class, the code to be executed, the values of local variables, the contents of its working memory and the values of its instance elds. In the following, we use f : D !T C to denote a total function from D in C, and f : D !P C to denote a partial one.
Versione 0.1 { September 1, 1998
14
T states1 = TH ids STATEMENT Status W MEMORIES MEMORIES Stacks,
where : W MEMORIES is the set of working memories, that is of total functions wm : (COMP ids FIELD TYPE) !T Wm Values; Wm Values = fundefg [ (VALUE Ss ) where Ss = fA; S; Lg is the set of elements returned by evaluating a variable in the local working memory; each element contains the value of the variable and the status of the value, that could be L if the variable has been loaded but not stored, A if it has been assigned and not still stored, S if it has been stored; emp wm, denoting the empty working memory, is the constant function always returning undef. MEMORIES is the set of memories, that is of partial functions m : FIELD TYPE !P VALUE; by m[(f; T) ! v] it is denoted the updating of m that associates v to f and behaves as m on the other couples. We will use the same notation for updating operations on other functions. Stacks is the set of stacks of local memories, where a local memory is an element of the set Locals and denotes a partial function lm : IDENTIFIER ! VALUE; ranged over by metavariable lmS; em mS denotes the empty stack of local memories. Status = fNew; PreStop; Run; Stop; Suspg is the set of all possible status a thread can assume: { New: not yet started; it has been created but not started by dome other thread. { PreStop: stopped but not started; it has been stopped while it was waiting to be started. { Run: runnable; it has already been started and it is executing its run method. { Stop: it has terminated the execution of its run method or another thread stopped it; { Susp: suspended; it has been suspended by some other thread while it was executing; it will remain in such state until another thread resume it. l
T labels 1 = LowLevelLabs [ MCallLabs [ ThreadControlLabs [ f g There are four dierent kinds of labels, describing respectively: low level actions (LowLevelLabs), method calls (MCallLabs), thread state control actions (ThreadControlLabs) and a generic internal activity. The labels are n-uples, but we write label(x1 ; : : :; xn) instead of (label; x1; : : :; xn). We brie y analyse every label (in the following V arArgs COMP ids TYPE FIELD): LowLevelLabs = fLoad; Storeg V arArgs VALUE; assign and use actions are internal action performed by the tee(thread execution engine), so there is no information to send outside; load and store, on the contrary, have a corresponding label: { Load(gid; T; f; v):a thread loads the value v of a eld f of type T from an object gid in the wm. { Store(gid ; T; f; v): a thread stores a value v of the eld f of type T for an object gid in the MmMng. MCallLabs = (fNewg CL ids NonCL ids)[ (fCallOg NonCL ids METHODS Args MET str)[ (fCallC g CL ids METHODS Args )[ (fSelfCallg METHODS Args ); { New(C; ncid): a new instance creation call to class C which returns the identi er ncid; { CallO(ncid; m name; args; mst): a call to method m name of the ncid object with the types of arguments in args which returns the method declarations structure mst; by MET str = STATEMENT Locals we denote the set of such method structures, { CallC(C; m name; args ): a call to method m name of the C class with types of the arguments in args ; the correct method declaration structure is not part of the information exchange described by this label, since it can be extracted from the program by the C and args information given by static semantics; { SelfCall(m name, args): a call to method m name of the thread with argument types in args ;
Versione 0.1 { September 1, 1998
15
ThreadControlLabs = fSend; Receiveg fStartT; Stop; Suspend; Resumeg TH ids; { Send(StartT; tid ); Send(Stop; tid ); Send(Suspend; tid ); Send(Resume; tid ): a component starts,
stops, suspends or resumes a thread tid . { Receive(StartT; tid ); Receive(Stop; tid ); Receive(Suspend; tid ); Receive(Resume; tid ): the thread tid changes status; Internal action: (usually omitted in the rules below).
Transition relation ??? > 1 : T states 1 T labels 1 ! T states 1 is de ned by the inductive rules below. p
In the rules, all non-meaningful components of the thread states are omitted from con gurations. For example, < stm; lmS > stands for < tid ; stm; tSt; wm; m; lmS >. Auxiliary operations: We de ne some operations to access memory stacks and local memories. Top, Pop and Push are the obvious operations on stack; CurEnt : Stacks ! COMP ids characterized by CurEnt(lmS) = Top(lmS)(this) returns the current object on a local environment lm; Eval : Stacks IDENTIFIER ! VALUE characterized by: Eval(lmS; id) = Eval(Pop(Stack); id) if Top(Stack)(id) = undef, Eval(lmS; id) = Top(Stack)(id) otherwise, returns the value of a local variable id in the local memory on the top of the stack lmS; ModifyMem : V arArgs MEMORIES ! MEMORIES s.t.: ModifyMem(f; C; v; m) =m[(f; C) ! v] MethodDec : METHODS Args TYPE PROG ! MET str returns the correct method structure, given the method name and signature, that contains the statements and local environment for a method call. We use auxiliary operations: Code : MET str ! STATEMENT s.t. Code(mst) returns the code in the method declaration mst; Loc : MET str ! Locals s.t. Loc(mst; v list; gid ) returns the local memory for the method described by mst; v list and gid are used for initializing, respectively, the value of formal parameters and of this reference. When evaluating method calls and assignments, it is necessary to distinguish when a term is ground: we distinguish between r ground term (term that are on the right part of an assignment and cannot be rewritten) and l ground term (term on the left part). A term t is said: r ground if t is a primitive value or t gid for some identi er gid ; l ground if t is r ground or t id for some identi er id or t y:f for some reference variable y and eld name f. Furthermore, we say that a list of terms is r ground if every element is a r ground term. In the following, some rules have a name, that will be used in Appendix B, to easily refer them when proving that this semantics satis es rules described in Chap.17 of [4].
Versione 0.1 { September 1, 1998
16
Expressions Semantics
The evaluation of an expression leads to a value. An expression is evaluated from left to right. Field Access From now on, when the identi er and the status are omitted, we assume that their value is, respectively, tid and Run; working memory and other components will also be omitted in some states, whenever they are not meaningful. { Evaluation of a local identi er; < id; lmS > ??? > p 1 < Eval(lmS; id); lmS > { Evaluation of this; < this; Run > ??? > p 1 < CurEnt(lmS); Run > l
< e; wm; m; lmS > ??? > p 1 < e0; wm0; m0 ; lmS 0 > l < e:[T; mod ]f; wm; m; lmS > ??? > p 1 < e0:[T; mod ]f; wm0; m0; lmS 0 >
{ Evaluation of an object: the thread looks for its value in the working memory; such value cannot be unde ned.
< gid :[T; mod ]f; wm > ??? > p 1 < v; wm > if 9x:wm(gid; T; f) = (v; x)
{ Evaluation of a eld accessed by super; the value cannot be unde ned. < super:[T; mod ]f; wm; lmS > ??? > p 1 < v; wm; lmS > if CurEnt(lmS) = gid; 9x:wm(gid; T; f) = (v; x) Creation Expression { A thread receives the reference for a new object; New(C; )
< new C > ????? > p 1 < nl > Method Call Expressions A method call is handled by pushing the corresponding code on the current code stack; moreover, a new local memory is pushed into the stack. In the following, v is a r ground value, v list is a list of r ground terms and args 2 Args is a list of types of arguments for a method, while info is an abbreviation for (T; args ; mod ). A method declaration structure, returned by MethodDec, contains both the code and the local memory for the method. { First of all, the expression that receives the method call is evaluated; l < e; wm; m; lmS > ??? > p 1 < e0; wm0 ; m0; lmS 0 > l < e:[info ]m name(p list); wm; m; lmS > ??? > p 1 < e0:[info ]m name(p list); wm0 ; m0; lmS 0 > { Then the parameters are evaluated; l < e1 ; wm; m; lmS > ??? > p 1 < e01 ; wm0; m0 ; lmS 0 > l < gid :[info]m name(v1 ; : : :; vi ; e1; : : :; en); wm; m; lmS > ??? > p 1 < gid :[info]m name(v1 ; : : :; vi; e01; : : :; en); wm0 ; m0 ; lmS 0 > if v1; : : :; vi are r ground { When the method is static, i.e., there is no dynamic binding, it is not necessary to know the class of nl
the object and the method code is calculated using the annotations on the program. CallC (T;m name;args) < gid :[T; args ; st ]m name(v list); lmS > ?????????? > p 1 < fstmg; Push(lm; lmS) > if MethodDec(m name; args; T; p) = mst; stm = Code(mst); lm = Loc(mst; v list; gid) { If the method is not static, it is necessary to ask the object, whose class is not known, to send the correct
Versione 0.1 { September 1, 1998
17
code for the method. CallO(ncid ;m name;args;mst) < ncid :[T; args ; nst ]m name(v list); lmS > ?????????????? > p 1
< fstmg; Push(lm; lmS) > if ncid 6= tid , stm = Code(mst), lm = Loc(mst; v list; ncid) { A self method call; it is not just an internal action, because when introducing synchronization (see Java2 ), it is necessary to distinguish this action w.r.t. other internal ones; SelfCall(m name;args) < tid :[T; args ; nst ]m name(v list); lmS > ??????????? > p 1 < fstmg; Push(lm; lmS) > if mst = MethodDec(m name; args ; C; p), C = Class(tid ), stm = Code(mst), lm = Loc(mst; v list; tid ) { A call for a method of the superclass; there is only one rule, since static and dynamic type of the expression are the same. The value of the current object, that is the object whose method is executed, is the value of this in the new local memory pushed on the stack. CallC (T;m name;args) < super:[T; args ; mod ]m name(v list); lmS > ?????????? > p1 < fstmg; Push(lmS; lm) > if gid = CurEnt(lmS); MethodDec(m name; args ; T; p) = mst; stm = Code(mst);
lm = Loc(mst; v list; gid )
Statements Semantics
Assignment Assignments are evaluated from left to right. The left-hand side is evaluated until is l ground, then the right-hand is evaluated, till obtaining a primitive value or a reference to an object. { This rule ensures that expressions like x or gid .f are not further evaluated when are on the left-hand side of an assignment.
l
< e1; wm; m; lmS > ??? > p1 < e01; wm0 ; m0; lmS 0 > l < e1 = e2; wm; m; lmS > ??? > p1 < e01 = e2; wm0; m0 ; lmS 0 > if e1 not l ground l
< e; wm; m; lmS > ??? > p 1 < e0; wm0; m0 ; lmS 0 > l < l = e; wm; m; lmS > ??? > p 1 < l = e0; wm0 ; m0; lmS 0 > if l is l ground
{ The assignment of a local identi er; < id = v; lmS > ??? > p 1 < skip; Push(lm[id ! v]; Pop(lmS)) > if lm = Top(lmS) TAssign { The assignment of a component eld modi es its value into the wm; < gid :[T]f = v; wm > ??? > p 1 < skip; wm[(gid ; T; f) ! (v; A)] > { The assignment of a eld accessed by super modi es its value into the wm. < super:[T]f = v; wm; lmS > ??? > p 1 < skip; wm[(CurEnt(lmS); T; f) ! (v; A)]; lmS > Statements concatenation l
< stm1; tSt; wm; m; lmS > ??? > p 1 < stm01; tSt0 ; wm0 ; m0; lmS 0 > l < stm1 ; stm2; tSt; wm; m; lmS > ??? > p 1 < stm01; stm2 ; tSt0 ; wm0; m0 ; lmS 0 > if - else
Versione 0.1 { September 1, 1998
18 l
< e; wm; m; lmS > ??? > p 1 < e0; wm0 ; m0; lmS 0 > l < if e stm else stm0; wm; m; lmS > ??? > p 1 < if e0 stm else stm0 ; wm0; m0 ; lmS 0 > < if true stm else stm0 > ??? > p 1 < stm > < if false stm else stm0 > ??? > p 1 < stm0 > Return l
< e; wm; m; lmS > ??? > p 1 < e0; wm0 ; m0; lmS 0 > l < return e; wm; m; lmS > ??? > p 1 < return e0 ; wm0; m0 ; lmS 0 > { The value of the expression is returned; the local memory is popped up from the stack; if a method invocation is used as a statement and returns a value, then such value is ignored: we assume that v; stm = stm
< return v; lmS > ??? > p 1 < v; Pop(lmS) > if v is r ground, Pop(lmS) 6= em mS l
{ No value is returned if there is no expression to be evaluated; the local memory is popped up from the stack;
< return; lmS > ??? > p 1 < skip; Pop(lmS) > Block l
< stm; wm; m; lmS > ??? > p 1 < stm0; wm0 ; m0; lmS 0 > l < fstmg; wm; m; lmS > ??? > p 1 < fstm0g; wm0; m0 ; lmS 0 >
Low level actions TLoad1 { The thread may load a value if the variable was unde ned in the wm otherwise it has been
stored; the wm is modi ed.
Load(gid ;T;f; ;tid)
< wm > ????????? > p 1 < wm[(gid ; T; f) ! (v; L)] > if 9v0 :wm(gid; T; f) = (v0 ; S) or wm(gid ; T; f) = undef. TStore { The thread may store a value if the variable has been assigned before that store action; the wm v
is modi ed.
Store(gid ;T;f; ;tid )
< wm > ????????? > p 1 < wm[(gid ; T; f) ! (v; S)] > if wm(gid ; T; f) = (v; A): v
Thread control statements semantics
For simplicity, we omitt the annotations of these method calls (besides, they are instance methods, so to resolve them it is not necessary to consider static information). In the following, a runnable thread is a thread that has already been started and can execute its run method; a pre-stopped thread is one that received a stop signal before being started. Start
caller { A thread tries to start itself: it is ignored; < tid ; tid :start() > ??? > 1 < tid ; skip > p
{ A thread sends a start message 0to another one; Send(StartT;t ) < tid ; t0id :start() > ????????id> p1 < tid ; skip > if t0id = 6 tid
Versione 0.1 { September 1, 1998
19
receiver : a thread is started; { A started or stopped thread does not change its state; Receive(StartT;tid) < tid ; tSt > ?????????? > p 1 < tid ; tSt > if tSt 62 fNew; PreStopg { A non-started but pre-stopped thread stops immediately; Receive(StartT;tid) < tid ; PreStop > ?????????? > p 1 < tid ; Stop > { A non-started thread becomes runnable; Receive(StartT;tid) < tid ; no stat; New; em mS > ?????????? > p 1 < tid ; stm; Run; lm > if mst = MethodDec(run; emptyl; Class(tid ); p); stm = Code(mst); lm = Loc(mst; emptyl; tid ) where emptyl is the empty list of parameters. l
Stop
caller { A thread stops itself;
< tid ; tid :stop(); Run > ??? > p 1 < tid ; skip; Stop >
{ A thread sends a stop message0 to another thread; Send(Stop;t ) < tid ; t0id :stop() > ???????id> p 1 < tid ; skip > if t0id = 6 tid
receiver { A non-started thread becomes pre-stopped; Receive(Stop;tid) < tid ; no stat; New > ????????? > 1 < tid ; no stat; PreStop > p
{ A pre-stopped thread does not change its state; Receive(Stop;tid) < tid; no stat; PreStop > ????????? > p 1 < tid; no stat; PreStop > { A started or stopped thread is stopped; Receive(Stop;tid) < tid; stm; tSt > ????????? > p 1 < tid; no stat; Stop > if tSt 62 fPreStop; Newg Suspend
caller { A thread suspends itself;
< tid ; tid :suspend(); Run > ??? > p 1 < tid ; skip; Susp >
{ A thread sends a suspend message to0 another thread; Send(Suspend;t ) 6 tid < tid ; t0id :suspend() > ?????????id> p 1 < tid ; skip > if t0id =
receiver { A non-running thread does not change its state; Receive(Suspend;tid) < tid ; tSt > ??????????? > 1 < tid ; tSt > if tSt 6= Run p
{ A running thread becomes suspended; Receive(Suspend;tid) < tid ; Run > ??????????? > p 1 < tid ; Susp > Resume
caller { A thread ignores a resume message for itself; < tid ; tid :resume() > ??? > 1 < tid ; skip > p
{ A thread sends a resume message to0 another thread; Send(Resume;t ) 6 tid < tid ; t0id :resume() > ?????????id > p 1 < tid ; skip > if t0id =
Versione 0.1 { September 1, 1998
20
receiver { A non-suspended thread does not change its state; Receive(Resume;tid) < tid ; tSt > ?????????? > 1 < tid ; tSt > if tSt = 6 Susp p
{ A suspended thread becomes running again; Receive(Resume;tid) < tid ; Susp > ?????????? > p 1 < tid ; Run >
4.5 Main Memory Manager semantics
The Main Memory Manager (MmMng) models the behaviour of the main memory that can perform lowlevel actions on variables, as depicted in g.3. Since MmMng may asynchronously read a value for a thread and send it to wm or write a value received from wm, it is not just a passive component of the program. It owns some capabilities that need to be modeled by an LTS. Essentially, the MmMng internal state is a stack of elements describing the occurrences of read actions performed by the MmMng and of store actions performed by threads. The MmMng, depending on the elements in the stack may: modify the value of an object, by a write action, if there is the corresponding store in its state; send a value to a thread if there is the corresponding read in its state. The semantics of the MmMng is given by: MmMng LTS= (MmMng states 1 , MmMng labels 1 , ??? > p 1 ). MmMng states 1 is set of stacks of Actions elements that are n-uple in: fR; S g ActArgs where we use the short notation ActArgs = (V arArgs VALUE TH ids). We use the : notation a1 : a2 : : : instead of a1 a2 : : : to make more clear the separation of the elements in the stack. We will use the following operations on MmMng states 1 : In : Actions MmMng states 1 is a predicate that checks if an action is in the stack; Cancel : Actions MmMng states 1 ! MmMng states 1 it cancels the oldest occurrence of an action in the stack; Cancel(a; Mm) = Mm1 : Mm2 if Mm = Mm1 : a : Mm2 and :In(a; Mm1 ); Cancel(a; Mm) = undef otherwise. RecCancel : ActArgs MmMng states 1 ! MmMng states 1 to cancel any element (if there exists) of the stack diering from the action parameter at most in the value of the variable; If Mm = a0 : Mm1 then RecCancel(a; Mm) = RecCancel(a; Mm1 ) if Similar(a; a0 ) RecCancel(a; Mm) = a0 : RecCancel(a; Mm1) otherwise where Similar((a name; var args; v; tid ); (a name; var args; v0; tid )), a name ranges over fR; S g, var args ranges over V arArgs. Push : Actions MmMng states 1 ! MmMng states 1: to push an element on the top of the stack: Push(a; Mm) = Mma MmMng labels 1 = fWrite; Read; Loaded; Storedg ActArgs: Write and Read describe the writing and the reading of a value, while Loaded, Stored describe operations to cancel or introduce elements into the stack.
Transition relation
The transition relation ??? > p 1 is de ned by the rules below: MmMngRead the MmMng may perform a read action on an object eld for a thread only if there is not any previously store action for such eld and thread, that has to be written in the object; Read(gid ;T;f; ;tid ) Mm ????????? > p 1 Push(R(gid; T; f; v; tid); Mm) if 6 9v0 :In(S(gid ; T; f; v0; tid ); Mm) v
Versione 0.1 { September 1, 1998
21
MmMngWrite a write action cancels the oldest store action for the same variable and thread; Write(gid ;T;f; ;tid) Mm ?????????? > p 1 Cancel(S(gid ; T; f; v; tid); Mm) MmMngLoaded a load into some thread's wm cancels every read actions for a given variable and v
thread; Loaded(gid ;T;f; ;tid) Mm ?????????? > p 1 RecCancel(R(gid ; T; f; v; tid); Mm) if In(R(gid; T; f; v; tid); Mm). MmMngStored a value for a given variable and thread is stored into the stack and all the read actionsStored for the same variable and thread are cancelled. (gid ;T;f; ;tid) 1 Mm ?????????? > p Push(S(gid ; T; f; v; tid); Mm0) if Mm0 = RecCancel(R(gid; T; f; v; tid); Mm) v
v
4.6 Classes and objects states
We decide to explicitly describe also classes in the program states, since class elds can be accessed by any instance of the class, but there exists exactly one incarnation 2 of it: in order to preserve its consistency the value can be read and changed only by messages sent to the class. Class and object states are simpler then threads' ones, since they lacks information related to code execution and just contain information on the identity (identi er and class name for an object or just the class name for a class state) and on the value of instance (for an object) or class (for a class) elds. O states; C states denote, respectively, the set of states for objects and classes; G states = Pass states1 [ T states 1 Pass states1 = O states [ C states is the set of all the passive component states.
Object State O states = OB ids MEMORIES.
An object state < oid ; m > consists of: { oid : the object identi er (plus the class information); { m : the memory; Class State C states = CL ids Set(OB ids) MEMORIES MEMORIES. A class state < cid ; Sid ; lmi ; m > consists of: { cid : the class identi er; { Sid : a set of identi ers of previously created objects of this class. { lmi : the initial memory for an instance of the class, used for creating new instances. { m : the memory;
From now on, we will use the following variables: tc 2 T states 1 ; cc 2 C states; oc 2 O states; gc 2 G states; pc 2 O states [ C states; nc 2 O states [ T states 1 ; Mm 2 MmMng states 1 .
4.7 Program Semantics
The semantics of programs is given by means of P LTS=(P states 1 , P labels 1 , ?! p 1 ).
P states 1 = Set(T states 1 [ Pass states) MmMng states 1
A program state consists of a set of thread and passive component states (that is a set of object and class states) plus a memory manager. We use the short notation gc1 j : : : jgcnjMm for < f1; : : :; gcng; Mm >. Sometime we also use the short notation gc1 j : : : jgcj jS jMm (where gci 2 T states 1 [ G states and S 2 Set(T states 1 [ G states) denotes a generic set of both active and passive components), when we want to enlighten only some components, mixing together the other active and passive ones. 2
It is part of the Java terminology, as used in [4]
Versione 0.1 { September 1, 1998
22
Auxiliary operations Auxiliar operations on program states are: IdInfo : O states ! OB ids characterized by IdInfo(< oid ; m >) = oid ; returns the identi er of
the component (analogous for class and thread states); Mem : O states ! MEMORIES s.t. Mem(< oid ; m >) = m returns the memory of the component (analogous for class and thread states); Status : T states 1 ! Status characterized by: Status(< gid ; stm; tSt; m; lmS >) = tSt; OidSet : C states ! Set(OB ids) characterized by: OidSet(< cid ; Sid ; lmi ; m >) = Sid returns the list of instance identi ers kept in a class state; NewO : C states 8 ! O states characterized by: if cc =< C; Sid ; lmi ; mC > and < < gid ; lmi > C is not a thread class NewO(cc) = : < gid ; no stat; New; emp wm; lmi ; em mS > otherwise with Class(gid ) = C l
P labels 1 =f g
The labels describe information made available to an external observer at each step of the computation, i.e., every kind of interaction a Java program can have with the external environment. Since Java1 does not contain I-O operations, we can have only the label describing transitions without information exchange with the environment (usually omitted);
Transition relation ?! 1 is de ned by the rules below. p
Internal Action tc ??? > p 1 tc0 tcjS ?! p 1 tc0jS New
PNew {
New(C;ncid )
tc ?????? > p 1 tc0 tcj < C; lmi ; Sid ; m > jS ?! p 1 tc0j < C; lmi ; Sid [ ncid ; m > jNewO(ncid ; < C; lmi ; Sid ; m >)jS if ncid 62 Sid Method Call { A thread starts an instance method call; CallO(ncid ;m
;args ;MethodDec(m name;args;Class(oc);p))
tc ?????????????????????????????? > p 1 tc0 tcjocjS ?! p 1 tc0jocjS if IdInfo(oc) = ncid name
{ A thread starts a class method call or a self method call; these two labels will be more useful in Java2 when a call for a synchronized method cannot be treated as an internal action of the thread. CallC (C;m
;args)
tc ?????????? > p 1 tc0 tcjS ?! p 1 tc0 jS name
{ A thread starts a self method call; SelfCall(m name;args ) tc ??????????? > p 1 tc0 tcjS ?! p 1 tc0jS
Versione 0.1 { September 1, 1998
23
Low level actions PLoad { A thread loads a value that has been previously read; the read is cancelled from the MmMng. Load(gid ;T;f; ;tid)
Loaded(gid ;T;f; ;tid)
tc ????????? > p 1 tc0 Mm ?????????? > p 1 Mm0 tcjMmjS ?! p 1 tc0 jMm0jS PStore { A thread stores a value ; the store is added to the MmMng. Store(gid ;T;f; ;tid) Stored(gid ;T;f; ;tid ) tc ????????? > p 1 tc0 Mm ?????????? > p 1 Mm0 tcjMmjS ?! p 1 tc0 jMm0jS { A read action of the mm is added to the MmMng; Read(gid ;T;f; ;tid) Mm ????????? > p 1 Mm0 gcjMmjS ?! p 1 gcjMm0 jS if IdInfo(gc) = gid ; Mem(gid) = m; v = m(T; f) PWrite { A write action modi es the eld value of an object; Write(gid ;T;f; ;tid ) Mm ?????????? > p 1 Mm0 gcjMmjS ?! p 1 gc0jMm0 jS if IdInfo(gc) = gid ; Mem(gid) = m; gc0 = ChangeMem(gc; ModifyMem(f; T; v; m)) where ChangeMem(gc; m) returns a component that diers from gc only in the memory, that in the new component is m. Thread Control { A thread sends a message to a dierent thread that reacts to it. v
v
v
v
v
v
Send(l;tid )
Receive(l;tid )
tc1 ??????2 > p 1 tc01 tc2 ???????2> p 1 tc02 tc1jtc2jS ?! p 1 tc01jtc02 jS
Versione 0.1 { September 1, 1998
Block
24
l ; ` stm : T ` f[l ]stmg : T Af(f[l ]stmg; )g = f[l ]Af(stm; l ; )gg
SyncStat
l ; ` stm : T ` synchronized(e)f[l ]stmg : T Af(synchronized(e)f[l ]stmg; )g = synchronized(Af(e; )g)fAf([l ]stm; )gg Table 3: New rules for the type system of Java2
5 Java2 Java2 is an enrichment of Java1 obtained by considering method synchronization and thus the methods , , , . Another mechanism for the synchronization is a particular statement, the statement, used to ensure that a block of statements is executed by only one thread at a time.
wait join notify notifyAll synchronized
5.1 Syntax and Static Semantics
The statement syntax of Java2 is the one of Java1 enriched by:
Statement: SynchronizedStatement SynchronizedStatement: synchronized(Expression)Block FieldModi er rule is replaced by FieldModi er: static, synchronized
The type inference system is enriched by rules for blocks and for synchronization statement, reported in table 3. The rule SyncStat only adds the local environment to the block. The rule MethodCall is left unchanged, but now we can have info = synchronized (abbreviated to sy in the annotations, as well as st that stands for static).
5.2 Semantics
Java2 diers from Java1, since a method can be declared as synchronized. This means that just one thread at a time can execute such method. In order to prevent multiple calls to synchronized methods from dierent threads, a lock is associated with each object or class (for synchronized class methods). Any thread must acquire such lock before starting the execution of a synchronized method. If an object has more than one synchronized method, then just one of its methods may be executed at a given time. One thread can obtain many times the lock for the same object, so, if it performs n locking operations on a lock, then it has to perform n unlocking operations before relinquishing it. The MmMng keeps track of the lock actions performed by threads, so that it is possible to control the correct matching among unlock and lock actions.
5.2.1 Thread Semantics
The semantics of threads in Java2 is given by T LTS2= (T states 2, T labels 2, ??? > p 2 ). T states 2
The con gurations are modi ed only in the status component that can assume dierent values. The new set of status elements is Status2 = Status [ (fAwakeng COMP ids)[
Versione 0.1 { September 1, 1998
25
(fReling List(COMP ids INTEGER) Status) [ (fWaitg COMP ids INTEGER): Awaken(gid ) says the thread has been noti ed but it is still not running; it has to acquire the lock for gid again. Relin(ll; s) says the thread has to relinquish locks in ll before becoming s Wait(gid ; n) says the thread has been suspended on object gid while executing a synchronized method; it has to perform n lock before coming running again; T labels 2
Some new labels are added to the set T labels 1 : T labels 2 = T labels [ (ffWait; Notify; Notified; Lock; Unlockg TH ids COMP idsg)[ (ffNotifyAg COMP idsg): Wait(tid ; gid ): tid executes the wait statement in the body of a method of gid and becomes waiting for it; Notify(tid ; t0id ); NotifyA(tid ; gid): tid executes the notify and notifyAll statements it sends a signal respectively to t0id or to every thread waiting for the lock on gid ; Notified(tid ; t0id ): tid communicates it received the notify signal from t0id and it is no more waiting for an object; Lock(tid ; gid ) tid performs a lock action on the lock of gid ; Unlock(tid ; gid) tid performs an unlock action on the lock of gid;
Transition relation
The transition relation ??? > p 2 is de ned by the rules below. The transitions de ned for Java1 are still transitions for Java2, except for method calls and load. A call to a synchronized method can occur only if the thread obtain the lock; if a method is synchronized, then the load is to be distinguished from other load occurring in non-synchronized blocks, since it can occur only if the corresponding read must have occurred before, inside the same synchronized block. Method structures returned after a method call are slightly modi ed to indicate if the method is synchronized or not: MET str = STATEMENT Locals Bool. A new predicate IsSynch : MET str characterized by: IsSynch((stm ; lm; true )). l
tc ??? > p 1 tc0 l tc ??? > p 2 tc0 if l 62 fCallO(gid; m name; args ; mst); CallC(T; m name; args ); SelfCall(T; m name; args); Load(: : :); g Low level actions TLoadS { A load action within a synchronized block, need to be represented by a new label. Load(gid ;T;f; ;tid )
< stm > ????????? > p 1 < stm0 > SLoad(gid ;T;f; ;tid ) < fstmggid0 > ?????????? > p 2 < fstm0 ggid0 > Method Calls When the method is synchronized, the method invocation is performed tightly coupled with a lock action. We use the auxiliary function: SynEnt : STATEMENT ! COMP ids [ fundefg characterized by: SynEnt(fstmg) = undef v
v
Versione 0.1 { September 1, 1998
26
SynEnt(fstmggid ) = gid that returns the lock acquired for the execution of a method. Remember that, when omitted, the thread identi er is implicitly assumed to be tid . TLock1 { Static and synchronized method, the class reference is annotated to the code block; a lock
action on the lock of the class occurs together with the method call that is an internal action and does not appear on the label. CallC (T;m name;args) < gid:[T; args ; st ]m name(v list); wm; lmS > ?????????? > p 1 < fstm0 g; wm; lmS 0 > Lock(tid ;T)
< gid :[T; args; st ; sy ]m name(v list); wm; lmS > ?????? > p 2 < fstm0gT ; emp wm; lmS 0 >
{ Static and non-synchronized method, nothing changes w.r.t. Java1 ; the transition becomes an internal action of the thread. CallC (T;m name;args ) < gid :[T; args; st ]m name(v list); lmS > ?????????? > p 1 tc0 < gid :[T; args; st ]m name(v list); lmS > ??? > p 2 tc0 TLock2 { Non-static and synchronized method; a lock action on the lock of the object occurs together with the method call. CallO(ncid ;m name;args ;mst) < ncid :[T; args; nst ]m name(v list); wm; lmS > ?????????????? > p 1 < fstmg; wm; lmS 0 > Lock(tid;ncid )
< ncid :[T; args ; nst ]m name(v list); wm; lmS > ??????? > p 2 < fstmgncid ; emp wm; lmS 0 > if IsSynch(Code(mst))
{ Non-static and non-synchronized method; the transition becomes an internal action of the thread. CallO(ncid ;m name;args ;mst) < ncid :[T; args; nst ]m name(v list); lmS > ?????????????? > p 1 tc0 < ncid :[T; args; nst ]m name(v list); lmS > ??? > p 2 tc0 if :IsSynch(Code(mst)) A self method call; TLock3 { If the method is not static and synchronized, then a lock operation on the lock of the thread
itself occurs. SelfCall(T;m name;args) < tid :[T; args ; nst ]m name(v list); wm; lmS > ???????????? > p 1 < fstmg; wm; lmS 0 > Lock(tid;tid )
< tid :[T; args; nst ]m name(v list); wm; lmS > ?????? > p 2 < fstmgtid ; emp wm; lmS 0 > if mst = MethodDec(m name; args ; T; p); IsSynch(Code(mst))
{ If the method is not static and not synchronized, then the transition becomes an internal action of the thread SelfCall(m name;args) < tid :[T; args ; nst ]m name(v list); lmS > ??????????? > p 1 tc0 < tid :[T; args ; nst ]m name(v list); lmS > ??? > p 2 tc0 if mst = MethodDec(m name; args ; T; p); :IsSynch(Code(mst)) Analogous rules if the method is static and synchronized and static and not synchronized. Synchronized { A synchronized statement is executed by rst evaluating the expression; l
< e > ??? > p 2 < e0 > l < synchronized(e)f[L]stmg > ??? > p 2 < synchronized(e0)f[L]stmg > TLock4 { The executing thread takes the lock associated with the value of the expression and then
executes the block; a new environment, local to the block, is pushed into the stack; Lock(tid;gid )
< synchronized(gid )f[L]stmg ; wm; lmS > ?????? > p 2 < fstmggid ; emp wm; Push(L; lmS) >
Versione 0.1 { September 1, 1998
27
Wait { A thread becomes waiting for an object and must relinquish n-time the gid locks it owns; The number of unlock operations to perform on gid is recorded in the MmMng component. Wait(gid ;n) < gid :wait(); Run > ?????? > p 2 < skip; Relin((gid ; n); Wait(gid; n)) > Notify, Notify All { thread can send a notify signal to one of the threads waiting for an object; Notify(t0 ;gid )
id > p 2 < skip > < gid :notify() > ????????
{ A thread can send a notifyAll signal to every thread waiting for an object; NotifyA(gid ) < gid :notifyAll() > ??????? > p 2 < skip > { A thread can receive a notify signal becoming awaken; Notified(tid ;gid ) < Wait(gid ; n) > ???????? > p 2 < Awaken(gid ; n) > Lock after a thread becomes awaken, it tries to acquire the lock again. TLock5 { Lock(tid ;gid ) < tid ; Awaken(gid; n) > ?????? > p 2 < tid ; Awaken(gid; n ? 1) > if n > 1 TLock6 { Lock(tid;gid ) < tid ; Awaken(gid; 1); wm > ?????? > p 2 < tid ; Run; emp wm > Join { A thread executes a join() statement for a thread and becomes waiting; < tid ; t0id:join(); Run; wm; m; lmS > ??? > p 2 < tid ; skip; Wait(t0id ; 1); wm; m; lmS > Stop { When a thread stops, it relinquishes every locks it owns and then sends a notifyAll signal to all threads
waiting for it;
Receive(Stop;tid)
tc ????????? > p 1 tc0 if Status(tc) 2 fPreStop; Newg Receive(Stop;tid) tc ????????? > p 2 tc0 Receive(Stop;tid)
< stm; tSt > ????????? > p 1 < no stat; Stop > Receive(Stop;tid;LocksList) < stm; tSt > ?????????????? > p 2 < no stat; Relin(LocksList; Stop) > if tSt 62 fPreStop; Newg TUnlock1 { A thread relinquishes the locks it owns; Unlock(tid;gid ) < Relin((gid ; n); ll; tSt); wm > ???????? > p 2 < Relin((gid ; n ? 1); ll; tSt); wm > if n > 2; 8gid ; f; T; v:wm(gid; T; f) = (v; ss) ) ss 6= A TUnlock2 { Unlock(tid ;gid ) < Relin((gid ; 1); ll); tSt; wm > ???????? > p 2 < Relin(ll; tSt); wm > if 8gid ; f; T; v:wm(gid; T; f) = (v; ss) ) ss 6= A NotifyA(tid)
< Relin(emptyl; Stop); wm > ??????? > p 2 < Stop; emp wm > TUnlock3 {
Versione 0.1 { September 1, 1998
28 Unlock(tid;gid )
< Relin((gid ; 1); Wait(gid; m)); wm > ???????? > p 2 < Wait(gid ; m); wm > if n > 1 and forall gid; f; T; v:wm(gid; T; f) = (v; ss) ) ss 6= A Unlock TUnlock4 { Unlock(tid ;gid ) < fskipggid ; wm > ???????? > p2 < skip; wm > if 8gid ; f; T; v:wm(gid; T; f) = (v; ss) ) ss 6= A
5.3 Main Memory Manager semantics
The semantics of MmMng is given by the lts: MmMng LTS2 =(MmMng states 2,MmMng labels 2 , ??? > p 2 ). MmMng states 2 The states of the MmMng are still stacks of elements but in the new Actions set that is enriched by new elements in fLg COMP ids TH ids describing the lock occurrences. In the following the metavariable a ranges over Actions. MmMng labels 2 = MmMng labels 1 [ (ffLocked; Unlockedg COMP ids TH idsg) There are new labels describing the lock and unlock actions.
Locked(tid ; gid ): a lock action L(gid ; tid ) is added to MmMng; Unlocked(tid ; gid): it is checked the presence of the corresponding lock action L(gid ; tid) that is cancelled from the MmMng.
Transition relation
??? > p 2 is de ned by the rules below. old rules rules for ??? > p 1 are still valid l
Mm ??? > p 1 Mm0 l Mm ??? > p 2 Mm0 SLoaded MmMngSLoaded { If a thread requires to load a value and the thread is executing a synchronized block
(and, consequently, there exists a lock action for that thread), then the corresponding read action must have been performed after the lock action. SLoaded(gid ;T;f; ;tid)
Mm ??????????? > p 2 RecCancel(R(gid ; T; f; tid); Mm1 : L(tid ; gid ) : Mm2 ) : Mm3 if 9gid0 ; Mm1; Mm2; Mm3:Mm = Mm1 : L(tid ; gid0 ) : Mm2 R(gid; T; f; v; tid)Mm3 ) Locked MmMngLocked { The lock action is recorded. There exists no other lock action for the same object. v
Locked(tid ;gid )
Mm ???????? > p 2 Push(L(tid ; gid); Mm) if 8t0id :t0id 6= tid ) :In(L(t0id ; gid); Mm) Unlocked MmMngUnlocked { The corresponding lock action is deleted if there is no store action following the lock
actions that is still in the stack.
Versione 0.1 { September 1, 1998
29
Unlocked(tid;gid )
Mm ????????? > p 2 Cancel(L(tid ; gid); Mm) if 6 9v; f; T; gid0 ; Mm1; Mm2:Mm = Mm1 ; L(gid; tid); Mm2 ^ In(S(gid0 ; T; f; v; tid); Mm2)
5.4 Program Semantics
The semantics of Java2 programs is given by the lts P LTS2= (P states 1 ; P labels 1; ?! p 2 ). P states
The passive component states are unchanged and the program states are de ned as for the semantics of Java1. P labels Labels for the LTS of a program are left unchanged w.r.t. Java1 semantics
Transition relation we extend the ?! 1 relation as de ned for Java1 on the new states. p
Method Call Non synchronized method calls are internal actions. Synchronized method calls are labelled by lock label, and then treated by Plock rule Low level actions PSLoad { A thread performs a load operation while in a synchronized block and it is returned a value from wm; SLoaded(gid ;T;f; ;tid ) SLoad(gid ;T;f; ;tid ) tc ?????????? > p 2 tc0 Mm ??????????? > p 2 Mm0 tcjMmjS ?! p 2 tc0 jMm0jS Lock and Unlock actions PLock { A thread performs a lock operation on a lock that is not assigned or that it already owns; v
v
Lock(tid ;gid )
Locked(tid;gid )
tc ?????? > p 2 tc0 Mm ???????? > p 2 Mm0 tcjMmjS ?! p 2 tc0 jMm0 jS PUnlock { A thread performs an unlock operation; Unlock(tid ;gid ) Unlocked(tid ;gid ) tc ???????? > p 2 tc0 Mm ????????? > p 2 Mm0 tcjMmjS ?! p 2 tc0 jMm0jS Wait { A thread, starts the unlock of all the locks it owns before becoming waiting; Wait(gid ;n)
tc ?????? > p 2 tc0 tcjMmjS ?! p 2 tc0 jMmjS if IdInfo(tc) = tid; n = LockNum(tid ; gid; Mm) where LockNum(tid ; gid; Mm) returns the number of occurrences of L(gid ; tid ) in Mm. Notify, Notify All { A thread sends a notify signal to one thread waiting for a passive component; the thread becomes awaken; Notify(tid 2;gid )
Notified(tid 2;gid )
tc1 ????????? > p 2 tc01 tc2 ????????? > p 2 tc02 tc1 jtc2jS ?! p 2 tc01jtc02jS
{ A thread sends a awaken again;
notifyAll
signal for an object or class and all the threads waiting for it become
Versione 0.1 { September 1, 1998 NotifyA(gid )
30
Notified(tid i;gid )
tc ??????? > p 2 tc0 tci ????????? > p 2 tc0i i = 1 : : :n tcjtc1; : : :tcnjTset1 jS ?! p 2 tc0jtc1; : : :tcn jTset1 jS if NoWait(Tset1 ) NoWait(Tset; gid ) is true if there is no thread in Tset waiting for gid ; 3
Stop { When a thread is stopped, it must relinquish every lock it has; Send(Stop;tid 1)
Receive(Stop;tid 1;LocksList)
tc1 ???????? > p 2 tc01 tc2 ??????????????? > p 2 tc02 tc1 jtc2jS ?! p 2 tc01 jtc02jS if LocksList = LocksOwned(tid 1; Mm)
{ All the other statements within a synchronized method are executed; l tc ??? > p 2 tc0 tcjS ?! p 2 tc0jS if l 62 fCallO(: : :); CallC(: : :); SelfCall(: : :); Notif(: : :); Stop(: : :); Lock(: : :); Unlock(: : :)g 3 In our compositional approach to the semantics, every information on the state of the threads used in a transition of the program should be given by the labels of the thread trensitions. To be precise, in this case there should be a transition performed by each thread that communicates whether it is waiting for a given object or not. For simplicity, we just use a predicate on a thread set that checks if the elements are threads that are not waiting for the lock of the object
Versione 0.1 { September 1, 1998 MEAN
Files other applet's method call Displaying Html pages Reading from an URL connection to an URL Sockets Remote Method Invocation
31 RESTRICTIONS
OPERATIONS
none none none ....
read read and (sometimes) write read and write ...
applications using local les read - write applets, not applications read- write applets, not applications read
Table 4: Communication in Java
6 Java3 Java3 is the extension of Java2 obtained by introducing some standard classes of Java that allows a program to communicate with the external environment. Obviously, we picked up only some of such classes, trying to combine the necessity of a powerful language to program reactive, distributed systems, that presumably need a wide range of communication mechanism (interaction with external users, communication by channels among dierent applications, broadcasting communications, I-O operations over les ..) with a simple and concise semantics of the Java sublanguage we use in system development. The semantics of the I-O mechanisms implemented in Java is of particular interest for what concern observations that can be done over the model of a program, since the information exchange that occurs during I-O operations is visible outside the program and will be used to determinate the equivalence notion among programs: two programs are equivalent i the same information exchange with the external environment is observed in their model.
6.1 Considerations on Communication in Java
Java programs can communicate and interact with external entities in dierent ways. A rst classi cation divides the interactions by means of a graphical interface and the interactions consisting of reading and writing operations on an external information resource. In the rst case, a program can interact with an external user that sends information by, for example, clicking on a button or typing text in a eld. The program can process the occurrences of these events and reacts by changing the graphical output. In the other case, a program can bring in information from an external source or send out information to an external destination. The information can be located everywhere: in a disk, in the memory, on a host on the network...and it can also be of any type: text, images, sounds. For the moment, we forget about graphical interfaces, and focus on the other cases. In table 4 we give a schematic classi cation of some mechanism for connecting to an external resource by considering which are the access restrictions and which operations are allowed. In the rst column we list the main mechanisms a Java program has to communicate with the external, that is by using les, asking the browser to display an HTML page, using URL simply to read a document or to establish a connection with a resource, calling methods of other applets, opening sockets to connect to remote processes, calling methods of remote objects. The Java I/O fundamental concept is stream. A stream is an ordered sequence of data that have a source (for input streams) or a destination (for output streams). To read or write data from or into an external entity, that could be a le, an URL, a socket or something else, a program opens a stream on that entity. Java de nes several classes implementing streams that operate on dierent data types. No matter where the information is coming from or going to, and no matter the type of data is being read or written, operations on streams are pretty much the same. It is worthwhile to note that external resources can change dynamically (like les that can be read or written by other processes, CGI programs that can be modi ed or moved, etc) since Java programs do not have the complete control over them. More speci cally: a le can be read/written/cancelled from outside the Java program by other processes; there is no way for the program to have exclusive access to a le.
Versione 0.1 { September 1, 1998
32
an URL connection may connect to a le, a HTML page, a CGI program; a priori, we do not
know what kind of entity is linked to the connection; a socket is a TCP (but also UDP communication with socket is supported) connection with a given port on a given machine. From these observation, we can deduce that the \real" content of a le (or a socket or a HTML page) cannot be modeled inside the state of the program, since it is known only outside the program. Besides, we need to have components of the program states that describe streams, sockets and le identi ers to resolve the expression evaluation and to identify the entities that exchange data with the program. As we said above, giving a complete semantics of I-O in Java requires a long (thus repetitive in some parts) work and it is behind the scope of our work. We are interested in giving the semantics of a sublanguage of Java that is powerful enough to implement a wide class of reactive-concurrent systems. It is not interesting, from our point of view, to consider all the classes that allows I-O operations in Java; we want to focus on the most relevant ones (from our point of view). We have to adopt some pragmatic criteria in the choice of the classes to consider, considering that the systems developed by our method probably need: les for { standard input-output operations; it is the simplest mechanism to interact with an external user (we do not consider graphical interfaces for the moment); { data persistency ( les are used to store the result of the computation of a part of a complex system); sockets as the main mechanism to communicate with processes residing on dierent hosts; it is the basic and simplest mechanism for distributed programming in Java (Remote Method Invocation will be considered in planned future works).
6.2 Files in Java
A le is an external \physical" entity, but Java gives the way to see it locally as an object of the program. Operations for receiving and sending data to external resources are implemented as calls to methods of the corresponding objects. So, if we are interested in observing the ow of data between a Java program and an external entity, we can simply enrich transition labels with the information about the data exchanged during these method calls. However, for the clarity, notice that objects having reading and writing methods represent streams rather then les. A stream is a communication channel between a source and a destination. In Java it is possible to read from or to write to a stream acting as a common channel between the program and an external le. Input streams allow to read from a source; output streams allow to write to a destination. Java also dierentiates between character and byte streams. Obviously, we take into consideration just a very little subset of these classes, if we consider the wide hierarchy of classes representing dierent kinds of streams in Java.
6.2.1 Streams for Standard Input and Output
The standard input, generally associated with the keyboard, is accessed by means of the eld in in the class System that is an instance of the class InputStream. The method that reads bytes from the source is the most important one. This method is overloaded in the class to allow the reading of a bunch of bytes rather then just one. For simplicity, we only consider the reading of a single byte at a time. The instances of InputStream we consider are only the standard input (that we assume to be in the initial con guration of the program, and so it is not explicitly created as instance of InputStream) and, as we will see in the following, the input streams associated with sockets (that are implicitly created by a method of Socket class). So, we do not need a constructor for class InputStream. This is the simpli ed version of InputStream class used in Java3:
Versione 0.1 { September 1, 1998
33
public abstract class java.io.InputStream extends java.lang.Object //Methods public void close(); public int read(byte b);
The standard output, generally associated with the screen, is accessed by means of the eld out in the class System that is an instance of the class PrintStream. The method that writes bytes to the destination is the most important one. This methods is overloaded in the class to allow the writing of a bunch of bytes or of a complete line. For simplicity, we only consider the writing of a single byte at a time and, for the reasons described above, there is no constructor for that class. These simpli cations make classes PrintStream and OutputStream (the superclass of OutputStream) essentially the same. So, in Java3 we use only OutputStream and we suppose that System.out has type OutputStream. This is the simpli ed version of OutputStream class used in Java3: public class java.io.OutputStream //Methods public void close(); public void write (byte b);
6.2.2 Streams to Read and Write Files
and FileOutputStream represent input and output streams on a le that lives on the native le system. For simplicity, we assume that is possible to create a le stream only by using a lename (and not with File and FileDescriptor objects). Again, we only consider read and write methods that manage one byte at a time. These are the simpli ed versions of the classes FileInputStream and FileOutputStream used in Java3: FileInputStream
public class FileInputStream extends InputStream //Constructors public FileInputStream(String path); //Methods public void close(); public int read(byte b); // simplified to a single byte
public class FileOutputStream extends OutputStream // Constructors public FileOutputStream(String path); //Methods public void write(byte b); public void close();
It is clear that performing read or write operations on standard I-O les or on streams on les is the same for a Java program.
Versione 0.1 { September 1, 1998
34
ConnectionExtablished(host,port)
Server program
Client program
SocketRequest(host, port)
Figure 4: A socket connection request
6.3 Sockets in Java
At the core of Java's networking support are the classes Socket and DatagramSocket that de ne channels for communication between processes over an IP network. A new socket is created by specifying an host, either by name or with an IP address, and a port number on the host. The class Socket is used to create TCP connections over an IP network (while DatagramSocket manages UDP connections, that we discard, for the moment). A process on the remote host must be listening on the speci ed port number for incoming connection request. In Java, this can be done using the class ServerSocket whose main activity is to listen on a port for connection requests from a client and eventually return a new socket on a new port to allow such connection. In g4 we report the two parts of a socket connection. On one side a client performs a connection request by requiring the creation of a new socket connected to a remote port on a speci ed host, for example using this code: Client Code:
Socket myS; InputStream in; OutputStream out; .... myS = new Socket("myServer", 3000); in = myS.getInputStream(); out = myS.getOutputStream();
The new operation requires that the client program performs an interaction with the external environment (see g.4), as a connection request must be sent to the server at the given port on the given host. If: the host does non exist; the port does not exist or no Java server is listening on it; new connections are no more accepted by the server; then the corresponding exception is thrown; otherwise, a socket bound on a client local port is returned. Two streams can be used to read and write over the socket. On the other side, the server program uses the class ServerSocket for creating a server side socket. Server Code:
Versione 0.1 { September 1, 1998
35
OUT
S1
port
Client1
IN port1
S1 .. . . .
ServerS
Server
portN Sn
IN
Sk
Clientk
OUT
Figure 5: A client-server socket schema ServerSocket serverS; Socket clientS; ... serverS = new ServerSocket(3000); clientS = serverS.accept();
The accept method waits until a client requests a socket connection on port 3000 on the server. When the connection is accepted and established, the method returns a Socket object that is bound to a new port (see g.5). The de nition of class ServerSocket and Socket with our simpli cation is the following: These are the simpli ed versions of ServerSocket and Socket classes used in Java3: public class ServerSocket extends Object ServerSocket(int port); Socket accept(); void close(); int getLocalPort(); // returns the port on which the server socket is listening;
public class Socket extends Object Socket(String host, int port); void close(); int getLocalPort();// returns the local port on which the socket is bound; int getPort;// returns the remote port to which the socket is connected; InputStream getInputStream(); OutputStream getOutputStream();
7 Syntax and Static Semantics The syntax of Java3 is the syntax of Java2 enriched by class declarations and class bodies in InputStream, PrintStream, FileInputStream, FileOutputStream, Socket, ServerSocket. Thus Java3 has the same statements as Java2. The dierence is that every program is implicitly enriched by the method code of the standard classes above. The inference rules for term typing and annotation are the same given in Java2. In the following rules for the dynamic semantics we discard, as not meaningful, the annotations on terms.
8 Thread Semantics The introduction of I-O mechanism of Java does not modify thread states, but it requires new labels and rules that extend the de nition of transition relation.
Versione 0.1 { September 1, 1998
36
So, in Java3 a thread is modelled by T LTS 3 = (T states 2 ; T labels 3 ; ??? > p 3 ). We de ne the syntactic sets: InputStream ids: set of identi ers for instances of InputStream, ranged over by instid . PrintStream ids: set of identi ers for instances of PrintStream, ranged over by outstid . FileInputStream ids: set of identi ers for instances of FileInputStream, ranged over by fsinid ; FileOutputStream ids: set of identi ers for instances of FileOutputStream, ranged over by fsoutid ; Stream ids = InputStream ids [ PrintStream ids [ FileInputStream ids [ FileOutputStream ids: set of stream names ranged over by stid . Strings: set of strings that represent an external le or socket ranged over by str; Socket ids: set of socket names ranged over by skid . ServerSocket ids: set of server socket names ranged over by Sskid . We also use the metavariable genIstid , that ranges over InputStream ids [ FileInputStream ids, and genOstid that ranges over PrintStream ids [ FileOutputStream ids.
Thread Labels T labels 3 = T labels 2 [ (fNewg fFileInputStream; FileOutputStreamg Strings Stream ids) [ (fNewServerSocketg INTEGER ServerSocket ids) [ (fNewSocketg Strings INTEGER Socket ids) [ (fAcceptg ServerSocket ids Socket ids) [ (fReadS; WriteS g Stream ids Bytes) [ (fGetInputStream; GetOutputStreamg Stream ids Socket ids)
We brie y analyse every label: New(FC; fname ; stid ): a new stream stid over le fname of class FC is created (FC 2 fFileInputStream; FileOutputStreamg); NewServerSocket(i; Sskid ): a new ServerSocket Sskid on port i is created; NewSocket(host; port; skid): a new Socket skid is created on port port of host host. Accept(Sskid ; skid ): a ServerSocket Sskid accepts a connection and returns a new socket skid ; ReadS(stid ; b); WriteS(stid ; b): a byte b is read from (writen to) socket stid . GetInputStream(instid ; skid ); GetOutputStream(outstid ; skid) : the input (output) stream associated with socket skid is returned.
Transition relation is de ned by the rules below (remember that we omitt, every time it does not cause confusion, the non meaningful state components; for example, in the following < stm > stands for < tid ; stm; tSt; m; lmS >) New stream { A thread calls the new method of class FileInputStream. New(FileInputStream;fname;stid)
< new FileInputStream(fname ) > ???????????????? > p3 < stid > Analogous for FileOutputStream. Read { A thread reads a byte from an InputStream or FileInputStream instance. ReadS (stid ;b)
< stid :read() > ??????? > p 3 < b >
Versione 0.1 { September 1, 1998
37
Write { A thread writes a byte to an OutputStream or FileOutputStream instance. WriteS (stid ;b)
< stid :writeS(b) > ??????? > p 3 < skip > NewServerSocket { A thread calls the new method to create a server socket listening on a given port. NewServerSocket(
;Sskid )
< new ServerSocket(port) > ??????????????? > p 3 < Sskid > Accept { A server socket Sskid accepts a connection requests and returns a new socket. port
Accept(Sskid;skid )
< Sskid :accept() > ????????? > p 3 < skid > NewSocket { A thread tries to connect to the server. It is returned a socket connected to a socket on the server's host.
NewSocket(host;
;skid )
< new Socket(host; port) > ????????????? > p 3 < skid > GetinputStream port
GetInputStream(instid;skid)
< skid :getInputStream() > ?????????????? > p 3 < instid > GetoutputStream analogous.
8.1 Memory Manager Semantics
It is left unchanged w.r.t. Java2.
9 Program Semantics The program states in Java3 can contain also components modelling streams and sockets.The labels now must describe the information exchange with the external caused by I-O operations. In Java3 a program is modelled by P LTS 3 = (P states 3 ; P labels 3 ; ?! p 3 ).
9.1 Program states
The description of an intermediate state in the computation of a Java3 program is enriched by elements describing objects of class streams and sockets.
9.1.1 Stream State
The streams are passive components as they do not perform independent activities on their own. We model them simply by describing their states in each moment of the computation. A stream state must contain: an identi er for the stream; a reference to the entity it is attached to ( le, socket,...); a status (open-closed) bit;
Versione 0.1 { September 1, 1998
38
Stream States = Stream ids Ext Resource BOOL where Stream ids: is the set of stream identi ers; ranged over by metavariable stid ; Ext Resource = Strings [ Socket ids is the set of le names or socket identi ers; ranged over by metavariable res. < stid ; str; b > represents a state for the stream stid associated with the external entity identi ed by name str, that can be opened or not, depending on the value of b. For simplicity, input and output standard streams are modeled like streams to les, where the name of the resource is IN for standard input and OUT for standard output. We de ne the extractors: Resource : Stream States ! Ext Resource, de ned by Resource(< stid ; str; b >) = str, that returns the reference to the object with which the stream is associated: Closed : Stream States ! Stream ids, de ned by Closed(< stid ; str; b >) = b, that returns the status bit. IdInfo : Stream States ! Stream ids, de ned by IdInfo(< stid ; str; b >) = stid Class() : Stream ids ! CL ids : stream identi ers (as thread and object identi ers introduced in Java1) contain also information about the class they are instances of.
9.1.2 Socket State
A ServerSocket object waits for requests of connections on a port on the server host. A Socket object is bound to a local port on a network host and is connected to another socket bound on a port on the server host. Since just one socket can be connected to a single port, a socket can be univocally identi ed by a port number. The state of a socket must contain 2 port numbers, for the local port and for the remote port to which the socket is connected. In the following, port is a metavariable ranging over INTEGER used to denote a port number. The state for ServerSocket instances contains: a status (open-closed) bit; a local port number; Ser Socket States = ServerSocket ids BOOL INTEGER. where ServerSocket ids: is the set of strem identi ers; ranged over by metavariable Sskid ; < Sskid ; b; port > represents a state for server socket Sskid listening on server port port; b is true if the server socket is open. Function Port : Ser Socket States ! INTEGER, de ned by Port(< Sskid ; b; port >) = port returns the port to which the server socket is bound. The state for Socket instances contains: input and output stream references; a status (open-closed) bit; a local port number; a port number to identify the socket on the other end of the connection; Socket States = Socket ids Stream ids Stream ids BOOL INTEGER INTEGER. where Socket ids is the set of socket identi ers; ranged over by metavariable skid . < skid ; stid IN; stid OUT; b; portl ; portr > represents a state for socket skid whose input stream is stid IN and output stream is stid OUT; portl and portr identify, respectively, the local and remote ports; b is true if the socket is open. We de ne the following functions to extract components by a socket state:
Versione 0.1 { September 1, 1998
39
LocalPort : Ser Socket States ! INTEGER de ned by
LocalPort(< skid ; stid IN; stid OUT; b; portl ; portr >) = portl RemotePort : Ser Socket States ! INTEGER de ned by RemotePort(< skid ; stid IN; stid OUT; b; portl ; portr >) = portr InputStr : Ser Socket States ! Stream ids de ned by InputStr(< skid ; stid IN; stid OUT; b; portl ; portr >) = stid IN OutputStr : Ser Socket States ! Stream ids de ned by OutputStr(< skid ; stid IN; stid OUT; b; portl ; portr >) = stid IN It is worthwhile to note that classes InputStream,OutputStream,..Socket,.. can be modeled by a simpler state w.r.t. Java2, since these classes have neither class nor instance elds and, consequently, the memory and initializer components can be discarded. Thus, the I-O classes are modeled by a couples in I ? OClasses = CL ids Set(OB ids). A class state < cid ; Sid > consists of: cid : the class identi er; Sid : a set of identi ers of previously created objects of this class. Now we can de ne program states for Java3.
P states3= Set(T states 2 [ G states [ Ser Socket States [ Socket States [ IOClasses) MmMng states 2
where in I ? OClasses are the components for classes InputStream,OutputStream...
Program labels (P labels 3 = f g) [ (fNEW IN FILESTR; NEW OUT FILESTRg Strings) [ (fREAD; WRITE g Stream ids Strings Bytes) [ (fACC CONN; REQ CONN g INTEGER Strings Socket ids)
We brie y analyse every label: NEW IN FILESTR(fname ); NEW OUT FILESTR(fname ): a new stream of input (output) associated with le fname is created; READ(stid ; str; b); WRITE(stid ; str; b): a byte b is read from (written into) a stream stid associated with le or socket str; ACC CONN(port; host; portrem ; portloc ; skid ): a socket connection with port port on host host from port portrem is accepted; a new socket skid on port portloc is returned. REQ CONN(port; host; portrem ; portloc ; skid ):a socket connection is requested from port portrem to port port on host host; a new socket skid on port localport is returned.
Transition Relation ?! 3 : P states 3 P labels 3 ! P states 3 is de ned by the inductive rules below. p
New Stream { There is an interaction with the external environment, since the system has to create the le if it does not exist. A new component, modeling the new stream, is added to the program state. New(FileInputStream;fname;instid)
tc ????????????????? > p 3 tc0
NEW IN FILESTR(fname )
tcj < FileInputStream; Sid > jS ???????????????????! p 3 tc0j < FileInputStream; Sid [ finstid g > j < instid ; fname ; true > jS
Versione 0.1 { September 1, 1998
40
if instid 62 Sid ; Class(instid ) = FileInputStream Read { A byte is read from the external le str (that is not a socket) by means of input stream stid . The stream must be open.
ReadS (stid ;b)
tc ??????? > p 3 tc0 READ(stid;str;b) tcj < stid ; str; true > jS ???????????! p 3 tc0 j < stid ; str; true > jS
{ If a byte is read, the socket must be open. ReadS (stid ;skid ;b) tc ????????? > p 3 tc0 READ(stid ;skid ;b) tcj < stid ; skid; true > jskcompjS ????????????! p 3 tc0j < stid ; skid; true > jskcompjS if IdInfo(skcomp) = skid ; Open(skcomp) Write { A byte is written to the external resource str (that is not a socket) by means of output stream stid . WriteS (stid ;b)
tc ??????? > p 3 tc0 WRITE (stid ;str;b) tcj < stid ; str; true > jS ????????????! p 3 tc0j < stid ; str; true > jS
{ If the byte is written to a socket, it must be open. WriteS (stid ;b) tc ??????? > p 3 tc0 WRITE (stid ;skid;b) tcj < stid ; skid; true > jskcompjS ?????????????! p 3 tc0 j < stid ; skid ; true > jskcompjS if IdInfo(skcomp) = skid ; Open(skcomp) NewServerSocket { It is created a new server socket listening on a given port. NewServerSocket(
;Sskid )
tc ??????????????? > p 3 tc0 tcj < ServerSocket; Sid > jS ?! p 3 tc0 j < ServerSocket; Sid [ fSskid g > j < Sskid ; true ; port > jS port
if Class(Sskid ) = ServerSocket; Sskid 62 Sid NewSocket { It is created a new socket object that is connected to a socket at the other end of the link. Note that the
input and output streams for the socket are created by the getInputStream and getOutputStream calls. NewSocket(host;
;skid)
tc ????????????? > p 3 tc0 REQ CONN ( ;host; rem ; loc ;skid ) 3 0 tcj < Socket; Sid > jS ????????????????????????????! p tc j < Socket; Sid [ fskid g > jskcompjS port
port
port
port
if skid 62 Sid ; Class(skid) = Socket; skcomp =< skid; ; ; true ; portloc ; portrem >
Accept { The returned socket is connected to a dierent port. Accept(Sskid;skid)
tc ????????? > p 3 tc0 ACC CONN (
;host;
;
;skid )
rem loc 3 tcj < Socket; Sid > jSskcompjS ????????????????????????????! p 0 tc j < Socket; Sid [ fskid g > jSskcompjskcompjS if skid 62 Sid ; Class(skid) = Socket; IdInfo(Sskcomp) = Sskid ; Port(Sskcomp) = port; port
port
port
Versione 0.1 { September 1, 1998 skcomp =< skid ; ; ; true ; portloc ; portrem > Get Input Stream from Socket { It is created an new input stream. The corresponding reference is put in the socket state. GetInputStream(skid;stid)
tc ????????????? > p 3 tc0 tcjskcompj < InputStream; Sid > jS ?! p 3 0 0 tc jskcomp j < InputStream; Sid [ fstid g > j < stid ; skid ; true > jS if stid 62 Sid ; Class(stid ) = InputStream; InputStr(skcomp0 ) = stid Output Stream from Socket Analogous
41
Versione 0.1 { September 1, 1998
42
A Java1 syntax We use the special notation for grammar introduced in [4]. The de nition of a non-terminal is introduced by the name of a nonterminal followed by a colon; alternative right-hand sides for the non terminal follow on succeeding lines. The subscripted sux opt, which may appear after either a terminal or nonterminal, indicates an optional symbol. For example: BreakStatement: Identi eropt
break
is a convenient abbreviation for:
BreakStatement: break Identi er break
Below is the syntax of programs and environments.
Env: StandardEnv Env; Decl Decl: ClassDeclaration VariableDeclaration ClassDeclaration: ClassInhDecf ClassDeclarationsopt g ClassInhDec: class ClassName ext ClassName Type: ClassName bool char int
ClassDeclarations: ClassDeclarations ClassDecl ClassDecl ClassDecl: ClassMemberDeclaration ClassMemberDeclaration: FieldDeclaration MethodDeclaration FieldDeclaration: FieldModi eropt VariableDeclaration VariableDeclaration: Type VariableDeclaratorId VariableDeclaratorId: Identi er FieldModi er: static
MethodDeclaration: MethodHeader MethodHeader: MethodModi eropt ResultType MethodDeclarator ResultType: Type void
MethodDeclarator:
Versione 0.1 { September 1, 1998 Identi er(FormalParameterListopt ) FormalParameterList: FormalParameterList,FormalParameter FormalParameter: Type VariableDeclaratorId MethodModi er: static, nonstatic
Programs
Programs: ClassBodiesopt ClassBodies: ClassBodyopt ClassBodies ClassBody ClassBody: ClassInhDec MethodBodies MethodBodies: MethodBodyopt MethodBodies MethodBody MethodBody: Identi er(ResultType TypeList) Block ;
TypeList: Type TypeList Type Block: f BlockStatementsg BlockStatements: BlockStatement [VariableDeclarations] BlockStatements BlockStatement: Statement VariableDeclarations: VariableDeclarationopt VariableDeclarations VariableDeclaration
Expressions (a subset of Primary Expressions for Java) Primary: Literal
this (Expression)
ClassInstanceCreationExpression FieldAccess MethodInvocation Literal: IntegerLiteral FloatingPointLiteral BooleanLiteral CharacterLiteral StringLiteral NullLiteral ClassInstanceCreationExpression: newClassType( ArgumentListopt )
43
Versione 0.1 { September 1, 1998 ArgumentList: AssignmentExpression ArgumentList,AssignmentExpression AssignmentExpression: ConditionalExpression4 Assignment Assignment: LeftHandSide = AssignmentExpression LeftHandSide: ExpressionName FieldAccess ArrayAccess FieldAccess: Primary. Expression super.Identi er MethodInvocation: MethodName(ArgumentListopt ) Primary.Identi er(ArgumentListopt ) super .Identi er(ArgumentListopt ) Name: SimpleName Quali edName SimpleName: Identi er Quali edName: Name. Identi er
Statements
Statement: EmptyStatement ClassInstanceCreationExpression Assignment MethodInvocation IfThenElseStatement ReturnStatement EmptyStatement: ;
IfThenElseStatement: if (Expression) Statement else Statement ReturnStatement: return Expressionopt
B Low level actions rules In this appendix, we prove that rules in [4] are correctly formalized in our semantics.
B.1 How to prove rules
In [4] there are essentially three kinds of rules: 4
see Appendix for a complete description
44
Versione 0.1 { September 1, 1998
45
rules on actions performed by the thread execution enginee (tee) (on its own); it is proved that these
rules are satis ed by the lts associated with threads; rules on actions performed by MmMng (on its own); it is proved that these rules are satis ed by the lts associated with MmMng; rules on the execution order of actions performed by tee and MmMng; it is proved that these rules are satis ed by the lts associated with the complete program (see below). For simplicity, below we refer the rules for low level actions in the semantics of threads, MmMng and program, by their names.
B.2 The Signature
The following rules involve also the occurrence of actions like assign, use and read that do not correspond to a label in the LTS of, respectively, the threads and the MmMng. We introduce two predicates checking if an assign or a read have been performed in the previous state: [Assigned(gid ; f; v; x)] is satis ed if the state is the nal state of the transition performing the assignment of value v to eld f in object gid ; analogous for Used. [Read(gid; f; v; x)] is satis ed if the state is the nal state of the transition performing the reading of value v of eld f in object gid . Moreover, it is necessary to de ne two predicates UnlockNum and LockNum for counting previous occurrences of lock and unlock in a path on the LTS of the MmMng. [UnlockNum(x; gid; tid ) = n] is valid on a path, if there are exactly n occurrences of unlock actions performed by thread T on lock gid along the path ending in the current state. Analogously, we can de ne the validity of LockNum.
B.3 Rules to be proved
Except for the rst one, the rules in [4] are re-formulated by means of temporal formulas that are proved to be true on each path of an lts (see sec.Appendix C for a description of the branching-time temporal logic used in the following). Atomic temporal formulas express properties on a state of a transition or on a label. The occurrence of an action in a path is expressed as the presence of a transition with the corresponding label in a point of the path. In the following, by actions we always refer to low-level actions as described above. It is worthwhile to remember that, in our semantics, the value of a non-local variable is a reference to an object eld. So we use (gid , f) for a given variable (forgetting the type, for simplicity). First three rules are related to thread actions, so they are proved on paths of the lts modelling threads.
[V0]: Actions performed by a thread as well as actions performed by MmMng, are totally ordered. That is, for any two actions performed by a thread (MmMng), one action precedes the other.
Proof Trivial CVD [V1]: An assign or use is permitted only when dictated by the execution of the Java code Proof Trivial CVD [V2]: a store action by T on V must intervene between an assign by T of V and a subsequent load
by T on V:
[Assigned(x)] ^ 3 < Load(gid ; f; v) >)< l 6= Load(gid ; f; v) > U < Store(gid ; f; v; tid) > Proof From [TLoad]: < Load(gid; f; v) >) [wm(gid; f) = (v0 ; S) _ undef] while from [TAssigned]: [Assigned(gid ; f; v; x)] ) [wm(gid ; f) = (v0 ; A)]. From (P4), a store action must occur in between. CVD
Versione 0.1 { September 1, 1998
[V3]: a
assign action by T on V must intervene between a subsequent store by T on V:
46 load
or
store
by T of V and a
(< Load(gid ; f; v) > _ < Store(gid ; f; v; tid) >)^3 < Store(gid ; f; v; tid) >)< l = 6 Store(gid ; f; v; tid) > U [Assigned(gid ; f; v; x)]) Proof [TStore]: a store action occurs when the value status is A and after the store, the value
status is S ; the status must become A again, before another store occurs but, [P4]: an assign is the only way to make the value status A again. Analogously, from [TLoad] a load can only occur if the status is not A and then, before a store, an assign must occur in between. CVD The following rules are quite dierent, since they express properties on the program relating the occurrence of an action performed by a component (the tee, for example) or by the whole program with the ocurrences of one or more actions performed by others components (i.e., the MmMng). Thus we need a dierent kind of temporal formulas that are able to relate the activity of two components. For example, we want to express that if a component performs a given transition, then another one will eventually (or had already) performed another activity. Using the temporal logic presented above, we have formulas like: l 9tc; tc0S; Mm; l:[x = tcjS jMm] ^ [x = tc0jS jMm ^ tc ?! tc0 ] to express the fact that within a certain program transition there is a component tc which performs a transition labelled by l. From now on, such a formula will be abbreviated by the short notation: < tid : l > where tid is the identi er of component tc (IdInfo(tc) = tid ). Analogous abbreviations can be de ned for the other components. Pn Moreover, in the following, we use the notation 1 ??! 2 for 2 follows from 1 applying property Pn.
Execution Order The following constraints require that some occurrences of actions are uniquely paired with other ones. In order to model this pairing, we can add an index to labels and elements of MmMng; this allows to uniquely identify each action occurrence. But such index will be omitted when unnecessary, as we did in our semantics rules.
[E1]: each lock or unlock action is performed jointly by some thread and the MmMng. < tc : Lock(gid ; tid) > () < Mm : Locked(gid ; tid ) > < tc : Unlock(gid ; tid ) > () < Mm : Unlocked(gid ; tid) > Proof Trivially from rules [Plock] and [Punlock] CVD [E2]: each load action by a thread is uniquely paired with a read action by the MmMng such that the load action follows the read action
< tc : Load(gid ; f; v)n >) p3[Mm : Read(gid ; f; v; tid; x)n] Proof A load action can be performed by a thread i the MmMng has the corresponding read stored. [PLoad] [P 5] < tc : Load(gid ; f; v)n > ?????! < Mm : Loaded(gid ; f; v)n > ??! p3[Mm : Read(gid ; f; v; tid; x)n] CVD
[E3]: each store action by a thread is uniquely paired with a write action by the main memory such that the write action follows the store action
< Mm : Write(gid ; f; v)n >) p3 < tc : Store(gid ; f; v)n > Proof A write action can be performed by the MmMng only if it has the corresponding stored. [MMWrite] [P 8] < Mm : Write(gid ; f; v; tid)n > ????????! [Mm : In(S(gid ; f; v)n; x)] ??!
store
Versione 0.1 { September 1, 1998 p3 < Mm : Stored(gid ; f; v; tid)n
47 [PStore]
> ?????! p3 < tc : Store(gid ; f; v)n > CVD
[V4]: After a thread is created, it must perform an assign (or load ) action on a variable before performing a (use or) store action on that variable.
< New(T; tid) > ^3 < tid : Store(gid ; f; v; tid) >)< tid : l 6= Store(gid ; f; v; tid) > U [tid : Assigned(gid ; f; v; x)] Analogous for use
Proof [PNew]: a newly created thread has an empty wm (the value and the status is undef for each varaible); [TStore]: a store occurs if the variable status is A; [P4]: an assign must occur between the creation and the store, to make the variable status A CVD
[V5]: A new variable is created only in MmMng and is not initially in any thread's working memory < New(T; tid) >) [tid : 8gid ; f; WorkingMemory(x)(gid ; f) = undef] Proof Obvious by [PNew] CVD [V6]: For every load action performed by a thread T on its working copy of a variable V, there
must be a corresponding preceding read action by MmMng on the master copy of V and the load action must put into wm the data transmitted by the corresponding read action.
< tc : Load(gid ; f; v) >) p3[Mm : Read(gid ; f; v; tid; x)] Proof See proof of [E2] CVD
[V7]: For every store action performed by a thread T on its working copy of a variable V, there must be a corresponding following write action by the main memory on the master copy of V and the write action must put into the master copy the data transmitted by the corresponding store.
< Store(gid ; f; v; tid) >) 3 < Mm : Write(gid ; f; v; tid) > Proof COME LA PROVO??? CVD [V8]: Let action A be a load or a store by thread T on variable V, and let action P be the
corresponding read or write by the main memory. Similarly, let action B some other load or by thread T on the same variable V and let action Q the corresponding read or write by main memory on V. If A precedes B, then P must precede Q. This property is expressed by more than one formula, for clarity: 1)< tc : Load(gid ; f; v1)n1 > ^3 < tc : Store(gid ; f; v2; tid )n2 >)
store
3([Mm : Read(gid; f; v1; x)] ^ 3 < Mm : Write(gid ; f; v2; tid) >) 2) < tc : Store(gid ; f; v1; tid )n1 > ^ 3 < tc : l = Load(gid; f; v2; tid)n2 >) 3 (< Mm : Write(gid ; f; v; tid) > ^3 [Mm : Read(gid; f; v; x)])
p
and so on..
Proof We start proving 1). If the premise is satis ed, we can assume:
[A] < tc : Load(gid ; f; v1; tid)n1 > and [B]< tc : Stored(gid ; f; v2)n2 > [PLoad]
[P 5]
then: from [A] ?????! < Mm : Loaded(gid ; f; v; tid)n1 > ??! p3[Mm : Read(gid; f; v; x)] while from [B] the occurrence of the corresponding write action must follow the store action, [E 3] < tc : Stored(gid ; f; v2; tid)n2 > ??! 3 < Mm : Write(gid ; f; v; tid) > and so A ^ B ) p3([Mm : Read(gid ; f; v; x)] ^ 3 < Mm : Write(gid ; f; v; tid) >) The we prove 2). If the premise is satis ed, we can assume: [A] < tc : Stored(gid ; f; v1)n1 > and [B] < tc : Load(gid ; f; v2)n2 >
Versione 0.1 { September 1, 1998
48
we want to prove that: < Mm : Write(gid ; f; v1; n1tid ) >) :p3[Read(gid; f; v2; x)n2 ] but [MmWrite;PLoad] < Mm : Write(gid ; f; v1; tid)n1 > ????????????! [Mm : In(S(gid ; f; v1; tid )n1 ; x)] S < Store(gid ; f; v1; tid)n1 > and [Mm : In(S(gid ; f; v1)n1 ; x)] [MmRead]
but [Mm : In(S(gid ; f; v1)n1 ; x)] ???????! [6 9v2:Read(gid ; f; v2; x)n2 ] CVD
[L1]: a lock action by T on L may occur only if, for every thread S other then T, the number of preceding unlock actions by S on L equals the number of preceding lock actions by S on L.
< tc : Lock(gid ; tid) >) [Mm : UnlockNum(x; gid ; t0id) = LockNum(x; gid ; t0id)] [PLock]
[MmLock]
Proof < tc : Lock(gid; tid) > ?????! < Mm : Locked(gid; tid) > ???????! [Mm : :In(L(gid ; t0id); x)^ t0id = 6 tid ]
It is trivial to prove that
Lemma B.1 [Mm : :In(L(gid ; t0id); x)] [UnlockNum(x; gid; t0id) = LockNum(x; gid; t0id )].
We use to denote [Mm : :In(L(gid ; t0id ); x)] [UnlockNum(x; gid; t0id ) = LockNum(x; gid ; t0id)] A for [Mm : :In(L(gid ; t0id ); x)] B for [UnlockNum(x; gid; t0id) = LockNum(x; gid; t0id )] The proof is by induction on the length of paths; for a path with just the initial state, it is obvious, since the MmMng component is empty; if is true on a path of length n, we prove that every transition leads to another state in which is true, by cases on the transition label: { a transition labeled by Unlock(gid ; tid) is not possible ([MmUnlock]); { ^ < Mm : l 6= Lock(gid ; tid) >) : obvious; [MmLock]
{ ^ < Mm : Lock(gid ; tid ) >) : < Mm : Lock(gid; tid) > ???????! [Mm : :In(L(gid; t0id); x)]^ [Mm : In(L(gid ; t0id); x)] and then [Mm : UnlockNum(x; gid ; t0id) = LockNum(x; gid; t0id ) ? 1]
CVD
[L2]: an unlock action by thread T on lock L may occur only if the number of preceding unlock actions by T on L is strictly less then the number of preceding lock actions by T on L.
< tc : Unlock(gid ; tid ) >) [Mm : UnlockNum(x; gid ; t0id) < LockNum(x; gid; t0id )] [PUnlock] [MmUnlocked] Proof < tc : Unlock(gid; tid) > ??????! < Mm : Unlocked(gid; tid) > ?????????! [Mm : In(L(gid ; tid); x)] From lemma B.1: [Mm : In(L(gid ; tid ); x)] implies [tc : UnlockNum(x; gid; tid ) 6= LockNum(x; gid ; tid )] Now we want to prove that: (Lemma A2) [Mm : UnlockNum(x; gid; t0id ) LockNum(x; gid ; t0id )] Its proof is similar to the one of B.1. [LemA1;LemA2]
Then, [Mm : In(L(gid ; tid ); x)] ???????????! [tc : :UnlockNum(x; gid; tid ) < LockNum(x; gid ; tid)] CVD
[LV1]: between an assign action by T on V and a subsequent unlock action by T on L, a store action by T on V must intervene; moreover, the precede the unlock action.
write
action corresponding to that
store
must
Versione 0.1 { September 1, 1998
49
[tc : Assigned(gid ; f; v; x)] ^ 3 < tc : Unlock(gid0 ; tid) >) ) (< tc : l 6= Unlock(gid0 ; tid ) >) U (< tc : Write(gid ; f; v; tid) > ^ < tc : l 6= Write(gid ; f; v; tid) > U < tid : Store(gid ; f; v; tid) >) [TAssign] Proof [tc : Assigned(gid ; f; v; x)] ??????! [tc : wm(gid; f) = (v; A)] while [TUnlock] < tc : Unlock(gid0 ; f; v) > ??????! [tc : (wm(gid ; f) 6= (v; A))] but (P4), an assigned value changes its state only after a store action: [P 4] [tc : Assigned(gid ; f; v; x)] ^ 3 < tc : Unlock(gid0 ; tid ) > ??! < tc : l 6= Unlock(gid0 ; tid) > U < tc : Store(gid ; f; v; tid) > then an unlock cannot occur if a store action has not been yet written into MmMng: [PUnlock]
[MmUnlock]
< tc : Unlock(gid0 ; tid ) > ??????! < Mm : Unlocked(gid0 ; tid ) > ????????! [Mm : :(In(S(gid ; f; vtid); x))] [P 1]
??! < tc : l 6= Unlock(gid ; tid ) > U < Mm : Write(gid ; f; v; tid) > but, from [E3], < Mm : Write(gid ; f; v; tid) >) p3 < tc : Store(gid ; f; v; tid) > and then the rule follows CVD [LV2]: between a lock action by T on L and a subsequent use or store action by T on a variable
V, an assign or load action on V must intervene; moreover, if it is a load action, then the read action corresponding to that load must follow the lock action.
< tc : Lock(gid ; tid) > ^3 [tc : Used(gid ; f; v; x)] ) [tc : :Used(gid ; f; v; x)] U (< tc : SLoad(gid ; f; v) > _[tc : Assigned(gid ; f; v; x)])^ < tc : l 6= SLoad(gid ; f; v) > ^[tc : :Assigned(gid ; f; v; x)] U [Mm : Read(gid ; f; v; tid; x)] [TLock] Proof < tc : Lock(gid; tid) > ?????! [tc : wm(gid ; f) = undef] while [TUse] < tc : Use(gid ; f) > ????! [tc : 9v; ss:wm(gid ; f) = (v; ss))] Then (P4): [tc : :Used(gid ; f; v; x)] U (< tc : SLoad(gid ; f; v) > _[tc : Assigned(gid ; f; v; x)]) Moreover [PSLoad;MmSLoaded] < tc : Lock(gid ; tid) > ^3 [tc : Used(gid ; f; v; x)] ???????????????! < tc : l 6= SLoad(gid ; f; v) > U [Mm : Read(gid ; f; v; tid; x)] Then the rule follows. CVD
B.4 Auxiliary properties
Below, we prove some properties that we used in the proofs above. P1 : a store action in the MmMng can be removed only by a write action. P2 : MmMng memory can be modi ed only by a write action. P3 : a lock action in the MmMng can be removed only by an unlock action performed by MmMng. P4 : a variable status changes from X (X 6= S) to S only after a store action, from X (X 6= L) to L only after a load action and from X to A only after an assign action P5 :a value can be loaded from MmMng only if MmMng previously had performed the corresponding read action on that value. < Mm : Loaded(gid; f; v; tid)n >) p3[Mm : Read(gid ; f; v; tid; x)n] P6 : there is a lock action in the MmMng if it was performed such lock action and no corresponding unlock has been happened yet. [Mm : In(L(gid ; tid ); x)] ) (p < tc : Unlock(gid ; tid ) >) S < tc : Lock(gid ; tid ) >
Versione 0.1 { September 1, 1998
50
P7 : if there is a read action in the MmMng then it performed such read action before. [Mm : In(R(gid ; f; v; tid)n; x)] ) p3[Mm : Read(gid ; f; v; tid; x)n] P8 : if there is a store action in the MmMng then it performed an action labeled by Stored before. [Mm : In(S(gid ; f; v; tid)n; x)] ) p3 < Mm : Stored(gid ; f; v; tid)n >
C A branching-time temporal logic .1 Introduction
In this work, we need a tool for speci ng properties of the models that are associated with Java programs by our semantics; such models can be represented as trees, whose paths describe program executions. It is obvious that in our case a temporal logic must be preferred to modal or multi-modal logics as it is interpreted over paths (in transition systems) and consequently allows to express properties over the executions of a program. Linear temporal logics refer to single paths (thus a formula is satis ed by a set of paths i it is satis ed by each path in the set), while branching-time temporal logics refer to sets of paths thus taking the branching structure of the behaviour into account. The temporal logic we chose is the one in [3], where the logic state-formulae are the basic building blocks; they are ordinary 1st order formulae and describe the properties of the model under investigation at a given instant in time. Temporal formulae are obtained from state formulae by using temporal combinators (such as \henceforth", \eventually", \at the next instant", : : :), together with classical propositional connectives and quanti ers. It is a branching-time logic, instead of the simpler linear one, because it allows to express, in a natural way, properties about the choices available at a given moment of time. Actually, the logic presented here is more simpler and more natural than the one in [3] using futuretime operators only, since we also included the usual operators referring to the past (such as "since", "last-time", "sometime", : : :). In [3] the logic is a tool for reasoning over dynamic elements, i.e., entities that can evolve in time, such as, for example, processes or concurrent/reactive systems. Dynamic elements, in turn, can contain other dynamic entities. When reasoning over such systems, the logic must be able to express properties over dierent dynamic sorts and, consequently dierent transition predicates. Moreover, temporal formulae are given depending on the signature, that is on the sorts and on the operation and predicate symbols. Indeed, in this work, we use the temporal logic in a very peculiar context. We need only one dynamic sort, that is the sort of program states with which is automatically associated the sort of the labels of the program and the transition predicate that gives the semantics of the program. Furthermore, we simply need a tool for reasoning about the validity of some properties, that are expressed by a xed set of predicates, over the computations of a given program. In the following, we give the de nition of dynamic structure, the set of temporal formulae depending on a signature and the validity notion for these formulae on a structure, in the restricted case of signature with only one (omitted for simplicity) dynamic sort. Then we present a modi ed set of formulae that are used to enlight the role of a component in a transition of the whole program. This requires to use a generalized de nition of labeled transition system, where transition carry on also their own derivation proof. Finally, we give the particular signature used in Sect. B to express the properties over low-level actions.
.2 Dynamic Structures
First, we brie y report the main de nitions about rst-order structures. A many-sorted signature is a triple = (S ; OP ; PR) where: S is the set of sorts; OP is a family of sets fOP w;sgw2S ;s2S of operation symbols; PR is a family of sets fPR w gw2S + of predicate symbols.
Versione 0.1 { September 1, 1998
51
A rst-order structure is a triple A = (fAss gs2S ; fOp A gOp2OP ; fPr A gPr 2PR ), consisting of the carriers, the interpretation of operation symbols and the interpretation of predicate symbols. An lts can be represented by a rst-order structure on a signature with two special sorts states and labels whose elements are states and labels of the lts, and a predicate !A : states labels states representing the transition relation. The rst-order structures corresponding to ltss are called dynamic structures and are formally de ned as follows.
Def. .1 A dynamic signature D is a pair (; DS ), where: = (STATE ; OP ; PR) is a signature; DS STATE is the set of the dynamic sorts, i.e. sorts corresponding to dynamic systems
(states of ltss); for all ds 2 DS there exist a sort lab (ds ) 2 STATE ? DS (the sort of the labels) and a predicate ?! : ds lab (ds ) ds 2 PR (the transition predicate). A dynamic structure on D (shortly a D-dynamic structure) is just a - rst-order structure; the term structure TD (X) is just T (X), where X is a sort assignment on D. Let us remember that we consider a simpli ed version of the logic, where a dynamic signature is D is a pair (; DS ) where DS = fstateg. So, in the following we simply use instead of D.
.3 The Logic
The formulae of the logic are given depending on the signature. In our case, we will x a particular signature (see ??) before using the logic. PATH (A) denotes the set of the paths for the dynamic systems, i.e. the set of all sequences of transitions having form either (1) or : : : or (4) below: (1) : : : d?2 l?2 d?1 l?1 d0 l0 d1 l1 d2 l2 : : : (2) d0 l0 d1 l1 d2 l2 : : : (3) : : : d?2 l?2 d?1 l?1 d0 (4) d0 l0 d1 l1 d2 l2 : : : dn n0 where for all integers i, di 2 Astate , li 2 Alabel and (di ; li; di+1) 2!A. Notice that both a single state d and a single transition d l d0 may be a path. If has form either (3) or (4) is said right-bounded, while if it has form either (2) or (4) is said left-bounded. If is right-bounded, then LastS () denotes the last state of ; analogously if is left-bounded, FirstS () denotes the rst state of ; while if is left-bounded, then FirstL() denotes the rst label of , if exists, i.e. if is not just a state. 2 PATH (A) is right-maximal (left-maximal ) i either is not right-bounded (left-bounded) or there do not exist l, d0 s.t. (LastS (); l; d0) 2!A ((d0; l; FirstS ()) 2!A). A composition operation is de ned on paths: : : : dn?1 ln?1 dn ln dn+1 ln+1 : : : if = : : : dn?1 ln?1 dn and 0 = dn ln dn+1 ln+1 : : : 0 =def undef otherwise A pointed path is a pair < p ; f > s.t. p is left-maximal and right-bounded, f is right-maximal and left-bounded and LastS (p ) = FirstS (f ); it represents a complete behaviour for the dynamic system in the state LastS (p ) coinciding with FirstS (f ), p the past and f the future.
Def. .2 The set F(X) of dynamic formulae and the auxiliary sets P(X) of path formulae (on X ) are inductively de ned as follows (where t1 , : : :, tn denote terms of appropriate sort and we assume that sorts are respected): dynamic formulae
Versione 0.1 { September 1, 1998 { { { { {
Pr (t1 ; : : :; tn) 2 F (X)
t1 = t2 2 F (X) : 1, 1 2 2 F (X) 8 x : 2 F(X) 4(t; ) 2 F (X)
52 if
Pr 2 PR
if if if
1; 2 2 F(X) 2 F (X), x 2 X t 2 TD (X )state , 2 P(X)
path formulae
{ { { { { {
[ x : ] 2 P (X) hx : i 2 P (X) : 1, 1 2 2 P (X) 8 x : 2 P(X) 1 U 2 2 P (X); 1 S 2 2 P(X); 2 P(X)
if if if if if if
x 2 Xstate , 2 F(X) x 2 Xlabel , 2 F (X) 1; 2 2 P(X) 2 P(X), x 2 X 1; 2 2 P(X) 2 P(X).
The symbols [ ] and h i that appear in path formulae are just brackets and do not represent modalities. The formulae of our logic include the usual ones of many-sorted rst-order logic with equality; they include also formulae built with the transition predicate. Notice that path formulae are just an ingredient, though an important one, for building the temporal formulae. The formula 4(t; ) can be read as \for every path < p ; f > pointed in the state denoted by t, the path formula holds on < p ; f >". We anchor these formulae to states, following the ideas in [?]. The dierence is that we do not model a single program but, in general, a type of programs, so there is not a single initial state but several of them, hence the need for an explicit reference to states (through terms) in the formulae built with 4. The formula [ x : ] holds on the pointed path < p ; f > whenever holds at the rst state of f , which is also the last state of p ; while the formula hx : i holds on the pointed path < p ; f > if f is not just a single state and holds at the rst label of f . Finally, , U and S are the so called next, (future) until and (past) since combinators.
Def. .3 (Semantics of formulae) Let A be a dynamic structure and V an evaluation of the variables in X in A (that is a family of functions fVs : Xs ! Asg), then we de ne by multiple induction: the validity of a path formula on a pointed path < p ; f > in A w.r.t. V (written A; V; < p ; f >j= ), the validity of a formula in A w.r.t. V (written A; V j= ),
as follows (tA;V denotes the interpretation of term t w.r.t. A and V ):
path formulae A; V; < p ; f >j= [ x : ] i A; V [FirstS (f )=x] j= A; V; < p ; f >j= hx : i i FirstL(f ) is de ned and A; V [FirstL(f )=x] j= . A; V; < p ; f >j= 1 U 2 i there exist 1 , 2 s.t. f = 1 2, A; V; < p 1 ; 2 >j= 2 and for each 10 , 100 s.t. 1 = 10 100 and 10 6= 1 , A; V; < p 10 ; 100 2 >j= 1 A; V; < p ; f >j= 1 S 2 i there exist 1 , 2 s.t. p = 1 2 , A; V; < 1; 2 f >j= 2 and for each 20 , 200 s.t. 2 = 20 200 and 200 = 6 2 , A; V; < p 20 ; 200 f >j= 1 A; V; < p ; f >j= i f = s l 0 and A; V; < p (s l FirstS (0 )); 0 >j= A; V; < p ; f >j= : i A; V; < p ; f >6j= A; V; < p ; f >j= 1 2 i either A; V; < p ; f >6j= 1 or A; V; < p ; f >j= 2 A; V; < p ; f >j= 8 x : 1 i for each v 2 Astate ; A; V [v=x]; < p ; f >j= 1
Versione 0.1 { September 1, 1998
53
formulae A; V j= Pr (t1 ; : : :; tn) i (tA1 ;V ; : : :; tAn ;V ) 2 Pr A A; V j= t1 = t2 i tA1 ;V = tA2 ;V A; V j= : i A; V 6j= A; V j= 1 2 i either A; V 6j= 1 or A; V j= 2 A; V j= 8 x : i for each v 2 As , with s sort of x, A; V [v=x] j= A; V j= 4(t; ) i for each < p ; f > s.t. FirstS (f ) = tA;V , A; V; < p ; f >j= is valid in A (written A j= ) i A; V j= for all evaluations V . In the above de nitions we have used a minimalset of combinators; in practice, however, it is convenient to use other, derived, combinators; we list below those that we shall use in this paper, together with their semantics. true , false , _ , ^ , 9 and , de ned in the usual way 3 =def true U (eventually ) A; V; < p ; f >j= 3 i there exist 1 , 2 s.t. f = 1 2, and A; V; < p 1; 2 >j= 3p =def true S (some time in the past ) A; V; < p ; f >j= 3p i there exist 1 , 2 s.t. p = 1 2 , A; V; < 1; 2 f >j= =def :3 : (always ) A; V; < p ; f >j= i for all 1, 2 s.t. if f = 1 2 , then A; V; < p 1; 2 >j= p =def : 3p : (always in the past ) A; V; < p ; f >j= p i for all 1, 2 s.t. if p = 1 2, then A; V; < 1; 2 f >j= Each time in there are no free variables of dynamic sort except x: [ x : ] is abbreviated to [], moreover [s = t] is abbreviated to [t]; analogously hx : i and hl = ti are abbreviated respectively to hi and hti.
.4 Temporal formulae with players
In Sect. B we have to express properties on the execution of the whole program that relate the activity of one component with the activity of other ones (e.g. a rule may require that if a thread performs a lock action, the memory manager performs a lock action too). Thus we need a temporal logic that allows to specify which is the role played by a component inside a transition of the program. For example, we need the formula to express that, when the whole program performs the transition l lt 0 s ?! s0 , the component t inside state s performs the transition t ?! t. Since the transition relation of the program is given by inductive rules, it is possible to associate with a transition its proof in this inductive system. Such proof describes how a transition of the whole program results by combining transitions of its components. Given an LTS (States ; Labels ; !), we can de ne the Generalized Labelled Transition System (glts) as the 4-tupla (States ; Labels ; !; Proofs), where !: States Labels States Proofs is de ned as follows: l (s; l; s0 ; P) 2! i s ?! s0 and P is its proof in the inductive system de ning !. (If (s; l; s0 ; P) 2!, l[P ] we write s ??! s0 ).
Versione 0.1 { September 1, 1998
54
Proofs is the set of proofs.
As a consequence, the de ntion of path for a glts is slightly dierent from the one in Sect. .3. GPATH (A) denotes the set of the paths (of the only dynamic sort) i.e. the set of all sequences of transitions having form either (1) or : : : or (4) below: (1) : : : d?2 l?2 p?2 d?1 l?1 p?1 d0 l0 p0 d1 l1 p1 d2 p2 l2 : : : (2) d0 l0 p0 d1 l1 p1 d2 l2 p2 : : : (3) : : : d?2 l?2 p?2 d?1 l?1 p?1 d0 (4) d0 l0 p0d1 l1 p1 d2 l2 p2 : : : dn n0 where for all integers i, di 2 Astate , li 2 Alabel , pi 2 Proofs A and (di; li ; pi; di+1) 2!A. Notice that both a single state d and a single transition d l p d0 may be a path. Actually, we do not need to have complete proofs for a given transition. We only need to record which is the laat rule applied and which are the premises. Elements in Proofs are couples (N; prems) where N is the name of a metarule, while prems is the set of transitions that are the premises of the applied rule. For example, if a transition rule in the generalized system has the form: L2 [P 2] L1 [P 1] T1 ????! T10 T2 ????! T20 . l[P 1;P 2] T1jT2 ?????! T10 jT20 We can simplify by naming R1 the rule above and writing l1
l2
l[R1;ft1 ?! t01 ;t2 ?! t02g] 0 0 t1 jt2 ?????????????????? ! t1jt2 instead ofl[P 1;P 2] t1 jt2 ?????! t01jt02 .
The formula: htid : lt i is satis ed if the next transition of the program results by combining a transition performed by t s.t. IdInfo(t) = tid and labelled by lt with other component's activity. Formally, it holds on a path sl[P]s0 : : : if P is a proof of the form: ln l1 gc0n gg gc01 : : :gcn ?! fRn; fgc1 ?! and exists i = 1 : : :n, t01 s.t. IdInfo(gci ) = tid and li = lt .
Versione 0.1 { September 1, 1998
55
Contents
1 Introduction 2 Static and dynamic semantics for Java 3 Peculiar aspects of Java
3.1 Field Access and Method Invocation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.2 Main memory and working memory : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
4 Java1
4.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : 4.2 Static Semantics : : : : : : : : : : : : : : : : : : : : 4.2.1 The type system and the annotation function 4.2.2 Type Rules : : : : : : : : : : : : : : : : : : : 4.3 Semantics : : : : : : : : : : : : : : : : : : : : : : : : 4.4 Thread Semantics : : : : : : : : : : : : : : : : : : : : 4.5 Main Memory Manager semantics : : : : : : : : : : : 4.6 Classes and objects states : : : : : : : : : : : : : : : 4.7 Program Semantics : : : : : : : : : : : : : : : : : : :
5 Java2
5.1 Syntax and Static Semantics : : : : 5.2 Semantics : : : : : : : : : : : : : : 5.2.1 Thread Semantics : : : : : 5.3 Main Memory Manager semantics : 5.4 Program Semantics : : : : : : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
6.1 Considerations on Communication in Java : : : 6.2 Files in Java : : : : : : : : : : : : : : : : : : : 6.2.1 Streams for Standard Input and Output 6.2.2 Streams to Read and Write Files : : : : 6.3 Sockets in Java : : : : : : : : : : : : : : : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
6 Java3
1 3 5 5 7
7
8 8 8 9 12 13 20 21 21
24 24 24 24 28 29
31 31 32 32 33 34
7 Syntax and Static Semantics 8 Thread Semantics
35 35
9 Program Semantics
37
A Java1 syntax B Low level actions rules
42 44
8.1 Memory Manager Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37
9.1 Program states : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37 9.1.1 Stream State : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37 9.1.2 Socket State : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38
B.1 B.2 B.3 B.4
How to prove rules : The Signature : : : : Rules to be proved : Auxiliary properties
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
44 45 45 49
Versione 0.1 { September 1, 1998
C A branching-time temporal logic .1 .2 .3 .4
Introduction : : : : : : : : : : : : Dynamic Structures : : : : : : : The Logic : : : : : : : : : : : : : Temporal formulae with players :
56 : : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
50 50 50 51 53
References [1] E. Astesiano and G. Reggio. Formalism and Method. In M. Bidoit and M. Duchet, editors, Proc. TAPSOFT'97, pages 93{114, Berlin, 1997. Springer Verlag. [2] Eva Coscia and Gianna Reggio. An equivalence notion for Java programs. Technical report, DISI, 1998. in preparation. [3] G.Costa and G.Reggio. Speci cation of abstract dynamic data types: a temporal logic approach. TCS, (173), 1987. to appear. [4] G. Steele J.Gosling, B. Joy. The Java Language Speci cation. The Java Series. Addison Wesley, 1996. [5] B. Reus P.Cenciarelli, A.Knapp and M.Wirsing. An event-based structural operational semantics of multi-threaded java. In to appear in Formal Syntax and Semantics of Java. Springer, 1998. [6] B.Reus M.Wirsing P.Cenciarelli, A. Knapp. From sequential to multi-threaded java: An event based operational semantics. In M.Johnson, editor, Algebraic Methodology and Software Technology. AMAST '97, Springer, July 1997. [7] Gordon D. Plotkin. A structural approach to operational semantics (lecture notes). Technical report, DAIMI FN-19, Aarhus University, 1981. [8] S. Eisenbach S.Drossopoulou. Java is type safe- probably. 11th European Conference on Object Oriented Programming, June 1997.