validity contracts for software transactions

0 downloads 0 Views 1MB Size Report
In this thesis, I mainly focus on the world of object-oriented programming. Even though ...... written by another transaction, a read-write conflict is considered harmless if for example, the read is not covered ...... with potatoes. — Douglas Adams.
VALIDITY CONTRACTS FOR SOFTWARE TRANSACTIONS

BY

QUAN HOANG NGUYEN

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTERS BY RESEARCH

IN THE SCHOOL OF

COMPUTER SCIENCE AND ENGINEERING

THE UNIVERSITY OF NEW SOUTH WALES

August 2009

c Quan Hoang Nguyen 2009

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.’

Signed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ABSTRACT

Software Transactional Memory is a promising approach to concurrent programming, freeing programmers from error-prone concurrency control decisions that are complicated and not composable. But few such systems address consistencies of transactional objects.

In this thesis, I propose a contract-based transactional programming model toward more secure transactional sofwares. In this general model, a validity contract specifies both requirements and effects for transactions. Validity contracts bring numerous benefits including reasoning about and verifying transactional programs, detecting and resolving transactional conflicts, automating object revalidation and easing program debugging.

I introduce an ownership-based framework, namely AVID, derived from the general model, using object ownership as a mechanism for specifying and reasoning validity contracts. I have specified a formal type system and implemented a prototype type checker to support static checking. I also have built a transactional library framework AVID, based on existing Java DSTM2 framework, for expressing transactions and validity contracts.

Experimental results on a multi-core system show that contracts add little overheads to the original STM. I find that contract-aware contention management yields significant speedups in some cases. The results have suggested compilerdirected optimization for tunning contract-based transactional programs. My further work will investigate the applications of transaction contracts on various aspects of TM research such as hardware support and open-nesting.

ACKNOWLEDGEMENTS

First and foremost, I would like to thank my supervisors, Prof. Jingling Xue and A/Prof. John M. Potter, for their helpful guidance and support for shaping up this dissertation. I also thank Dr. Yi Lu for explaining and discussing on numerous research questions in the early state of my research. I am very grateful for the ARC grants, the scholarship and tutorship of the School of Computer Science.

Many thanks to Dr. Bernhard Scholz for introducing me to the opportunity to study at UNSW. I would like to thank Lian Li and Lin Gao for the collaboration during my study. I also would like to thank a visiting student Li Wang for some collaborative work.

Last but not least, I would like to thank many people of our lab, e.g., Tom, Andy, Angelo, just to name a few, in the lunch group for a variety of topics they have made at lunchtime. Many discussions on our current work are interesting and exciting, and other off-research topics are so entertaining.

Contents

List of Figures

vii

List of Tables

ix

1 Introduction 1.1

1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.1.1

Concurrent Programming . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.1.2

Software Transactional Memory . . . . . . . . . . . . . . . . . . . . . .

2

1.1.3

Object-oriented Programming . . . . . . . . . . . . . . . . . . . . . . .

2

1.1.4

Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.2

Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

List of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

1.4

Outline of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2 Background 2.1

2.2

Concurrency

9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.1.1

Lock-based Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1.2

Subtleties of Lock-based Synchronization . . . . . . . . . . . . . . . .

12

2.1.3

Non-blocking Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . .

13

Transaction Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.2.1

Transactional Memory . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.2.1.1

Updating Policy . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2.1.2

Conflict Detection . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2.1.3

Contention Management . . . . . . . . . . . . . . . . . . . .

17

2.2.1.4

Version Management . . . . . . . . . . . . . . . . . . . . . .

18

iii

CONTENTS

2.2.1.5 2.3

2.4

2.5

Correctness Criteria . . . . . . . . . . . . . . . . . . . . . . .

18

Contract Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.3.1

The Notion of Contract . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.3.2

Design by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.3.3

DbC Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.3.3.1

Pre-condition/post-condition . . . . . . . . . . . . . . . . . .

20

2.3.3.2

Class Invariant . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.3.3.3

Check Instruction . . . . . . . . . . . . . . . . . . . . . . . .

20

2.3.4

The Benefits of DbC . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2.3.5

Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

Ownership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.4.1

Ownership Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.4.1.1

Ownership Generic Java . . . . . . . . . . . . . . . . . . . . .

23

2.4.2

Ownership-based Effect Systems . . . . . . . . . . . . . . . . . . . . .

23

2.4.3

Ownership Types and Invariants . . . . . . . . . . . . . . . . . . . . .

24

2.4.4

Applications of Ownership . . . . . . . . . . . . . . . . . . . . . . . . .

24

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3 Contract-Aware Transactional Memory

27

3.1

Transactions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

3.2

General Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.2.1

Validity Contracts for Transactions . . . . . . . . . . . . . . . . . . . .

28

3.2.2

General Contract-based STM Framework . . . . . . . . . . . . . . . .

28

3.2.2.1

Execution Model . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.2.2.2

Validity Requirement . . . . . . . . . . . . . . . . . . . . . .

30

3.2.3

Transaction Validation vs. Object Revalidation . . . . . . . . . . . . .

30

3.2.4

Validity Subcontract . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

3.2.5

Correctness of Contracts . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.2.6

Contracts and Optimizations . . . . . . . . . . . . . . . . . . . . . . .

31

3.2.6.1

Validity-based Conflict Detection . . . . . . . . . . . . . . . .

31

3.2.6.2

Contract-based Contention Management . . . . . . . . . . .

31

3.2.6.3

Benign Conflicts and Safe Escapes . . . . . . . . . . . . . . .

33

AVID: Ownership-based STM . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

3.3.1

Ownership-based Contract . . . . . . . . . . . . . . . . . . . . . . . . .

33

3.3.2

Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . .

35

3.3

iv

CONTENTS

3.3.3

Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.4

AVID Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

3.5

Dynamic Semantics and Properties . . . . . . . . . . . . . . . . . . . . . . . .

42

3.6

Extensions for Static Non-interference . . . . . . . . . . . . . . . . . . . . . .

47

3.7

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4 Type Checker

49

4.1

Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

4.2

Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

4.2.1

Highlights of OGJ Type Checker . . . . . . . . . . . . . . . . . . . . .

50

4.2.2

Design and Engineering of AVID . . . . . . . . . . . . . . . . . . . . .

52

Type Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

4.3.1

Testing with Field Access and Assignment . . . . . . . . . . . . . . . .

54

4.3.2

Testing with Method Parameters . . . . . . . . . . . . . . . . . . . . .

55

4.3.3

Testing with Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . .

56

4.3.4

Testing with Contract . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

4.3.5

Testing with Atomic Blocks and Subcontract . . . . . . . . . . . . . .

60

4.3.6

An Integration Test . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

4.3

4.4

5 The AVID STM Library 5.1

5.2

5.3

63

Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.1.1

Annotation of Ownership and Contract . . . . . . . . . . . . . . . . .

64

5.1.2

Transaction Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.1.3

Transactional Objects . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

5.1.4

Conflict Detection and Resolution . . . . . . . . . . . . . . . . . . . .

66

5.1.5

Contract-aware Validation . . . . . . . . . . . . . . . . . . . . . . . . .

66

5.1.6

Usability and Extensibility . . . . . . . . . . . . . . . . . . . . . . . .

67

Examples of the Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

5.2.1

Debugging with Contracts and Checks . . . . . . . . . . . . . . . . . .

67

5.2.2

Pair Manager Example

. . . . . . . . . . . . . . . . . . . . . . . . . .

68

Performance Evalulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

5.3.1

Evaluation Methodologies . . . . . . . . . . . . . . . . . . . . . . . . .

71

5.3.2

Benchmark Settings and Testing Environment

. . . . . . . . . . . . .

71

5.3.3

Efficiency of Read Escapes

. . . . . . . . . . . . . . . . . . . . . . . .

71

v

CONTENTS

5.4

5.3.4

Contract-based Contention Managers

. . . . . . . . . . . . . . . . . .

74

5.3.5

Evaluation of Execution Overheads . . . . . . . . . . . . . . . . . . . .

75

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

6 Discussion and Review

79

6.1

Design by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

6.2

Invariant and Exception Handling . . . . . . . . . . . . . . . . . . . . . . . .

80

6.2.1

Object Invariants vs. Data Races. . . . . . . . . . . . . . . . . . . . .

81

Ownership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

6.3.1

Ownership and Invariants . . . . . . . . . . . . . . . . . . . . . . . . .

82

6.3.2

Runtime Ownership . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

Transactional memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

6.4.1

Conflict Detection and Resolution . . . . . . . . . . . . . . . . . . . .

83

6.4.2

Correctness Criteria for TM . . . . . . . . . . . . . . . . . . . . . . . .

83

6.4.3

Irreversible Transactions . . . . . . . . . . . . . . . . . . . . . . . . . .

84

6.4.4

Transaction Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

6.4.5

Composability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

6.4.6

Other Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Limitations of AVID Framework . . . . . . . . . . . . . . . . . . . . . . . . . .

85

6.5.1

STM-related Overheads . . . . . . . . . . . . . . . . . . . . . . . . . .

85

6.5.2

Expressiveness of Ownership Types . . . . . . . . . . . . . . . . . . . .

86

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

6.3

6.4

6.5

6.6

7 Conclusions

87

7.1

Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

7.2

Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

7.3

Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

Bibliography

91

Appendices

97

A Further Type-Checking Experiments

99

A.1 Testing with Method Argument . . . . . . . . . . . . . . . . . . . . . . . . . .

99

A.2 Testing with Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 A.3 Testing with Atomic Expression and Subcontracting . . . . . . . . . . . . . . 100

vi

List of Figures

1.1

Example of Customer and Account . . . . . . . . . . . . . . . . . . . . . . . .

4

2.1

Ownership and Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

3.1

Meaning of Transaction’s Validity Contract . . . . . . . . . . . . . . . . . . .

28

3.2

Java STM System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.3

Semantics of contract-attached atomic expression . . . . . . . . . . . . . . . .

29

3.4

Overview of AVID STM System . . . . . . . . . . . . . . . . . . . . . . . . . .

34

3.5

Account and Customer example in AVID abstract syntax . . . . . . . . . . . .

37

3.6

Example of ownership and validity contracts using AVID API . . . . . . . . .

38

4.1

Examples of books in ownership types and in OGJ . . . . . . . . . . . . . . .

50

4.2

Ownership and Subclasses in OGJ . . . . . . . . . . . . . . . . . . . . . . . .

51

4.3

Contracted Atomic Statement and Contracted Method Declaration . . . . . .

53

4.4

Testing with field assignment: (a) source code, and (b) compilation result . .

54

4.5

Testing with parameter mismatch: (a) source code, (b) compilation result . .

55

4.6

Testing with inherited method calls: (a) code, (b) compilation result . . . . .

56

4.7

Testing with parameter mismatch of inherited method: compilation result . .

57

4.8

Testing with inherited attribute . . . . . . . . . . . . . . . . . . . . . . . . . .

58

4.9

Testing with where clause and validity subcontract: code, compilation result .

59

4.10 Testing with nested atomic blocks

. . . . . . . . . . . . . . . . . . . . . . . .

60

4.11 Example of illegal manipulation of list via reference . . . . . . . . . . . . . . .

62

5.1

Transaction Creation in DSTM2 and in AVID . . . . . . . . . . . . . . . . . .

65

5.2

Contracted Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

vii

LIST OF FIGURES

5.3

Transactional Node with Ownership . . . . . . . . . . . . . . . . . . . . . . .

66

5.4

Throughput txns/second for a List with 3 update mix rates . . . . . . . . . .

72

5.5

Throughput txns/second for an RBTree with 3 update mix rates . . . . . . .

73

5.6

Validity checks for RBTree with shadow and obstruction-free adapters . . . .

73

5.7

Compare Contract-based and Aggressive CMs for List and RBTree when mix=50 74

5.8

Throughputs of single-threaded execution of three versions of List (transactional, synchronized, and sequential List), with five update mixes; the logscaled Y-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.9

75

Comparison of lock-based and transaction versions of List, with five mix updates 76

5.10 Throughputs of single-threaded execution of three versions of RBTree (transactional, synchronized, and sequential List), with five update mixes; the log-scale Y-axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

5.11 Comparison of lock-based RBTree with five mix updates . . . . . . . . . . . .

78

A.1 Test 2 with parameter mismatch: (a) source code, (b) command line . . . . .

99

A.2 Test 3 with inheritance and parameter mismatch: (a) source, (b) command line 100 A.3 Example of inheritance and single dispatch: command line . . . . . . . . . . . 101 A.4 Example of atomic expression . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

viii

List of Tables

3.1

Contracted-based Contention Managers . . . . . . . . . . . . . . . . . . . . .

32

3.2

Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

3.3

Extended Syntax for Type System . . . . . . . . . . . . . . . . . . . . . . . .

39

3.4

Rules for Validity Contract and Subcontract . . . . . . . . . . . . . . . . . . .

39

3.5

Rules for Context and Constraint . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.6

Program, Class and Method Rules . . . . . . . . . . . . . . . . . . . . . . . .

40

3.7

Expression Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

3.8

Rules for Type, Subtype, Binding and Abstraction . . . . . . . . . . . . . . .

42

3.9

Lookup Functions

43

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10 Extended Syntax for Dynamic Semantics

. . . . . . . . . . . . . . . . . . . .

43

3.11 Auxiliary Definitions for Dynamic Properties . . . . . . . . . . . . . . . . . .

44

3.12 Small-Step Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . .

45

3.13 Extended Abstract Syntax for Disjointness Specification . . . . . . . . . . . .

47

3.14 Modified Rules for Fine-grained Effect Specification

. . . . . . . . . . . . . .

48

3.15 Rules for Disjoint Contexts and Contract Non-interference . . . . . . . . . . .

48

ix

Listings

2.1

Example of check instruction . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.1

Shell script ”runcommand.sh” for running AVID checker . . . . . . . . . . . .

53

4.2

An example of manipulating a list . . . . . . . . . . . . . . . . . . . . . . . .

61

5.1

A Transactional Pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

5.2

Storage of Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

5.3

PairManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

5.4

PairChecker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

xi

Chapter

1

Introduction If we know what it was we were doing, it would not be called research, would it ? — Albert Einstein This dissertation describes my concerns with the reasoning and maintenance tasks for object validity in object-oriented concurrent applications. In particular, I concentrate on the development of a contract-based language, compiler and a library framework to ease concurrent programming and to preserve system validity. In this introductory chapter, I start with the current needs of concurrent programming in the multi-core era and the promise of transactional memory to ease the programming task. Then I discuss the requirements for maintaining system validity. In the next section, I describe my contract-based approach to addressing object validity in concurrent programs. Section 1.2 shows my thesis statements; and Section 1.3 provides a list of contributions made by this thesis. Finally, Section 1.4 gives an outline of this dissertation.

1.1 1.1.1

Background Concurrent Programming

With the ubiquity of multi-core systems, there is a huge demand for software developers to exploit the available concurrency (128). But concurrent programming is notoriously difficult since it requires careful coordination between threads that access shared memory locations. Pessimistic lock-based synchronization mechanisms, such as locks, semaphores and monitors, provide a means to synchronize memory and coordinate threads, yet the solutions target performance in a cumbersome and error-prone manner, despite their current marketdomination. The major disadvantage of lock-based synchronization is the difficulty in rea-

1

1. INTRODUCTION

soning about locks. This difficulty occurs for at least two reasons: the need to explicitly manage locks and the challenge of composing lock-based code. Moreover, many concurrent programming techniques such as Edsger Dijkstra’s semaphore (41), are very challenging to program. All the difficulties are magnified by both obvious defects such as deadlocks and livelocks, and latent defects, such as race conditions, due to the nondeterministic nature of concurrent programs. Even worse, adding resources, e.g. faster CPU, disk or network, and more memory, can make a seemingly stable system become unstable; latent defects show up more quickly. In this dawn of multi-core systems, programmability of concurrent software is the dominant concern. New programming mechanisms are needed to harness the upcoming generation of massively parallel computer architectures that will comprise hundreds of cores per processor. Thus, a more highly programmable concurrent programming mechanism is needed to take advantage of the multi-core processing power.

1.1.2

Software Transactional Memory

Software Transactional Memory (STM) systems (122) promise to ease programming tasks for concurrent application developers. STM hides the complexity of concurrency control from programmers and enables safe composition of scalable applications. STM offers a programming model inspired from the ACID properties (e.g. Atomicity, C onsistency, I solation and D urability) of database transactions which play a major role in software systems. The STM systems guarantee atomicity for code wrapped inside an atomic block. The code is executed transactionally and if the transaction fails, its effects are discarded. STM provides isolation guarantees that no transaction can observe any partial effects of the others. There are, however, few STMs (58) paying attention to data invariants — the consistency guarantees.

1.1.3

Object-oriented Programming

Object-oriented programming (OOP) provides basic concepts such as encapsulation, inheritance and polymorphism, the practice of which brings numerous advantages for expressing programmer’s design in a clear way. It models software in the way that humans describe realworld objects, using software objects in clear program structures. It encapsulates attributes and operations (behaviours) into objects. The inheritance relationships permit derivation of new classes of objects by absorbing characteristics of existing classes and adding unique characteristics of their own. OOP provides information hiding, in which objects may communicate

2

1.1 Background

with each other across interfaces, without worrying about the underlying implementation details of other objects. By decoupling the internal functioning of objects from their external interface, minor changes to the internal implementation are made easier during upgrading or maintenance without affecting the clients. Lastly, reusing existing classes through inheritance and composition is made easy, where developers can build new solutions by extending existing solutions. In this thesis, I mainly focus on the world of object-oriented programming. Even though the discussions made throughout this dissertation is about Java programming language, the concepts can be applied to similar languages such as C#.

1.1.4

Validity

Object invariants describe essential properties of objects that must be maintained to guarantee the expected behaviour of the program execution. Any violation of the object invariant is often undesirable as it violates program logic and can cause programs to crash. Yet reasoning about and maintaining object invariants in concurrent programs is difficult. The difficulties come from arbitrary object aliases, mutable object state, methods’ side effects and the sophisticated mixture of multiple threads and shared mutable state typically of concurrent O-O programs. Incorporation of lock-based synchronization primitives, such as monitors, mutexes, and semaphores, in programs may resolve latent defects (data races). Yet they lead to other harder to solve problems such as deadlocks, lock convoying and priority inversion (63). At a high level of abstraction, using locks lead to problems like inheritance anomaly, loss of abstraction and code reusability, just to name a few. In general, these synchronization mechanisms complicate reasoning about and maintenance of concurrent code and object validity. To illustrate the challenges for maintaining object validity, Figure 1.1 shows an example of a bank customer having an account. In this example, if a reference of account a of the customer is leaked, such as via a call getAccount(), then any manipulation of the reference is out-of-control of the customer. The alias makes it very hard to reason about the invariant of the customer, which depends on the account. Even worse, if such a manipulation, e.g. withdraw, on the leaked reference occurs concurrently with the client’s operation, it would result in a race condition on the account. In this situation, no matter how carefully the Customer code is synchronized, neither a consistent state of the account nor the customer’s invariant is guaranted. Even worse, now suppose all the three operations of the account are manifested with lock-based synchronization (Java monitors), to prevent the potential race

3

1. INTRODUCTION

class Account { private int amount = 0; // inv: amount ≥ 0

class Customer { private Account a; private String id; // inv: a6= null ∧ a. amount ≥ 10

int balance() { return amount; } void deposit(int x) { amount += x; } void withdraw(int x) { amount −= x; }

void transfer(int amt, String custID) { Customer b = ....; // lookup custID a. withdraw(amt); b.deposit(amt); } Account getAccount() { return a; } }

Figure 1.1: Example of Customer and Account

condition. Unfortunately, deadlock can occur if two customers attempt to simultanously transfer money to each other.

1.2

Thesis Statement

To address the validity challenge, this thesis introduces a new approach to consistency in concurrent applications in an inherently trackable manner. This work focuses on the use of invariants for validity guarantees in the context of software transactions. By introducing a new contract-based programming model, namely AVID, the system provides the consistency required for ACID transactions. Validity contracts for methods and transactions specify assumptions and effects on object invariants. The key idea is to facilitate modular reasoning about object validity and to preserve validity of objects at runtime. In this programming model, all complex details of the underlying execution mechanisms (e.g., locks or transactions; the implementation details of STM library) are hidden from the programmer. By using a structured form of validity contract, AVID system is able to reduce the number of required validity checks compared to what other approaches would require if they attempted to perform similar checks. Furthermore, validity contracts open a number of opportunities for optimization. First, validity contracts are served for detecting transactional conflicts at a higher abstraction level (e.g., validity interference) than reads and writes. In STM, conflict detection determines when two transactions cannot both safely commit, to guarantee serializability. Specifically, when an object read access is not covered by the transaction’s validity contract, such access can be safely ignored when considered for transactional conflicts with other

4

1.3 List of Contributions

transactional writes on the same object. Second, validity contracts can be used for conflict resolution. Contention management (CM) was originally proposed for conflict resolution to guarantee progress (livelock avoidance) in obstruction-free STMs (65; 71). CM, an out-of-band mechanism (provided by separate user-defined modules), can be tuned to increase throughput dramatically in high contention workloads (51; 119). With validity contracts, I have developed a set of contract-based contention managers with different safety-driven strategies to ensure liveness. Third, validity checks can be deduced from contracts and scheduled at transaction commits to preserve object validity. Transactional validation ensures a transaction never views or makes use of inconsistent data; while object revalidation ensures transactions only commit valid data. Validity checks should not contain any side effects and I/O operations as in (58) and similarly in ours. I employ a notion of subcontracting to justify the validity requirements and effects of operations inside a transaction’s body. Validity contracts and subcontracting greatly help reduce the number of validity checks while ensuring safety.

1.3

List of Contributions

In summary, this dissertation makes four primary contributions: 1. Programming with Validity Contracts This thesis proposes a general approach to validity contracts for software transactions to enhance program safety. The benefits of validity contracts include modularity and composability of transactional code. They are also useful for detecting and scheduling conflicting transactions. Moreover, they are useful for software assurance by automating object revalidation via deduced assertions. 2. Static Checking Based on my general model, I integrate with ownership mechanisms to support static reasoning of transactional code. To achieve that, my approach involves language design, type system and static type checker for modular reasoning about object validity and validity contracts. 3. Feasibility of Ownership and Contracts This work is the first that introduces a practical way to express ownership-based validity contracts by using Java annotations. The approach enables practical experimentation of ownership-based validity contracts on existing Java STM implementations. 4. A Contract-based Transactional Framework I have developed a Java-based STM framework that supports linear-nesting contract-based conflict detection, automatic

5

1. INTRODUCTION

validity checks and rollbacks. It was implemented on top of DSTM2 (70) and SXM (65). In particular, I have developed six contention managers utilising validity contracts and a read-escape optimization for exploiting benign conflicts. 5. Experiments with New Framework I ran experiments of my AVID framework with concurrent data structures and a number of micro benchmarks on an UltraSPARC T2 Plus server. Experimental results show that my approach adds little overhead. Contract-based contention managers can help gain significant speed-ups in many cases; and a read-escape strategy gave some slight speed-ups. Validity checks are also used to illustrate debugging buggy versions of simple programs.

1.4

Outline of this Dissertation

This dissertation comprises seven chapters. In addition to this introduction, the dissertation is organised in the following chapters. • Chapter 2 covers some background of concurrent programming, contracts and object ownership. The concurrent programming part considers ock-based mechanisms, nonblocking mechanisms and then transactional memory. In the next part, I give an overview of design-by-contract together with some related work. The final part introduces object ownership and language-based advances for expressing ownership types and side-effects. • Chapter 3 introduces a general model of transactional memory in which each transaction has a contract. The model describes validity contracts in a general form and when expressed using object ownership mechanism. Section 3.3 introduces the three major modules developed in this work and then describes the formal treatment including the language syntax, a type system and dynamic semantics of contract-aware STM. An example is given to show how programmers can specify ownership, validity contracts and explicit transactions in AVID. • Chapter 4 will cover the design and implementation details of the AVID type checker. It will explain the ideas for implementing the checker. Subsequently, the chapter will give a number of experiments and test cases for the type checker. The experiments demonstrate the usefulness of static checking in detecting program errors.

6

1.4 Outline of this Dissertation

• Chapter 5 describes key ideas of the implementation of the AVID library framework. By extending an existing framework with Java annotations, we are able to specify ownership and contracts in an elegant way. The Java annotation and reflection facility allows us to retrieve runtime information of contract to support automatic object revalidation. Section 5.3 covers a number of experimental evaluations on the AVID library framework. They include performance evaluation using standard data structure benchmarks. The evaluation also includes contract-based optimizations. • Chapter 6 reviews the work described in this dissertion from the perspective of existing work. Subsequently, it will cover both the benefits and disadvantages of validity contracts. The discussions include possibile improvements and extensions for my current work. • Finally, Chapter 7 summarizes the work presented in the dissertation including the contributions listed in Section 1.3. Then it discusses some possible future investigations.

7

Chapter

2

Background The work in this dissertation is founded on the concepts of object validity and transactional memory. To broaden the discussions in later chapters, a review of existing work on concurrent programming including transactional memory will be described first. The second part of this chapter will give some overview of language-based approaches to object validity including recent advances in type and effect systems.

2.1

Concurrency

This section will give an overview on concurrent programming (e.g. challenges and needs), with respect to the history of computer market. Then the rest of the section will describe existing concurrent programming mechanisms. Concurrent programming is notoriously hard due to the management of threads and synchronization mechanisms on shared data. The non-deterministic nature of concurrent programs often leads to undetected latent defects in delivered softwares. The difficulties of concurrent programming and the non-avoidance of latent defects (such as data races) found in concurrent code caused reduced interests in concurrent programming in the past. Data races involve concurrent write-write and read-write accesses to unprotected shared data. For decades, these problems have resulted in the market-domination of sequential programming and sequential computers, in which these shortcommings do not exist. The computer architecture market shifted in 2005; parallel architectures have become the main stream product(1; 3). This movement was caused by the rising concerns about heat dissipation and power consumption in chip designs, and limitations of instruction level parallelism (ILP) in out-of-order execution of programs. The multi-core era began with

9

2. BACKGROUND

the first mass produced Intel dual-core processors(2). Multi-core processors produce many identical cores on a single integrated circuit die (known as a chip multiprocessor or CMP); the design is different from previous multi-processor machines which had only one processor per die. The shift to multicore processors has raised more concerns with concurrent programming’s ability to exploit parallel hardwares’s power (55). As the upcoming generation of massively parallel computers will consist of up to hundreds of cores per processor (82), the concerns target both performance and programmability of the solutions. Ease and reliability of concurrent programming for such architectures is still an ongoing problem, in spite of many studies had been proposed, even prior to the wide-spread release of CMPs (63; 122), Nevertheless, there was a common misconception that more cores would apparently give more program’s speedup. In fact, speedup depends on available concurrency in the program; linear speedup of the number of available cores is often unachievable, independent on parallizing techniques used. Theoretically, by Amdahl’s Law (10) parallelized speedup is bounded by the inverse of the sequential portion of the program; the speedup of an algorithm when parallelized across N processors is bounded by N . In practice, there exist many factors that significanly reduce a program’s performance, such as synchronization overheads and imbalance of parallelizable task chunks. There are two sorts of available concurrency in applications: coarse-grained and finedgrained. Coarse-grained concurrency exists at business level such as transactions in clientserver J2EE applications; or at application level such as virtualization or multiple application processes or Java Virtual Machine (JVMTM) instances. The second sort of concurrency is fine-grained, such as at data structure, loop or thread level. At data structure level, Java concurrency utilities and Transactional Memory are commonly applied. The rest of this section will give a brief introduction to existing solutions for writing concurrent programs. In general, lock-based mechanisms incur high overheads (especially in read-dominated workloads) and non-modularity (and thus non-composibility) caused by their pessimistic locks. On the other hand, non-blocking mechanisms with their optimistic locking do reduce overheads in read-dominated workloads and support modularity. But the algorithms using them are often more challenging to program than pessimistic locking.

2.1.1

Lock-based Mechanisms

In lock-based mechanisms, a thread acquires a lock protecting a shared resource before accessing the resource. Once holding the lock, the thread prevents other threads from accessing

10

2.1 Concurrency

the same resource until it releases the lock. There are a number of lock-based mechanisms, which will be introduced shortly. Also there exist various implementations of thread waiting strategies, such as using spinlocks, ticket counters, backoff and wait, exponential backoff and wait, and so forth (33).

Locks

The most well known type of synchronization to date is the lock (85). Locks are also

known as mutual exclusion locks or mutex locks. A thread acquires a lock before accessing a shared resource to prevent other threads from accessing the resource. Once a lock is obtained by a thread, other threads requesting the same lock block until the lock is released by the owning thread.

Condition Variables Condition variables wait for a condition to become true before they can proceed. In general, condition variables can be tied to any arbitrary condition, such as isEmpty == true and x == 5. In contrast, locks are tied to one condition: the availability of the lock.

Monitors

Monitors (75) are abstract data types, i.e., a combination of data structures and

operations, that allow only one thread to be executed over shared resources at a time. This function of monitors is similar to that of locks. The required protections are enforced by the compiler implementing the monitor. Unlike locks, monitors can avoid many problems of locks such as inconsistent lock-resource association, deadlocks due to failed unlocking and incorrect extents of synchronized-based behaviour. Monitors are the primary synchronization mechanism used in the Java programming language and are implemented by condition variables. When a function exits, monitors are then released automatically, independent of program flow, allowing exceptions to be handled in synchronized Java methods while avoiding deadlocks.

Semaphores

Semaphores, sometimes referred to as counting semaphores, are non-negative

counters. Any thread can decrement the counter to lock the semaphore, but attempting to decrement it below zero causes the calling thread to wait for another thread to unlock it first. A major problem with semaphores is the lack of ownership in the sense that a thread that never locked a semaphore may unlock it. It can lead to unexpected program behaviour.

11

2. BACKGROUND

2.1.2

Subtleties of Lock-based Synchronization

There are some subtleties with lock-based approaches which vary by degrees relating to lock management (e.g. coverage, granularity) and their side-effects. Lock Coverage

There are three sorts of problems in terms of lock coverage, such as insuf-

ficient synchronization, holding wrong locks and over-synchronization. It is not uncommon that codes written by novice programmers may mix the three problems. Insufficient synchronization and incorrect locks can cause subtle data races that are very hard to find. When a shared object is insufficiently synchronized, one of the threads can access the object without holding the lock that protects the object (and hence the problem can occur even another thread is holding the lock). In a very similar problem, one thread accesses a shared object after acquiring a lock that is not protecting the object (thus, it is unsynchronized with another thread holding the correct lock). On the other hand, oversynchronization does not cause race conditions. But instead, redundant synchronization leads to runtime overheads (a reduction the degree of concurrency) and more chance of deadlock. Lock Granularity Coarse-grained locking protects larger amounts of data. It is easier to implement and safer, but exhibits poorer parallelism. In contrast, fine-grained locking protects fairly small bits of data, leading to improved parallelism. But fine-grained locks are difficult to reason about and can lead to more subtle race conditions and greater lock management overhead. Side-effects of Locks

There are several classical problems with locks such as deadlock,

priority inversion, lock convoy and starvation. A deadlock occurs when two or more threads are waiting for the locks held by each other, in a circle. Priority inversion happens where a lower priority task preempts a higher priority task by holding a lock resource needed by the higher priority task. Lock convoy occurs when multiple threads of equal priority contend repeatedly for the same lock; each time a thread fails to acquire the lock, it forces a context switch. Locks are also subject to starvation, in which a thread never gets sufficient resources to complete its task. In addition to the above, locks can cause other problems. A fundamental difficulty with locks is that it is very complex to extend their behavior. Locks are not isolated — modifications to shared resources are immediately seen by other threads. To overcome the lack of isolation, locks usually extend their critical section to all locations that access the shared data. In addition, locks are vulnerable to failures and faults — if one thread dies, blocks,

12

2.1 Concurrency

halts or goes into an infinite loop while holding a lock, other threads waiting for the lock may wait forever.

2.1.3

Non-blocking Mechanisms

Non-blocking mechanisms are an alternative to lock-based mechanisms to avoid some of the problems associated with locks (62). Many proposals have been made in the last decade for non-blocking mechanisms. In traditional lock-based mechanisms, a thread blocks a shared resource by holding a lock protecting that resource; whereas non-blocking algorithms allow multiple threads to read and write shared data concurrently without blocking each other (using neither locking nor mutual exclusion). Generally speaking, non-blocking mechanisms ensure one of the following progress properties. The wait-free property guarantees that every thread completes its task within a bounded number of steps. The lock-free property ensures that at least one thread makes progress within a bounded number of steps. The obstruction-free property ensures that a thread will complete for a bounded number of steps, provided that it executes in isolation at any point of time. All wait-free algorihtms are lock-free and all lock-free algorithms are obstruction-free. Non-blocking algorithms use non-blocking atomic primitives, such as compare-and-swap (CAS), load-linked and store-conditional (LL-SC), and test-and-set. CAS operation compares the content of a memory location to a given value and, if they are equal, writes a new given value to that location. Although non-blocking atomic operations existed before other synchronization primitives, their use was considered too challenging and has been regimented to implementation of synchronization primitives. Non-blocking algorithms that do not use locks do not incur lock-related problems, and so are often considered easier to employ to build scalable software systems. As they do not block, non-blocking algorithms are also expected to perform better than lock-based mechanisms. Since the first STM was proposed (122), it was widely accepted that non-blocking atomic operations were the only way to build such systems (98) and non-blocking frameworks dominated STM research for several years. In fact, practical evaluation has shown that nonblocking algorithms do not perform comparatively well as highly optimized blocking versions (64). In 2005, Robert Ennals’s controversial paper (12) and later work (37; 42) indicated that lock-based systems create the fastest STM models, much faster than non-blocking systems. Nevertheless, they are very challenging to program compared with conventional blocking al-

13

2. BACKGROUND

gorithms (69). Up till now, a significant number of TM researchers are against lock-based STM design, for the inherent problems locks create. The next section will cover transactional memory more deeply, including history, properties and implementations of transactional memory.

2.2

Transaction Processing

Database management systems and associated transaction management systems have demonstrated their efficiency for a large class of important database software systems for as accountancy, inventory management, production planning, and so forth. The systems avoid loosing any information in the case of a system crash by halting system immediately and the stateof-the-art database recovery mechanisms could recover all previously committed work. The ACID properties of database transactions are atomicity, consistency, isolation and durability, taking a major role in software systems. Atomicity requires that either all memory access operations of a transaction complete successfully or none of them completes. Consistency requires that the changes made by a transaction leave the data consistent with respect to a number of well-defined data integrity constraints. Isolation requires that the result of each transaction is correct as if it was running alone on the system regardless of the other concurrent transactions. Durability requires that after a transaction commits, its modifications to the data are permanent and available to all subsequent transactions, even after a system crash.

2.2.1

Transactional Memory

Transactional memory was founded on the concept of data transactions. The first TM proposal was made by Herlihy and Moss in 1993, showing a concrete way to implement transactions in hardware (72). The TM system solidified as a practical means to handle concurrency. The history of transactional memory derives from the work of Lomet’s atomic operations in programming languages in 1977 (93), and the later work of Knight (84) in 1986 to integrate data transactions into the Lisp language. Transactional memory is founded on the first three of the ACID properties found in database systems: atomicity, isolation and consistency, known as ACI (93; 103). Durability is usually not a concern in TM implementations since in-memory data does not last after the program exits; in constrast, database systems rely on disc storage for data durability. The principle behind TM is that it simplifies concurrent programming (9). TM presented a new

14

2.2 Transaction Processing

programming concept which not only solved thread coordination and memory synchronization problems, but also promised to do so in an elegant and programmable way. Since then, there have been a large amount of TM implementations, classified as hardware transactional memory (HTM), software transactional memory (STM) or a hybrid TM. I will survey more on STM as it is more related to my work. Interested readers can find more work on HTM and HyTM in the literature. The TM systems are also classified by different characteristics such as updating policy, conflict detection, contention manager for obstructionfree STMs, version management and correctness criteria. Hardware Transactional Memory

The first TM proposal (72) was introduced by Her-

lihy in 1993, as an alternative to concurrency control. There are numerous HTM systems (11; 54; 103; 104; 114) that have followed. The TM system (114) was proposed for virtualizing for transaction memory. The work (104) extended (103)’s model with nesting transactions. Recently, (36) reported early experience with commercial hardware transactional memory. Software Transactional Memory

The first software-only implementation of TM was de-

veloped by Shavit and Toutitou in 1995 (123). It raised interest in TM research dramatically given the lack of TM hardware. In the following years, a number of important STM contributions were made, such as a polymorphic contention manager (52), unbounded transactional memory (71), direct updating (6; 60; 117) and lock-based (blocking) systems(12; 37; 42). Tim Harris made an observation that TM is composable (59). That is, unlike all the previous concurrent programming mechanisms, TM allows independent and isolated transactions to be composed into larger transactions without any change to the semantics of the original isolated transactions. STM systems differ in transaction granularity— the type of storage unit, they handle. Systems implemented in OO languages, such as Java, usually have an object granularity (52; 71; 98). Other systems have a word granularity (47; 57), or a block granularity, depending on whether they reference single words or blocks of words. (38) supports all three types of granularity. In object granularity systems the objects are simply extended with a field that records the metadata to allow quick access to it. Systems with word or block granularity maintain a separate table that records the metadata, thus access time is slower. Hybrid Transactional Memory

HyTM implementations combine HTM with STM to

maintain the flexibility of STM (e.g. supporting unbounded transactions and transactional

15

2. BACKGROUND

variables) while inheriting performance scalability of a hardware mechanism. Many hybrid TM implementations (23; 35; 92; 116; 129) have now been proposed and evaluated. 2.2.1.1

Updating Policy

All TM systems must implement an updating policy to control how transactions perform and commit their writes to global memory. In effect, updating policies have a direct impact on transactional reads. There are two types of updating: direct updating and deferred updating. Many early STM systems implemented deferred updating policy (57; 71), and recently some direct updating systems have been proposed (6; 60; 117). Numerous TM research papers have investigated which updating policy gives better performance. Deferred updating makes a local copy of global memory and performs reads/writes on the local copy. On commit, the transaction copies local value of the memory to global memory. On abort, the transaction performs no write action; no restoration is required, since it has not written anything to global memory. Existing STM systems such as DSTM (71), Harris and Fraser’s WSTM (57) use deferred updates. In direct updating, a transaction makes a local copy of the original state of global memory. The transaction updates global memory directly on its writes; and performs no write action on commits (every write was committed). On aborts, the transaction simply restores the global memory by copying the local backup value of memory back to the global memory. In direct-update systems (6; 60; 117), eliminating the copy may be more efficient because it avoids cloning each modified object. However, direct-update systems must record the original value of each modified memory location for later restore if the transaction aborts. Directupdate STM systems must also prevent a transaction from reading the locations modified by other uncommitted transactions, thereby reducing the potential for concurrent execution. Direct-update STM systems use locks to prevent multiple transactions from updating an object concurrently and thus are exposed to lock-based problems. Deferred-update STM systems typically use non-blocking data structures, which prevented a failed thread from obstructing other threads. Direct-update STM systems also provide forward-progress guarantees to an application by detecting and aborting failed or blocked threads. 2.2.1.2

Conflict Detection

Conflict detection is the process of identifying conflicting accesses (at least one is a write) of two or more transactions to a memory location. Generally, there are two types of conflict

16

2.2 Transaction Processing

detection, e.g., early detection or late detection (87). In early conflict detection, a TM system can detect a conflict immediately when a transaction tries to open a memory for reading or writing. Early detection helps a transaction to prevent unnecessary computation that will subsequently be aborted due to a conflict. On the other hand, late conflict detection refers to detecting conflicts at commit-time (prior to commit) or at various points within the transaction’s lifetime. Late detection can avoid unneccessary aborts because the conflicting transaction itself may abort due to a conflict with a third transaction; whereas early detection always leads to aborts. Often, conflict detection strongly depends on the updating policy of a TM system. Conflict detection also considers both visible and invisible readers. Invisible readers require less book-keeping and induce fewer cache misses, but require that read-write conflicts be noticed by the reader. Visible readers allow such conflicts to be noticed by writers as well. A variety of strategies for conflict detection have been implemented (121). A number of conflict detection and validation strategies have been covered in (125), including experimental evaluation of these strategies. 2.2.1.3

Contention Management

Contention management was originally proposed for conflict resolution to guarantee progress (livelock avoidance) in obstruction-free STMs DSTM (71), ASTM (96) and SXM (65). CM is an out-of-band mechanism which users can provide as a separate module for TM systems. There has been a considerable amount of research on contention management to determine which transaction among conflicting transactions to be aborted (51; 53; 119). CM can be tuned to increase throughput dramatically in high contention workloads (51; 119). For example, let us consider a number of policies to resolve two conflicting transactions: the attacking transaction A and the victim transaction B. The easiest policy is Aggressive (120) in which the attacker transaction immediately aborts the victim transaction. With Greedy (53), transaction A aborts transaction B if B is waiting. Otherwise, A waits, until B either commits, aborts, or waits. With Timestamp, if A is not older than B, transaction A then waits for a series of fixed intervals. Karma (120) and Polka (119) contention managers increases a transaction’s priority, everytime that transaction acquires successfully a transactional object. Transaction A makes a number of attempts equal to the difference among priorities of both transactions. There is a constant backoff between each attempt in Karma; whereas, an exponential random backoff between successive attempts is used by Polka. The Eruption contention manager maintains the number num of transactional objects successfully

17

2. BACKGROUND

acquired and sets a transaction’s initial priority to zero. When a conflict occurs, if B has a higher priority than A, the contention manager of A adds num to B’s priority and then puts A to sleep for an exponential random backoff; otherwise, the contention manager aborts B. The Eruption increases the priority of the transaction behind which other transactions are waiting. 2.2.1.4

Version Management

Version management refers to the way a TM system controls the versions of memory objects. Most TM systems use single-versioning scheme. In this scheme, every memory data has a single version. In contrast, a few systems, e.g., LSA-STM (115) and JVSTM (21), implement multi-version schemes that store several copies of shared data including associated version information. 2.2.1.5

Correctness Criteria

There are several correctness criteria in the literature addressing the safety property of TM systems. These criteria include linearability, serializability and opacity. Linearizability (73; 91) is a safety property in which every transaction

Linearizability

should appear as if it took place at some single, unique point in time during its lifespan. A recent work has extended the definition to take aborted transactions into account. Generally speaking, linearizability would be alright if only the end result of a transaction counted. However, since every TM transaction is an internal part of an application, the result of every operation performed inside a transaction is important and accessible to a user. Thus, serializability and its derivatives are more appropriate correctness criteria, as recommended by (73). Serializability

Serializability (111; 121) is one of the most commonly required properties of

database transactions. A history H of transactions is serializable if all committed transactions in H issue the same operations and receive the same responses as in some sequential history S (with no concurrency between transactions) that consists only of the transactions committed in H. A strict form of serializability, with real-time order (121), requires the effects of committed transactions to appear in memory as if the transactions execute at once in some order. Another derivative of seriazability is 1-copy serializability (7) in which transactions create multiple (local or shared) versions of shared objects for their use, while hiding this fact from the user.

18

2.3 Contract Theory

Opacity Opacity (50) is recently proposed as a more appropriate correctness criterion. Intuitively, opacity guarantees the effects of every committed transaction appear as if they happened at some single, indivisible point during the transaction’s lifetime. Operations performed by any aborted transactions must never be visible to other transactions (including live ones). Every transaction always observes a consistent state of the system. As expounded by (50), most TM systems such as DSTM, ASTM, SXM, JVSTM and LSA-STM, ensure opacity.

2.3

Contract Theory

2.3.1

The Notion of Contract

The notion of contract in O-O is adapted from the contracts that are used in human affairs. Often, contracts are written and used between (two or more) parties when one party (the provider) performs some task for the other(s), e.g., the client(s). Contracts impose expectations and obligations of individual parties. Typically, an obligation of a party is a benefit of the other party, and vice versa. The aim of contract specification is to note these benefits and obligations. The table below summarizes the terms of such a contract, which specifies the pre-condition and post-condition of some behavior (e.g., transaction or method):

Client Provider

Obligations Must ensure precondition Must ensure postcondition

Benefits May benefit from postcondition May assume precondition

A contract contains the most important information that can be given about the routine: what each party in the contract must guarantee for a correct call, and what each party is entitled to in return. A contract governs the relations between the method implementation and any potential caller. A contract document protects both the client, by specifying how much should be done, and the provider, by stating that the provider is not liable for failing to carry out tasks outside of the specified scope.

2.3.2

Design by Contract

Design by contract(DbC) is about creating a contract between the provider and the client of the software modules. The idea of DbC is to associate a specification with every software element. These specifications (or contracts) govern the interaction of the element with the rest of the world. The purpose of contracts is to improve modularity, composability and reliability of software systems.

19

2. BACKGROUND

DbC takes advantage of the presence of contracts to permit self-documenting software. This specification defines a contract between the client and the provider. When a provider writes a contract with the client, it should document the obligations and benefits. When a client makes a call to the provider, the provider makes sure that the client follows the contract, this could be done by using precondition within a software routine (clients obligations to the code it calls) and postcondition (providers obligations to the code that uses it).

2.3.3

DbC Elements

Assertions in the program define constraints that the system must adhere to. These are typically categorised as pre-conditions, post-conditions and invariants. An invariant is both an assumption and a guarantee for some properties. The concept of invariant in O-O language is mostly known as class invariant. 2.3.3.1

Pre-condition/post-condition

Every method has its associated documentation about the obligations of the caller (precondition) and the guarantees of the method (post-condition). Pre- and post-conditions apply to individual methods. Commonly, pre- and post- condition are specified in the documentation section of a method, in popular programming languages such as Java, C#, C++, etc. Then these conditions can be checked by the client and the provider. 2.3.3.2

Class Invariant

Class invariant is an assertion describing consistency constraints, which hold for all (object) instances of a class, characterizing the properties of a class. This notion of invariant is important for program testing, since the critical properties of a class are specified and verified. These properties are often not just the temporal characteristics during an object’s evolution, but rather the constraints that are invariant for the object’s lifecycle. In common programming languages like Java, C# and C++, class invariants are specified in the documentation section of a class. 2.3.3.3

Check Instruction

A check instruction is used where an assertion is needed, such as method’s pre-/post-condition or class invariant. A check instruction may contain a boolean expression and can be placed at any position in method implementations. Below is an example of a Java check instruction — assert statement.

20

2.3 Contract Theory

class Person { void setAge(int age) { ... assert (check ()); } boolean check() { return age >10 ∧ age

Suggest Documents