a holistic approach to security across distributed ...

29 downloads 5864 Views 4MB Size Report
PKI infrastructure, authentication techniques, digital signing, biometrics and ...... the certificate by making it publicly known that the certificate is no longer valid,.
A HOLISTIC APPROACH TO SECURITY ACROSS DISTRIBUTED INFORMATION SYSTEMS

DIMITRIOS SISIARIDIS

A thesis submitted in partial fulfilment of the requirements of Northumbria University for the degree of Doctor of Philosophy

August 2008

Re-submission August 2010

i

ACKNOWLEDGMENTS

This thesis is dedicated to my wife Ioulia Sidera-Sideri and to our children, Antonis and Morfoula, who have been my inspiration and motivation throughout this work. I would like to express my deepest gratitude to my mother, Antonia, my brother Apostolos and my father-in-law Kostantinos, who recently passed away, for their continuous support and encouragement. My family has been tolerable beyond the bounds of expectation and for this I am eternally grateful. I am grateful for my supervisors, Nick Rossiter and Michael Heather, for their intellectual input and encouragement. I would like also to thank the Greek National Institute for Scholarships (IKY) for supporting this thesis during the first two years.

DECLARATION This work has not been submitted for any other award and the main body of the thesis is all my own work. The published papers in Appendix F show any joint authorship.

Dimitrios Sisiaridis

ii

ABSTRACT

Distributed Information Systems, as well as their underlying networks, are exposed to a growing number and a wider variety of threats and vulnerabilities. A comprehensive analysis of the literature shows that security for distributed systems is not a local feature but has to be treated globally. It is expressed in the form of higher-order activities usually between complex object types and their hierarchies, where higherorder functions are required. Current security approaches whether baseline or part of risk management are characterized by their locality. Usually, they are based on firstorder logic. There are also approaches based on process calculi mainly in the fields of concurrency and auditing control. A holistic approach with natural closure seems necessary for any global description. In this thesis, applied category theory has been used – based on higher order logic providing a formal approach to process simply by the use of the arrow. The current thesis describes several novel contributions to improve the state-of-the-art security approaches across distributed information systems. The proposed holistic security approach is conceptual and constructive, following a four-level architecture where the interaction of the involved Cartesian closed comma categories involved is expressed as adjoint functors, natural transformations, mappings in the form of 2cells, 3-cells as modifications of system behaviour, as well as monads and comonads constructions to represent internal processing. Composition, through the levels of the architecture, is based on Godement calculus. Pullback and pushout diagrams are used to express global intensionality and constrained local extensionalities, in terms of security policies, security services and security mechanisms. The approach provides natural closure, based on security principles on the top level. In practical terms, the current research can be the basis for a future development and implementation of a graphical software tool, based on mathematical principles and logic, with the abilities to visualize and evaluate the proposed applied categorical logic. This tool is more likely to provide the basis for a future standard way to represent interoperability issues governed by a global security framework in distributed information systems, expressed in the four-level architecture.

iii

TABLE OF CONTENTS 1

INTRODUCTION

14

1.1

Higher-order logic for global description

14

1.2

The need for a global, holistic security approach

15

1.3

Current security approaches

15

1.4

A complete security strategy with natural closure

16

1.5

Research hypothesis - applied category theory as the way forward for global security

18

1.6

UML notation used in the research

19

1.7

Thesis contributions

20

1.8

Roadmap to the thesis

22

2

DISTRIBUTED INFORMATION SYSTEMS

2.1

Definition - Hardware and Software issues

25

2.2

Communication issues

26

2.3

DIS requirements

27

25

2.4 The issue of time in (asynchronous) distributed systems – event ordering 2.4.1 Processes in a distributed system 2.4.2 Lamport‟s logical time 2.4.3 Logical Clocks 2.4.4 Totally ordered logical clocks 2.4.5 Vector clocks 2.4.6 Global states and consistent cuts

28 28 28 29 30 30 31

2.5

The object-oriented and service-oriented paradigm

32

2.6

Summary

33

3 SECURITY ISSUES ACROSS DISTRIBUTED INFORMATION SYSTEMS 3.1 Information Systems Security 3.1.1 Threats and vulnerabilities 3.1.2 Risk management 3.1.3 Security attacks 3.1.4 DoS attacks 3.1.5 Network security architectures 3.1.5.1 The ISO/OSI X.800 Security framework 3.1.6 Security policies 3.1.7 Security assessment and evaluation 3.1.8 Information security approaches 3.1.8.1 Bottom-up approaches 3.1.8.2 Top-down security approaches 3.1.8.3 The ISO/IEC 17799 standard

34 34 34 35 35 37 37 39 40 41 42 42 42 43

iv

3.1.9 3.1.10 3.1.11 3.1.12

Naming issues in DIS Security awareness Integrating top-down and bottom-up approaches – the need for a holistic approach Discussion on Information Systems security

44 44 45 47

3.2 Cryptography issues 3.2.1 Shared and public-key cryptosystems 3.2.2 Digital signatures 3.2.3 Authentication 3.2.3.1 Biometrics 3.2.3.2 Steganography, smart cards, and other authentication schemes 3.2.4 Secure channels 3.2.5 Key management 3.2.5.1 Key distribution 3.2.5.2 Key recovery and key escrow 3.2.6 Discussion on cryptography

50 50 51 51 52 53 53 54 54 54 55

3.3 Access control 3.3.1 Access rights 3.3.2 Access control classification 3.3.2.1 Hierarchical access control 3.3.2.2 Multilevel security 3.3.2.3 Multilateral security 3.3.2.4 Access Control Lists and Capabilities 3.3.2.5 Security policy domains 3.3.2.6 Role-based Access control 3.3.3 Multi-policy systems 3.3.3.1 The „sandbox‟ 3.3.4 Firewalls 3.3.5 Virtual Private Networks (VPNs) 3.3.6 Discussion on access control

56 56 57 57 57 58 58 59 59 62 62 62 64 64

3.4 High-level security services 3.4.1 Intrusion Detection 3.4.2 Vulnerability analysis 3.4.3 Auditing and tracing techniques 3.4.4 Antivirus control 3.4.5 Hardware-based security 3.4.6 Discussion on high-level security services

66 66 67 68 68 69 70

3.5 Fault tolerance 3.5.1 Fault tolerance and dependability 3.5.2 Redundancy 3.5.3 Process groups 3.5.4 The Byzantine failure model 3.5.5 Business continuity and recovery from disasters 3.5.6 Intrusion detection and fault-tolerance 3.5.7 Failure analysis 3.5.8 Discussion on fault tolerance

71 71 72 72 73 73 74 74 75

3.6 Databases and database security 3.6.1 Information integration 3.6.2 Distributed databases 3.6.3 Database security 3.6.3.1 Database encryption 3.6.3.2 Integrity control 3.6.3.3 Transactions and concurrency control 3.6.3.4 Views 3.6.4 Discussion on database security

76 76 76 77 77 78 78 79 80

v

3.7 Distributed computing and security 3.7.1 Distributed computation 3.7.2 Workflow systems 3.7.3 Web and Semantic Web 3.7.3.1 Web overview 3.7.3.2 Semantic Web overview 3.7.3.3 Web security 3.7.3.4 E-commerce security 3.7.3.5 e-Trust 3.7.3.6 Privacy and identity management 3.7.3.7 XML security 3.7.4 The Grid infrastructure 3.7.4.1 Grid overview 3.7.4.2 Grid security

81 81 82 83 83 83 84 85 86 88 89 89 89 90

3.8

Summary

91

4

CATEGORY THEORY OVERVIEW AND ANALYSIS

4.1

Introduction to category theory

94

4.2

Categories and functors

94

4.3

Commutative diagrams

95

4.4

Natural transformations

95

4.5

Adjointness

96

4.6

Duality, contravariance and opposites

97

4.7

Universal arrows and universal constructions

98

4.8

Limits and colimits

99

4.9

Products and coproducts

99

94

4.10

Pullbacks

101

4.11

Exponentiation

101

4.12

Cartesian closed categories

102

4.13

Toposes or Topoi

103

4.14

Product of categories

103

4.15

Product Functors

104

4.16

Bifunctors or functors of two variables

105

4.17

Subcategories

106

4.18

Monads and comonads

106

4.19

Comma categories

109

4.20 2-categories 4.20.1 Adjointness defined in 2-categories

112 114

vi

4.20.2 4.20.3 4.20.4 4.20.5 4.20.6

Natural transformations between 2-categories Modifications (3-cells) in enriched 2-categories n-categories Pullback functor Multicategories and operads

114 115 116 116 117

4.21

Examples of categories

118

4.22

Summary

118

5

CATEGORICAL CONSTRUCTIONS AND VISUALIZATIONS

5.1

Rationale

120 120

5.2 The Godement Calculus 5.2.1 The Cube and the Lattice of Cubes 5.2.2 Products/coproducts in pullback diagrams

120 121 126

5.3 Basic adjunctions in applied category theory 5.3.1 The product  as right adjoint of the diagonal functor  ( ┤ )

127 127

5.3.2

The coproduct

as left adjoint to diagonal functor

 ( ┤ )

128

5.4 Type of categories and universal constructions in the proposed approach 5.4.1 An example of a product category 5.4.2 Examples of product functors 5.4.3 An example of a bifunctor 5.4.4 Monads/comonads analysis 5.4.5 Monads in partial orders 5.4.6 Examples of comma categories 5.4.7 Adjointness in terms of comma categories 5.4.8 A comma category as a pullback 5.4.9 The difference between a natural transformation and a comma category 5.4.10 Natural Transformations between simple categories 5.4.11 Adjointness defined in 2-categories 5.4.12 Natural transformations between 2-categories 5.4.13 Modifications (3-cells) in enriched 2-categories 5.4.14 Adjointness in terms of comma categories

130 131 132 134 135 137 138 139 140 142 143 143 146 146 147

5.5

Summary

149

6

THE FOUR-LEVEL ARCHITECTURE

6.1

Introduction to the architecture

150 150

6.2 Interoperability issues in the 4-level architecture 6.2.1 Naturality 6.2.2 Semantic & organizational interoperability in the 4-level architecture 6.2.3 Achieving ultimate closure at the top-level 6.2.4 Semantic & organizational interoperability using the lattice of cubes 6.2.5 More security examples based on interoperability

152 152 153 155 157 160

6.3

Adjointness in the 4-level architecture in terms of 2-categories

161

6.4

Natural transformations in the 4-level architecture

162

6.5

Comma categories in the four-level architecture – top level analysis

163

vii

6.6

Security in distributed computations

168

6.7 Monads and comonads in the 4-level architecture 6.7.1 Evaluation and comparison of two policy frameworks using the cube and the monads/comonads 6.7.2 Maintaining database consistency

179

6.8

Identify risks based on threats and vulnerabilities

186

6.9

Balancing the cost of security measures against their effectiveness

186

179 184

6.10

Integrating two intrusion detection methods – auditing and logging procedures

190

6.11

Inter- and intra-relationships in the architecture

192

6.12

Functional dependencies in system components of the architecture – LCCC and 3NF 204

6.13

Summary

7

DISCUSSION OF THE RESULTS AND FUTURE WORK

7.1

The four level-architecture development stages in summary

211

213 229

APPENDIX A - PKI INFRASTRUCTURE

247

APPENDIX B – RBAC CONCEPTS

249

Separation of Duty handling in TRBAC models

250

APPENDIX C - CUSTOMER-ORIENTED INFORMATION INTEGRATION 251 APPENDIX D - WORKFLOW SYSTEMS

252

APPENDIX E - CORBA SECURITY

254

APPENDIX F – PUBLISHED WORK

255

viii

TABLE OF FIGURES FIGURE 1-1: THE PROPOSED COMPLETE SECURITY STRATEGY – A HOLISTIC APPROACH .......................... 17 FIGURE 3-1: DISTRIBUTED INFORMATION SYSTEMS ATTACKS CLASSIFICATION ...................................... 36 FIGURE 3-2: DDOS ATTACKS CLASSIFICATION ........................................................................................ 38 FIGURE 3-3: X.800 OSI SECURITY MECHANISM FOR AVAILABLE SECURITY SERVICES ............................ 39 FIGURE 3-4: SECURITY IN DISTRIBUTED INFORMATION SYSTEMS ............................................................ 47 FIGURE 3-5: SYMMETRIC RBAC WITH CONSTRAINTS, BASED ON RBAC 96 ........................................... 60 FIGURE 3-6: ORBAC BASED ON RBAC96 .............................................................................................. 61 FIGURE 3-7: THE TRBAC MODEL ........................................................................................................... 61 FIGURE 3-8: ACCESS CONTROL COMPONENTS CLASSIFICATION FOR DISTRIBUTED ENVIRONMENTS ........ 65 FIGURE 3.9: A UML DIAGRAM SHOWING A CLASSIFICATION OF THE METHODS DEALING WITH FAILURES ...................................................................................................................................................... 75 FIGURE 3-10: SEMANTIC INTEGRITY CONTROL ........................................................................................ 80 FIGURE 3-11: MECHANISMS FOR MAINTAINING DATABASE CONSISTENCY ............................................. 80 FIGURE 4-1: EQUALITY OF PATHS IN A COMMUTATIVE DIAGRAM ............................................................ 95 FIGURE 4-2: COMPONENTS OF A NATURAL TRANSFORMATION ................................................................ 96 FIGURE 4-3: THE UNIT OF THE ADJUNCTION ............................................................................................ 96 FIGURE 4-4: THE CO-UNIT OF THE ADJUNCTION ...................................................................................... 97 FIGURE 4-5: ADJOINT FUNCTORS F AND G .............................................................................................. 97 FIGURE 4-6: A CATEGORY C AND ITS DUAL CATEGORY COP ..................................................................... 98 FIGURE 4-7: A COVARIANT FUNCTOR F : C  D .................................................................................. 98 op

FIGURE 4-8: A COVARIANT FUNCTOR F : C  D ............................................................................... 98 FIGURE 4-9: A CONTRAVARIANT FUNCTOR F : C  D .......................................................................... 98 FIGURE 4-10: CONES { fi : a  di } FOR A DIAGRAM D .......................................................................... 99 FIGURE 4-11: A LIMIT d i FOR A DIAGRAM D .......................................................................................... 99 FIGURE 4-12: THE PRODUCT OF OBJECTS a AND b ............................................................................... 100 FIGURE 4-13: THE CO-PRODUCT OF OBJECTS a AND b ........................................................................ 100 FIGURE 4-14: A PULLBACK DIAGRAM FOR TWO C-OBJECTS ................................................................. 101 FIGURE 4-15: EXPONENTIATION ............................................................................................................ 101 FIGURE 4-16: A SUB-OBJECT CLASSIFIER  IN A CATEGORY C, AS A PULLBACK DIAGRAM .................. 103 FIGURE 4-17: THE PRODUCT CATEGORY B  C ..................................................................................... 103 FIGURE 4-18: THE PRODUCT FUNCTOR U  V : B  C  B  C .......................................................... 104 FIGURE 4-19: THE COMPOSITE PRODUCT FUNCTOR (U   V ) (U  V )  U U  V V ........................... 105 FIGURE 4-20: A BIFUNCTOR F : B  C  D ........................................................................................ 105 FIGURE 4-21: THE BIFUNCTOR IN TERMS OF FUNCTORS LC AND MB ....................................................... 106 FIGURE 4-22: ASSOCIATIVE LAW FOR MONAD T.................................................................................... 107 FIGURE 4-23: LEFT & RIGHT UNIT LAWS FOR MONAD T ........................................................................ 107 FIGURE 4-24: ASSOCIATIVE LAW FOR T  GF ..................................................................................... 107 FIGURE 4-25: INTERCHANGE LAW FOR T  GF ................................................................................... 107 FIGURE 4-26: LEFT & RIGHT UNIT LAWS FOR MONAD T  GF ............................................................. 108 FIGURE 4-27: ASSOCIATIVE LAW FOR COMONAD L ............................................................................... 108 FIGURE 4-28: LEFT & RIGHT UNIT LAWS FOR COMONAD L .................................................................... 108 FIGURE 4-29: LEFT & RIGHT UNIT LAWS FOR COMONAD L  FG ........................................................ 109 FIGURE 4-30: OBJECTS UNDER B (b  C) ............................................................................................... 109 FIGURE 4-31: OBJECTS OVER A (C  a) ................................................................................................. 109 FIGURE 4-32: OBJECTS S-UNDER B ( (b  S ) ......................................................................................... 110 FIGURE 4-33: OBJECTS T-OVER A (T  a) ............................................................................................. 110 FIGURE 4-34: THE COMMA CATEGORY (T  S ) ................................................................................... 111 FIGURE 4-35: HORIZONTAL COMPOSITION   .................................................................................. 112 FIGURE 4-36:IDENTITY 2-CELL 1b ......................................................................................................... 113 FIGURE 4-37: VERTICAL IDENTITY 1 f ................................................................................................... 113 FIGURE 4-38: HORIZONTAL COMPOSITION FOR VERTICAL IDENTITES .................................................... 113 FIGURE 4-39: HORIZONTAL AND VERTICAL COMPOSITION BETWEEN 2-CELLS....................................... 113

ix

FIGURE 4-40: HORIZONTAL COMPOSITION OF A 2-CELL WITH A 1-CELL ................................................ 113 FIGURE 4-41: A 2-NATURAL TRANSFORMATION  BETWEEN 2-FUNCTORS F , G FOR TWO 2CATEGORIES T AND U 115 FIGURE 4-42: A MODIFICATION  BETWEEN TWO 2-NATURAL TRANSFORMATIONS  ,  FOR TWO 2CATEGORIES T AND U .................................................................................................................. 115 FIGURE 5-1: GODEMENT NATURAL TRANSFORMATIONS FOR FIVE CATEGORIES, EIGHT FUNCTORS AND FOUR NATURAL TRANSFORMATIONS ............................................................................................ 120 FIGURE 5.2: THE GODEMENT‟S RULES FOR FIGURE 5-1 ......................................................................... 121 FIGURE 5-3: A CUBE VISUALIZING NATURAL TRANSFORMATIONS BETWEEN TWO CATEGORIES AND FOUR FUNCTORS .................................................................................................................................... 122 FIGURE 5-4: REPRESENTING OBJECTS AS FUNCTORS.............................................................................. 123 FIGURE 5-5: NATURAL TRANSFORMATION BETWEEN TWO CATEGORIES AND SIX FUNCTORS ................. 123 FIGURE 5-6:A LATTICE OF CUBES VISUALIZING NATURAL TRANSFORMATION FOR TWO CATEGORIES AND SIX FUNCTORS .............................................................................................................................. 124 FIGURE 5-7: GODEMENT CALCULUS FOR 5 CATEGORIES, 8 FUNCTORS AND 4 NATURAL TRANSFORMATIONS USING THE LATTICE OF CUBES ...................................................................... 125 FIGURE 5-8: THE PRODUCT/COPRODUCT OF OBJECTS a AND b , SHOWING ALL THE COMMUTING TRIANGLES ................................................................................................................................... 126 FIGURE 5-9: THE PRODUCT/COPRODUCT OF OBJECTS a AND b AS AN ABSTRACTION ........................... 126 FIGURE 5-10: THE DIAGONAL ARROW  c  1c ,1c  ............................................................................ 127 FIGURE 5-11: THE DIAGONAL ARROW  c IN THE CONTEXT OF CATEGORY C .......................................... 127 FIGURE 5-12: THE PRODUCT a  b ........................................................................................................ 127 FIGURE 5-13: THE UNIT  c OF THE ADJUNCTION ................................................................................... 127 FIGURE 5-14: THE COUNIT   a ,b  OF THE ADJUNCTION ......................................................................... 128 FIGURE 5-15: THE INTEGRATED VIEW OF THE ADJUNCTION ┤ ....................................................... 128 FIGURE 5-16: THE UNIT   a ,b  OF THE ADJUNCTION  a, b   a  b, a  b ............................................ 129 FIGURE 5-17: THE COPRODUCT a  b OF TWO OBJECTS a AND b ........................................................ 129 FIGURE 5-18: THE FOLDING MAP [1c , 1c ] FOR THE COUNIT OF THE ADJUNCTION  c : c  c  c ......... 129 FIGURE 5-19: THE COUNIT  c OF THE ADJUNCTION ............................................................................... 129 FIGURE 5-20: THE INTEGRATED VIEW OF THE ADJUNCTION ┤ ...................................................... 130 FIGURE 5-21: THE UNIVERSAL CONSTRUCTIONS LIMITS, PRODUCTS AND PULLBACKS AND THEIR ASSOCIATION. .............................................................................................................................. 130 FIGURE 5-22: THE CATEGORIES USED IN THE CURRENT APPROACH ....................................................... 130 FIGURE 5.23: THE PRODUCT CATEGORY C  C ..................................................................................... 131 FIGURE 5-24: THE PRODUCT FUNCTOR U  V : C  C  C  C .......................................................... 132 FIGURE 5-25: THE PRODUCT FUNCTOR U U  V V : C  C  C  C  C  C ................................. 132 FIGURE 5-26: THE PRODUCT FUNCTOR U  V IN TERMS OF OBJECTS AND ARROWS .............................. 133 FIGURE 5-27: THE BIFUNCTOR U  V : C  C  C  C IN TERMS OF ARROWS .................................... 134 FIGURE 5-28: THE BIFUNCTOR U  V : C  C  C  C IN TERMS OF OBJECTS .................................... 134 FIGURE 5-29: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 1ST CYCLE OF A MONAD ................... 135 FIGURE 5-30: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 2ND CYCLE OF A MONAD .................. 135 FIGURE 5-31: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 3RD CYCLE OF A MONAD .................. 135 FIGURE 5-32: THE THREE CYCLES OF A MONAD, COMBINED .................................................................. 136 FIGURE 5-33: THE THREE CYCLES OF A MONAD, COMBINED, ONLY THE ARROWS .................................. 137 FIGURE 5-34: ADJOINTNESS IN TERMS OF ISOMORPHIC COMMA CATEGORIES, WITH THE EQUIVALENT ELEMENTS PROJECTED ON THE PRODUCT CATEGORY ................................................................... 139 FIGURE 5-35: ADJOINTNESS IN TERMS OF COMMA CATEGORIES, SHOWING THE UNIT  c AND THE COUNIT

d

OF THE ADJUNCTION .............................................................................................................. 140

FIGURE 5-36: CONNECTION BETWEEN THE PRODUCT FUNCTOR AND THE BIFUNCTOR IN A PULLBACK WITH COMMA CATEGORIES ................................................................................................................... 140 FIGURE 5-37: USING COMMA CATEGORIES, PRODUCT FUNCTORS AND BIFUNCTORS IN A PULLBACK DIAGRAM ..................................................................................................................................... 141 FIGURE 5-38: NATURAL TRANSFORMATIONS IN A COMMA CATEGORY .................................................. 142

x

FIGURE 5-39: THE CORRESPONDENCE BETWEEN A FUNCTOR CATEGORY AND A VERTICAL CATEGORY . 143 FIGURE 5-40: ANALYSIS OF THE ADJUNCTION F ┤G , BASED ON THE UNIT  AND COUNIT  .............. 144 FIGURE 5-41: 2-CELLS 1F (A) AND 1G (B) OF THE ADJUNCTION F ┤G , BASED ON THE UNIT  AND COUNIT

 . ................................................................................................................................... 144

FIGURE 5-42: ADJOINTNESS  |  FOR CATEGORIES C, C  C ........................................................... 145 FIGURE 5-43: ADJOINTNESS  |  FOR CATEGORIES C  C, C ........................................................... 145 FIGURE 5-44: COMMA CATEGORIES FOR ADJOINT FUNCTORS  |  |  BETWEEN CATEGORIES C, C  C ...................................................................................................................................... 145 FIGURE 5-45: A 2-NATURAL TRANSFORMATION  BETWEEN 2-FUNCTORS F , G FOR TWO 2CATEGORIES T AND U 146 FIGURE 5-46: A MODIFICATION  BETWEEN TWO 2-NATURAL TRANSFORMATIONS  ,  FOR TWO 2CATEGORIES T AND U .................................................................................................................. 147 FIGURE 5-47: THE 3 CYCLES OF A MONAD/COMONAD CONSTRUCTION IN TERMS OF THE CORRESPONDING COMMA CATEGORIES (C  G ) AND ( F  D) FOR AN ADJUNCTION  F , G,  ,   BETWEEN THE UNDERLYING CATEGORIES C AND D ............................................................................................ 148 FIGURE 5-48: THE COMMA CATEGORIES (C  G ) AND (C  G) IN THE UNDERLYING CATEGORY C, IN CASE OF HAVING TWO ADJUNCTIONS  F , G,  ,   ,  F , G,  ,   .............................................. 149 FIGURE 6-1: NATURAL COMPOSITION OF ADJOINT FUNCTORS ............................................................... 150 FIGURE 6-2: FOUR LEVELS DEFINED WITH COVARIANT FUNCTORS AND INTENSION-EXTENSION PAIRS .. 151 FIGURE 6-3: ADJOINTNESS BETWEEN TWO SYSTEMS ............................................................................. 152 FIGURE 6-4: ADJOINTNESS BETWEEN CATEGORIES SCH AND DAT....................................................... 153 FIGURE 6-5: SEMANTIC INTEROPERABILITY .......................................................................................... 154 FIGURE 6-6: ORGANIZATIONAL INTEROPERABILITY .............................................................................. 154 FIGURE 6-7: SEMANTIC INTEROPERABILITY IN THE 4-LEVEL ARCHITECTURE USING 2-CATEGORIES ...... 154 FIGURE 6-8: ORGANIZATIONAL INTEROPERABILITY IN THE 4-LEVEL ARCHITECTURE USING 2-CATEGORIES .................................................................................................................................................... 154 FIGURE 6-9: SEMANTIC AND ORGANIZATIONAL INTEROPERABILITY, IN THE 4-LEVEL ARCHITECTURE, IN TERMS OF VERTICAL CATEGORIES AND FUNCTOR CATEGORIES – THE FUNCTOR CATEGORY CPT PROVIDES THE CLOSURE IN THE TOP LEVEL ................................................................... 155 DAT FIGURE 6-10: SEMANTIC INTEROPERABILITY (RELATIONAL, OBJECT-RELATIONAL AND OBJECT-ORIENTED PARADIGM) IN THE 4-LEVEL ARCHITECTURE, BASED ON GODEMENT CALCULUS, USING THE LATTICE OF CUBES ....................................................................................................................... 157 FIGURE 6-11: ORGANIZATIONAL INTEROPERABILITY (RELATIONAL, OBJECT-RELATIONAL AND OBJECTORIENTED PARADIGM) FOR THE 4-LEVEL ARCHITECTURE, BASED ON GODEMENT CALCULUS, USING THE LATTICE OF CUBES ................................................................................................................ 158 FIGURE 6-12: ADJOINTNESS BETWEEN THE LEVEL-PAIR (CPT, CST) , IN TERMS OF COMMA CATEGORIES

.................................................................................................................................................... 161 FIGURE 6-13: A MODIFICATION  FOR AN ARROW f IN CPT ............................................................... 162 FIGURE 6-14: THE FOUR-LEVEL ARCHITECTURE IN TERMS OF COMMA CATEGORIES –TOP/DOWN VIEW . 164 FIGURE 6-15: THE FOUR-LEVEL ARCHITECTURE IN TERMS OF COMMA CATEGORIES –BOTTOM/UP VIEW165 FIGURE 6-16: Objects-P under CPT ................................................................................................... 166 FIGURE 6-17: Objects-P over CST ...................................................................................................... 166 FIGURE 6-18: COMMA CATEGORY (O  I ) DERIVED FROM COMMA CATEGORIES (O  SCH) AND

(SCH  I ) ................................................................................................................................ 167 FIGURE 6-19: THE COMMA CATEGORY (O  I ) DERIVED FROM THE PRODUCT CATEGORY CST  DAT , WHICH IN TURN IS DERIVED UNDER HORIZONTAL COMPOSITION (*) OF PRODUCT CATEGORIES CST  SCH AND SCH  DAT .................................................................................................. 167 FIGURE 6-20: A COMMUNICATION CHANNEL BETWEEN PROCESSES P AND Q ........................................ 171 FIGURE 6-21: EXCHANGE OF MESSAGES BETWEEN CLIENT AND SERVER PROCESSES ............................. 172 FIGURE 6-22: INTERNAL EVENTS AND EVENTS ON A MESSAGE EXCHANGE ............................................ 173 FIGURE 6-23: POTENTIAL RELATIONSHIPS BETWEEN THE EVENTS OF TWO PROCESSES IN THE CASE OF A MESSAGE EXCHANGE ................................................................................................................... 174 FIGURE 6-24: A MODIFICATION  THAT DETERMINES IF TWO EVENTS ARE PARALLEL OR NOT ............ 177 FIGURE 6-25: CORRESPONDENCE BETWEEN SECURITY POLICIES AND SECURITY REQUIREMENTS .......... 177

xi

FIGURE 6-26: HIGHER-ORDER LOGIC IN THE CASE OF ACCESS CONTROL ............................................... 178 FIGURE 6-27: ADJOINTNESS (EVALUATION) AND COMPARISON OF TWO APPLIED POLICY FRAMEWORKS BASED ON THE SAME SECURITY PRINCIPLES (CONFIDENTIALITY, INTEGRITY, AVAILABILITY) ........... 179 FIGURE 6-28: ASSOCIATIVE LAW FOR T  P P .................................................................................. 180 op

FIGURE 6-29: INTERCHANGE LAW FOR T  P P ................................................................................. 180 op

FIGURE 6-30: LEFT & RIGHT UNIT LAWS FOR T  P P ....................................................................... 180 op

FIGURE 6-31: LEFT & RIGHT UNIT LAWS FOR L  PP ....................................................................... 181 FIGURE 6-32: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 1ST CYCLE OF A MONAD .................. 181 FIGURE 6-33: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 2ND CYCLE OF A MONAD .................. 182 FIGURE 6-34: THE UNIT AND COUNIT OF THE ADJUNCTION ON THE 3RD CYCLE OF A MONAD .................. 182 FIGURE 6-35: THE INTEGRATED DIAGRAMS FOR THE MONAD/COMONAD CONSTRUCTION – THE 3 CYCLES, IN THE CASE OF HAVING TWO ADJUNCTIONS BETWEEN CATEGORIES CPT AND CST .................... 183 FIGURE 6-36: THE THREE CYCLES OF THE MONAD/COMONAD CONSTRUCTION FOR MAINTAINING DATABASE CONSISTENCY ............................................................................................................. 184 FIGURE 6-37: THE ACID PROPERTIES OF A TRANSACTION EXPRESSED AS HORIZONTAL COMPOSITION OF 2-CELLS, IN TERMS OF RELIABILITY (ATOMICITY & DURABILITY) AND CONCURRENCY CONTROL (CONSISTENCY & ISOLATION) ...................................................................................................... 184 FIGURE 6-38: DATABASE CONSISTENCY IN TERMS OF ITS COMPONENTS, SEMANTIC INTEGRITY CONTROL, PROTECTION CONTROL AND ACID PROPERTIES (FOR RELIABILITY AND CONCURRENCY CONTROL) .................................................................................................................................................... 185 FIGURE 6-39: FUNCTOR CATEGORIES AND THE CORRESPONDENT VERTICAL CATEGORIES IN DATABASE CONSISTENCY .............................................................................................................................. 185 FIGURE 6-40: ILLUSTRATING ATTACKS ON A SPECIFIED TARGET FROM AN IDENTIFIED RISK FROM AN ATTACKER, BASED ON EXPONENTIATION ..................................................................................... 186 FIGURE 6-41: THREAT ASSESSMENT, RISK EVALUATION AND CONTROL AND EFFECTIVENESS OF APPLIED SECURITY MEASURES ................................................................................................................... 187 FIGURE 6-42: A RISK IDENTIFIED AS A RESULT OF THE INTERACTION OF THREATS AND VULNERABILITIES .................................................................................................................................................... 188 FIGURE 6-43: HOW TO KEEP THE BALANCE BETWEEN SECURITY CONTROLS (THE EFFECTIVENESS OF THE APPLIED SECURITY MEASURES) AND COSTS (IN TERMS OF COMPUTATIONAL EFFORT AND NETWORK USAGE) ........................................................................................................................................ 189 FIGURE 6-44: SYSTEM ACTIVITIES AND USER PROFILES (GLOBAL EXTENSIONALITY) CONSTRAINED TO INTRUSIONS AND NORMAL USER ACTIVITIES (LOCAL EXTENSIONALITY) ...................................... 191 FIGURE 6-45: THE UNIT AND COUNIT OF THE ADJUNCTION BETWEEN INTRUSIONS AND NORMAL USER op

ACTIVITIES EXPRESS THE DEVIATION OF THE NORMAL USER BEHAVIOUR AND A NEW INTRUSION TECHNIQUE, RESPECTIVELY ......................................................................................................... 191 FIGURE 6-47:THE UNIT AND COUNIT OF THE ADJUNCTION BETWEEN LEVELS DAT AND SCH (I) .......... 193 FIGURE 6-48: THE UNIT AND COUNIT OF THE ADJUNCTION BETWEEN LEVELS DAT AND SCH (II) ........ 193

FIGURE 6-49: ASSOCIATION BETWEEN OBJECTS OF THE FORM value , name AND type ....................... 194 FIGURE 6-50:MULITPLE INHERITANCE IN TERMS OF OBJECTS value , AND name ................................. 195 FIGURE 6-51: THE UNIT AND COUNIT OF THE ADJUNCTION BETWEEN LEVELS DAT AND SCH (III) ....... 195 FIGURE 6-52: A COMMA CATEGORY AS A PULLBACK DIAGRAM –ALL THE FUNCTORS ........................... 196 FIGURE 6-53: ADJOINT FUNCTORS IN A PULLBACK DIAGRAM IN LEVEL DAT ........................................ 198 FIGURE 6-54: ISOMORPHIC CATEGORIES BETWEEN VALUES AND VN IN LEVEL DAT ......................... 199 FIGURE 6-55:ISOMORPHIC CATEGORIES NAMES AND VN IN LEVEL DAT............................................ 199 FIGURE 6-56:OBJECTS OF VALUES  VN PROJECTED IN THE ISOMORPHIC COMMA CATEGORIES........... 200 FIGURE 6-57:A BIFUNCTOR IN THE PULLBACK DIAGRAM OF A COMMA CATEGORY BETWEEN VALUES AND NAMES ............................................................................................................................... 201 FIGURE 6-58: REPRESENTING OBJECTS IN THE ISOMORPHIC COMMA CATEGORIES .................................. 201 FIGURE 6-59: THE COMMA CATEGORY IN THE CASE OF VISUALIZING A COMMUNICATION CHANNEL BETWEEN TWO PROCESSES ........................................................................................................... 202 FIGURE 6-60: A RELATION IN 3NF - THE GENERAL CASETHE PRODUCT a e b AS A PULLBACK 207 FIGURE 6-61: A COMPOSITE KEY a  b  c  d FOR A RELATION R .......................................................... 207 FIGURE 6-62: A RELATION R IN 3NF ...................................................................................................... 209 FIGURE 6-63: A RELATION R IN 2NF ................................................................................................... 209 FIGURE 6-64: THE RELATION R IN 3NF ............................................................................................... 210 FIGURE 7-1:DEVELOPMENT STAGES IN THE THESIS ............................................................................... 229

xii

LIST OF TABLES TABLE 2-1: DIFFERENT ASPECTS OF DISTRIBUTION TRANSPARENCY ....................................................... 25 TABLE 3-1: THE ADVANTAGES AND DISADVANTAGES OF CURRENT SECURITY APPROACHES .................. 48 TABLE 3-2: PASSWORD-BASED AUTHENTICATION ................................................................................... 52 TABLE 3-3: ADVANTAGES AND DISADVANTAGES OF SMART CARDS ........................................................ 53 TABLE 3-4: COMPARISON OF DIFFERENT SHARED-KEY DISTRIBUTION SERVICES ..................................... 55 TABLE 3-5: ADVANTAGES AND DISADVANTAGES OF PKI INFRASTRUCTURE ........................................... 55 TABLE 3-6: COMPARISON OF HIGH-LEVEL SECURITY SERVICES ............................................................... 70 TABLE 3-7: DEPENDABILITY IN DIS ........................................................................................................ 71 TABLE 3-9: TYPES OF INFORMATION INTEGRATION ................................................................................. 76 TABLE 3-9: THE KEY METHODOLOGIES FOR ESTABLISHING TRUST .......................................................... 87

xiii

1

Introduction

Today‟s organizational forms are location and structure-independent, characterized by cooperation. Computer perimeterless networks connect hosts and user terminals into a distributed computing environment, which provides the advantages of increasing reliability, sharing information, and computing power. Distributed information systems (DIS), as well as their underlying networks, are exposed to a growing number and a wider variety of threats and vulnerabilities. Technical and social solutions should be implemented to reduce the pervasive computer abuse problems. Organizations usually respond to security threats on a piecemeal basis including anti-virus software, anti-spam, and anti-intrusion software, which need to be updated and redeployed if it is to remain effective; usually, they leave gaps and generate inconsistencies, which could be exploited by intruders. A holistic security approach with natural closure is necessary for any global description.

1.1

Higher-order logic for global description

A highly desirable feature required for distributed systems is exactness which is related with the notion of certainty. The latter, according to Kurt Gödel, the famous logician of the 20th century, has components as completeness and decidability. Gödel‟s 1929 doctoral thesis [1929] established that first-order predicate logic is complete, thus internally consistent; the validity of first order predicate calculus is equivalent to provability in a specific system of axioms and rules of inference. The completeness theorem [Gödel, 1930] was fundamental for the subject of model theory, proving any generality of the produced models; every model, based on axioms and therefore in first order predicate calculus is complete and its validity can be proved based on experimental verification. But the use of models, which is the common approach in theoretical computer science, suffers from Gödel‟s uncertainty. For higher order and open systems, experimental verification only holds locally without any guarantee of wider validity. Gödel‟s two theorems of incompleteness [1931] [1934] prove that any formal system (e.g. a model) expressed in first- or second-order calculus includes undecidable propositions, as every statement in the language of number theory can be either proved or disproved. The proposed security architecture uses higher-order logic in the context of applied category theory. Indeed, certain Cartesian closed categories, the topoi (or toposes), 14

have been proposed as a general setting for mathematics, instead of a traditional set theory, which is based on numbers [Lawvere and Rosebrugh, 2003], [Bell, 2005], [Goldblatt, 1984], [Johnstone, 2002], [Barr and Wells, 1985]. Categories in the architecture are Cartesian closed comma categories which express all the complex structure and security activities in a distributed system, including data repositories and processes that exchange messages via communication channels.

1.2

The need for a global, holistic security approach

A comprehensive analysis of the literature shows that security for DIS is not a local feature but has to be treated globally. Information security threats are global in nature and usually automated and loose on the Internet. Current security approaches like baseline approaches or risk management are characterized by their locality. Security for modern, complex and usually heterogeneous distributed information systems is based on higher order activities usually between complex object types and hierarchies of them (e.g. high level security services as firewalls or network monitoring, threat assessment and risk management, access control policies in multi-policy environments etc.); such activities can be expressed using higher-order functions. Furthermore, security is related to issues such as data integrity and interoperability among complex heterogeneous systems. Shared data is the key concept in DIS providing the basis for integration and interoperability. Interoperability itself is a global requirement. In the context of information systems, it is concerned with the inter-communication of data at different and therefore usually heterogeneous localities. The plethora of approaches in interoperability suggests that none has had universal success. They are characterized by their locality. Most of them work well for general closed systems where the logic is that of a closed Boolean world. In some cases, their mathematical basis is set theory. In other cases, process calculi provide a high-level tool of describing concurrency issues in distributed computations.

1.3

Current security approaches

Bottom-up approaches (e.g. risk analysis) are subjective; these are more suited to high-level security risks. On the other hand, top-down approaches (e.g. baseline approaches), such as standards ISO/IEC 27001:2005 specification [2005] and the ISO/IEC 17799:2005 Code of Practice [2005], offer an alternative to conventional 15

risk methods as they represent the minimally acceptable security countermeasures that an organization should have implemented. Simplicity and low cost are some of their benefits. In addition, no training is required to use the method. However, they leave the choice of control to the user; they are most appropriate for low-level security risks.

1.4

A complete security strategy with natural closure

In the current work, as will be clear from later analysis, it is felt that a complete security strategy needs to be layered to deal with high-level aspects such as continuity strategies (threat assessment, risk evaluation & control), security policies, incident response plan, host-based & network-based perimeter and/or perimeterless detection, auditing procedures, fault tolerance and recovery strategies, anti-malware control (Intrusion Detection, router and firewall security, anti-virus control) as well as legal and regulatory compliance (UML diagram in Figure 1-1 –UML notation used is explained in §1.6). A holistic approach with natural closure is necessary for any global description. It embraces all these aspects of security, including systems architecture, policies, procedures and user education providing natural closure with a very high degree of certainty based on the CIA security principles (i.e. confidentiality, integrity and availability) [FIPS, 2003]. It focuses on securing the infrastructure itself by forcing users to adopt best security practices while ensuring that the network is “secure by design” rather than by post-rational customization. Thus, security considerations can be included as core processes of the DIS itself. The proposed holistic security approach is based on category theory, which provides a formal approach to process simply by the use of the arrow. It is inherently holistic and with intrinsic natural closure.

16

Figure 1-1: The proposed complete security strategy – a holistic approach

17

Threat assessment

Security policy

* Continuity Strategy

Risk evaluation & control

Incident response plan Host-based & network based perimeter and/or perimeterless detection

Intrusion Detection

Auditing procedures

Complete Security Strategy (Holistic approach)

Router & Firewall Security

Anti-malware Control

Anti-virus Control

Fault Tolerance & Recovery strategy

Legal & regulatory compliance

Security awareness & user education

1.5

Research hypothesis - applied category theory as the way forward for global security

Category theory brings together algebra, geometry and topology. It appears to be naturally suited to handling interoperability. Fundamental category theory shows that for physical existence the real world operates as Cartesian closed categories as was described by Church [1932] [1933] [1936] [1941] and others [Lambek, 1980] [Seldin and Hindley, 1980] [Barendregt, 1984] [Hindley and Seldin, 1986]. Applied category theory based on Cartesian closed categories for process is natural. There is a strong connection between Cartesian closed categories and typed λ-calculus, an abstract programming language, proposed by Church. Gödel accepted Church‟s thesis in the version given by Turing [1937], in the form of the Turing computable functions as well as in Church‟s versions using λ-definability and general recursiveness. Security entities and distributed activities e.g. distributed transactions, in a DIS, are expressed as Cartesian closed categories and adjoint functors between them, following a four-level modular approach. It has been shown [Rossiter et al., 2006] that any realizable system can be conceptually expressed using four interchangeable levels in categorical terms, where local extensionalities (e.g. local security policies) are interconnected one with another through global intensionality (e.g. global security policy / meta-policy framework) by integrating comma categories (e.g. local policy security domains, each one corresponding to a specific security policy). Composed adjunctions, in the form of 2-cells, appear particularly well-suited for modelling interoperability, with composition of distinct functors for mapping across a number of levels (from data values to data abstractions as aggregation and inheritance, and vice versa) and of endofunctors in the form of monads and comonads constructions for organizational (business) process interoperability. The Service-Oriented Modelling paradigm attempts to take a holistic view of the analysis, design and architecture of all the information assets (called „Services‟) in a system [Arsanjani, 2004] [Bierberstein et al., 2005]. This dynamic composition of services, which would benefit from a categorical approach, has been proposed as a way forward for interoperability. The internal structure of the categories in the proposed four-level architecture can be expressed in terms of Cartesian closed comma categories. The internal logic of certain Cartesian closed categories, the topoi (or toposes) is intuitionistic [Rydeheard and Burstall, 1988]. 18

The behaviour of a system is represented by its limit. Limits and the dual notion of co-limits are examples of universal and co-universal constructions. In turn, initial and terminal objects, products/co-products, equalizers/co-equalizers, pullback/pushouts are specific instances of limits and co-limits and in such a way that is found to be directly applicable in the current research. Limits/colimits and products/coproducts express global intensionality while pullbacks/pushouts express local extensionalities. Thus, the research hypothesis can be formed as follows: “Global security for interoperability across heterogeneous distributed information systems can benefit from the use of category theory”. The deliverable is the proposed holistic security architecture that uses higher order logic in the context of applied category theory in order to define, develop and deploy security issues in interoperable distributed systems. The architecture will be the basis for a future graphical software tool visualizing the proposed holistic security framework.

1.6

UML notation used in the research

While the research is not predicated upon UML, its notation is frequently used as a convenient technique for design. Class diagrams are fundamental to object-oriented analysis. Classes represent entities of the system. UML class diagrams are graphical representations of the elements of sub-systems of the DIS, thus modelling a specific view of the system under consideration [Bennett et al., 2002]. They are not the same as commutative diagrams in category theory. Generalization, specialization, aggregation and composition are issues dealt with in the semantics of a UML model. Generalization in object-oriented paradigm is used to describe relationships of similarity between classes. Object classes can be arranged into disjoint hierarchies. Inheritance (or type hierarchy) is the mechanism for implementing generalization and specialization in object-oriented paradigm. When two classes are related by the mechanism of inheritance, the more general class is called a superclass in relation to the other and the more specialized is called its subclass. A subclass inherits all the characteristics of its superclass and includes at least one detail not derived from its superclass. The triangle symbol ∆ is used to visualize the inheritance mechanism. Composition (represented as a solid diamond) is a type of abstraction that encapsulates groups of classes that collectively have the

19

capacity to be a reusable sub-assembly. Composition is a strong form of aggregation (represented as a diamond ◊). Multiplicity (or cardinality) is the term used to describe constraints on the number of participating objects in a relationship. It reflects organization (or business) rules, which are constraints on the way that business activities can take place.

1.7

Thesis contributions

The current thesis describes several novel contributions to improve the state-of-the-art security approaches across distributed information systems. It discusses the use of applied category theory on handling security in DIS in a complete and holistic manner. A holistic security architecture, at least, should include baseline assessment, risk analysis, specific policy development, security measure implementation as well as monitoring and reporting action. The proposed layered holistic architecture as well as the findings of the analysis on state-of-the-art security architectures across distributed information systems are represented in the form of UML diagrams. UML has dominated as the language for designing and representing object-oriented paradigm. All the figures, tables and UML diagrams in Chapter 3 are believed to be original unless it is clearly defined otherwise. The proposed holistic security approach follows a four-level architecture where the interaction of the involved categories is expressed as adjoint functors, natural transformations, mappings in the form of 2-cells, 3-cells as modifications of system behaviour, as well as monads and comonads constructions to represent internal processing. Pullback and pushouts diagrams are also widely used to express global intensionality and constrained local extensionalities, in terms of security policies, security services and security mechanisms. Good practice in software engineering for process management in distributed systems has been based on a variety of mathematical formal interpretations of processes, known as process calculi. Following the ideas of λ-calculus and π-calculus, processes, channels, participants, services, activities and the actual data transmitted in process interaction or those referred to local system state changes, in the four-level architecture, are treated as instances of an abstract type process. Good practice in software engineering is following the LCCC approach. We have shown that LCCC correspond precisely to the industry-strength standard of 3NF for 20

data design, therefore justifying the choice of LCCC as the underlying structures in our architecture. New categories have not been invented. Applied category theory which is based on higher order logic has been used. The categories across levels are Cartesian closed, in the form of comma categories, to represent real world entities. The behaviour of a system, in terms of system security, is given by its limit. Cartesian closed categories, based on limits and exponentiation, allow us to draw the boundaries of the system. The proposed holistic approach provides natural closure, based on security principles on the top level. Cartesian closed categories have been used widely in computing science, especially in the functional paradigm. Comma categories are widely used through this research as the basis of a number of applications and visualizations. The proposed form of pullback diagrams combines the use of limits and products (global security policy framework –system intensionality) and comma categories (local security policies – system extensionalities) mediated by product functors and bifunctors. There are also security application examples where pullbacks are combined with pushouts through a pair of adjoint functors, for example in order to balance the cost of applied security measures against their effectiveness (achieved security control). Adjointness based on comma categories has been used to compare and evaluate the impact of different security policies, based on security principles, as well as to compare different security measures. Access control components, as ACLs, capabilities, RBAC etc., can be expressed using comma categories. Object-oriented characteristics as generalization/specialization and aggregation/composition can also be expressed using comma categories, although categorification is not one of the objectives of this research. Composition, through the levels of the architecture, is based on Godement calculus. Originally inspired from Godement calculus, we have constructed the Cube to represent all the different composed paths through arrows such as functions, higherorder functions as functors, natural transformations between functors, adjunctions, 2cells as mappings between parallel pairs of arrows, 2-functors and 2-natural transformations for parallel computations through vertical and horizontal composition, as well as 3-cells as modifications for representing process interaction by defining event-ordering using higher-order logic instead of first-order predicate 21

calculus, thus enhancing the certainty and validity of the proposed approach. Then, the Lattice of Cubes, in two different forms, has been constructed in order to express organization interoperability as a way to represent local and global security frameworks and meta-frameworks. The first form visualizes composition up to threedimensional reality, covering semantic interoperability issues, while the second one explores composition up to four-dimensional reality. The ultimate closure in the proposed four-level architecture is achieved through horizontal composition of the vertical categories (based on vertical composition of 2cells) between the level pairs and is simply expressed as the pair of functor categories

DATCPT and CPTDAT . The monad and its dual categorical construction, comonad, have been explored in order to enhance consistency in the proposed security applications. The ACID properties of a distributed transaction have been defined using endofunctors of a system of 3-cycles describing internal processing as closed operations. Achieving and maintaining database consistency in a distributed system, in terms of database integrity, database control (using encryption) and handling of transactions, is another given application based on monads and comonads. There are also application examples for defining and handling security attacks on DIS, as for example distributed Denial of Service (DDoS) attacks, as well as a way to identify risks as the result of the interaction of threats and vulnerabilities by including all the possible combinations. There are also examples for non-repudiation and auditing control using logging and tracing back techniques, embedded in Intrusion Detection Systems. Other applications include the evaluation of security channels, PKI infrastructure, authentication techniques, digital signing, biometrics and steganography.

1.8

Roadmap to the thesis

After this introduction chapter, Chapter two contains a background discussion in distributed information systems (DIS). It analyses related issues in DIS as hardware and software issues, communication issues, functional and non-functional requirements, finishing with a discussion on the issue of event-ordering in DIS as it is currently handled.

22

Chapter three contains an overview discussion on security across distributed systems, divided into seven subchapters. It reviews and analyses the concepts of information systems security (such as principles, policies, services, mechanisms, state-of-the-art approaches), cryptography (shared and public-key cryptosystems, digital signatures, authentication – including PKI infrastructures and biometrics, secure channels and key management), access control (access control classification, multi-policy systems and firewalls), high-level security services (e.g. network monitoring, intrusion detection, auditing and tracing techniques, antivirus control and hardware-based security), fault tolerance (as it is associated with availability, one of the security principles) and database security. Finally, it defines the concept of distributed computation and its association with security in several examples of DIS, such as workflow systems, web and semantic web, e-commerce systems (including SSL/TLS and SET protocols) and Grid infrastructures. Furthermore, it analyses the issues of etrust (usually based on the existence of trusted-third parties), privacy and identity management, XML security and CORBA security. Chapter four contains an overview and analysis of category theory. It introduces basic category theory notions as well as complex constructs as 2-cells, 3-cells and monads/comonads with an emphasis on applied category theory using Cartesian closed categories. Chapter five presents a number of examples on natural transformations, adjunctions (especially on the adjunction pair

┤┤ ),

exponentiation, product categories, product functors, bifunctors, monads/comonads, comma categories and 2-categories. It introduces constructions such as the Cube, the Lattice of Cubes and the integrated system of 3-cycles in a monad/comonad construction. In addition, it attempts to clarify the difference between natural transformations and comma categories, to express adjunctions and pullbacks using comma categories and to show the correspondence between functor categories and vertical categories. In order to handle security in DIS, all these categorical concepts are integrated into the proposed four-level architecture presented in Chapter six, following a holistic security approach, through various security application examples, including event-ordering handling. Finally, Chapter seven concludes with a discussion of contributions made, limitations of the proposed architecture, other ongoing holistic security approaches and future work. A way forward for implementation of the proposed architecture can be based on functional paradigm, enriched with object

23

extensions, in order to utilize functional, object-oriented and service-oriented programming.

24

2 2.1

Distributed Information Systems Definition - Hardware and Software issues

Distributed information systems (DIS) are a combination of distributed applications and supporting distributed computer systems. The objective is to get a diverse group of systems and users working together in a coordinated fashion to use common resources and data [Benyon-Davies, 1998] distributed across the network, by only passing messages [Coulouris et al., 2005]. Users perceive the system as a single, coherent, integrated computing facility. The concealment from users of the separation of the individual components in a DIS is called transparency. There are different aspects of distribution transparency (Table 2-1).

Location

Enables local and remote resources to be accessed using identical

transparency

operations

Concurrency

Processes can operate concurrently using shared resources without

transparency

interference between them, thus aiming to increase the system‟s consistency

Replication

Multiple instances of resources are used to increase reliability and

transparency

performance without knowledge of the replicas (i.e. copies) by users or application programmers

Performance

Allows the system to be reconfigured to improve performance as

transparency

loads vary

Failure

Enables the concealment of faults from users while the system

transparency

subsequently repairs them

Table 2-1: Different aspects of distribution transparency

Components in DIS are usually autonomous, and can be used exclusively [Emmerich, 2000]. Computers can have either shared or private memory, called multiprocessors and multicomputers, respectively. Multicomputers are further distinguished as being either homogeneous, by using a single interconnection network where all processors are the same and have access to the same amount of private memory, or heterogeneous. There are three different types of operating systems for DIS: tightly-coupled systems or Distributed Operating Systems (DOS) for multiprocessors and homogeneous 25

multicomputers, where the main goal is to hide and manage hardware resources, loosely-coupled systems or Network Operating Systems (NOS), for heterogeneous multicomputers, where the main goal is to offer local services to remote clients, and middleware, a software abstract layer atop NOS implementing general purpose services such as communication facilities, naming, persistence facilities, distributed transactions, and security, in order to provide distribution transparency. Most middleware is based on some model, or paradigm, for describing distribution and communication. The actual organization of a DIS refers to the way that processes are organized. The model that has dominated in managing the complexity of DIS is the client-server. In its simplest form, a server is a process implementing a specific service, such as a file system or a database service, whereas a client is a process that requests a service from a server by sending it a request and subsequently waiting for the server‟s reply. Client-server applications are layered, including three different levels; the userinterface level, which include programs that allow end-users to interact with applications, the processing level, which contains the core functionality of the application, and the data level, which includes programs that maintain the actual data on which the applications operate. Client-server architectures vary from the simplest form, having just a server and a client, to multi-tiered (vertical distribution) and horizontal distribution. Vertical and horizontal distributions are further distinguished as two- or three-tiered, and server- or client (peer-to-peer) distribution, respectively. Implementations include multiple servers, proxy servers and caches, thin clients, mobile code and mobile agents, as well as peer-to-peer networks and integrated mobile devices. Any communication mechanism between processes is called a channel.

2.2

Communication issues

The International Standards Organization (ISO) developed the Open Systems Interconnection Reference Model (OSI model), which identifies the various levels involved in communication within open systems (e.g. DIS). The protocols that were developed as part of the model were never widely used. Communication is divided up into seven levels (layers), named as application layer, presentation, session layer, transport layer, network layer, data link layer, and physical layer. Each layer provides an interface to the one above it. The interface consists of a set of operations that 26

together define the service the layer is prepared to offer its users. The model has reliable transmission (error recovery) in the network and transport layers, but fault recovery has been put in the session layer, thus producing the same error and fault frequency. Additionally, in the application layer there must be a check for the success of any remote operation. Error recovery in the lower levels of protocols is only useful for purposes of increasing efficiency. Communication in DIS, generally, can be implemented using a connectionless (e.g. UDP ) or connection-oriented protocol (e.g. TCP/IP). It is usually based on low-level message passing as offered by the underlying network, such as Remote Procedure Calls (RPCs) or Remote Method Invocations (RMIs). An RPC allows a process to call a procedure on a remote machine; Java RMIs enable a programmer to create distributed Java-based applications in which the methods of remote Java objects can be invoked from other Java virtual machines, possibly on different hosts. An alternative is to use a high-level message-queuing model, called Message-oriented middleware (MOM). In the context of computational Grid, communication is achieved using RPC/RMI (asymmetric), the Message Passing Interface (symmetric) or hybrid models (e.g. Open Message Passing, OmniRPC).

2.3

DIS requirements

Functional requirements in DIS are concerned with the functions that the system can perform for its users. Non-functional requirements, concerned with the quality of the system, include issues such as openness, scalability, heterogeneity, fault-tolerance and security. The openness of a DIS permits sharing of all resources among users independently of their locations [Anderson, 2001]. In order to be flexible and extensible, the system offers its services according to standard rules that describe the syntax and semantics of these services; services are generally specified through tested and verified published interfaces, which are often described in an Interface Definition Language (IDL). Scalability in DIS can be measured with respect to size, geographically scalable and administratively scalable. A scalable DIS will remain effective when there is a significant increase in the number of resources and the number of users. Heterogeneity in DIS applies to networks, computer hardware, operating systems, programming languages, and implementations by different developers. Fault tolerance means that the system has to be capable of continuing in the face of single-point failures and of parallel execution. Security is related to 27

integrity issues, and generally, can be achieved by securing the processes and the channels used for their interactions and by protecting the resources against unauthorized access.

2.4

The issue of time in (asynchronous) distributed systems – event ordering

One of the major problems in DIS is the lack of a global time, something that makes it difficult to find out the state of distributed computations i.e. the processes in a distributed system. Each computer may have its own physical clock, which typically deviates and cannot be synchronized perfectly. Lamport [1978] pointed out that in general physical time cannot be used to find out the order of any arbitrary pair of events occurring in the same or different processes in a distributed system. For security purposes, in DIS, it is especially important to know the state of each process.

2.4.1

Processes in a distributed system

A distribution system can be regarded as a collection P of N processes pi , i  1, 2,...N , communicating with each other only by passing messages. Each process is executed on a single processor. The processes do not share memory. Each process pi in P has a local state si , which includes the values of all the variables within it, as well as the values of any objects in its local operating system environment that it affects, such as files. An event e is the occurrence of a single action that a process carries out as it executes. It can be either a communication action (send or receive message) or a state transforming action (that is one that changes one or more values in si . The sequence of events within a single process pi can be placed on a total ordering, denoted by the relation i between the events. That is e i e if and only if event e occurs before

e at process pi . The history of process pi is defined as the series of events that takes place within it, ordered by the relation i , that is history( p i )  hi  ei0 , ei1 , ei2 ,... .

2.4.2

Lamport’s logical time

Lamport [Lamport, 1978] showed that if two events occur in the same process pi , then they occur in the order in which pi observes them, that is the order i . Whenever a

28

message is sent between two processes, the event of sending the message occurs before the event of receiving the message. This partial ordering relationship is called „happened-before‟ or „causal ordering‟ or „potential causal ordering‟. It is defined as: 

If there is a process pi with e i e then e  e



For every message m, it is send (m)  receive(m)



If e and e are events such that e  e , then the relationship  captures a flow of data in the underlying process (or, underlying processes, if they belong to different processes). The first event might or might not actually have caused the second event. In the second case, a process might, for example, receive a message and subsequently issues another message, but one that it issues every ten minutes anyway and which bears no specific relation to the first message. No actual causality has been involved. Still the relationship would order these events.



If e , e and e are events such that e  e and e  e then e  e



If e and e are events and if e  e , then there is a series of events

e1 , e2 ,..., eN (not necessarily unique) occurring in one or more processes such that e  e1 and e  eN , i  1, 2,..., N  1 , then events ei and ei 1 , either occur in succession at the same process pi or, there is a message m such that

ei 1  receive(m) . 

Two events e and e that are not ordered by  are concurrent, written as

e || e

2.4.3

Logical Clocks

A Lamport Logical Clock [1986] is a monotonically increasing software counter, whose value need bear no particular relationship to any physical clock. Each process

pi keeps its own clock Li . A logical clock is used from a process to apply Lamport timestamps to events. The timestamp of an event e at a process pi is denoted as Li (e) . The timestamp of an event e in whatever process it occurred is denoted as L(e) . Processes update their logical clocks and transmit the values of their logical clocks as follows: 

Li is incremented before each event is issued by a process pi ( Li  Li  1 ) 29



When a process pi sends a message m, it also transmits the value t  Li



On receiving (m, t ) , a process p j first computes L j : max( L j , t ) and then increments the value of L j

If e and e are events and if e  e , then L(e)  L(e) . The converse is not always implied; that is, if L(e)  L(e) , then it is not guaranteed that e  e . This is the main shortcoming of Lamport‟s clocks.

2.4.4

Totally ordered logical clocks

A total order of events is one for which all pairs of distinct events are ordered, by taking into account the identifiers of the processes at which events occur. If e is an event occurring at a process pi with local timestamp Ti and e is an event occurring at a process p j with local timestamp T j , then the global logical timestamps for these events are defined as (Ti , i) and (T j , j ) , respectively. It is true that (Ti , i)  (T j , j ) if and only if either Ti  T j or (Ti  T j )  (i  j ) .

2.4.5

Vector clocks

Vector clocks [Fidge, 1988] [Mattern, 1989] [Basten et al., 1997] try to overcome the shortcoming of Lamport‟s clocks. A Vector Clock for a system of N processes is an array of N integers. Each process keeps its own vector clock Vi , which it uses to timestamp local events. When a process pi sends a message m, it also transmits the value of its vector timestamp. The rules for updating vector clocks are as following: 

Initially Vi [ j ]  0, i, j  1, 2,..., N



Just before a process pi timestamps an event e , it sets Vi [i]: Vi [i]  1



The process pi includes the value t  Vi in every message it sends



When a process pi receives a timestamp t in a message m, it sets

Vi [ j ]: max(Vi [ j ], t[ j ]), j  1, 2,..., N o For a vector clock Vi , Vi [i] is the number of events that pi has timestamped o Vi [ j ], i  j is the number of events that have occurred at p j that pi has potentially been affected by. 30

The rules of comparing vector timestamps [Schwarz and Mattern, 1994] [Babaoglu and Marzullo, 1993] follow: 

V  V  iff V [ j ]  V [ j ], j  1, 2,..., N



V  V  iff V [ j ]  V [ j ], j  1, 2,..., N



V  V  iff (V  V )  (V  V )

Vector timestamps, compared with Lamport‟s timestamps, have the disadvantage of taking up an amount of storage and message payload that is proportional to N, that is the number of process of the system.

2.4.6

Global states and consistent cuts

A history of a process pi is defined as history( pi )  hi  ei0 , ei1 , ei2 ,... . A finite prefix of the process‟s history is defined as hik  ei0 , ei1 ,..., eik  . The state of process pi immediately before the kth event occurs is denoted as sik . Thus, si0 is the initial state of pi . Each process can record the events that take place there, and the succession of states it passes through. Thus, processes record the sending or receiving of all messages as part of their state. The global history of the distributed system P is the union of the individual histories, that is H  h0  h1  hN 1 . A global state S corresponds to initial prefixes of the individual process histories. A cut C of the system‟s execution is a subset of its global history, that is a union of prefixes of process histories C  h1c1  h2c2  ...  hNcN . The state si in the global state S, corresponding to the cut C, is that of pi immediately after the last event processed in the cut, defined as eici , i  1, 2,..., N . The set of events {eici , i  1, 2,..., N} is called the frontier of the cut. A cut C is consistent if for each event it contains, it also contains all the events that happened-before that, that is for all events e  C , if f  e then

f  C . That is, a consistent cut is left-closed under the „happened-before‟ relation. A consistent global state is one that corresponds to a consistent cut. The execution of a distributed system is characterized as a series of transitions between global states of the system S0  S1  S2  ... . In each transition, precisely one event occurs at some single process in the system. Events that occur simultaneously must be concurrent, that is neither happened-before the other. In this way, a system evolves through consistent global states. A run is a total ordering of all the events in a global history 31

that is consistent with each local history‟s ordering, i , i  1, 2,..., N . A linearization or consistent cut is an ordering of the events in a global history that is consistent with this happened-before relation  on H. Thus a linearization is also a run. Not all runs pass through consistent global states. But, all linearizations pass through consistent global states. A state S  is reachable from a state S , if there is a linearization that passes through S and S  . The ordering of concurrent events, within a linearization, may be altered leading to the derivation of a run that still passes through only consistent global states. For example, if two successive events in a linearization are the receipt of messages from two processes, then the order of these two events may be swapped.

2.5

The object-oriented and service-oriented paradigm

In computing, information hiding is the concealing of the implementation details in a computer program from users and classes, using for example stable interfaces. Examples include encapsulation and polymorphism. In networks, encapsulation allows an upper protocol layer to add data and functionality to the actual message, while passing them to a lower protocol layer. Both OSI and TCP/IP use this form of encapsulation. Object-oriented programming languages use encapsulation for defining classes which include attributes and methods applied on them. In general, distributed transparency is achieved through the use of encapsulation. The principle of the Separation of Concerns (SoC) is the process of breaking a program into distinct features that overlap in functionality as little as possible. It is achieved through the use of encapsulation and modularity (i.e. the property of programs to be composed out of separate parts, called modules). Object-oriented programming languages offer subtyping polymorphism using subclassing (also known as type inheritance) where derived subclasses are treated like their parent class (i.e. a superclass). A polymorphic function can be evaluated or applied to values of different types (overloaded function). In the same way, operators can be overloaded to apply to different data types. Polymorphism allows also for client programs to be written based on abstract interfaces of the objects which they manipulate. Computer viruses and worms sometimes hide their presence by making use of polymorphic code, which mutates while keeping the original algorithm intact.

32

In the context of cryptography, polyinstantiation is the existence of a cryptographic key in more than one secure physical location. In access control, mandatory policies use polyinstantiation for providing different security levels on system resources. The Service-Oriented Modeling paradigm attempts to take a holistic view of the analysis, design and architecture of all the information assets (called „Services‟) in a system [Arsanjani, 2004] [Bierberstein et al., 2005]. It uses a modelling language for facilitating a Service-Oriented Approach (SOA) implementation based on a ServiceOriented Framework design.

2.6

Summary

Distributed information systems (DIS) allow users to use common resources and data transparently. A DIS can be regarded as a collection of processes communicating with each other only by passing messages, following usually a client-server architecture. For security purposes, it is especially important to know the state of each process as well as the state of the communication channels used for exchanging messages between processes. Communication can be implemented using a connectionless (UDP) or connection-oriented protocol (TCP/IP). One of the major problems in asynchronous DIS is the lack of global time. Lamport‟s clocks and timestamps as well as Vector clocks are used to define event-ordering in processes, based on number theory. The object-oriented and recently the service-oriented paradigm have dominated in the design of a DIS.

33

3

Security issues across Distributed Information Systems

3.1

Information Systems Security

Security is increasingly important in modern networked computer systems as they are exposed to a growing number and a wider variety of threats and vulnerabilities. It is a very complex set of processes that ranges from the level of crypto-primitives over crypto-protocols to the level of organizational matters and legislation [Anderson, 2001]. Gerber et al [2001] define information security as „the process of controlling and securing information from inadvertent or malicious changes and deletions or unauthorized disclosure‟. Information systems security has components such as confidentiality, integrity and availability (CIA) [FIPS, 2003]. Confidentiality refers to the property of a computer system that its information is disclosed only to authorized parties, restricting any inappropriate access. Integrity is the characteristic that alterations to a system‟s assets e.g. hardware, software and data, can be made only in an authorized way; improper alterations in a secure computer system should be detectable and recoverable. Availability refers to the probability that the system is operating correctly at any given moment and is available to perform its functions on behalf of its users. Other security principles, mainly derived from the ISO 17799 [ISO, 2002] standard and its later version [ISO, 2005] are defence-in-depth, separation of duties, need-to-know, strict least privilege, order of events and dual control.

3.1.1

Threats and vulnerabilities

A security threat is a potential violation of a system‟s security which exists when there is a circumstance, capability, action, or event that could breach security and cause harm. Security threats, according to Coulouris et al [2005], include leakage, tampering and vandalism. Tanenbaum & Van Steen [2002] consider four types of security threats, as different forms of data falsification. Interception refers to the situation that an unauthorized party has gained access to a service or data. Interruption is when a file is corrupted or lost. In general, interruption refers to the situation in which services or data become unavailable, unusable, or destroyed. Modification involves unauthorized changing of data or tampering with a service so that it no longer adheres to its original specification. Finally, fabrication refers to the situation in which additional data or activity are generated that would normally not exist. 34

A vulnerability can be described as one or more attributes of the system that permits a subject to initiate patterns of misuse on that system. Vulnerabilities exist independently of any threats that may or may not be present. Johansson & Schultz [2003] distinguish vulnerabilities as pervasive or contextual. Pervasive vulnerabilities do not need a particular context to be exploited, but can rather can be exploited at virtually any time locally or remotely, while contextual vulnerabilities can only be executed in a particular context, when the system is in some particular state, or when a particular user is using the system, or when the user has local instead of remote access or possibly vice versa.

3.1.2

Risk management

Risk or exposure is the result of the interaction of threats and vulnerabilities. Both must exist for a risk to exist. Within the hacker community, risks are often referred to as „exploits‟. Risk management can be defined as the process of analysing exposure to risk and determining how to best handle such exposure. The objective is to enable a risk owner to manage risks by getting appropriate controls in place where they are needed. Security measures must be incorporated into computer systems whenever they are potential targets for malicious or mischievous attacks with an overall plan to minimize risk [Morau, 2004]. A case study [Doughty, 2003] has shown that in order to minimize the risks concerning the integrity and the security of organizations‟ information channels (e.g. email, Internet, applications, DBMS, and operating systems), a security framework, supported by processes and tools, is required.

3.1.3

Security attacks

Network security consists of measures to deter, prevent, detect, and correct security violations that involve the transmission of information. Moitra & Konda [2004] define a security attack as a series of intentional steps taken by an attacker to achieve an unauthorized result; an incident refers to a group of related attacks that can be distinguished from other attacks. They argue that in order to develop better security and defences against network attacks, it is important to investigate the patterns of attacks on network systems and sites. According to Hawkins et al [2000], protection against security attacks can be achieved by implementing hardware and software solutions, as well as by using human intervention for network monitoring.

35

Attacks on DIS depend upon obtaining access to existing communication channels or establishing new channels that masquerade as authorized connections. Methods of attacks can be classified according to the way in which a channel is misused into five categories, named masquerading (to pretend of having the identity of legitimate users), eavesdropping (listening or sniffing the network packets that are transmitted when implementing an object request), request tampering (request messages are intercepted before they reach the server object, replaying (the repetition of request messages), and Denial of Service (DoS) attacks (Figure 3-1).

Attacker's intentional step

Masquerading

*

Eavesdropping

DIS attack

Request tampering

Incident

*

Replaying

DoS attack

Figure 3-1: Distributed information systems attacks classification

A useful means of classifying security attacks, is in terms of passive attacks and active attacks [ISO, 1989] [ITU-T, 1991] [Stallings, 2002]. A passive attack e.g. eavesdropping attempts to learn or make use of information from the system but does not affect system resources by obtaining information that is being transmitted, while active attacks e.g. masquerading, request tampering, replaying, and DoS attacks, attempt to alter system resources or affect their operation. The emphasis in dealing with passive attacks is on prevention rather than detection; the opposite is with active attacks. Additional threats come from the use of mobile code (e.g. code written in Java, where there is a potential threat of downloading and running malicious code that removes files or accesses private information) as well as from systems whose security is particularly sensitive to information leakage (whenever the results of a computation can be observed). Network security attacks include sequence number spoofing (guessing the sequence number of IP packets and issue a forged IP address, thus concealing the identity of the sender), authentication attacks, routing attacks, spam (unsolicited commercial e-mail or electronic junk mail) and DoS attacks. Many actual network attacks involve a combination of vulnerabilities such as the SYN flood attack (sending a succession of 36

SYN messages to a target), root attacks, smurfing (or spoofing, it refers to the exploitation of the IP broadcast addressing to create a denial of service, using a program called smurf), virus attacks, worms, Trojan horses, spyware (software that displays advertisements (in this case it‟s called adware) or tracks personal or sensitive information), and sniffing ( the monitoring of the network). More details about the socalled DoS attacks are given in the next paragraph.

3.1.4

DoS attacks

Denial of service (DoS) prevents or inhibits the normal use or management of communications facilities. The main aim of a DoS attack is the disruption of services by attempting to limit access to a machine or service, by targeting the network‟s bandwidth or connectivity, instead of subverting the service itself. The most three common forms of DoS attacks are bandwidth consumption, resource starvation, and resource exploitation. Distributed Denial of Service (DDoS) attacks exploit the inherent weakness of the Internet system architecture and its open resource access model. They are comprised of packet streams from disparate sources. The attacker must compromise the security of a large number of hosts, called „zombies‟, to be used later as launching pads for a coordinated DoS attack against one or more targets [Janczewski et al., 2001]. The success of the attack is dependent on the time gap between detection and response. Abouzakhar & Manson [2002] [2003] claim that an effective solution to DDoS attacks must be based on the control of resource allocation and not in mandatory authentication. They propose the use of intelligent fuzzy agents, located at the network router and the network server, which will provide dynamically, automated response actions in the form of management of resources when a DDoS is launched. Figure 3-2 provides a classification for DDoS attacks, based on the work of Douligers & Mitroskosta [2004] as well as of Janczewski et al [2001].

3.1.5

Network security architectures

Networking protocols such as TCP/IP, designed for open and trusted communities, have inherent flaws and are vulnerable to a number of security attacks [Mason, 2003] [Cournane and Hunt, 2004] [Sherif, 2003] [Sherif and Ayers, 2003]. They have been developed without security in mind; the objective was quick, connection-oriented and error-free delivery of message packets across the network. The ISO/OSI X.800

37

UDP flood

Malformed packet attack

Flood attack

ICMP flood

Smurf attack

Protocol exploit attack

By exploited vulnerability

1

Fraggle attack

Amplification attack

*

1

Direct

Manual

is comprised of

1

DDOS attack classification type list

has

Semiautomatic

*

Automatic

Indirect

By degree of automation

DDOS attack classification type

DDOS attack

Target

compromises

generates

Disruptive

*

Variable

Flunctuating

*

Increasing

Continuous

By attack rate dynamics

targets to

Packet stream

Degrading

By impact

compromises

Attacker

*

Handler

*

* Zombie Host

1

security architecture is the best known network security framework.

Figure 3-2: DDoS attacks classification

38

3.1.5.1 The ISO/OSI X.800 Security framework The X.800 OSI Security architecture is useful to managers as a way of organizing the task of providing security. It focuses on security services, mechanisms and attacks. It defines a security service as „a service provided by a protocol layer of communicating open systems, which ensures adequate security of the systems or of data transfers‟. The clearest definition is found on RFC 2828 [Shirey, 2000], where a security service is defined as „a processing or communication service that is provided by a system to give a specific kind of protection to system resources‟. Security services enhance the security of the data processing systems and the information transfers of an organization, by countering security attacks, and using security mechanisms to provide the service. The OSI Security framework identified the necessary security requirements that could be enforced in a connection (connection-oriented or connectionless) as authentication, confidentiality, integrity, access control, and non-repudiation. The techniques and services specific to network system have been classified as encipherment (confidentiality and integrity), digital signatures (non-repudiation, authentication, integrity, and access control mechanisms), data integrity mechanisms (integrity and authentication exchange mechanisms), traffic padding (confidentiality), routing control (to prevent sensitive data from traversing insecure channels), and notarisation (data integrity, peer authentication, and non-repudiation) as can be seen in Figure 3-3.

Encipherment

Data integrity mechanisms

Trusted functionality

Digital signature mechanisms

Security labels

Access Control mechanisms

Contextual

Pervasive

Security audit trails

Traffic padding

Routing control

Notarization mechanisms

Event detection

X.800 OSI security mechanism

Recovery procedures

Figure 3-3: X.800 OSI security mechanism for available security services

39

3.1.6 Security policies According to the X.800 Security Framework, security services implement security policies, which in turn are implemented by security mechanisms. The openness feature of any security architecture allows enabling parts of software to be added or replaced with no change to the architecture, with a true separation of security policies and mechanisms. Thus, different security policies must be separated from each other while separation implementation must be small, well structured and protected. Also, it must be guaranteed that any application-level communication is mediated by a specific security policy. Trcek [2000] defines a security policy as „a continuous process of setting, refining and implementing security objectives, regarding all aspects and levels of IS resources and based on organizational structure and its mission‟. In practice, it takes the form of a document that expresses clearly and concisely what protection mechanisms are to achieve. It describes precisely which actions the entities e.g. users, services, data and machines in a system are allowed to take and which ones are prohibited. The policy framework is in line with compliance regulations. Suitable security policies should be developed and implemented in order to reduce the risk exposure from potential threats. The process of developing a security policy and obtaining agreement on it from the system owners is the process of requirement engineering, the most critical task of managing secure system development. Once a security policy has been laid down, it becomes possible to concentrate on the security mechanisms by which a policy can be enforced. Cryptography provides the basis for most computer security mechanisms. Pieprzyk et al [2003] distinguish security policy into five different types. Access control policy defines the collection of access privileges and access rules; it is further divided into mandatory and discretionary access control. The inference policy determines which data items have to be protected to eliminate a leakage or disclosure of confidential information. The user identification policy specifies the requirements for proper user identification. The accountability (or accounting) and audit policy indicate a collection of requirements for the audit control. Accounting is the registration of the use of services by users. In traditional systems, accounting was primarily done to register consumption of resources; in modern open systems it is done to register the consumption of services or hierarchies of services. Auditing is related with the notion of non-repudiation, which makes users accountable for their 40

actions by collecting evidence. Finally, the consistency policy defines the meaning of operational integrity, semantic integrity, and physical integrity of databases.

3.1.7 Security assessment and evaluation Caelli [2002] argues that any secure system should be based on the fundamentals of computer security outlined in the „Orange Book‟ or „TCSEC‟ the Trusted Computer Systems Evaluation Criteria of 1983/1985. The original goal of the Orange Book [TCSEC, 1985] was to develop protection measures that would be standard in all major operating systems, not an expensive add-on for captive government markets. Common Criteria (CC) is an international standard [ISO/CCIT, 1999] for the evaluation of IT products and systems in general. It has much more flexibility than the Orange Book [Mason, 2000]. A product is evaluated against a protection profile, which is a set of security requirements, their rationale, and an Evaluation Assurance Level [Williamson and Healy]. A rationale consists of knowing how each threat is controlled by one or more objectives, and how each objective becomes necessary by some combination of threats. The standard does not deal with administrative security measures, cryptography, evaluation methodology, and the use of any other standards. It is focused on the technical aspects of design, and strongly oriented toward MLS (multi-level security) systems as well as to devices that support them, such as government firewalls and encryption boxes. Assuring optimal security of an information system is not a trivial task, as it requires a wide variety of expertise from technological to organizational. Research has shown that there is not a concrete method to ensure that all possible attacks and loopholes are excluded while a secure system is designed as the final result will be based on the best state-of-the-art available standards. A good practice, while designing or evaluating a security framework, is to assume for the worst. A study was carried out [Kankanhalli et al., 2003] which theoretically developed and empirically tested a model of IS security effectiveness that incorporates organizational factors (organizational size, top management support, and industry type) with the analysis of undertaken IS security measures (deterrent and preventive). The authors conclude that an increment of the deterrent and preventive efforts appears to increase security effectiveness. Another case study has taken place [Vermeulen and Von Solms, 2002] to prove the effectiveness of the proposed security tool, called the information security management toolbox. Contemporary e-business networks are increasingly 41

implementing the multi-layer security scheme in order to provide a reasonable measure of security for their information systems. An experiment took place in order to quantify the latency introduced by multi-layered security [Iheagwara and Blyth, 2002]. The results show that the absolute value of the end-to-end latency fluctuates based on the network load, while the multi-layered security architecture suffers from not being able to respond well under heavy network load.

3.1.8 Information security approaches 3.1.8.1 Bottom-up approaches In general, while following bottom-up approaches, the system under consideration has to be examined for areas of weakness or backdoors, as well as for assessing configurations against „industry best practice‟, in order to identify a checklist with the needed security controls. Risk analysis, a bottom-up approach, has been developed to address security in information systems. The aim is to eliminate or reduce risks and vulnerabilities that affect the overall operation of organizational computer systems. The Extended Risk analysis model [Reid and Floyd, 2001] combines the classical risk analysis and state-preference models. A UML design methodology, based on Systems Security Engineering Capability Maturity Model (SSE-CMM), is proposed by Chan & Kwok [2001] in order to specify details for processes such as risk management, model engineering, and assurance. Lee et al [2002] proposed a model which integrates several software lifecycle process standards (SLPS), such as SSE-CCM, IEEE/EIA 12207, and ISO 17799, with security engineering (SE) activities. The Business Process Information Risk Management (BPIRM) [Coles and Moulton, 2003] ensures ownership of information risks using Standard Service Level Agreements (SSLAs). The RiMaHCof (Risk Management in Health Care) method [Smith and Eloff, 2002], as opposed to conventional techniques such as Annual-Loss Exposure (ALE) calculation, uses cognitive fuzzy techniques in order to achieve confidentiality of patient information.

3.1.8.2 Top-down security approaches Top-down approaches to information security are generally harder to audit; the system is measured against a defined standard, which is derived from a security policy, and ultimately from a risk assessment. The security framework in this case includes a threat model, security policies, and security mechanisms [Anderson, 2001]. A threat 42

model should be constructed that lists all the forms of attacks, arising from all possible sources in the network, physical, and human environment, to which the system is exposed and an evaluation of the risks and consequences of each. The effectiveness and the cost of the security techniques that are needed can then be balanced against the threats [Coulouris et al., 2005]. Threat-based Security Engineering (TBSE) [Leach, 2004] takes a non-deterministic approach to modelling how security threats interact with countermeasures enabling quantitative forecasts of the likelihood and the characteristics of security incidents as a direct function of the security measures employed. ISCAP is another top-down model [Van der Haar and Von Solms, 2003] where an organization identifies its own security needs by taking into account its properties, and the security goals, thus leading onto the derived control attributes associated with each control. The best known top-down approach is the ISO/IEC 17799 standard [BSI, 1999] [ISO, 2002]. Other baseline security standards are the ISACAF/COBIT, the General Accepted System Security Principles (GASSP) – from USA National Research Council, GMITS (ISO/IEC PDTR 13335-1), and the ISF (Information Security Forum) standard of Good Practice [Höne and Eloff, 2002].

3.1.8.3 The ISO/IEC 17799 standard The main goal of the ISO/IEC 17799 standard and its later version [ISO, 2005] is to detect and prevent unauthorized acts by computer users. It defines information security management in terms of the information security policy establishment and assessment, information security organization and responsibility, personnel security management and training, computer system security management, network security management, system access control, system development and maintenance security management, information assets security management, physical and environment security management, and business planning and management. The standard proposes the Plan/Do/Check/Act as the life cycle for information security management. It also defines 10 baseline control areas, 36 control objectives, and 127 controls. Controls are activities that an organization can implement to reduce the possibility of adverse activity directed at the asset or to limit the long- and short-term consequences if the threat materializes. An implementation in the UK National Health System has been presented by Lillywhite [2004].

43

The standard does not define any security requirements explicitly, but describes the grounds on which the security requirements should be derived. The Security Requirements Exercise (SRE) approach [Gerber et al., 2001] is based on BS 7799 baseline standard. A lookup matrix, which combines the amount of security for each security concern with the results of an impact analysis, is constructed to determine the level of security for each security requirement. A benchmarking model for critical security infrastructures has been developed against the framework of ISO 17799 Part 1. In its current form [Todd et al., 2002], the benchmarking model comprises a matrix, which charts the checkpoints within each control against six activities developed from the deter/ protect/detect/react cycle in an effort for better describing the complex requirements of the critical infrastructure protection model. Furthermore, the model addresses the concept of a critical national infrastructure (CNI) as a defined network of computers, databases, protected transmission links and security procedures. The proposed model exemplifies the co-operation, which must be established between the government and the private sector elements in terms of establishing an understanding of dependencies.

3.1.9 Naming issues in DIS Naming (i.e. providing unique names for system entities) in a DIS can be very complex as was shown in an article by Needham [1990]. Several assumptions that underlie names often change from one country to another or between different local naming systems. In addition human naming conventions are not uniform and other assumptions between government and people‟s names that vary from one country to another in various ways can cause subtle security failures. Other problems have to do with the stability of names and addresses, the semantic of names, the use of pseudonyms, and law regulations in the use of names. The ITU-T X.500 standard [2005] deals with the construction of global directories of names and attributes.

3.1.10 Security awareness The human factor may be the weakest link, in requiring a greater awareness and understanding of security issues, to deploy for safe solutions. Siponen [2000] divides state-of-the-art approaches to minimizing user-related faults into two parts, those concerned with affecting the user, and those concerned with increasing the humanoriented character of technical solutions. A taxonomy of information security tasks 44

and related human usability factors is given in Schultz et al [2001], based on the trade-off between usability and the degree of security provided by various information security methods, in order to make a strong case for the need for systematic usability analyses and for the development of usability metrics for information security. Three surveys were undertaken exposing issues that security managers need to be aware of when they implement their procedures in order to avoid unfortunate side effects [Pounder, 2003] with respect to users. Another study [Leach, 2003] has shown that improving user security behaviour significantly reduces internal security threat, expressed in terms of user errors, user negligence and deliberate acts against the company. The OECD Guidelines [2002] were designed to develop a "Culture of Security" and suggest the need for a greater awareness and understanding of security issues. Information security awareness is used to refer to a state where users in an organization are aware of their security mission. The use of security technologies can be significantly enhanced by employing Human Computer Interaction (HCI) concepts in the design of these technologies, leading to a system which is easier to use and which is more secure, according to Johnston & Eloff [2003]. Siponen [2000] argues that motivation and behavioural theories need to be considered in order to understand human behaviour, as well as the normative and prescriptive nature of end-users. Trompeter and Eloff [2001] define socio-ethical information security awareness as “the conforming of an organization to recognized information security ethical principles”. Wood [2004] suggests a team-based security approach where a team can be made up of people inside and outside of an organization. King [2004] describes how a combination of both technical and people skills can work towards achieving the goals of technical education, and awareness of corporate goals and policies, with respect to security, including raising security awareness, promoting centrally mandated policies and standards, and monitoring progress and ongoing product security. 3.1.11 Integrating top-down and bottom-up approaches – the need for a holistic approach Wilson in an article [2003] argues that “It is as important to develop a top-down approach, but with knowledge and understanding of the technical needs and environment, as it is to deliver technical solutions that meet business needs and are in 45

tune with the corporate strategy”. Anderson [2003] proposes the following definition of enterprise information security: “A well-informed sense of assurance that information risks and controls are in balance”. Such proposed hybrid approaches are trying to integrate the basic characteristics of top-down and bottom approaches and usually are introduced as „systemic‟ or „holistic‟. Examples can be found in the work of Patterson [2003], Trompeter & Eloff [2001] and Tickle [2002]. The Viable System Model (VSM) as was introduced by Stafford Beer [1973] [1992]and later addressed by Warren and Hutchinson [2003], a systemic risk management approach for ecommerce, first determines the high level security risks and then uses baseline security methods to determine the lower level security risks. Venter & Eloff [2003] argue that developing and implementing security policies as well as a risk assessment, can minimize the number of security vulnerabilities. A high level approach to implementing security policies through information security responsibilities, management accountability policy, and other baseline access control security policies is proposed by Ward & Smith [2002], by clarifying principles such as the defence-in-depth, separation of duties, need-to-know, and the dual control principle, derived from ISO 17799. Janczewski & Portougal [2000] discuss the use of fuzzy security clearance modelling for ensuring the „need-to-know‟ principle. Janczewski & Shi [2002] propose a set of information security baselines for healthcare organizations in New Zealand, based on AS/NZCS 4444. The framework includes an overall baseline assessment, risk analysis, specific policy development, measure implementation, as well as monitoring and reporting action. Lee & Lee [2002] propose a holistic model, based on the theory of Planned Behaviour, which assumes that behavioural intention is a key factor in predicting a person‟s behaviour. The Holistic Security Requirement Engineering (HSRE) for ecommerce [Zuccato, 2004], an iterative and incremental approach, defines security requirements with respect to risks, business processes, stakeholder and environmental demands. SALSA [Sherwood, 1996] is a generic five-layered model for the development of a security architecture, which is tuned to the business requirements of the enterprise. The FARES (Forensic Analysis of Risks in Enterprise Systems) [2003] [2004] is an iterative process, concerned with a holistic approach to risk analysis and risk management by applying an impact and vulnerability analysis, whereas a threat impact is evaluated against classified threats defined in the Common Criteria Profiling Knowledge Base. The role of electronic forensics within an overall security policy 46

and strategy is addressed by Wolfe et al [2003] as one part of a holistic view of protecting the assets, integrity, reputation, continuity and operation of any given organization. Hong et al [2003], in order to overcome the limitations of the current information security management theories, proposed an integrated theory where the information security, internal control and contingency management are expressed in the form of functions.

3.1.12 Discussion on Information Systems security Generally speaking, information security requirements correspond to specific security policies; each security policy is materialized through a specific security service, which in turn is implemented using one or more security mechanisms as countermeasures against specific security attacks (Figure 3-4). The advantages and disadvantages of the current security approaches, including top-down, bottom-up and the hybrid ones are presented in Table 3-1:

Security requirement

Security Policy corresponds to materialized through

Security mechanism

Security service

* implemented by

protect against

* Security attack

*

*

Communication channel

targeted to

Figure 3-4: Security in distributed information systems

47

Advantages

Top-down

Bottom-up

Hybrid

Simplicity

Security controls are taken

Integration of the basic characteristics of

Low cost

on identified system

top-down and bottom approaches

No training is required to use

Security risks

Raise security awareness

the method

The system is assessed

Some of them are enforced during the

against „industry best

system design

practice‟ Use of state-of-art technical solutions e.g. security mechanisms

Disadvantages

Easy to audit Local solution

Local solution

They leave the choice of

Subjective

control to the user

There is always a

Are generally harder to audit

possibility that not all the

Characterized by their locality

risks of the system are identified Need for people with special knowledge and skills

Table 3-1: The advantages and disadvantages of current security approaches

Security across distributed systems is based on higher-order activities that should be handled using higher-order logic. Subchapters 3.2 – 3.7 review and analyze all the security concepts such as policies, services and mechanisms –usually based on cryptography. Most of them are partial solutions as they deal with either the needed controls (addressing security management on a top-down basis) or security measures (in that case, ad-hoc locally applied security mechanisms usually as a software solution, following a bottom-up approach). All these high level security aspects should be dealt with under the umbrella of a complete layered holistic architecture that is based on higher-order logic, in order to ensure secure transparent distributed computations and to enhance the availability of system‟s services. Complete --in order to include all the state-of-the-art security services and extensible to include the future ones, holistic in order to meet the new trends and to ensure that everything is in balance, and finally higher-order logic to provide a global solution for handling 48

interoperability and data integrity -- issues that are strongly related to security. Thus, processes and channels will be secured and resources will be protected in order to achieve data sharing transparently between system components. UML diagrams presented here, are believed to be original unless it is clearly defined otherwise. The reason for using UML is that it is a notation, well known to software designers, that enables us to describe and compare the various approaches.

49

3.2 Cryptography issues Modern cryptography is concerned with the construction of systems that are robust against malicious attempts to make these systems deviate from their prescribed functionality targeting mainly on privacy and secrecy. Encryption and signature schemes are the most basic applications in cryptography using tools such as computational difficulty, pseudo-randomness, or zero knowledge proofs [Goldreich, 2003]. Their aim is in enhancing confidentiality and providing secret and reliable communication over networks. Encryption schemes are used to ensure the integrity and secrecy (or privacy) of the actual information being communicated (e.g. avoid eavesdropping and request-tampering); signature schemes are used to ensure the reliability and authenticity of the actual information being communicated, usually between pair of principals. Both schemes consist of efficient algorithms for key generation, signing, and verification. Encryption algorithms use keys to transform data in a manner that can only be reversed with knowledge of the corresponding decryption key. Encryption can be made either on the network on a link-by-link or on an end-to-end basis. In the first case the platform will be more secure but performance might be lost due to multiple encryption layers. It is best for system-initiated encryption such as in the CORBA Security service (see Appendix E) while the second one is preferred for application-level initiated encryption.

3.2.1 Shared and public-key cryptosystems There is a fundamental distinction between different cryptographic systems, based on whether or not the encryption and decryption key are the same. In a symmetric cryptosystem, also referred to as secret-key or shared-key system, the same key is used to encrypt and decrypt the message. Authentication protocols based on a shared secret key are also known as challenge-response protocols. Examples are DES [National Bureau of Standards, 1977], TEA [Wheeler and Needham, 1994], IDEA [Lai, 1992] as a successor to DES, and AES developed by NIST. In an asymmetric cryptosystem or public key system, the keys for encryption and decryption are different, but together form a unique pair. One of the keys is kept private, and the other is made public. The basis for all public-key schemes is the existence of large numbers as „trap-door functions‟, which are one-way functions with a secret exit, whereby it is easy to compute in one direction but infeasible to compute the inverse, unless the secret is 50

known. RSA [Rivest et al., 1978] is the best known public-key cryptosystem. The elliptic curve algorithms are another example of public key algorithms.

3.2.2 Digital signatures Digital signing and digest (or secure hash) functions are two techniques for signing documents digitally in order to ensure message integrity. A hash function is a oneway function that takes a message of arbitrary length as input and produces a bit string having a fixed length as output. Digital signatures can be constructed using public keys (e.g. RSA), secret keys (message authentication codes – MACs), and discrete logarithm-based schemes (including elliptic curve-based schemes). Two widely used digest functions are the MD4 and MD5 algorithms as well as the SHA algorithm (based on MD4). One-time signature (OTS), based on the underlying oneway function, offers an alternative to public key-based digital signatures. Bicakci et al [2003] have developed a methodology for evaluating OTS methods and present optimal OTS techniques for a single OTS or a tree with many OTSs. A proxy signature scheme based on discrete logarithms [Li et al., 2003] was designed for equation and forgery attacks.

3.2.3 Authentication In DIS there is a need to authenticate client requests as well as to protect the content of requests and replies to messages with the use of digital signatures and optionally, the encryption of data. The basic authentication technique is to include in a message an encrypted portion that contains enough of the contents of the message to guarantee its authenticity. The original form of the message that is sent is called plain text and the encrypted form is referred to as the cipher text. User authentication based on the use of passwords is the basic security mechanism for remote login systems. Username/password combinations for user authentication have a variety of recognized weaknesses. Table 3-2 shows the advantages and disadvantages of this approach. A survey [Furnell et al., 2004] and a comparative study [Irakleous et al., 2002] revealed that users are positive in using alternative authentication techniques such as cognitive questions, and image-based PIN. SAS, OSPA, and ROSI [Chien and Jan, 2003] are examples of password-based authentication protocols for distributed systems with low transmission cost.

51

Advantages The most convenient and widely adopted method

Disadvantages Very prone to attacks such as:

for user authentication

Replay

Any compromise of a user password does not

Password search

compromise any earlier session keys

Stolen-verifier

Low transmission cost

DoS attacks

Table 3-2: password-based authentication

Authentication also can be based on the use of digital certificates, a method that attempts to provide a solution to the problem of trust in security services (by enhancing confidentiality). Organizations, like Verisign, which called Certification Authorities (CAs), or Key Distribution Centres (KDCs), or Authentication Services (ASs) , provide digital certificates for individuals and businesses. These certificates allow users to authenticate entities with which they exchange information, as well as to encrypt data so that only the intended recipients are able to read the data, thus maintaining confidentiality [Sedov, 2000]. The most widely used standard format for certificates is the X.509 standard [ITU-T, 2000] (see Appendix A). The authentication service can be distributed logically or physically. A hybrid solution, as followed in Kerberos [MIT, 1994] [Neuman et. al, 2005], has the system distributed into authentication domains called realms, where each authentication server is being in charge of a domain. Kerberos has become an industry standard for securing intranet servers against unauthorized access and imposer attacks. An important issue concerning certificates is their longevity. Mechanisms to revoke the certificate by making it publicly known that the certificate is no longer valid, include the use of Certificate Revocation Lists (CRLs), published regularly by the certification authority, as well as the use of other methods to restrict the lifetime of a certificate.

3.2.3.1 Biometrics Biometric methods have been used in cryptography for authentication purposes. Biometric is the process of automatically recognizing a person using distinguishing traits. There are different types of biometric such as fingerprint, hand geometry, iris scanning views, voice verification, and signature verification [Harris and Yen, 2002]. 52

According to Groves [2002], biometrics, although its use can achieve cost reductions, is likely to struggle to gain mass deployment in the security industry, due to problems on legal systems and the identified user acceptance of biometrics as an advantageous and valuable business tool.

3.2.3.2 Steganography, smart cards, and other authentication schemes Steganographic methods hide the encrypted message in cover carriers so that it cannot be seen while it is transmitted on public communication channels such as a computer network. Smart cards have been widely adopted in many e-commerce applications, network security protocols, and also remote authentication schemes, due to their low cost, portability, efficiency and cryptographic capacity (Table 3-3).

Advantages

Disadvantages

Low cost

Hardware endurance

Portability

Maintenance of smart card databases

Efficiency Cryptographic capacity

Table 3-3: advantages and disadvantages of smart cards

3.2.4 Secure channels Encryption and authentication are used to build secure channels as a service layer on top of existing communication services, such as Virtual Private Networks (VPNs), Secure Sockets Layer (SSL) protocol [Netscape, 1996] and the Transport Layer Security protocol (TLS) [ITF, 1998], which is the de facto standard security protocol used in most e-commerce and it is based on SSL [Harding, 2003]. When establishing a secure channel between a pair of processes, each one of them knows reliably the identity of the principal on whose behalf the other process is executing. In advance, each message includes a physical or logical timestamp to prevent messages from being replaced or reordered. Secure channels offer performance benefits, enabling multiple requests to be handled without a need for repeated checking for authenticity. Besides authentication, a secure channel should also provide a guarantee for confidentiality and message integrity. Confidentiality, established using encryption, ensures that messages cannot be intercepted and read by eavesdroppers. Message 53

integrity means that messages are protected against modification (protection against tampering). A common practice is to use secret-key cryptography by means of session keys. A session key is generally used only for as long as the secure channel exists. When the channel is closed, its associated session key is discarded (or actually, securely destroyed).

3.2.5 Key management 3.2.5.1 Key distribution Establishing and distributing keys can be achieved using symmetric or asymmetric cryptosystems. Public key distribution usually takes place by means of a Public-key Manager (PKM), or by the use of public-key certificates. In a symmetric cryptosystem, the initial shared secret key must be communicated along a secure channel. A trusted Key Distribution Centre (KDC) is presumed to generate, store and distribute session keys. Each principal shares a secret key with the KDC. A KDC can be implemented using different approaches. Major disadvantages with the centralized approach include performance degradation due to the server bottleneck and poor reliability and scalability. A partially-distributed service offers intermediate performance, scalability and reliability, while the system is distributed in regions with one KDC per region. A fully-distributed service is highly scalable. Each KDC maintains a table of secret keys of other KDC and communicates with them via secure channels.

3.2.5.2 Key recovery and key escrow Several key recovery schemes have been developed having properties such as compliance, enforceability, traceability, confidentiality (end-to-end), and authorized accessibility to session keys. The properties of key recovery, key escrow, and trusted third party encryption were examined by Anderson [2004] and previously by Abelson et al. [1998], in order to outline the technical risks, costs, and implications of deploying systems that provide government access to encryption keys. The results show that key recovery systems are inherently less secure, more costly, and more difficult to use than similar systems without a recovery feature.

54

3.2.6 Discussion on cryptography Security policies are enforced with the help of security mechanisms. Digital cryptography provides the basis for most computer security mechanisms. Cryptography is used for the secrecy and identity of the communicated parties (using encryption and checksums), authentication of pair of principals and the creation of digital signatures. Encryption algorithms are employed using shared secret keys or public/private key pairs. A comparison of the different shared-key distribution services is given in Table 3-4. The advantages and the disadvantages of the PKI infrastructure are presented in Table 3-5.

Performance

Reliability

Scalability

centralized

Poor

Poor

Poor

partially distributed

Medium

Medium

Medium

fully distributed

High

Medium

High

Table 3-4: comparison of different shared-key distribution services

Advantages

Disadvantages

No need for a secure key-distribution

Lack of agreement upon standards for the parties involved

mechanism

extremely costly to add PKI infrastructure to a security

Secure verification of sender and

procedure

recipient

Public-key algorithms compared with secret key algorithms

Secure transfer of online data

are computationally more complex.

Provides a legal basis for online

Replacement of authentication is relatively expensive

transactions

Holders of the digital identities must maintain the privacy

Reduction of costs after installation

their IDs Software needs to be utilized Need of methods to restrict the lifetime of a certificate and revoke it when it is not valid Acceptable response time usually with a consistent number of users Lack of access control management

Table 3-5: Advantages and disadvantages of PKI infrastructure

55

3.3

Access control

The goal of access control is to counter the threat of unauthorized operations involving computer or communication systems (e.g. prevent masquerading attacks). The amount of effort put into the measures for preventing unauthorized access is analogue to the threat. A careful analysis of security threats and risks associated with them is essential to work out an acceptable security policy with respect to an overall database access control. Access control is performed at the application, host, and network level [Venter and Eloff, 2003]. Security protocols, as IPSec, ISAKMP and IKE, performed at the application, and network level, use a standard procedure for regulating data transmission between computers or applications, in order to safeguard sensitive information. They have proven to be open to several interpretations and implementations, something that causes several interoperability problems [Dunbar, 2001]. The mechanisms that prevent unauthorized access are referred to as protection mechanisms. They include password-based login procedures and screening logic, as well as logging and monitoring controls that monitor and analyse stored information in an attempt to detect the presence of unwanted intruders. Administration can be centralized, allowing very fine-grained control over resources, or decentralized which is more scalable. The first approach can become difficult when a lot of modifications are needed. Following the second one, it must be carefully thought out to avoid a lack of consistency among different administration domains.

3.3.1 Access rights All resources in a computer system can be divided into subjects (active) and objects (passive). A subject (or principal) may be a user, a group of users, a service, or an application. Subjects have different levels of access to certain objects in a system. An object may be a file, a directory, a printer, or a process. The way a subject acts on an object is called the access privilege or right. Access privileges can allow a subject to manipulate objects (read, write, execute, delete, modify, etc.) or to modify the access permissions (transfer ownership, grant and revoke privileges, etc.). The credentials are a set of evidence provided by a subject when requesting access to an object. Access control ensures that a subject has sufficient rights to perform certain actions 56

on an object. In addition, it may include object management issues, such as creating, renaming, or deleting objects. Access control is based on valid authorization. In its simplest form, for each pair (subject, object), the access control policy assigns a collection of access rights. The assignment can be explicit (positive authorization) or implicit (negative authorization) [Pieprzyk et al., 2003].

3.3.2 Access control classification Mandatory access control [TCSEC, 1985], which defines user access to system resources using the user security clearance and the security classification of the resource, was primarily used for handling multilevel and multilateral security. Hierarchical access control is closely related to multilevel security. Discretionary access control specifies user‟s privileges relating to different system resources. In the context of Role-based access control (RBAC), roles represent functions within a given organization, while authorizations are granted to roles instead of to single users, and are strictly related to the data that are needed by a user in order to exercise the functions of the role.

3.3.2.1 Hierarchical access control In the case of hierarchical access control, users and their own information items, structured in a user hierarchy, are organized into a number of disjointed sets (partially ordered) of security classes, and each user is assigned to a security class called the user’s security clearance. The user in a higher class can derive the cryptographic key for users in a lower class; the opposite is not allowed. Several hierarchical access control models have been proposed based on Chinese Remainder Theorem and symmetric algorithm such as Huang & Chang‟ scheme [2004], [Shen and Chen, 2002], Chen & Chung‟s scheme [2002] and [Bertino et al., 2003]. The problem of access control in a hierarchy for classified data management can be modelled also, as a directed acyclic graph [Lin, 2001].

3.3.2.2 Multilevel security In multilevel security, classifications are labels that run upward from „unclassified‟ through „confidential‟, „secret‟, and „top secret‟. The Bell-LaPadula model (BLP) [1976] deals with information flow control. It enforces two properties. The „simple security property‟, also known as „no read up‟ (NRU), means that no process reads 57

data at a higher level. The *-property means that no process may write data to a lower level; it is also known as „no write down‟ (NWD). System Z [McLean, 1985] was defined as a BLP model with the added feature that a user is able to temporarily reclassify any file from „high‟ to „low‟, with the system administrator‟s permission. The IBM z/OS operating system [2004] implements multilevel security based on system Z. The BLP model does not deal with the creation or destruction of subjects or objects, something that was tackled by the HRU model [Harrison et al., 1976]. The Biba model [1977], known as the „BLP upside-down model‟, deals with integrity, ignoring confidentiality entirely. The Clark-Wilson model [1987] deals also with integrity in the context of distributed transactions based on the principle of „separation of duties‟. Multi-level secure (MLS) systems are used mainly in military applications.

3.3.2.3 Multilateral security In multilateral security, instead of the information flow-control boundaries being horizontal as in the BLP model, the boundaries are vertical, expressing for example hierarchies between departments. The Lattice model [Denning, 1976] uses a lattice of security labels, similar to the Bell-LaPadula model. The Chinese wall [Brewer and Nash, 1989] introduces the concept of „separation of duty‟. The BMA model [Anderson, 1996] was introduced to handle multilateral security in medical information systems, where the data protection laws restrict the dissemination of personal data. Inference control and privacy are also issues where multilateral security systems are trying to provide solutions. Clauss & Kohntopp [2001] provided two different types of classification for multilateral security technologies. With respect to the number of involved parties, they are classified as unilateral, bilateral, trilateral, and multilateral.

3.3.2.4 Access Control Lists and Capabilities A common approach to modelling the access rights is to construct an access control matrix, with a row for each subject and a column for each object. Another approach is having each object maintain a list of the access rights of subjects that want to access the object, thus implementing an Access Control List (ACL). The matrix in this case is distributed column-wise across all objects, and empty entries are left out. In the case where the matrix is split up by row, each subject is associated with a list of 58

capabilities it holds for each object. Access control lists are suited to environments where protection is data-oriented; they are less suitable where the user population is large and constantly changing, or where users want to be able to delegate their authority to run a particular program to another user for some set period of time. Common delegation issues include authenticity, integrity, lifetime of delegated rights, rights addition and revocation. Capabilities are more efficient for runtime security checking and delegation. Public certificates are another form of capabilities.

3.3.2.5 Security policy domains One general way of reducing ACLs is to make use of security policy domains. A security policy domain includes all components subject to a single security policy. The access to a security policy domain can be based upon the domain‟s classification level [Stephenson, 2004], thus determining who or what processes can access it while it reflects the type as well as the criticality and sensitivity of the data that it contains. The access matrix can be constructed with a column for each object and a row for every domain. Security policy domains can be implemented as hierarchical groups, or roles [Matthews, 2000]. A group is a list of subjects; a role is a fixed set of access permissions that one or more subjects may assume for a period of time using some defined procedure [Anderson, 2001].

3.3.2.6 Role-based Access control The structure of an organization involves users who perform their assigned tasks according to their job positions or business roles. Some tasks compose business processes, which have, under permission, special access control requirements. A number of business rules are involved in various business activities and access controls. RBAC models emerged to express this reality. RBAC models can be flat or symmetric. Other relevant extensions are related to role hierarchies and role constraints [1996], [Bertino, 2003] (see Appendix B). Role hierarchies are essential in reducing the complexity of role and authorization specification, because they support, according to the classical inheritance mechanism of object-oriented programming, the re-use and specialization of role definitions and authorizations. Constraints are relevant for facilitating better modeling application-

59

dependent restrictions on the use of roles by users. Type enforcement in RBAC models can be used for handling integrity There is a difficulty in assigning permissions suitable for roles. For that reason, several variations of RBAC models have been developed such as the RBAC96, TaskRBAC (TRBAC), Administrative-RBAC (ARBAC), and Object-RBAC (ORBAC) model [Oh and Park, 2003]. A symmetric RBAC with constraints, based on the work of Moona et al. [2004] and expressed in a UML diagram, is presented in Figure 3-5.

inherits

inherits

inherits

* 1

* 1

User

*

1

*

Role 1

has

*

1

Permission

*

*

*

instantiates refers to

*

Session

* Disjoint permission (DP)

*

* User constraint

* Session constraint

* Role constraint

* Permission constraint

Conflicting permissions (CP) Prerequisite permission (PP) Permission assigned to single role (PASR)

Constraint

Figure 3-5: Symmetric RBAC with constraints, based on RBAC 96

The UML diagram of the ORBAC model is presented in Figure 3-6.

60

inherits

inherits

inherits

*

*

User

1

Position role

1

has

*

*

*

*

1

has

*

*

Privilege

has

*

*

*

*

*

*

User constraint

Task role

*

Position role constraint

Task role constraint

Constraint

Figure 3-6: ORBAC based on RBAC96

inherits

inherits

* Private task

1

Supervision task

Workfloworiented task

Passive access task

inherits

1

inherits

*

User

*

Approval for activity task

Active access task

inherits

*

1

1

* has

Role

*

*

*

instantiates

Workflow Task

has

*

1

Permission

* *

refers to

Session

PASR

* *

PP

* User constraint

* Session constraint

*

CP

Role constraint

* Permission constraint

DP

Constraint

Figure 3-7: The TRBAC model

61

In a variation of a constrained RBAC, the Temporal Constrained RBAC, roles can be enabled in some periods and not enabled in others.(see Appendix B) T-RBAC models are founded on the classification of tasks. The UML diagram of the T-RBAC model can be seen in Figure 3-7.

3.3.3 Multi-policy systems Interoperability and security in multi-policy systems have been examined by Kuhnhauser [1999]. He introduces the concept of policy groups as an approach to secure domain interactions, based on HRU access control calculus. A policy group combines a set of security policies with a set of policies that control inter-domain actions. A systemic framework for achieving interoperability when multiple security policies are employed has been proposed by Kokolakis & Kiountouzis [2000], developing a Metapolicy Development System (MDS), a policy framework and a meta-policy framework to serve as conceptual devices in the application of the MDS through a policy repository implemented in Telos, an object-oriented knowledge representation language. The custodian model [Halfmann and Kuhnhauser, 1999] was introduced to support multi-policy systems and expressive security policies. A custodian is a capsule containing an executable code unit, compiled from a policy‟s program language representation. It links the policy to the entities within its associated security domain at runtime. 3.3.3.1 The ‘sandbox’ Another way of implementing access control is to use a software sandbox. Users want to run some code that they have downloaded from the Web as an applet. Java uses this technique by providing a „sandbox‟ for such code – a restricted environment in which it has no access to the local hard disk and is only allowed to communicate with the host it came from. An alternative is proof-carrying code. Code to be executed must carry with it a proof that it does not do anything that compromises the local security policy. Both of these are less general alternatives to an architecture that supports proper supervisor-level confinement [Anderson, 2001].

3.3.4 Firewalls One of the most common problems with perimeter defences, e.g. use of internal/external firewalls, is the incorrect configuration of external firewalls, which 62

can result in inadequate protection. Firewalls are gateways that tightly control message traffic between private and public networks. They protect a trusted network from an untrusted network by filtering traffic according to a specified security policy. A firewall is implemented by a set of processes, usually working at different protocol levels. Primary types of firewalls include screening routers, proxy servers, and stateful inspectors [Desai et al., 2002]. Screening routers, also known as „packetfiltering gateways‟, operate as routers and make decisions as to whether or not to pass a network packet based on the source and destination address as contained in the packet‟s header. They have low hardware costs and relative simplicity but lack any user-level authentication protection, something that is included in proxy servers. The latter provides also logging and accounting information, but an application-layer gateway is needed for each application. Stateful inspectors, specialized for viruses in audio or video packets, include intrusion detection techniques, but present high data transfer delay as well as high maintenance activity. Most firewalls operate at the TCP/IP layer without being able to detect attacks at the application itself, such as SQL injection, cross-site scripting or other cgi-script attacks [Midian, 2003]. Firewalls are not particularly effective against DoS attacks. Personal firewalls are installed on a normal workstation and attempt to protect only that specific workstation from the rest of the hosts on the network or the Internet. As it is infeasible to examine and test each firewall for all possible potential problems, a taxonomy is needed to understand firewall vulnerabilities in the context of firewall operations. A firewall vulnerability is defined as an error made during firewall design, implementation or configuration, which can be exploited to attack the trusted network that the firewall is supposed to protect. According to Kamara et al. [2003], the most common causes for firewall vulnerability include validation error, input, origin, target, authorization error, serialization / aliasing error, boundary checking error, domain error, weak / incorrect design error, or other error. Firewall vulnerability effects include execution of code, change of target resource, access to target resource, and DoS attacks. A framework for understanding vulnerabilities in firewalls using a data flow model of firewall internals was proposed by Frantzen et al [2001]. The model depicts data flow in terms of a sequence of stages that start with a receipt of a packet.

63

3.3.5 Virtual Private Networks (VPNs) VPNs extend the firewall protection boundary beyond the local intranet by the use of cryptographic protected channels at the IP level. Cheung & Misic [2002] proposed a VPN protocol, consisting of four layers, namely the VPN application layer, convergence and control layer, data protection layer, and network layer. It is based on the architecture of SNMPv3, and can be implemented over different architectures, such as OSI, ATM, WDM optical networks.

3.3.6 Discussion on access control All resources in a computer system can be divided into subjects (active) and objects (passive) and can be associated with an access control security component in the boundaries of a security policy domain. A security domain policy is applied to the latter for a specific computational environment. The access control components classification for distributed environments is presented in the next UML diagram (Figure 3-8).

64

Access Control

Delegation

Nonrepudiation

Auditing

Message Protection

Security Domain Policy *

Access Control Matrix

1

Computational environment

1

* Execution Unit

1

Security Policy Domain

* Access Control Security Component

Identity-based security component (need-to-know)

Access Control List (ACL)

Capability

Access Right

Role Role-based security component

1

Task

Ruled-based security policy component 1

Multilevel security component

Multilateral security component

Computer Resource

Subject (or principal)

Group of Users

Service

Object

Application

File

Directory

Printer

Process

* User

Figure 3-8: Access control components classification for distributed environments

65

3.4 High-level security services Authentication and access control are prevention systems that are implemented as the first line of defence. Unfortunately, they do not provide a means of detecting and reporting security events as they happen. Continuous monitoring of a network domain is necessary to ensure proper operation of the network by detecting possible service violations and attacks, usually based on edge-to-edge measurements as Service Level Agreement (SLA) parameters. Network monitors (or sensors) are capable of monitoring specific TCP flows and returning that information to the application for the purpose of performance debugging (e.g. to prevent replaying attacks). On the other hand, operating system sensors perform real-time intrusion monitoring, detection and prevention of malicious activity by analysing kernel-level events and host logs. The threat of intrusion according to Sherif & Ayers [2003] should be taken very seriously because „the threats are real, everything is on the Net, firewalls and VPNs are not enough, the amount of new vulnerabilities is increasing, and hackers are getting smarter‟. Intrusion Detection Systems (IDS) are implemented as the second line of defence. An IDS can be seen as a collection of detection modules, also called sensors, with unique attack recognition and response capabilities. Vulnerability scanners are a special case of intrusion detection. They use signatures for the vulnerabilities they can identify. Anti-virus scanners attempt to scan for viruses and functions before they can cause any damage, much in the same way as vulnerability scanners. Because no list of threats is likely to be exhaustive, auditing methods are used in security-sensitive applications to detect any system violations, back to the users responsible for it. Tracing-back techniques are used for working out the suspected path of an attack.

3.4.1 Intrusion Detection IDS provide proactive monitoring of system activity and apply automatic responses in the event of detecting such suspicious unwanted or illegal activities. It is a security technology attempting to identify and isolate computer system intrusions. A computer system intrusion is seen as any set of actions that attempt to compromise the integrity, confidentiality or availability of a resource. It exploits vulnerabilities and introduces external disturbances into information systems, thus causing faults of software and hardware components, which then lead to errors and failures of system performance. Security threats related to intrusion detection include interruption, interception, 66

modification, and fabrication. The first is a threat to availability, the second to confidentiality and the last two to the integrity of the system. There are two primary IDS techniques, anomaly detection and misuse detection [McHugh, 2001]. The first is based on the assumption that an attack on a computer system will be noticeably different from normal system activity and an intruder will exhibit a pattern of behaviour different from that of the normal user. In the case of misuse detection, a collection of known intrusion techniques is kept in a knowledge base, and intrusions are detected by searching through the knowledge base for the same techniques. Experiments that took place with real data [Han and Cho, 2003] have shown that no single approach can detect all types of intrusion. It is better to combine a specific anomaly approach with a specific misuse approach into one IDS. Anomaly approaches have accuracy and completeness rates, but it is difficult to define normal user behaviour. Misuse detection approaches detect only attack patterns but with high accuracy. The pitfalls with current IDS implementations can be summarized in the presence of a high number of false positive and negative alerts, lack of efficiency, the issues of variant signatures, data overload, difficulties to function effectively in switched environments, resilience and scale up issues.

3.4.2 Vulnerability analysis Vulnerability scanning is also referred to as „interval-based scanning’, because hosts on a network are scanned at certain intervals and not continuously. Vulnerability analysis and penetration testing cannot test for unknown vulnerabilities [Frantzen et al., 2001]. One of the problems facing penetration testers is that a test can generate vast quantities of information that need to be stored, analysed and cross-referenced for later use. A global approach to viewing vulnerabilities looks not at the individual components of the enterprise, but rather at the security policy domains in the enterprise and the paths between them. Malware writers are ready to take advantage of published vulnerabilities. Patching should by no means be regarded as a complete solution to the security problems introduced by vulnerabilities. The biggest benefits of automated patch management are significant time and resource savings, closing security vulnerabilities and 67

providing consistent enterprise-wide patch management and a proactive approach to establishing and maintaining a more stable and secure environment [Bakman, 2003].

3.4.3 Auditing and tracing techniques Stephenson argues that it is unlikely that organizations will have a clear understanding of the probability that an event will occur [2003]. He claims that „an enterprise should be understood in terms of the vulnerabilities of the policy domains and the paths that a threat might take to exploit those vulnerabilities‟. The concept of link analysis is fundamental in the tracing of various types of fraud. Link analysis, combined with the technique of Trace back, can be used for working out the suspected path of an attack [Stephenson, 2003]. In IP trace-back, a measure is considered reactive, e.g. debugging and control flooding, when the trace-back process is initiated in response to an attack. In the case where tracing information is concurrently generated as packets routed through the network for subsequent attacker identification, the measure is considered as proactive e.g. logging, messaging, and packet marking. Audit trails can help a system administrator trace a security violation once it has occurred, if possible back to the users responsible for it. They are made up of logs of typed events i.e. user‟s actions. An audit policy filters the security-related events that should be logged to avoid a log size overflow. A dedicated audit server can provide a number of facilities such as user accountability (through a user-level audit policy), system-level policy concerning modification of configuration files and failed connection attempts, intrusion detection, tracing back techniques, strong protection mechanisms for the audit trail data, audit administration tools (for audit reduction, trends detection and attack signatures) and logging procedures (real-time or batchprocessed), as well as frequency audit trail reviews [Beckers and Ballerini, 2003] [Botha and von Solms, 2003].

3.4.4 Antivirus control A virus can be defined as a set of instructions that can spread from computer to computer by attaching itself to otherwise legitimate programmes, or a fragment of a program, which cannot run independently. It can act as a logic bomb, a Trojan horse that reproduces itself in other executable code, a normal programme file, or instructions stored in the boot-up sequence that exists on all diskettes and the PC hard disk. There are different types of viruses such as boot viruses, multi-part viruses, 68

systemic viruses, polymorphic viruses, stealth viruses, and meta-viruses. Problems encountered with viruses include the availability of current virus pattern updates and live-updates for specific platforms as well as the proxy server and MAC memory problem [Sherif and Gilliam, 2003]. Zenkin [2001] found that the main deficiency of commonly used anti-virus scanners is their ability to detect and delete only those for which their signature is known. To overcome this drawback, he proposes as the best choice for protection against unknown viruses, a well-organized combination of techniques such as heuristic analyser, redundant scanning, integrity checker, and behaviour broker. The Analytic Hierarchy Process (AHP) [Mamaghani, 2002] is designed for decisions that require integration of quantitative and qualitative data in order to evaluate and select an antivirus and content filtering software. It is a multiple criteria decision-making method; among the criteria are the installation, operation, administration, notification/logging, and anti-virus/content filtering.

3.4.5 Hardware-based security Hunter [2004] states that the key requisites of hardware solutions for security have to be able to keep up with the speed of modern networks, be capable of analysing in principle any part of an IP packet, whether in a header or payload, and be programmable. A programmable device, based on Field Programmable Gate Arrays (FPGAs) can detect any virus on the basis of generic signatures. Distributed Sensor networks have been identified as being useful in a variety of domains such as wireless networking, and micro-electro-mechanical systems (MEMS). Avancha et al [2003] presented a scenario defining perimeter protection as an application class of sensor networks. In order to reduce the computational cost, VLSI chips can be used for security functions. Lin [2001] proposed a scheme suitable for a low cost tamper-proof VLSI chip, which uses tamper-proof devices, one-way hash function, and a single Trusted Agent (TA). An alternative direction of cryptography is based on Data Dependent Rotations (DDR) [Sklavos and Koufopavlou, 2003]. DDR operations can be performed very quickly in hardware with permutation networks, and are proved to be a cryptographic primitive, which is useful for designing fast hardware-oriented encryption algorithms for ensuring data integrity.

69

3.4.6 Discussion on high-level security services A comparison of the high level security services discussed in this section with respect to whether they are proactive or reactive, as well as if they use continuous networking monitoring or not, is presented in Table 3-6.

Type

Monitoring

IDS

Proactive

Continuous

Vulnerability scanners

Proactive

At certain intervals

Penetration Testing

Proactive

On demand

Auditing procedures

Proactive / Reactive

Continuous/On demand

Tracing techniques

Proactive / Reactive

On demand

Anti-virus control

Proactive / Reactive

Continuous/At certain intervals/On demand

Table 3-6: Comparison of high-level security services

70

3.5

Fault tolerance

Fault tolerance demands that a system continues to operate even in the presence of faults. A fault is the cause of an error. Faults are classified as transient - they occur once and then disappear, intermittent - they occur, and then vanish and reappear periodically, and permanent - they continue to exist until the faulty component is repaired. An error is a part of a system‟s state that may lead to a failure. A system is said to fail when it cannot meet its promises. Failures in DIS are partial; some components fail while others continue to function. A crash failure occurs when a process halts and remains halted, while other processes may not be able to detect this state. An omission failure occurs when a server fails to respond to incoming requests; such failures can further be divided into receive and send omission failures, that is whether the server fails to receive incoming messages or to send messages. Timing failures occur when the server‟s response lies outside the specified time interval. A response failure occurs when the server‟s response is incorrect. Arbitrary failures, known also as Byzantine (or fail-stop) failures, occur when a process or a channel exhibits arbitrary behaviour, by producing arbitrary responses at arbitrary times. Techniques for dealing with failures include detecting failures, masking failures, tolerating failures, and recovery from failures (Figure 3-10).

3.5.1 Fault tolerance and dependability Fault tolerance in DIS is strongly related to the notion of dependability. A dependable computer is one that is trusted to deliver its services. Dependability covers a number of useful requirements (Table 3-7). Availability and performance tend to be mutually contradictory goals in DIS.

Reliability

The probability that the system under consideration does not experience any failures in a given time interval

Safety

The situation that when a system temporarily fails to operate correctly, nothing catastrophic happens

Maintainability

The ease with which a failed system can be repaired

Availability

The probability that the system is operating correctly at any given moment and is available to perform its functions on behalf of its users

Table 3-7: Dependability in DIS

71

3.5.2 Redundancy A system is said to be k fault tolerant, if it can survive faults in k components and still meet its specifications. The key technique in masking faults is to use redundancy. Some computers have been built with hardware redundancy. With process group redundancy multiple copies of a system run on multiple servers in different locations. Backup is another form of redundancy; a copy of the system, also known as checkpoint, is taken at regular intervals. A fallback system is an example of redundancy in the application layer, where the processing reverts, when the main system is unavailable. The Eternity Service [Anderson, 1996] deploys redundancy and scattering techniques to enhance data availability. Douceur [2002] has shown that large-scale peer-to-peer systems without a logically centralized authority cannot prevent the so-called Sybil Attacks (where a singly faulty entity can present multiple identities).

3.5.3 Process groups The key approach to tolerating a faulty process is to organize several identical processes into a group. The purpose of introducing groups is to allow processes to deal with collections of processes as single abstractions. When a message is sent to the group, all members of the group receive it. In this way, if one process in a group fails, hopefully some other process can take over from it. Process groups can be dynamic; they also can be flat or hierarchical. Group membership can be distributed or managed by a group server. An important issue with using process groups to tolerate faults is how much replication is needed. Replication can be primary-based – clients communicate with a distinguished replica using primary-backup protocols, or replicated-write –clients communicate by multicast with all replicas using replication and quorum-based protocols. The primary means of communication in the process group is group multicast. The audit of causal relationships of group multicast communications is an important component in achieving a solution to the problem of group-oriented distributed computing security. An auditing service should be able to collect, maintain, make available and validate irrefutable evidence regarding causal relationships. The approach proposed by Tsaur & Horng [2001], uses process dependency graphs i.e. abstractions of the dynamic behaviour of a distributed programme, which show the communications between processes while hiding the details of internal events within a process. An implementation of a portable CORBA 72

fault-tolerant middleware service, based on the above protocol, is presented by Morgan et al. [1999]. It provides causality preserving total order delivery to members of a group, which is preserved for multi-group processes. The Arjuna Transaction Service (ATS) [Parrington et al., 1995] is a transaction processing system that allows the construction of fault-tolerant distributed applications using nested transactions controlling operations on persistent objects. Confidential and reliable group communication can be achieved using symmetric or asymmetric systems.

3.5.4 The Byzantine failure model The issue of reaching an agreement is crucial in fault-tolerant systems. Messages should be delivered within a known finite time. Several distributed agreement algorithms have been used for that reason. Contributory key agreement (CKA) protocols generate group keys based on contribution of all group members and are particularly appropriate for relatively small collaborative peer groups. Amir et al [2004] propose a CKA protocol, based on the group Diffie-Hellman contributory key agreement, which aims to provide resilience to any sequence of group changes by using the services of a group communication system that supports Virtual Asynchrony semantics. The most famous agreement problem is the Byzantine-Generals problem. The Byzantine failure model is inspired by the idea that there are P generals defending Byzantium, S of whom have been bribed by the Turks to cause as much confusion as possible in the command structure. For a set P of parties, the security requirements must specify for which sets S  P of parties, their collective cheating must be tolerated, meaning that for the remaining parties ( P \ S ) the specification is still achieved.

3.5.5 Business continuity and recovery from disasters A Disaster Recovery Plan (DRP) is a pro-active document designed to assist an organization in recovering from data losses and restoring data assets. It is an action plan that is used to identify a set of policies, procedures, and resources that is used to monitor and maintain corporate information technology, before, during and after the disaster. Possible disasters include natural disasters e.g. fires, earthquakes, lightning, 73

storms, static electricity, software malfunctions, hardware or system malfunctions, power outages, computer viruses, man-made threats e.g. vandalism, hackers, and sabotage, and human error e.g. improper computer shutdown, spilling liquids on the computer, and cigarette ash. An example of a DRP for a DIS was given by Hawkins et al [2000]. Business Continuity [King and CBCP, 2003] was defined as a long term, proactive and responsible approach to ensure that the organization is operationally resilient to disasters.

3.5.6 Intrusion detection and fault-tolerance Intrusion detection can be used also in the field of fault tolerance. In general, fault tolerance can be accomplished through fault avoidance, fault masking and error tolerance. Two intrusion tolerance methods were presented by Ye [2001] that use fault masking: Tagushi’s method for system configuration and the sharing of resources via an information infrastructure for redundancy.

3.5.7 Failure analysis In the method of distributed diagnosis (a bottom-up approach), each working node must maintain correct information about the status (working or failed) of each component in the system. In a fault tolerant system some kind of statistical analysis may be needed. Statistical Data Analysis (SDL) shows that early fault detection can cut costs significantly. A relevant architecture was proposed by Wong et al [2003]. It is applied at source code level to cover testing software designs represented in SDL. Kranakis & Santoro [2001] presented algorithms based on Boolean functions, for distributed computation on anonymous, oriented, asynchronous, n-dimensional hypercubes with faulty components (i.e. processors and links). They analyse only the case where faults can occur before the start of the computation, without any consideration of faults that may occur at different stages of the computation. The academic literature concerning Fault Tree Analysis (FTA) (a top-down approach) relates almost entirely to the design and development of safety-critical systems. FTA is a top-down approach. A tree is constructed whose root is the desired behaviour and whose successive nodes are its possible causes. According to Brooke & Paige [2003], FTA can also be applied to the design and analysis of systems with security requirements. A thorough analysis of failure models may combine top-down and bottom-up approaches, while dealing with the security in safety-critical systems. 74

3.5.8 Discussion on fault tolerance Fault tolerance is related especially with the notion of availability, one of the CIA security principles. A classification of the methods dealing with failures is illustrated in the following UML diagram (Figure 3-9).

Method for dealing with failures

Fault Avoidance

Fault Tree Analysis

Detecting & Tracing Analysis

Distributed Diagnosis

Bounded Correctness

Masking failures

Statistical Analysis

Message Logging

Sharing of Resources

Recovery from failures

Tolerate faults

Backward error recovery

Tangushi’s method

Forward error recovery

Redundancy Periodic Checkpoints Hardware Redundancy

Backup

Logging & Auditing events

Replication

Fallback

Process Group Replication

Passive (or Primary-backup) Replication

Active (or ReplicatedWrite) Multicast Replication

Replicationbased

Quorumbased

Figure 3.9: A UML diagram showing a classification of the methods dealing with failures

75

3.6 Databases and database security 3.6.1 Information integration Databases are becoming one of the most valuable properties owned by organizations. Several solutions have been given over the years for handling information integration (Table 3-9). Database integration approaches have traditionally focused on schema integration issues at the „structural‟ level of the schema. Common approaches are the global-as-view (the elements of the global schema are defined as views over the sources) and the local-as-view approach (the sources are characterized as views over the global schema) [Cali et al., 2004]. „Customer-oriented‟ integration approaches can be found in Appendix C. Data warehouse

Data from multiple independent sources „physically‟ brought together into a single system

Database integration

A formal data model for data storage and representation, along with a query algebra for integrating and querying the data

Application integration

An object model for the necessary modelling capability with a programming language for data integration

Semantic data (or

„Conceptual‟ models linking data from heterogeneous databases to a global

„horizontal‟) integration

knowledge representation structure, using ontological structures.

Model-based (or

Statistical and probabilistic techniques for large scale integration of data.

„vertical‟) integration

Table 3-9: Types of information integration

3.6.2 Distributed databases Distributed Heterogeneous Database Management Systems (DHDBMS) are emerging to handle aggregation and coordinated access in distributed heterogeneous databases. A distributed database (DDB) can be defined as a logically integrated collection of shared data, which are physically distributed across the nodes of a computer network, fully replicated, fully partitioned, or partially replicated [Bell and Grimson, 1992]. An example was given by Spaccarpietra et al. [1992] where the data model is defined using independent integration rules as correspondence assertions, which are customized to the various classical data models (represented as graphs) with the target of consistency in the system. The integration of heterogeneous databases, which uses mappings from local elements to global elements, often produces various conflicts in semantic, description and structural terms, both at the schema and the instance integration level [Sattler et al, [2003]. 76

In multi-database architectures, each local database is managed by a local DBMS and the various DBMSs are connected through a DDBMS.

3.6.3 Database security Data security in a database system includes data protection and data authorization control. Data protection is needed to prevent unauthorized users from understanding the physical content of the data. The main data protection approach is data encryption. Authorization control must guarantee that only authorized users perform the operations that they are allowed to perform on the database. Authorization control in a distributed environment includes also remote user authentication, management of distributed authorization rules, and handling of views and user groups. In addition, other important security requirements for database management systems include integrity, reliability (well-formed transactions), recovery procedures (continuity of operation) and information security principles such as the least privilege (i.e. users have the minimal privilege necessary to perform their tasks), separation of duties (i.e. no single individual has access to critical data) as well as the delegation of authority. The integrity of a database is concerned with its consistency, correctness, validity, and accuracy; it is necessary to reflect in the database the rules governing the organization, which the database is modelling. Security issues with distributed databases include identification and authentication, distribution of authorization rules, encryption, as well as the existence of a global view mechanism [Wiseman, 2001]. Secure distributed, multi-level transactions are the most used method. Communication is based on SSL.

3.6.3.1 Database encryption Database encryption should be done in a lower-level security mechanism that is applicable independent of the type of policy used in the database system. It provides the last line of defence against any attack by an opponent by acting as a „deterrent‟ to attackers. The disadvantage of encrypted databases is that record searching, particularly in the case of partial-match and range queries, becomes inflexible unless secure auxiliary information that maintains the positions of records or fields in the database is held. Encryption can be applied to the three levels of data granularity, namely to whole tuples (records), to whole attributes (fields) and to individual data

77

elements. Instead of using decryption during processing, operations can be performed on cryptograms.

3.6.3.2 Integrity control A database state is said to be consistent if the database satisfies a set of constraints, called semantic integrity constraints. Maintaining a consistent database requires various mechanisms such as concurrency control, reliability, protection, and semantic integrity control. Semantic integrity control ensures database consistency by rejecting update programs, which lead to inconsistent database states, or by activating specific actions on the database state, which compensate for the effects of the update programs. In general, semantic integrity constraints are rules that represent the knowledge about the properties of an application. Two main types of integrity constraints can be distinguished, structural constraints and behavioural constraints. Structural constraints express basic semantic properties inherent to a model, such as unique key constraints and associations, while behavioural constraints regulate the application behaviour, for example dependencies among objects or descriptions of an object structure [Ozsu and Valduriez, 1999]. Moreover, integrity in distributed databases handles problems such as inconsistencies with local integrity constraints, difficulties in specifying global integrity constraints, and inconsistencies between local and global constraints.

3.6.3.3 Transactions and concurrency control A reliable DDBMS is one that can continue to process user requests even when the underlying system is unreliable. One of the most basic ingredients of a fault-tolerant system is the transaction mechanism. A transaction brackets a number of operations in such a way that either all of them happen or – in the case of failure – none of them do. Transactions make crash recovery much easier, because a transaction can only end in two states, to be carried out completely, or failed completely. Transactions in DIS should be atomic, consistent, isolated, and durable [Bertino et al.]. In general, the reliability of a DBMS refers to the atomicity and durability of the four ACID properties of transactions. Two relevant protocols with these properties are the commit and recovery protocols (Figure 3-11). Transactions also represent the basic unit of recovery in a DBMS. The idea of error recovery is to replace an erroneous state with an error-free state. In backward recovery, a method widely used 78

in DIS, the goal is to bring the system from its present erroneous state into a previously correct state. When the effort is to bring the system into a correct new state from which it can continue to execute, it is called forward recovery. Recovery procedures can take place in the presence of unpredictable failures, for example using database log files and periodic checkpoints of the database. Security auditing mechanisms are concerned with logging security-relevant events on persistent storage so that they can be analysed further. Consistency and independence properties of ACID are mainly the responsibility of concurrency control. Processes are said to be concurrent if they run at the same time. Concurrency problems can occur at a number of levels of the system, from the hardware to the business environment. Two such problems are replay attacks on protocols, where an attacker manages to pass off out-of-date credentials, and the problem of race conditions. Other problems include the likelihood of deadlock, or inconsistent updates, as well as the provision of accurate time. Locking and call-back mechanisms are used to solve such problems. Electronic transactions as for example email, purchase of goods and services, banking transactions and micro-transactions (e.g. paid by end-users) can be safely performed only when they are protected by appropriate security policies and mechanisms. Financial transactions take place in a trusted and spontaneous manner over the Internet, even between previously unknown parties. Security may be even more important in Internet transactions because of the accessibility of the Internet and the need to establish the customer‟s legitimacy and privacy.

3.6.3.4 Views A database view can be defined to be a pre-set or predefined named retrieval query that creates a virtual (static) relation over base relations. Such views can provide content-dependent security but they also have some drawbacks. View definitions may contain errors, and the database upon which the views are defined may also contain errors. The complexity and possible errors of view definitions, as well as their vulnerability to Trojan horses are some of their drawbacks. Access views are used for retrieving or updating data, as well as for classification constraints, which specify access classes between relations.

79

3.6.4 Discussion on database security The database integrity can be interpreted in terms of an enforcement of database integrity constraints, concurrency control, and backup and recovery procedures within an overall security and access control framework. The different types of semantic integrity control that deal with consistency problems are illustrated in Figure 3-10. The relevant mechanisms for maintaining database consistency are presented in the UML diagram of Figure 3-11. Description of an object

unique key structural

behavioural

association

structure Object dependencies

Semantic integrity constraint

* deal with * Consistency problem

Inconsistency in local

Difficulty in specifying

Inconsistency between

integrity constraints

global integrity

local and global

constraints

constraints

Figure 3-10: Semantic integrity control mechanism of maintaining database consistency

reliability

atomicity

Figure 3-11: Mechanisms for maintaining database consistency

protection

durability

semantic

concurrency

integrity control

control

isolation

consistency

transaction ACID property

80

3.7 Distributed computing and security The move from procedural technology, expressed through the step-wise procedural paradigm to object technology, has triggered changes in the design of new information systems and software engineering operations [Bézivin, 2001], to satisfy the increasing need for reusable components and flexible combination of existing patterns. In the context of DIS, components extend the object-oriented paradigm by enabling objects to manage the interfaces they present and discover those presented by others. Examples include COM/DCOM, the CORBA Component Model, Enterprise Java Beans, and the Common Component Architectures [Raptis et al., 2001]. Application-level access control is an important requirement in many distributed environments; several authorization decisions are based on application domainspecific factors. In an object-oriented DIS, security functions can be based on the object request broker, or ORB, a software component that mediates communication between objects, as in CORBA architecture. CORBA, COM, and Java/RMI security models are similar. Threats include the unauthorized disclosure of information, violation of data or code integrity, Denial of service (DoS), repudiation of user‟s actions, malicious code, and traffic analysis. DCOM implements an extended ACL, which includes components and associated users. The Lightweight Directory Access Protocol (LDAP) runs directly over TCP, thus eliminating overhead of the OSI session and presentation layers required by DAP, the Dynamic access probability protocol [Naor and Levy, 2003]. This simplifies the X.500 functional model, with the use of string encoding for distinguished names and data elements. LDAP can be applied on a CA, combined with RBAC concepts [Yeh et al., 2002].

3.7.1 Distributed computation Generally speaking, a distributed computation M ( M  {P,W } ) describes the execution of a distributed program by a collection of processes, where P and W represent a finite set of processes ( P  { p1, p 2,..., pn} ) that run at one or more nodes connected by a communication network and a finite set of communication channels, respectively. The processes (P) have a disjoint address space and communicate with each other by message passing via W. The communication channels between correct 81

processes are authenticated and protect the integrity and secrecy (privacy). The activity of each sequential process is modelled as executing a sequence of events. A sequence of all the events in a process constitutes a local history. A global history of the computation contains all the events. A distributed computation, through its lifetime, is composed of a dynamic group of processes running on different resources and sites. The processes constituting a computation may communicate by using a variety of mechanisms, including unicast and multicast. While these processes form a single, fully connected logical entity, low-level communication connections (e.g. TCP/IP sockets) may be created and destroyed dynamically during program execution. Parallel computations that acquire multiple computational resources introduce the need to establish security relationships not simply between a client and a server, but among potentially a hundred of processes that collectively span many administrative domains. It is known [Coulouris et al., 2005] [Tanenbaum and Van Steen, 2002] that in an asynchronous system, information may flow from one event to another either because the two events are of the same process and thus may access the same local state, or because the two events are of different processes and they correspond to the exchange of a message. A binary relation '  ' (“happen before”) over the events of the system can be defined in order to express the sequential process of events. Certain events of the global history may be causally unrelated. For two distinct events e and e , neither

e  e nor e  e is true. Such events are called „concurrent‟, written as e || e ' .

3.7.2 Workflow systems Workflow systems are increasingly being used to streamline organization‟s business processes. They use the Internet as the underlying communications infrastructure. Functional access control requirements in Workflow Systems (WS) such as the „strict least privilege‟, the „order of events‟, and the „separation of duty‟ are often associated with Business-Process Reengineering (BPR). A workflow is a set of sequences of activities called business processes, which represent the functioning of an organization and are executed in the run-time environment. A business process is described to the workflow system by means of a process definition; the latter identifies the tasks that form part of the business process and provides also rules for specifying the conditions for executing those tasks in certain roles (see Appendix D).

82

3.7.3 Web and Semantic Web 3.7.3.1 Web overview The Web has become an infrastructure for distributed applications. Information is organized into documents; a document usually contains links that refer to other documents. Web standard protocols, and services such as TCP/IP, HTTP/HTTPS, SOAP [W3C a, 2001], Universal Description Discovery and Integration (UDDI), and Web Services Description Language (WSDL) [W3C c, 2002] support the serviceoriented approach. WSDL descriptions are used to ensure interoperability. The development and deployment of ontologies is a major topic in the Web services. In the context of knowledge and Web engineering, an ontology is simply a published conceptualization of a specific area. The ontology may describe objects, processes, resources, capabilities etc. Ontologies improve communication between systems by establishing an agreed model, and enhancing interoperability. An ontology can also promote reuse of content, ensure a clear specification of what content or a service is about, and increase the chance that content and services can be successfully integrated. Guarino & Welty [2000] have proposed a methodology for ontology-based model engineering, which provides well-defined taxonomies e.g. clarifying the is-a relation, through a rigorous analysis based on the analytic notions of identity, unity, rigidity, and dependence that have been drawn from Philosophy and adapted to Engineering.

3.7.3.2 Semantic Web overview The Semantic Web [W3C b, 2001] is an extension of the current Web. It is the idea of having data on the Web defined and linked in such a way that it can be used for more effective discovery, automation, integration, and reuse across various applications. Services are a particularly important component of the Semantic Web. A semantic service description language (or mark-up language) enhances the quality and quantity of e-commerce transactions on the Web by describing the capabilities of Web services. Information exchange in the Web is usually facilitated using XML/XMI [W3C d, 2001]. RDF/RDF(S) is a standard way of expressing metadata, though in fact it can be used to represent structured data in general. DAML-S/DAML + OIL, built on top of XML and RDF(S), provide a process modelling language for describing taxonomic information of a Web domain. OWL/OWL-S have succeeded DAML + OIL. 83

Real-scale semantic web applications, e.g. Knowledge Portals, e-voting systems, and E-Market places, require the management of voluminous repositories of resource metadata. Portals can be viewed as providing a Web-based interface to a distributed system. They entail a three-tier architecture. E-voting systems make it possible for the voters to cast their ballots over a distributed network. E-Market places are an example of e-commerce.

3.7.3.3 Web security Web-based applications greatly increase the availability of information and the ability of people to access and share information in a collaborative environment. According to Skoularidou & Spinellis [2003], security in Web-based systems should be enforced in the Web client, data transport, Web server, and the operating system itself. Typical operating system security features include memory and file protection, resource access control, and user authentication. Web attacks, i.e. attacks exclusively using the HTTP/HTTPS protocol, are rapidly becoming one of the fundamental threats for information systems connected to the Internet. A taxonomy and semantic-dependent encoding scheme of Web attacks was proposed by Álvarez & Petrovic [2003]. The authors argue that an entry point has vulnerability which threatens a service, exploited by an action, using input length against a target with certain scope, and thus obtaining privileges. The agent-based computing paradigm provides a perspective on software systems in which entities typically have properties such as autonomy, social ability, reactivity, and pro-activeness. Mobile agents are software programmes, developed in open, distributed and heterogeneous environments, which act on behalf of users or other software programmes by travelling over the Internet via some communication paths and returning the results to the original users. Agent-based computing is particularly well suited to a dynamically changing environment, in which the autonomy enables the computation to adapt to changing circumstances using techniques such as on-thefly negotiation between agents. Security threats for mobile agents include protecting hosts from access by unauthorized parties or malicious agents, and protecting the agents themselves from attacks by other agents or hosts. In a Peer-to-Peer (P2P) architecture, computers, which have traditionally been used as clients, communicate directly among themselves; they can act as both clients and servers, assuming whatever role is most efficient for the network. 84

3.7.3.4 E-commerce security Lothian & Wenham [2001] claim that a secure e-business infrastructure should address prevention, assurance, detection, and recovery. E-commerce business transactions include business-to-business (B2B), business-to-consumer (B2C), consumer-to-business (C2B), and consumer-to-consumer (C2C) [Aljifri et al., 2003]. Independent security measures have resulted in a growing amount of security information that must be stored and recalled by the consumer. National laws on electronic commerce, as well as international laws and European Community directives agree that online documentation has the same legal value as an original paper document, provided that the authenticity, non- repudiation, integrity and confidentiality of the document and its content are assured. The tools used for such purposes include digital signatures (providing authenticity, identification & authorization) and encryption (for confidentiality). The proposed framework by Kesh et al [2002] defines the procedures that should be followed for the analysis, and the development of e-commerce security. It focuses on analysing threats as well as listing countermeasures and security tools for developing the final security architecture. Mayes & Markantonakis [2003] argue that a citizen-centric approach is a potential solution that should be researched further. They propose the use of a citizen card or token as a consumer owned secure platform for access to a wide range of services (Multi-Application Citizen Card). SSL (Secure Socket Layers) (and its successors TLS & WTLS) is a security protocol, on the top of the TCP/IP protocol, which works as a secure transport mechanism by creating a network connection as part of the security context establishment. It has become popular in the context of the Web as the underlying protocol for Secure HTTP (SHTTP). It provides a secure communication channel between two parties i.e. the client (or a customer) and the server (a merchant or a bank) without involving a Trusted Third Party. SSL provides mechanisms for data confidentiality and data integrity but lacks any mechanism for a non-repudiation service; the server is always authenticated, while client authentication (e.g. digital signature) is optional. SSL is also the basic security mechanism for CORBA security. The Secure Electronic Transaction [Rousset and Reynaud] protocol, on the other hand, provides partial non-repudiation based on digital signatures. It is used for guaranteeing transaction security (network-based payment card transactions e.g. Visa & MasterCard) in e-commerce systems. Financial transactions take place in a trusted 85

and spontaneous manner over the Internet, even between previously unknown parties. Authentication is based on digital certificates, while data integrity is achieved using public/private keys of the parties involved.

3.7.3.5 e-Trust The major reason why most people are still sceptical about e-commerce is, according to Labuschagne & Eloff [2000], the perceived security risks associated with electronic transactions over the Internet. They have shown that a balance may be maintained between the business needs and the technology requirements, adopting a dynamic approach where risks can be analysed online and in real-time, requiring less human intervention. Factors contributing to trust in e-commerce transactions are the easy access to description of product and services, ease of placing orders, order confirmation, order tracking and post-sales service. The confidentiality of a distributed computation can be compromised if trust on the client-side cannot be assured. Transaction-based websites reduce the time and effort required by an organization to complete transactions, as well as reduce errors caused by manual data entry. The key methodologies for establishing trust are presented in Table 3-9. Trust in e-business is governed by the quality of the exchanged services or products, usually deployed with the existence of a Trusted Third Party (TTP), for example using Managed Security Services Providers (MSSP); selecting an MSSP is a complex task and carries a risk [Beckers and Ballerini, 2003]. A TTP is an impartial organization delivering business confidence, through commercial and technical security features, to an electronic transaction. Threats against e-trust include monitoring of communication lines, shared key guessing, unauthorised modification of information in transit, forged network addresses, masquerading, password stealing, unauthorized access, repudiation of origin, private key stealing, and private key compromise [Ruppel et al., 2003]. Gritzalis & Gritzalis [2001] proposed the use of digital seals, where a PKI consists of more that one TTP. Tscheme [Emery et al., 2003] is the independent, industry-led, co-regulatory scheme for electronic trust services.

86

Methodology

Main characteristics 

Organizational Solution



It relies on detection to be an effective

Disadvantages 

Applications are only permitted to execute on hosts that

Examples 

Contracts, service level agreements

deterrent.

have previously been identified as trustworthy by

(SLA), and laws in conjunction with

An entity can breach an agreement,

subscribing to the agreement.

techniques such as auditing

something that invokes a prescribed punishment. 

Reputation

All execution hosts and parties are



By the time a host can be declared untrustworthy, the



In the real world the spam black-hole

considered equal and trustworthy until they

hosted computation and any information it contains have

services, where customers of the service

are reported through appeals to a central

already been compromised

report potential distributors of spam to

There is the problem of reliable detection, when such a

the central authority, which then decides

compromise occurs. With no other means of protection, an

whether to publicly advise people that

untrustworthy execution host could continue to be assigned

the mail-server sends spam

authority.



applications. 



The validity of a host is established before

In order to succeed, aspects of both automated



It has been the primary approach taken

technological and organizational methodologies need to be

by the development of the Safe-GRID

maintained throughout the agent's execution.

employed.

environment.

A mobile computation no longer needs to be

Validity of the

attacked before a host can be declared

host

untrustworthy. 



execution begins, while the trustworthiness is

Any time the host has become untrustworthy, the environment should be notified in order to allow the appropriate action to be taken

Table 3-9: The key methodologies for establishing trust

87

3.7.3.6 Privacy and identity management Failure to protect personal data could demonstrate negligence, as well as a breach of the Data Protection Act [DPA, 1998]. The Electronic Communications Act [ECA, 2000] is a part of a collection of laws created to deal with the issue of identification, and bind people to whatever they have done and signed to electronically. Other acts in the UK include the Regulation of Investigatory Powers Act, Human Right Acts, and the Anti-terrorist Acts [Jayeju-akinsiku, 2002]. Forte [2003] argues that there is no interoperability between the European and the US legal systems in matters of privacy. Ross Anderson [Anderson, 2004] defines privacy as „the ability and/or right to prevent invading our personal space‟. Privacy threats include the identity disclosure, linking data traffic with identity, location disclosure in connection with data content transfer, as well as user profile and data disclosure. For example, phishing, also called „carding‟, is the act of sending an email to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private information such as credit card numbers, bank account information, social security numbers, passwords, and other sensitive information that will be used for identity theft. Fischer-Hubner as well as Senicar et al [2003], identified four ways that privacy protection can be achieved, which are protection by government laws and privacy-enhancing technologies (PETs), self-regulation for fair information practice by codes of conduct promoted by businesses, and privacy education of consumers and IT professionals. PETs refer to the various technologies that have been developed to help users protect their privacy, through the use of agents. The „anonymous remailer‟, the Rewebber, the TAZ server, onion routing, Crowds, JANUS, P3P, PISA, RAPID, and EU-GUIDES are examples of PETs. Cyberspace trust starts with clear, transparent, negotiated, and documented policies associated with identity. An identity management system provides the tools for managing user identities by using pseudonyms. It supports multilateral security i.e. security in communication between different parties; it can be used in e-commerce, email, auctions, and e- voting. The Liberty Alliance version 1.0 specifications [LATE Group, 2002] allow businesses to connect heterogeneous systems to handle identity and authorization in a more efficient and controlled manner. ID-PKI [Paterson, 2003] [Paterson, 2002] was created as a means to overcome the problem of the management of the certificate and associated key in PKI; it uses a Trusted Authority (TA) instead of a certificate. Identity-based public key cryptography (ID-PKC), which enabled 88

through pairings (maps) from one group to another, is a promising solution in particular kinds of applications. In the distributed computing environment, it might be required to maintain the user anonymity, where only the service provider can identify the user, while all other entities cannot determine any information on the user‟s identity. An e-voting system makes it possible for the voters to cast their ballots over a distributed network. It provides various services such as the anonymity and accuracy of voters, collision freedom, tally correctness, verifiability, and double voting detection [Lin et al, [2003] [Dini, [2001].

3.7.3.7 XML security Extensible Mark-up Language (XML), although designed as an information mark-up language and not as a mark-up language for electronic commerce, has become a standard for data exchange and electronic commerce, and inevitably suffers from various security weaknesses [Blyth et al., 2003]. King [2003] proposed a secure layer above the Web service using security standards, such as WS-Security, Security Assertion Markup Language (SAML), XML Key Management Services (XKMS), or XML Application Gateways (a combination of WS-Security and SAML). Other techniques include building security into the Web service, or hardening the network infrastructure that hosts the Web service. Integrity constraints are also an essential part of modern schema definition languages. They are useful for semantic specification, update consistency control, and query optimisation. UCM, a model of integrity constraints for XML [Fan et al., 2002], relies on a single notion of keys and foreign keys and on a powerful type system.

3.7.4 The Grid infrastructure 3.7.4.1 Grid overview The Grid can be thought of as a series of abstract layers of different widths, such as connectivity layers, collective services, user applications, and resources containing any number of components. Actually, it is a set of additional protocols and services that build on Internet protocols and services such as resource discovery, data management, scheduling of computation, and security. It offers powerful ways of working such as science portals, distributed computing, large-scale data analysis, and collaborative work. Grid service interfaces must be globally and uniquely named. The 89

implementation of a level of abstraction can be done at any level of the data management software infrastructure. The visions of the Grid and the Semantic Web have much in common but can perhaps be distinguished by a difference on emphasis; the Grid is traditionally focused on computation, while the ambitions of the Semantic Web take it towards inference, proof, and trust. The evolution of the Grid has been a continuous process. First-generation Grid systems involved solutions for sharing high-performance computing resources. Second-generation systems introduced middleware to cope with scale and heterogeneity. Third-generation systems are adopting a service-oriented approach, emphasizing distributed collaboration; they are metadata-enabled and may exhibit autonomic features of DIS. All components of the environment are virtualized in order to hide the complexities of data management and data integration. A Virtual Organization (VO) is a service defined as a network-enabled entity that permits mappings of multiple logical resource instances onto the same physical resource and composition of services regardless of implementation, as well as resource management based on composition from lower-level resources.

3.7.4.2 Grid security Authentication, authorization, and security policy are among the most challenging issues in Grids. The inter-domain security solutions must be able to interoperate with, rather than replace, the diverse intra-domain access control technologies inevitably encountered in individual domains. In Grid environments, the distinction between client and server tends to disappear, because an individual resource can act as a server one moment and as a client at another. There is no standard authorization mechanism for the Grid. Almost all current Grid software uses some form of access control lists (ACLs). Interesting requirements are the „single sign-on’ to many resources that is achieved through proxy credentials used by user proxies i.e. tokens that allows its owner to operate with the same or restricted rights and privileges as the subject that granted the token, „mapping to local security mechanisms‟, delegation, as well as „community authorization‟ – a community authorization system that allows policy decisions such as group membership identified by a TTP, to be delegated to a community representative. It is relatively easy to establish a policy for homogeneous communities, but difficult to establish trust for large, heterogeneous VOs. Current

90

Web technologies lack certain features required for VOs, such as single sign-on or delegation. The Globus Toolkit [2001] is a community-based, open-architecture, open-source set of services and software libraries that supports Grids and Grid applications. It includes software for security, information infrastructure, resource management, data management, communication, fault detection, and portability. The Globus security architecture consists of entities such as users, user proxies, resource proxies, and general processes. It defines four different protocols for creation of user proxy, allocation of a resource by the user or a process in a remote domain, and making user known in a remote domain. The Access Grid is a collection of resources and interfaces, which support human collaboration across the Grid, including large-scale distributed meetings and training. A mediation service for monitoring terms of service and enhancing collaboration in Virtual Organizations [Sklavos and Koufopavlou] by ensuring secure, private access to service resources is presented by Shrivastava [2005]. 3.8

Summary

Security requirements include confidentiality, integrity, availability, user identification, access control and accounting. Security measures, against identified security risks, are divided into proactive (including measures oriented to deter and prevent passive security attacks) and reactive measures (including measures oriented to detect security violations i.e. active attacks and to correct/repair system functionality). Security risks are the result of interaction of threats and vulnerabilities. Security attacks are over system‟s communication channels. Information security requirements correspond to specific security policies; each security policy is materialized through a specific security service, which in turn is implemented using one or more security mechanisms as countermeasures against specific security attacks. Digital cryptography provides the basis for most computer security mechanisms. Cryptography is used for the secrecy and identity of the communicated parties (using encryption and checksums), authentication of pair of principals and the creation of digital signatures. Encryption algorithms are employed using shared secret keys or public/private key pairs. Digital signing and digital functions are used to sign documents digitally for ensuring message integrity. Authentication can be based on 91

the use of passwords, digital signatures, data encryption, public-key certificates, biometrics, smart cards and steganography. Secure channels like SSL, based on encryption and authentication, provide a service layer on top of existing communicating services. All resources in a computer system can be divided into subjects (active) and objects (passive) and can be associated with an access control security component in the boundaries of a security policy domain. Access control, based on valid authorization, determines which operations are allowed for subjects to perform on objects of the system. A security domain policy is then applied to a security policy domain for a specific computational environment. Different forms of access control include access control matrix, Access Control Lists, capabilities, mandatory access control, discretionary access control, RBACs, multilevel and multilateral security and multipolicy systems. Firewalls are gateways that tightly control message traffic between private and public network, while VPNs extend the firewall operation beyond the local intranet by using cryptographic protected channels at the IP level. High-level security services are implemented as a second line of defence. They include network monitoring, Intrusion Detection Systems, auditing and tracing techniques, antivirus control and hardware-based security services, like the VLSI chips. Network sensors monitor specific TCP flows. There are two primary IDS techniques, anomaly detection and misuse detection. Auditing techniques are used for accountability purposes. Tracing back techniques are used for working out the suspected path of an attack. Antivirus control is used for detecting and handling viruses, worms and Trojan horses. FPGAs and VLSI chips are hardware solutions as an alternative to software-based security. Fault tolerance is strongly related to dependability. Fault tolerance demands that a system continues to operate even in the presence of faults. Redundancy is the primary method for masking faults while process group replication is the main method for tolerating faults. Other methods for dealing with failures are fault avoidance, detecting & tracing analysis and recovery from failures, usually based on checkpoints and logging & auditing techniques. Database security in DIS includes data protection (usually based on encryption), data authorization control and data integrity. The database integrity can be interpreted in terms of an enforcement of database integrity constraints, concurrency control, and backup and recovery procedures within an overall security and access control 92

framework. The method of distributed multi-level transactions is the most used technique for operating on resources of a DIS. Communication is based on SSL/TLS protocol. Mechanisms for maintaining database consistency include reliability, protection, semantic integrity control and concurrency control. The object-oriented paradigm has dominated in the design of DIS. Examples include CORBA, DCOM and Java Enterprise Beans; their security models are similar. The Web has become an infrastructure for distributed applications. Web protocols support the service-oriented approach. Grid infrastructures, Peer-to-Peer architectures, workflow systems, e-commerce and customer-oriented web services are based on secure distributed computations in the form of distributed transactions. A distributed computation is composed of a dynamic group of processes running on different resources and sites. The processes exchange messages through authenticated communication channels that protect the integrity and secrecy (privacy), providing also user anonymity when it is required, based on the use of protocols such SSL/TLS, SET and PET. Trust in e-business, based on the quality of the exchanged services or products and usually is deployed with the existence of a Trusted Third Party. The analysis of the current security approaches across distributed systems, like baseline approaches or risk management, has shown that they are characterized by their locality. Perimeter network defence based primarily on the use of internal and external firewalls or solutions based on a piecemeal basis cannot guarantee an adequate protection from potential security breaches. Assuring optimal security of a distributed information system is not a trivial task, as it requires a wide variety of expertise from technological to organizational. A complete, holistic security strategy needs to be layered to deal with high-level aspects such as continuity strategies (threat assessment, risk evaluation & control), security policies, incident response plan, host-based & network-based perimeter and/or perimeterless detection, auditing procedures, fault tolerance and recovery strategies, anti-malware control (intrusion detection, router and firewall security, antivirus control) as well as legal and regulatory compliance. These layered protection measures are needed to be taken in a distributed system in order to ensure secure transparent distributed computations and to enhance the availability of system‟s services in complex and interoperable environments. Processes and channels should be secured and resources should be protected in order to achieve data sharing transparently between system components. 93

4 4.1

Category theory overview and analysis Introduction to category theory

Category theory is a generalized mathematical theory of structures, which formalizes a number of algebraic properties of collections of transformations between mathematical objects (such as binary relations, groups, sets, topological spaces, etc.) of the same type, subject to the constraint that the collections contain the identity mapping and are closed with respect to compositions of mappings. Category theory provides a formal approach to process simply by the use of the arrow; for example, commutative diagrams are as formal as an algebraic expression. Categories are of the nature of types. The following overview of category theory is based on the work of Mac Lane[1998], Lawvere [1963;1969a;1969b;1986;1964], Pierce[1991], Barr and Wells [1999;1985], Rydeheard and Burstall[1988], Lambek[1980] and Scott[1986], Pitt et al.[1985], Schalk and Simmons[2003], Asperti and Longo[1991], Freyd and Scedrov[1990] and Fokkinga[1994]. All the figures, tables and diagrams in the current and the next chapter are believed to be original unless it is clearly defined otherwise. Categories in figures are represented as rectangles with internal structure (objects and arrows). Their names are written in bold uppercase. The categories involved in the applications using applied category theory in Chapter 5 and Chapter 6 are written in lowercase italics.

4.2

Categories and functors

A category consists of a collection of data (objects), e.g. computer resources such as subjects and objects, that satisfy some particular properties (such as operations and axioms), and for each pair of objects, a collection of morphisms (arrows), e.g. access rights, from one to another. Operations assign to each arrow f an object dom f, its domain, and an object cod f, its codomain ( f : a  b , dom f  a , cod f  b ). The collection of all arrows with domain a and codomain b is written C(a, b) . Thus, a category can be thought of as a directed graph [Barr and Wells, 1999] together with two functions, composition and identity; the objects of a category correspond to the nodes of a graph. The composition operator

assigns to each pair of arrows f and g,

with cod f = dom g, a composite arrow g f : dom f  dom g , satisfying the 94

associative law, that is for any arrows f : a  b, g : b  c, h : c  d (with a, b, c, d not necessarily distinct), h ( g f )  (h g ) f . For each object a , an identity arrow

1a : a  a satisfies the identity law, that is for any arrow f : a  b , 1b f  f and

f 1a  f . A functor F : C  D is a map taking each C-object a to a D-object Fa and each Carrow f : a  b to a D-arrow Ff : Fa  Fb , such that for all C-objects a and composable C-arrows f and g, Fa  1Fa and F ( g f )  Fg Ff . Such functors are usually called covariant.

4.3

Commutative diagrams

A diagram in a category C is a collection of vertices and directed edges, consistently labelled with objects and arrows of C. Such a diagram is said to commute if, for every pair of vertices x and y , all the paths in the diagram from x to y are equal, in the sense that any two paths in the diagram that start and end at the same points are equal. Thus, each path in the diagram determines an arrow and these arrows are equal in C. For example, saying that the diagram in Figure 4-1 commutes [Pierce, 1991] means that f g   g f  .

f

x g g

z g

w

f

y

Figure 4-1: Equality of paths in a commutative diagram

4.4

Natural transformations

Having two categories, C and D, and two functors F and G from C to D, then a

. G and often called a  natural transformation  from F to G (written  : F 

95

morphism of functors) is a function such that for any C-arrow f : a  b the diagram in Figure 4-2 [Pierce, 1991] commutes in D.

Fa

a

Ga

Ff Fb

Gf Gb

b

Figure 4-2: Components of a natural transformation

4.5

Adjointness

An adjunction consists of a pair of categories (C and D) [Lawvere, 1969a] [Mac Lane, 1998], a pair of functors F : C  D and G : D  C , and a natural

. (G F ) , such that for each C-object X and C-arrow transformation   IC   f : X  G(Y ) , there is a unique D-arrow f # : F ( X )  Y yielding the commutative

diagram in Figure 4-3 [Pierce, 1991].

X

ηX

G(F(X))

G(f#) f Figure 4-3: The unit of the adjunction

G(Y)

96

The pair (F, G) is an adjoint pair of functors. F is the left adjoint of G and G is the right adjoint of F. The natural transformation η is called the unit of the adjunction.

. ID , Associated with each adjunction is another natural transformation   ( F G)   called the co-unit of the adjunction, with the property that for each D-arrow g : F ( X )  Y there is a unique C-arrow g*: X  G(Y ) for which the diagram in

Figure 4-4 commutes [Pierce, 1991]. εΥ

F(G(Y))

Y

g

F(g*)

Figure 4-4: The co-unit of the adjunction

F(X) Adjointness occurs when there is an exact correspondence between D-arrows F ( X )  Y and C-arrows X  G(Y ) as can be seen in the diagram in Figure 4-5.

X

G(Y)

F

G

F(X)

Y

Figure 4-5: Adjoint functors F and G

4.6

Duality, contravariance and opposites

The dual (or opposite) category Cop of a category C is defined as having the objects of C; Cop-arrows are in one-to-one correspondence with arrows in reverse direction to the C-arrows. Thus, for each C-arrow f : a  b , there is a corresponding Coparrow f op : b  a . Having two categories, C and D, and a covariant functor F : Cop  D , then a contravariant functor F : C  D assigns to each C-object a , a D-object Fa and to 97

each C-arrow f : a  b , a D-arrow F f : Fb  Fa in the opposite direction (Figures 4-6, 4-7, 4-8, 4-9).

Cop

C

a

C

a

f

a

b

C

b

Figure 4-7: A covariant functor F : C  D

D

a

Fa

F

Fb

C

D

a

Ff

F

b

Figure 4-6: A category C and its dual category Cop op

Fa

f

f op b

f op

D

F

f

Ff op

Fa

b

Fb

op

Figure 4-8: A covariant functor F : C

Ff

Fb

D Figure 4-9: A contravariant functor F : C  D

4.7

Universal arrows and universal constructions

A universal construction describes a class of objects and accompanying arrows that share a common property and picks out the objects that are terminal when this class is considered as a category. The entities defined by a universal construction are said to be universal among entities satisfying the given property, or simply to have the universal property. The unique (universal) arrows to them from other objects sharing the given property are often called mediating arrows. In the special case that these arrows are not unique, then they are called weak universal arrows. A co-universal construction has the same form as a universal construction, except that the arrows are reversed and it picks out the initial object with the specified property.

98

Limits and the dual notion of co-limits are examples of universal and co-universal construction. Initial and terminal objects, products/co-products, equalizers/coequalizers, pullback/pushouts are specific instances of them. Another example of a universal arrow is the initial object  f , d  in the comma category (b  S ) , for a functor S : D  C and a C-object b .

4.8

Limits and colimits

Let C be a category and D a diagram in C. A cone { fi : a  di } for D is a C-object a and arrows fi : a  di , one for each D-object d i , such that for each D-arrow g the diagram in Figure 4-10 [Pierce, 1991] commutes.

a fj

fi di

dj

g

Figure 4-10: Cones { fi : a  di } for a diagram D

A limit for a diagram D is a cone { fi : a  di } with the property that if { fi: a  di } is another cone for D then there is a unique arrow k : a  a , such that the diagram in Figure 4-11 [Pierce, 1991] commutes for every d i in D. Thus, a limit is terminal object in the category of the cones for a diagram D.

k

a f i

a

fi di

Figure 4-11: A limit d i for a diagram D

4.9

Products and coproducts

A C-object 0 is called an initial object if, for every C-object a , there is exactly one C-arrow from 0 to a . Dually, a C-object 1 is called a terminal or final object if, for 99

every C-object a , there is exactly one C-arrow from a to 1 . An arrow from a terminal object to an object a is called a global element or constant of a . A product of two C-objects a and b is a C-object a  b , together with two projection arrows P : a  b  a and Q : a  b  b , such that for any C-object c and a pair of arrows f : c  a and g : c  b , there is exactly one mediating arrow  f , g  : c  a  b making the diagram in Figure 4-12 commute [Pierce, 1991], that is

P  f , g   f and Q  f , g   g .

c

 f , g

ab

a P

b Q

Figure 4-12: The product of objects a and b

B If a category C has a product a  b for every pair of object a and b , then it is said that C has all (binary) products, or simply C has products. A coproduct of two C-objects a and b in a category C, is a C-object a  b , together with two injection C-arrows I : a  a  b and J : b  a  b , such that for any Cobject c and a pair of C-arrows f : a  c and g : b  c , there is exactly one arrow [ f , g ]: a  b  c making the diagram in Figure 4-13 commute [Pierce, 1991], that is [ f , g ] I  f and [ f , g ] J  g .

a A

I

ab

J

b B

[ f , g]

c Figure 4-13: The co-product of objects a and b

C

100

4.10 Pullbacks A pullback of a pair of C-arrows f : A  C and g : B  C , is a C-object P and a pair of C-arrows g  : P  A and f  : P  B making the diagram in Figure 4-14 [Asperti and Longo, 1991] commute, that is f g   g f  . The dual universal construction of a pullback is a pushout. f

P

B

g g

g

A

C

f

Figure 4-14: A pullback diagram for two C-objects

4.11 Exponentiation In certain categories, the collection of arrows C(a, b) is representable as a C-object b a , that is ba  { f : a  b} . Associated with b a is a special C-arrow (i.e. evaluation

function) eval : (b a  a)  b , defined by the rule eval ( f , a)  f (a) , and having the universal property that there are C-arrows g : (c  a)  b . For each such g arrow, there is exactly one C-arrow curry( g ) : c  ba as can be seen in Figure 4-15. The projection arrows P , Q , P , Q are used to get the elements c , a , b a , a of the products c  a and ba  a , respectively. P

c

Q

ca g

a

f f

curry( g ) 1a

curry( g )

b f

eval

ba

P

ba  a

1a f

Q

a

Figure 4-15: Exponentiation

101

Exponentiation is a way for functional languages to allow functions to be passed as parameters to other functions and returned as results of functions. Thus, a twoargument function is reduced to a one-argument function, yielding a function from the second argument to the result. That is a way to interpret the notion of currying. Thus, when the product of two objects c  a should be associated with a third object b , it can be achieved through the g. Then the arrow curry( g ) allows us to associate the first element of the product (i.e. c ) with the collection of arrows between the second (i.e. a ) and the third element (i.e. b ), that is the exponential object b a . In a category C that has exponentiation, the functor a  () is a left adjoint to functor

() a for a C-object a . The evaluation function also in this case forms a natural transformation; object b is taken from the exponential object, by deduction.

4.12 Cartesian closed categories A Cartesian closed category is a category with a terminal object, finite (binary) products and exponentials. One area of computer science where the influence of category theory is particularly evident is in semantic models of programming languages. There is a strong connection between Cartesian closed categories and typed λ-calculus, an abstract programming language, something that is outside the scope of this research. Typed λ-calculi served as the foundation for modern type systems. In general, λ-calculus is a formal system for function definition, function application and recursion. More information on λ-calculus can be found in the work of Church[1941], Lambek[1980] and Scott[1986], Selding and Hindley[1980;1986], Barendregt[1984], Lawvere[1969b], Hankinp[2004], Asperti and Longo[1991] and Fokkinga[1994]. The category CPO of complete partial orders and continuous functions is Cartesian closed with b a the CPO of continuous functions from a to b . Any Cartesian closed category Baa is isomorphic to (Ba )a [Pierce, 1991]. Another example, adapted from Pierce [1991], refers to propositional logic. Let us have S, the sentences of propositional logic, as a preorder ( S , ) , where p  q means that from p we can derive q. Then S forms a Cartesian closed category, where products are given by conjunction of propositions and the exponential q p corresponds to “p implies q”.

102

Certain Cartesian closed categories, the topoi, have been proposed as a general setting for mathematics, instead of a traditional set theory. 4.13 Toposes or Topoi A topos T is a Cartesian Closed Category with an object  , which represents truth values. The internal logic in toposes is intuitionistic [Lambek and Scott, 1986]. For a category C, a sub-object classifier is an object  together with a monomorphism

t :1   (called the truth), where 1 is the terminal object (with !: a  1), such that every monomorphism m : a  c in C is a pullback of t, as can be seen in Figure 4-16. The unique arrow  : c   is called the character of m [Mac Lane, 1998] [Rydeheard and Burstall, 1988] [Barr and Wells, 1985]. !

a

1

m

t

c

Figure 4-16: A sub-object classifier  in a category C, as a pullback diagram





4.14 Product of categories Having two categories B and C, a new category B  C can be constructed, called the product of B and C [Mac Lane, 1998]. An B  C -object is a pair b, c of a B-object b and a C-object c. An B  C -arrow is a pair  f , g  of a B-arrow f : b  b and a Carrow g : c  c . D

R

T F

B

BC P

Q

C

Figure 4-17: the product category B  C

103

 f ,g  f , g  The composite of two such arrows b, c  b, c  b, c is defined in

terms of the composites in B and C by  f , g   f , g    f  f , g  g  . Functors

P : B  C  B and Q : B  C  C , called the projections of the product, are defined on objects and arrows by P f , g   f and Q f , g   g . For any category D and two functors R : D  B and T : D  C , there is a unique functor F : D  B  C with PF  R and QF  T . For a D-arrow h, these two conditions require that Fh   Rh, Th , making the diagram in Figure 4-17 commute [Mac Lane, 1998]. This property of the product category states that the projections P and Q are universal among pairs of functors to B and C.

4.15 Product Functors Let us have two functors U : B  B and V : C  C which have a product functor

U V : B  C  B  C defined explicitly on objects and arrows as (U V )b, c  Ub,Vc and (U V ) f , g   Uf ,Vg  for a B-arrow f : b  b and a

C-arrow g : c  c , respectively, thus making the diagram, in Figure 4-18, commute. P

B

6

BC

1

BC

8

BC

U V

U

9

Q

BC

BC

B

4

P 11 B  C

B  C

C

V 10 B  C

BC

Q

C

13 B  C

Figure 4-18: The product functor U  V : B  C  B  C  The operation  is itself a functor ( Cat  Cat   Cat ). To each pair of categories

B, C there is a corresponding new category B  C and to each pair of functors U ,V  a new functor U V . Moreover, when the composites U  U and V  V are

defined then we have (U  V ) (U V )  U U V V in order to make the diagram in Figure 4-19 commute. 104

P

B

BC

BC 21 B  C

Q

C

31 B  C

U

U V

V

32 B  C

28 B  C

33 B  C

B

B  C

P

34 B  C

C

Q

36 B  C

U

U  V 

V

BC

14 B  C

16 B  C

B  C

B

C Q

P

Figure 4-19: The B  C product functor (U   V ) (U  V )  U U  V V 18composite

20 B  C

4.16 Bifunctors or functors of two variables Functors F : B  C  D , from a product category to another category, are called bifunctors (on categories B and C). If one of the two arguments in a bifunctor F (_, _) is constant, the result is an ordinary functor of the remaining argument. A

bifunctor F (_, _) is contravariant in its first argument and covariant in the second. The whole bifunctor S is determined by all the available combinations of these two one-variable functors (Figure 4-20) [Mac Lane, 1998]. D

L

M F

B

P

BC

Q

C

Figure 4-20: A bifunctor F : B  C  D

105

If for all C-objects c and B-objects b there are functors Lc : B  D and

M b : C  D such that M b (c)  Lc (b) , then there exists a bifunctor F : B  C  D with F (_, c)  Lc for all c and F (b, _)  M b for all b, if and only if for every pair of arrows f : b  b and g : c  c , then M b g Lc f  Lc f M b g (Figure 4-21) [Mac Lane, 1998]. These equal arrows [Mac Lane, 1998 p.37] are the value F ( f , g ) of the arrow function of F at f and g.

D

Lc

Mb F

B

P

BC

Q

C

Figure 4-21: The bifunctor in terms of functors Lc and Mb

4.17 Subcategories A subcategory S of a category C is a collection of C-objects and of C-arrows, which includes: 

for each arrow f the object dom f, its domain, and the object cod f, its codomain ( f : A  B , dom f  A , cod f  B )



for each object A, its identity arrow 1A



for each pair of composable arrows f : A  B , f ' : B  C , their composite f '' : A  C

Moreover, the injection (inclusion) map S  C that sends its S-object and its S-arrow to itself (in C) is a functor, the inclusion functor.

4.18 Monads and comonads A functor T from a category C to category C ( T : C  C ) is called an endofunctor. It has composites T 2  T T : C  C and T 3  T 2 T : C  C . If  :T 2  T is a natural transformation, with components c : T 2c  Tc for each c C , then

106

T  : T 3  T 2 denotes the natural transformation with components (T  )c  T ( c ) : T 3c  T 2c , while T : T 3  T 2 has components (T )c   Tc .

A monad (or triple) T  T , ,   in a category C [Mac Lane, 1998] [Barr and Wells, 1985] [Lawvere, 1986] consists of functor T : C  C and two natural transformations

 : IC  T and  :T 2  T , making the following diagrams commute (Figure 4-22 & 4-23, respectively) [Mac Lane, 1998].

T

T3

T2

IT





T

T2

T



T

T

T

T2

TI

 

 

T

T

Figure 4-22: Associative law for monad T Figure 4-23: Left & right unit laws for monad T

Every adjunction  F , G, ,   : C  D gives rise to a monad in the category C. The two functors F : C  D , G : D  C have a composite endofunctor T  GF , while

 : IC  T is the unit and  : FG  I D is the counit of the adjunction, respectively. The latter yields by horizontal composition a natural transformation

  G F : GFGF  GF  T , making the first of the following diagrams commute (Figure 4-24) [Mac Lane, 1998]. Also, dropping G in front and F behind and applying the horizontal composition     ( FG )    ( FG) , produces the second commutative diagram (Figure 4-25) [Mac Lane, 1998].

GFGFGF

G FGF

GFGF

GFG F

G F

G F

GFGF G F

GF

FGFG

FG



 FG

FG

FG





IA

Figure 4-24: Associative law for T  GF Figure 4-25: Interchange law for T  GF

107

By applying the left and right unit laws in the diagram of Figure 4-23, it produces the two triangular identities, 1  G G : G  G ( G G : ICG  G , where G G IC   GFG   GI D  G ) and 1   F F : F  F (  F F : FIC  F , F F where FIC   FGF   I D F  F ) making the diagram in Figure 4-26

commute [Mac Lane, 1998].

GF

I C GF

GF

GFGF

GFI C

G F

=

=

GF Figure 4-26: Left & right unit laws for monad T  GF

A comonad (or cotriple) in a category D consists of an endofunctor L ( L : D  D ) and transformations  : L  I and  : L  L2 making the diagrams commute, in Figure 4-27 & 4-28, respectively.

L



L2 L



L2

L

L3

IL

L

L2

L

LI

T2









L T

L T



L T

Figure 4-27: Associative law for comonad L Figure 4-28: Left & right unit laws for comonad L

In this way, every adjunction  F , G, ,   : C  D gives rise to a comonad

FG,  , FG in category D, where   FG . By applying the left and right unit laws in the diagram of Figure 4-28, the two triangular identities are produced, F F  FGF   ID F  F ) 1   F F : F  F (  F F : F  I D F , where F 

G G  GFG   GI D  G ) and 1  G G : G  G ( G G : G  GI D , where G 

making the diagram in Figure 4-29 commute.

108

 FG

I D FG

FGFG

FG

FGI D

FG

=

=

FG Figure 4-29: Left & right unit laws for comonad L  FG

4.19 Comma categories Let C be a category and b a C-object, then the category of objects under b (b  C) has as objects all pairs  f , c where c is a C-object and f a C-arrow f : b  c . An arrow of this category is in the form h :  f , c   f , c for which h f  f  (Figure 4-30). Composition in category (b  C) is given by the composition of the basic arrows h in category C. In a similar way, we can define the category of objects over a

(C  a) , of a C-object a. It has as objects all pairs c, f  where c is a C-object and f a C-arrow f : c  a . An arrow of this category is in the form h : c, f   c, f  for which f  h  f (Figure 4-31). Composition in category (C  a) is given again by the composition of the basic C-arrows h. C

C

(b  C)

(C  a)

b

h

c f

c

f 

f h

c

f

f h

c

h

c

c

f 

a

Figure 4-30: objects under b (b  C) Figure 4-31: objects over a (C  a)

109

Now, if b is a C-object and S : D  C a functor between categories D and C, then the category of objects S-under b ( (b  S ) has as objects all the pairs  f , d  , where d is an object of D and f : b  Sd a C-arrow (Figure 4-32). An arrow of this category is a D-arrow h : d  d  for which f   Sh f . Composition in (b  S ) is given by the composition of D-arrows h. Thus, for D-arrows h : d  d  and h : d   d  composition is given by the arrow h  h h . Equality of arrows in (b  S ) means their equality as arrows of D. In a similar way, we can define the category of objects T-over a (T  a) (Figure 4-33), for a C-object a and a functor T : E  C between two categories E and C. The category (T  a) has as objects all the pairs e, f  , where e is an E-object and f : Te  a a C-arrow. An arrow of this category is an Earrow k : e  e for which f  f  Tk . Composition in (T  a) is given by composition of E-arrows k. Thus, having E-arrows k : e  e and k  : e  e , composition is given by the arrow k   k  k . Equality of arrows in (T  a) means their equality as arrows of E.

C

(b  S )

D d

h

h d

b

f

S d 

h

Sd

f

Sh

Sd 

f 

Sh

Sd 

Figure 4-32: Objects S-under b ( (b  S )

C (T  a)

E

k

e

e

k 

k

e

Tk

Te T

Te f

f

Tk 

Te

f 

a

Figure 4-33: Objects T-over a (T  a)

110

Considering categories and functors T : E  C and S : D  C , then the comma category (T  S ) (Figure 4-34), which can be written also as (T , S ) , has as objects all the triples e, d , f  where d is a D-object, e is an E-object and f : Te  Sd a Carrow. An arrow e, d , f   e, d , f  of this category consists of a pair of arrows  k , h , that is an E-arrow k : e  e and a D-arrow h : d  d  , for which f  Tk  Sh f . Composition  k , h   k , h  k , h is given by the pair of arrows k  k , h h , when is defined. Equality of arrows in (T  S ) refers to the equality of

the corresponded arrows in categories E and D [Lawvere, 1963] [Lawvere, 1969a] [Mac Lane, 1998] [Rydeheard and Burstall, 1988]. C

(T  S ) f

Te

Sd

Tk

Sh

Te

Tk 

f

Sh

Sd 

Tk 

Sh

Te

Sd 

f 

S

T E

D k

e

d

e

h

h

k 

k

e

d

h

d 

Figure 4-34: The comma category (T  S )

111

The comma category (T  S ) can be thought of either as the category of objects Sunder objects-T, or the category of objects T-over objects S. Therefore, using a comma category, a category C can be split up into two parts (slices), in the context of the objects of two other categories E, D by using appropriate functors T : E  C and S : D  C (only in this order, not in the opposite one). In the case where categories E and D are subcategories of C, then T and S play the role of the inclusion functions.

4.20 2-categories A 2-category is a system of 2-cells or „maps‟ which can be composed in two different but commuting categorical ways; horizontal and vertical. A 2-category is a structure consisting of objects, arrows between the objects and 2-cells between the arrows [Mac Lane, 1998] [Lambek and Scott, 1986] [Kelly, 1982] [Rydeheard and Burstall, 1988] [Pitt et al., 1985] [Baez, 1997] [Leinster, 2004]. Let us have a category C with objects a, b,... and arrows f : a  b . Then, a 2category  on C has a 2-cell  : f  g : a  b , for two parallel C-arrows f : a  b and g : a  b . Let us have also two more 2-cells,   : g  h : a  b and

 : f   g  : b  c . Then, horizontal composition is given by the 2cell   : f  f  g  g : a  c ; an identity 2-cell 1:1  1: b  b acts as a 2-sided identity for this composition. Vertical composition is given by the 2-cell    : f  h : a  b ; an identity 2-cell

1g : g  g acts as a 2-sided identity for this composition. In addition, the horizontal composite of two identity 2-cells 1 f : f  f and 1 f  : f   f  is given by the identity 2-cell 1f f  1f  1f : f  f  f  f : a  c (Figure 4-35, 4-36, 4-37 & 4-38) [Mac Lane, 1998] [Kelly, 1982]. f

f



a g

b



f f

c



a

g

 

c

g g

Figure 4-35: Horizontal composition  

112

f

1 b

a

b

1b

b

1f f

1 Figure 4-36:Identity 2-cell 1b

Figure 4-37: Vertical identity 1 f

a

f f

f

f b

1f



c

1f 

1f 

a

f

f

f

c

f f

Figure 4-38: Horizontal composition for vertical identities

Let us have also another 2-cell   : g   h : b  c . Then vertical and horizontal composition between 2 cells  , ,  ,   is given by the equation (   )  (  )  (    ) (   ) : f  f  h h : a  c . A horizontal composition

of a 2-cell  : f  g : a  b with the 1-cell (vertical identity) 1 f  : f   f  is given by the 2-cell 1f   : f  f  f  g : a  c (Figures 4-39 & 4-40) [Mac Lane, 1998] [Kelly, 1982] [Leinster, 2004]. f

f

a

b

g

f f





c

g





a

(    ) (   ) c h h

 h

h

Figure 4-39: Horizontal and vertical composition between 2-cells



a g

f f

f

f b

1f  f

c



a

1f  

c

f g

Figure 4-40: Horizontal composition of a 2-cell with a 1-cell

113

The vertical category of 2-cells on objects a, b is usually represented as T (a, b) [Mac Lane, 1998 p.275]. Then, the horizontal composition between two vertical categories is a bifunctor Ka,b,c : T (b, c)  T (a, b)  T (a, c) . The operation U a , which sends any object a to its vertical identity 2-cell 1:1  1: a  a , is a functor from the terminal category 1 to the vertical category T (a, a) , as U a : 1  T (a, a) . Thus, a 2-category can be given by the following data: 

objects a, b,...



A function that assigns to each ordered pair of objects (a, b) a vertical category of 2-cells T (a, b)



For each ordered triple  a, b, c of objects, a bifunctor Ka,b,c : T (b, c)  T (a, b)  T (a, c) called composition, and



For each object a , a functor U a as a left and right adjoint for this composition.

4.20.1 Adjointness defined in 2-categories The notion of an adjunction, as has been developed in CAT, can be carried over to other 2-categories. Thus, in a 2-category, two 1-cells f : a  b , g : b  a are adjoint, when there are 2-cells  :1a  g f : a  a and  : f g  1b : b  b (the unit and counit of the adjunction, respectively), such that ( f )  ( f  )  1 f : f  f g f : a  b and ( g  )  ( g )  1g : g  g f g : b  a [Mac Lane, 1998] [Leinster, 2004].

4.20.2 Natural transformations between 2-categories A morphism F : T  U between two 2-categories T and U is called a 2-functor [Mac Lane, 1998]. Having two 2-functors F , G : T  U , then a 2-natural transformation

.

 : F   G is a function that sends each T-object a to an U-arrow  a : Fa  Ga in such a way that the equality of 2-composites G  a   b F holds for each 2-cell

 : f  g : a  b in T (with f , g : a  b ), making the diagram in Figure 4-41 commute. In the case that  is applied to 1-cells (e.g. 1 f : f  f ) then the associated functors F and G become ordinary functors.

114

U

T

a



F

Ga

G

F



g

f

a

Fa

Ff

Fg

Gf

Gg

G

b

Fb

Gb

b

Figure 4-41: A 2-natural transformation  between 2-functors F , G for two 2-categories T and U

4.20.3 Modifications (3-cells) in enriched 2-categories

.

Given two 2-natural transformations  ,  : F   G , then a map (i.e. a 3-cell) [Mac Lane, 1998;Leinster, 2004;Kelly, 2005]  :    , called modification, assigns to each T-object a a 2-cell  a :  a   a , such that the equality of 2-composites

b F  G  a holds for every 2-cell  : f  g : a  b , with f , g : a  b , as visualized in Figure 4-42.

U

1Ga

Ga

a

a

T

Ga

a G

Gf

a

Fa

F



g

f





1Fa



F

Ff

b

G

1Gb

Gb

Gb

Fg

b Fb

Gg Fa

b

b 1Fb

Fb

Figure 4-42: A modification  between two 2-natural transformations  ,  for two 2-categories T and U

115

4.20.4 n-categories In higher-order logic we can quantify over predicates (something that is not allowed in first-order logic). A higher-order predicate takes one or more other predicates as arguments, something that also holds for higher-order functions. According to higher-dimensional category theory, an n-category is an algebraic structure consisting of objects, morphisms between objects, 2-morphisms, etc. up to n-morphisms. n-categories can be considered as single sets made of subsets (subset for 0-cells, 1-cells, 2-cells, 3-cells, 4-cells, etc.) [Baez, 1997] [Leinster, 2002]. A category (n+1)-Cat is the category of categories enriched over the category n-Cat. In general, in higher-dimensional category theory, when composition is not strictly associative, but associative only up to isomorphism (including natural isomorphisms to support adjunctions), then we have a weak n-category of a n-category. For example, a bicategory is a weak 2-category, a tricategory is a weak 3-category and a tetracategory is a weak 4-category [Leinster, 2002]. In bicategories, horizontal composition of the categories B(a, b) , B(b, c) and B(c, d ) of a bicategory B is associative up to a natural isomorphism  between

composite iterated functors (where the bifunctor  is called horizontal composition).   Thus, B(b, c)  B(a, b)   B(a, c) , B(c, d )  B(b, c)   B(b, d ) ,   B(c, d )  B(a, c)   B(a, d ) and B(b, d )  B(a, b)   B(a, d ) , with

  h  ( g  f )  (h  g )  f and h  ( g   f )  (h  g )  f  (where a, b, c, d are 0-

cells, f and f  are 1-cells in B(a, b) with the composite 2-cell  : f  f  : a  b , g and g  are 1-cells in B(b, c) with the composite 2-cell  : g  g  : b  c and h and h are 1-cells in B(c, d ) with the composite 2-cell  : h  h : c  d ,

respectively) [Mac Lane, 1998]. Vertical composition of 2-cells is also associative, with 1-cells as vertical identities.

4.20.5 Pullback functor Let C be a category with pullbacks. Then for any C-arrow f : a  b we can get a functor f *: (C  b)  (C  a) . If x : d  b is an object of (C  b) , then in the pullback diagram in Figure 4-43, it is f * x  p1 : d   d . If y : c  b is another object of (C  b) , then we get the diagram in Figure 4-44, where both the small and the

116

bigger square are pullbacks. Since f * y : c  a , p2 : c  c and g : c  d , we have an arrow h : c  d  , with f * g  h . d

p2

p1

a

d

p2

c

c

h

x b

f

f *y

g

d

p2

a

y

x

f *x

Figure 4-43: Image of objects in a pullback diagram

d

b

f

Figure 4-44: Image of arrows in a pullback diagram – the pullback functor

4.20.6 Multicategories and operads A multicategory is a generalization of the concept of category that allows morphisms of multiple arity. If morphisms in a category are viewed as analogous to functions, then morphisms in a multicategory are analogous to functions of several variables. Generally, given any monoidal category C, there is a multicategory whose objects are objects of C, where a morphism from the C-objects X1,X2,…,Xn to the C-object Y is a C-morphism from the monoidal product of X1,X2,…,Xn to Y. In a multicategory (C,T) , C is a Cartesian closed category and T a Cartesian closed monad, that is, η and μ are Cartesian natural transformations, while T preserves pullbacks based on η and μ [Lambek and Scott, 1986] [Leinster, 2004]. Such a multicategory is a monad in the bicategory of spans (Span(C,T) ). An application of the bicategory of spans is on implementing b-trees and presheaves. Kasangian et al. [1997] used bicategories to represent information flow and access control in security frameworks. An operad is a one-object multicategory [Leinster, 2004]. The algebra for an operad is a model of that theory. A bicategory B with one 0-cell is a monoidal category with the bifunctor  (called multiplication) associative up to a natural isomorphism  between composite iterated functors, as it was mentioned in the previous paragraph. A strict monoidal category is based on a monoid M. The category of all endofunctors on a category C is a strict monoidal category with the composition of functors as the 117

product of multiplication and the identity functor as the identity object (the unit of multiplication) [Mac Lane, 1998]. In general, any category with finite products as the monoidal product and the terminal object as the unit, is called a Cartesian monoidal category [Baez, 1995].

4.21 Examples of categories The category 0 has no objects and no arrows. The category 1 has one object and one (identity) arrow. The category 2 has two objects, two identity arrows, and an arrow from one object to the other. The category 3 has three objects (A, B, and C), three identity arrows, and three non-identity arrows ( f : A  B, g : B  C, h : A  C with composition defined only one way as g f  h ). A category is discrete when every arrow is an identity. A monoid is a category with one object (a set M) with a binary operation M  M  M , which is associative and has an identity; thus a monoid is a semigroup with identity element. A group is a category with one object in which every arrow has a two-sided inverse under composition (every arrow is an isomorphism).

4.22 Summary A category consists of objects (data) and arrows (functions) between them. Higher order functions, called functors (arrows between categories), describe notions as natural transformations (arrows between functors) and adjunctions, as 1-cells, 2-cells and 3-cells. Commutative diagrams are as formal as algebraic equations. Natural transformations are used for process evaluation. Adjointness can be used to describe system behaviour, something that is important for anticipatory systems. Contravariant logic is fundamental for systems and type theory. Composition of arrows can be used to describe process interaction for computation and communication purposes e.g. in distributed transactions for achieving concurrency. Pullbacks provide a tool to illustrate top- and low-level relations and associations between the elements of a system. The exponentiation feature is adequate for binding system activities in a framework. Cartesian closed categories provide a natural setting for λ-calculus. Certain Cartesian closed categories, the topoi, have been proposed as a general setting for mathematics, instead of a traditional set theory. Product categories, product 118

functors and bifunctors can be used to explain the intensionality and extensionality of a system in the context of comma categories, expressed as pullbacks. Monads and comonads can be used to describe step-by-step the internal processing of closed operations in categories. Higher-order logic in categorical terms describes notions as n-categories and multicategories. 2-categories are needed in order to integrate local activities of system components to a global view. 3-cells, particularly, can be used to explain system state changes. Bicategories have already been used to represent information flow and access control in security frameworks. Pullback functors can explain the relations in typed systems in terms of LCCC and categories with families in general. All diagrams, unless it is clearly defined otherwise, are original and aim to present in an understandable way categorical logic in its basics as well as the higher dimensional category theory needed for the development of the proposed architecture.

119

5 5.1

Categorical constructions and visualizations Rationale

The purpose of this chapter is to develop the categorical structures that are used in Chapter 6 to handle the holistic security requirement. The order is to develop the Godement calculus for composition in interoperability, Cartesian closed categories for products, limits and toposes, monads and comonads for process and comma categories for expressing pullbacks. Ultimate closure required for security including event ordering leads into n-categories, in particular 2-categories and 3-cells. With applied structures, a natural order of presentation of concepts readily emerges, perhaps facilitating a more pedagogical approach, as it was attempted here with applied category theory. The representation of categorical structures in this chapter and Chapter 5 can be used to conceptualize the nature of any data-driven or any other real-world realizable system of variable perspectives, with the emphasis in Chapter 6 on defining formal methods of handling global security in distributed information systems. 5.2

The Godement Calculus

Composability is a cornerstone of category theory. Composition of the mappings (i.e. arrows) between the levels is natural; functors and natural transformations may be composed with each other. Rules governing this composition are derived from Godement calculus [Godement, 1958]. For example, considering five categories, eight functors between them, and four natural transformations (Figure 5-1), Godement‟s five rules (equations) are given in Figure 5-2. The interchange law (commutativity) is indicated by equation 1, associativity by 2 & 3, permutation of paths by 4 and finally, production of simultaneous equations representing different paths through the diagram by 5.

G

F A

M

B

F F 

 

C

G



G



D

L

E

Figure 5-1: Godement natural transformations for five categories, eight functors and four natural transformations

120

(   )(   )  (  ) ( )

(1)

( L G)  L(G  )

(2)

 ( F M )  ( F )M

(3)

G(   )M  (G M ) (G M )

(4)

  ( F ) (G )  (G ) ( F )

(5)

Figure 5.2: The Godement’s rules for Figure 5-1

There is a similarity between Figure 5-1 and Figure 5-5 in §5.1.1. Categories C, D, C have been replaced with categories B, C, D respectively. In addition, categories A and E along with functors M : A  B and L : D  E have been added. In the lattice of cubes visualized in Figure 5-7, category A with the associated functor M represents an input to the system, while category E, with the associated functor L, represents an output of the system. The Godement calculus is visualized using the lattice of cubes in Figure 5-7.

5.2.1

The Cube and the Lattice of Cubes

Let us have two categories C and D, functors F : C  D , F  : C  D , G : D  C , . .  G , as can be seen  F  ,  : G  G : D  C and natural transformations  : F 

in Figure 5-3.

121

C

D

a

F

f

a

Fa



F a

F f

Ff

F

Fb

b

F b

b

G

G

 C GF a G a

GFa

GF f

G a GFa

 Fa

 b

G b

GFb

 Fb

GF a

 a

GF b

GFf

 F a

 F b

GFf

GF f

 f

GF b G b

GFb

Figure 5-3: A cube visualizing natural transformations between two categories and four functors

122

Here, the notation  a (for a D-object) is preferred instead of  a , as the objects and arrows of a category C can be written as functors a : 1  C and f : 2  C , visualized in Figure 5-4:

a

1



C

D

Figure 5-4: Representing objects as functors

The above cube is constructed of commutative diagrams. For example, see the top diagram with the diagonal  a (the top side of the cube), where the transformation  is natural for arrows  a . Thus,  a  G a  Fa   F a F a . Similarly, for the bottom commutative diagram (the bottom side of the cube),

 b  G b  Fb   F b F b . The diagram constructed by the top and the back square (blue arrows) shows that the composite transformation .      GF   GF  is natural for any C-arrow f. Composite natural . . transformations then, are of the form G : GF   GF  , G : GF   GF  ,

. .  F : GF   GF ,  F  : GF    GF  , which means that vertical composition can

be written, generally, as   G  F   F  G . Having three categories and four natural transformations, as can be seen in Figure 5-5, the vertical and horizontal composites are related using the “interchange law” [Godement, 1958], as          .

G

F

C

F F 

 

D

G



G



C

Figure 5-5: Natural transformation between two categories and six functors

The cube for four natural transformations is presented in Figure 5-6. Due to space limitation, all the vertical and horizontal composites (in the form of dashed arrows) are omitted. 123

D

C F

a

a

Fa

 a

F a

F a

 F

f

F f

Ff

F f

 F  Fb

b

F b

b

G

F b

 b

G

G





C

 F a

GF a G a

G a

GF f

GF f GF a

 F a

G a

GFa

 Fa GF b

GFf

GFb

 F b GFf

G b

 Fb

GF f G b

G a

G b

GFb

 F a

 F b

GF b

GF f GFa

GF a

 Fa GF b G b

 Fb

 F a

GF a

G a

 F b

GF b

GF f G b

GFa

GFf

GF f

GF a

GF b G a

 F b

GF a

GF b G b

GFb

Figure 5-6:A lattice of cubes visualizing natural transformation for two categories and six functors

124

C

B

A

a

F

Ma

 Ma F Ma  Ma

FMa

F Ma

 f

M

F

Mf

F Mf

F Mf

FMf

 b

F 

Mb

FMb

 Mb G

D

GF Ma

GF Ma

G



F Mb

G



GF Ma

GF Ma

GF Ma

GF Mb

GF Mb

GFMa

GFMa

GF Mb

GF Mb

GF Mb

GFMb L

GFMb

 Mb

GF Ma

GF Mb

GFMa

F Mb

GFMb

E LGF Ma

LGF Ma

LGF Ma

LGFMa

LGF Mb

LGFMa

LGFMa

LGFMb

LGF Ma

LGF Ma

LGF Mb

LGF Mb

LGFMb

LGF Ma

LGF Mb

LGF Mb

Figure 5-7: Godement Calculus for 5 categories, 8 functors and 4 natural transformations using the lattice of cubes

LGF Mb

LGFMb

125

5.2.2

Products/coproducts in pullback diagrams

The next two figures (Figure 5-8 & 5-9) visualize the product and the coproduct of two objects a and b in the context of an object c . In Figure 5-8, the internal arrows are shown, while they are omitted in Figure 5-9. By replacing a and b with c , the product and coproduct in the following figures, become c  c and c  c , respectively.

a

I

P

f

f

ax cb

Figure 5-8: The product/coproduct of objects a and b , showing all the commuting triangles

h

Q

r

c

g

g'

ab

J

a b

P

I

ab

ax cb

Figure 5-9: The product/coproduct of objects a and b as an abstraction

Q

J b

126

5.3 5.3.1

Basic adjunctions in applied category theory The product  as right adjoint of the diagonal functor  ( ┤ )

The unit of the adjunction c : c  c  c (Figure 5-13) monitors the behaviour of a Cobject c, as the diagonal arrow  c  1c ,1c  (Figure 5-10 and alternatively in Figure 5  11), with c  c, c  c  c .

c

1c

c

1c

1c ,1c 

p

cc

Figure 5-10: The diagonal arrow  c  1c ,1c 

c

q

C

1

Figure 5-11: The diagonal arrow  c in the

1

c

context of category C

c

p

cc

c

q

c

c

c

cc

g

f

f g

h

h

a

p

ab

q

b

ab

Figure 5-12: The product a  b Figure 5-13: The unit  c of the adjunction

127

The counit of the adjunction   a,b :  a  b, a  b   a, b (Figure 5-14) monitors the behaviour of a C  C -object  a, b , as a pair of arrows p : a  b  a , q : a  b  b ,   with  a, b   a  b  a  b, a  b , h : c, c  a, b and

h* : c, c   a  b, a  b c, c

 f , g

h

 a  b, a  b

 a, b

  a ,b 

Figure 5-14: The counit   a ,b  of the adjunction

The integrated view of the adjunction ┤ is given in Figure 5-15. C C

C

c

c

cc

c, c 

f g

 f , g

h

h

 ab

 a  b, a  b

  a ,b 

 a, b

Figure 5-15: The integrated view of the adjunction ┤

5.3.2

The coproduct

as left adjoint to diagonal functor  (

┤ )

The unit of the adjunction  a,b :  a, b   a  b, a  b (Figure 4-26) monitors the behaviour of the C  C -object  a, b , as a pair of arrows (injections) i : a  a  b and i [ f ,g ] j [ f ,g ]  a  b   c and b   a  b   c ), with j : b  a  b ( a    a, b   a  b  a  b, a  b (Figure 5-16 & 5-17).

128

  a ,b 

 a, b

 a  b, a  b

r

 f , g 

Figure 5-16: The unit  a ,b  of the adjunction  a, b   a  b, a  b

c, c

c

f

a

r

ab

i

Figure 5-17: The coproduct a  b of two objects a and b

g

b

j

The counit of the adjunction  c : c  c  c (Figure 5-18) monitors the behaviour of  the C-object c, as the folding map [1c ,1c ]: c  c  c with c  c, c  c  c

( i1 x

x and i2 x

x)

1c

c

c

[1c ,1c ]

i1

cc

Figure 5-18: The folding map [1c , 1c ] for the

1c

counit of the adjunction c : c  c  c

c

i2 ab

Figure 5-19: The counit  c of the adjunction

The integrated view of the adjunction

f   g

r

cc c ┤ is presented in Figure 5-20.

c

129

C C

C

  a ,b 

 a, b

 a  b, a  b

ab

r

 f , g 



c, c

Figure 5-20: The integrated view of the adjunction

5.4

r

f   g

cc

c

c

┤

Type of categories and universal constructions in the proposed approach

In Figure 5-21, the constraint projection of universal constructions as limits, products and pullbacks, is illustrated.

limits

products

pullbacks

Figure 5-21: The universal constructions limits, products and pullbacks and their association.

In the proposed architecture, categories are Locally Cartesian closed categories. That means that all slice categories of a LCCC are CCC (Figure 5-22).

LOCALLY CCC

CARTESIAN CLOSED CATEGORIES

Figure 5-22: The categories used in the current approach

130

5.4.1

An example of a product category

Consider a category C and C-arrows f : a  a , f  : a '  a , g : b  b , g  : b  b , and the composites f  f  f f : a  a , g  g  g g : b  b . Then, arrows in the product category C  C can be defined as  f , g  : a, b   a, b and  f , g  :  a, b   a, b .

Let us now have a category D with a D-arrow h : c  d , a functor F : D  C  C and two functors R : D  C , T : D  C with Fh   Rh, Th . Following the theory [Mac Lane, 1998], Fh   h, h for a pair of C-arrows h : c  d  and h : c  d  . Thus PFh  Rh  h and QFh  Th  h , as can be seen in Figure 5-23.

D

c

h d

R

T F

C C

C

a

c

 a, b

a

f

d

c, c

 f , g

h

f

P

 a, b

 d , d 

 a, b

c

b

 h, h

 f , g 

a 

C

h

g

Q

b

d 

g

b

Figure 5.23: The product category C  C

131

5.4.2

Examples of product functors

Let us have two functors U : C  C and V : C  C . The product U V : C  C  C  C is defined in terms of objects and arrows as

(U V )a, b  Ua,Vb and (U V ) f , g   Uf ,Vg  . In the case of having two U V U V  product functors C  C   C  C   C  C , then the product U U V V is

defined in terms of objects and arrows as (U U V V )a, b  U Ua,V Vb and (U U V V ) f , g   U Uf ,V Vg  (Figures 5-24 & 5-25). P

C

82 B  C

C C

84 B  C

Q 81 B  C

C

U

U V

V

80 B  C

83 B  C

79 B  C

C

P

C  C

Q

C

78TheBproduct  C functor U  V : C  C 77  Figure 5-24:  CB CC

C

P

74 B  C

C C

76 B  C

Q 73 B  C

C

U

U V

V

72 B  C

75 B  C

71 B  C

C

P

C  C

70 B  C

Q

C

69 B  C

U

U  V 

V

68 B  C

BC

67 B  C

C

P

66 B  C

C  C

Q

C

65 B  C

Figure 5-25: The product functor U U  V V : C  C  C  C  C  C

132

Figure 5-26 shows the role of product U V : C  C  C  C , having P U Q V C  C   C   C and C  C   C   C . The product functor applied on

the C  C -arrow  f , g  : a, b   a, b produces the mapping

 f , g  : c, d   c, d  , that is (U V ) f , g   Uf ,Vg    f , g  with P f , g   f  and Q f , g   g  .

In other words, UPa, b  P(U V )a, b  Pc, d   c , VQa, b  Q(U V )a, b  Qc, d   d

UP f , g   P(U V )  f , g  P f , g   f  , VQ f , g   Q(U V )  f , g  Q f , g   g 

C C

C

C

 a, b

a

b

 f , g

f a

P

f

 a, b

g b

Q

g

 f , g 

a 

b

 a, b

U V

U

61 B  C C

59 B  C C  C c, d 

c

f 

c

C

P

c, d   f , g 

60 B  C

d

 f , g 

f  c

V

g 

d

Q

g 

c, d 

d 

Figure 5-26: The product functor U  V in terms of objects and arrows

133

The product functor U V : C  C  C  C can be regarded either as an ordinary P U Q V functor where C  C   C   C and C  C   C   C , or a bifunctor U V  C  C with P : C  C  C and (U V )(_, _) , where C  C 

Q : C  C  C , as can be seen in the next paragraph.

5.4.3

An example of a bifunctor

Let us consider categories C, C , the product categories C  C , C  C and functors U : C  C , V : C  C and the bifunctor U V : C  C  C  C (replacing category

D with the product category C  C ). Then, following the theory [Mac Lane, 1998] [Lawvere, 1986], (U V )(_, b)  Ub and (U V )(a, _)  Va . Thus, for arrows

f : a  a and g : b  b then Va g Ub f  Ub f Va g . If U b a  c , Vab  d , and

Vab  U b a ,then c  d , that is c and d are isomorphic. This is the case where two constants are used as arguments in a bifunctor S (here the bifunctor U V ). The result will be the same, either using functor U with the left argument, or functor V with the right argument (Figures 5-27 & 5-28). (U V )(a, b)

(U V )(a, g )

(U V )(a, b)

(U V )( f , b)

(U V )( f , b)

(U V )(a, b)

(U V )(a, g )

(U V )(a, b)

Figure 5-27: The bifunctor U  V : C  C  C  C in terms of arrows

c, d 

c, d 

c, d 

c, d 

Figure 5-28: The bifunctor U  V : C  C  C  C in terms of objects

134

5.4.4

Monads/comonads analysis

Having a C-object a and a D-object b , the unit and the counit of the adjunction involved in each of the three cycles is presented on Figures 5-29, 5-30 and 5-31. C D

a

a

f

Fa

F

GFa

g

Ff

Gg

G

Gb

FGb

b

b

Figure 5-29: the unit and counit of the adjunction on the 1 st cycle of a monad

C GFa

D

GFa

f

GFGFa GF

F

Gg  GFGb

FGFa Ff 

G

FGFGb

g FGb

 FGb

Figure 5-30: the unit and counit of the adjunction on the 2 nd cycle of a monad

C GFGFa

D

GFGFa

f 

GFGFGFa

F

Ff 

Gg  G

GFGFGb

FGFGFa

FGFGFGb

g 

 FGFGb

FGFGb

Figure 5-31: The unit and counit of the adjunction on the 3 rd cycle of a monad

The above three diagrams can be combined into one as in Figure 5-32, where

135

T  GF , T 2  T T  GFGF , T 3  T 2 T  GFGFGF , L  FG , L2  L L  FGFG

and L3  L2 L  FGFGFG . In this diagram, the dashed arrows represent the unit and the counit of the adjunction involved in each of the three cycles, for a C-object Gb and a D-object Fa .

Gb

Gb f

f Gg

a

b

a b

g

Ta

Lb

Ff Fa

Ta  Lb

g

 Fa

TGb

TGb

T 2Gb

f  Gg  T 2a

L2b GF

Ff  LFa

Gg 

T

T 3a 2

L b 2

g 

 LFa

a

L3b GF

Ff  L2 Fa

Figure 5-32: The three cycles of a monad, combined

In Figure 5-33, the actual arrows and the objects are omitted. Only the flow of the arrows is depicted. The top part of it captures the „forward‟ ongoing process of the universe (e.g. the entropy of the system under consideration, that is the tendency for systems to go from a state of higher organization to a system of a lower organization or simply the change from a greater to a lesser potential energy level) while the bottom part provides a way to „look into the future through the past‟. In this case the C-object a operates as a starting point (limiting cone) for the process (or a group of processes) that will follow, while the D-object b serves as the desired state of the system (the term „system‟ here refers to the pair of categories C and D). Two other important states of the system are the top right corner (top closure – process comes to end or, a new recursive one starts over) and the bottom right corner.

136

Figure 5-33: The three cycles of a monad, combined, only the arrows

Thus, in each of the three circles, the following take place: 

The unit of the adjunction represents an „onward‟ (or „forward‟) process or a group of processes in the left-hand category, while the counit of the adjunction represents a „backward‟ process, in the right-hand category.



The unit of the adjunction represents the new state of the left-hand category, while the counit shows a previous state of the right-hand category.

After three cycles, the image of the left-hand category will show „what will happen‟, while the image of the right-hand category will show „what happened‟ to the system under consideration. In this way, all the morphisms and mappings can be clearly defined. That means that all the employed processes can be deployed, compared (using exponentation) and evaluated against standards or clearly specified conditions following a top-down and a bottom-up approach, something that is a cornerstone for a systemic approach. All of the above stand if and only if the employed categories are all Cartesian closed categories.

5.4.5

Monads in partial orders

Let us have a preorder P . A functor T : P  P is just a monotonic function, that is x  y  Tx  Ty . Then there are natural transformations  ,  precisely when x  Tx (a) and T (Tx)  Tx (b), x  P . Diagrams in Figures 4-49 & 4-50 (associative

law and left/right unit laws for a monad T, respectively) commute [Mac Lane, 1998]. Then (a) gives Tx  T (Tx) . If the preorder is a partial order, that is x  y  x  x  y , then from (a) & (b) we know that T (Tx)  Tx . Hence a monad in a partial order P, is

137

just the closure operation t in P. That is, a monotonic function t : P  P with x  tx and t (tx)  tx, x P .

5.4.6 

Examples of comma categories A C-object b may be regarded as the functor b :1  C . If T=b, that is

T (e)  b for a fixed C-object b and for every E-e object, then the category

(T  S ) becomes the category (b  S ) . In this case T is called a selection functor. 

If S  1C , that is we have the category (b  C) of C-objects under b. In this case S is called the identity functor, while T is the selection functor of a fixed C-object b. The comma category (b  C) is also known as the coslice category with respect to b.



Similarly, if T=1C (the identity functor) and S is the selection functor of a fixed C-object a , then we have the category (C  a) , C-objects over a , also known as the slice category over a .



If S  T  1C , that is both of them are identity functors, we have the category

(C  C) , that is the category C2 of all C-arrows. 

Instead of having two underlying categories E and D, we may have only one of them, for example E. A special case is when this category is also a comma category itself (e.g. (e  E) ).



If S  a :1  C and T  b :1  C , that is S and T are both selection functors with a and b being fixed C-objects, then (T  S ) = (b  a) , which is the category with objects all the arrows f : b  a . In some cases it is denoted as

a b , the exponential object. As said earlier, categories with limits (finite products) and exponentials are called Cartesian closed categories. Such categories form the basis for all structures developed in this research. 

T S  C   D , categories E and D can be subcategories In (T  S ) with E 

of C . 

Locally Cartesian closed categories (LCCC) are satisfied when a category C has pullbacks and all the derived comma categories are Cartesian closed.

138

5.4.7

Adjointness in terms of comma categories

Lawvere [1963] [1969a] [1964] showed that the functors F : C  D and G : D  C are adjoint iff the comma categories ( F  D) and (C  G) are isomorphic, and equivalent elements in the comma category can be projected onto the same elements of C  D . This allows adjunctions to be described without involving sets; that was in fact the original motivation for introducing comma categories. An isomorphism F : C  D is a functor F from C to D which is a bijection, both on objects and on arrows [Mac Lane, 1998]. Alternatively, a functor F : C  D is an isomorphism if and only if there is a functor G : D  C with IC  G F and

I D  F G . In this case G is called the two-sided inverse G  F 1 . A natural transformation  :T  S for functors T : C  D and S : C  D is often called a morphism of functors. If every morphism  c : Tc  Sc in D is invertible (i.e. there is a morphism ( c)1 : Sc  Tc ), then  is called a natural equivalence or natural isomorphism. Adjointness in terms of the implicated comma categories is depicted in the next two figures. The first one (Figure 5-34) is focused on the equivalent elements of the isomorphic comma categories while the second one (Figure 5-35) shows adjointness in terms of the unit and counit of the adjunction. Equivalent elements e.g. c, c, d , d  are projected onto the same elements of C  D e.g. ( c, d  , c, d  ). C

D

(C  G)

( F  D)

c

F

d

Gd 

Gd

Fc

d Fc

G

c

P

C D

Q c, d 

c, d 

Figure 5-34: Adjointness in terms of isomorphic comma categories, with the equivalent elements projected on the product category

139

C

D

(C  G)

c

c

( F  D)

GFc

F

Gd

G

P

C D

Fc

FGd

d

d

Q

c, d 

Figure 5-35: Adjointness in terms of comma categories, showing the unit  c and the counit  d of the adjunction

5.4.8

A comma category as a pullback

Let us consider categories E and D, the comma category (T  S ) and a bifunctor R : E  D  C2 with Cdom R  T P and Ccod R  S Q (Figure 5-36). Then,

(T  S ) is a pullback with R  R L ( L : E  D  (T  S ) ), as can be seen in Figure 5-37.

L R E  D  (T  S )  (C  C)

A product functor

A bifunctor Figure 5-36: Connection between the product functor and the bifunctor in a pullback with comma categories

140

(T  S ) acts as a constraint limit of the product category E  D . Functors P and Q are the projections of the comma category, while functors P and Q are the projections of the product category E  D . Functors Cdom and Ccod provide the domain and codomain for an arrow f in the functor category C2 . It is the case where S  T  1C , that is, both of them are identity functors. Thus, the category C2 of all C-arrows can be represented also as the comma category (C  C) (Figure 5-37).

C H E D M

Figure 5-37: Using comma categories, product functors and bifunctors in a pullback diagram

K L

P

(T  S )

P

Q

Q

R

(C  C)

E

T

Ccod

Cdom

D

S

C Functors T and S may be, for example, selection or identity functors, giving thus a variety of different comma categories and applications. In the case where E  D represents all the pairs of events of two processes in a DIS, E and D, the comma category (T  S ) represents all the parallel events between these two processes (the two categories represent actually the local histories of the two processes). The product functor L can be applied on arrows or objects, e.g. L( f , g ) or L(a, b) while the bifunctor R can be applied to none (two constants), one (one constant – one variable), or two variables, e.g. R(a, b) R(, f ) , R( f , ) , R(, ) .

141

5.4.9

The difference between a natural transformation and a comma category

A natural transformation is a particular collection of morphisms in the target category between the two functors (one morphism for each object in the domain) which makes the diagram in Figure 4-2 commute, while the comma category contains all the morphisms in the target category between the two functors which make the diagram commute. Thus a natural transformation  :T  S , with T : E  C , S : E  C , is a functor  : E  (T  S ) , such that each object e in E is mapped to a morphism Te  Se in (T  S ) (also in C). This functor picks just one object morphism from the

category for each object in E, based on the natural transformation  (Figure 5-38).

C (T  S )

e

Te

Se

Tk

Sk

 e

Te

Tk 

Sk 

Se

Tk 

Sk 

 e

Te

T



Se

S

E Figure 5-38: Natural transformations in a comma category

e

e

fig

k 

k

k

e

142

5.4.10 Natural Transformations between simple categories Given functors R, S , T : C  D , R, S , T  : D  E and natural transformations

.

.

.

.

T ,  : R   S  ,   : S   T  , then the natural  : R   S ,   : S 

transformations are the „maps‟ between the functors. The horizontal and vertical composition is given by the equation (   )  (  )  (    ) (   ) : R R  T  T : C  E . Horizontal composition

between the three categories C, D, E is given by the bifunctor ED  DC  EC . There is a function that sends each functor category between two categories to the corresponding vertical category, DC  T (C, D) , with

.

DC ( R, S )  { |  : R   S natural} . Then, the horizontal composition between vertical categories is given by the bifunctor KC,D,E : T (D, E)  T (C, D)  T (C, E) and the left and right adjoint of the composition by the functor U D : 1  T (D, D) : 1  I D (Figure 5-39).

EC

ED

ED  DC

T(D, E)

T (D,E)  T (C, D)

DC

T (C, D)

T(C, E)

Figure 5-39: The correspondence between a functor category and a vertical category

5.4.11 Adjointness defined in 2-categories The notion of an adjunction can be carried over to other 2-categories. Thus, in a 2category, two 1-cells f : a  b , g : b  a are adjoint, when there are 2-cells

 :1a  g f : a  a and  : f g  1b : b  b (the unit and counit of the adjunction, respectively), such that ( f )  ( f  )  1 f : f  f g f : a  b and ( g  )  ( g )  1g : g  g f g : b  a [Mac Lane, 1998] [Leinster, 2004].

143

Consider the case of having two adjunctions and three functors, F ┤G┤F  (e.g. ┤┤ ). Figure iii presents an analysis of the first adjunction

F ┤G ; similarly we

can work for the second adjunction G┤F  . The two functors F and G (1-cells) are adjoint when there 2-cells  :1C  GF : C  C and  : FG  1D : D  D , such that

( F )  ( F  )  1F : F  FGF : C  D and (G  )  ( G)  1G : G  GFG : D  D as can be seen in Figure 5-40 (a) & (b).

1C

 :1C  GF : C  C

C

 : FG  1D : D  D

D

1C

 

C

C

 

GF

T

FG

L

 

D

D

C

 

D

1D

1D

Figure 5-40: Analysis of the adjunction F ┤G , based on the unit  and counit 

F 

F C

 1F

D

C

 1F

D

F

FGF

G 

G

(a)

D

 1G

C

D

 1G

G

GFG

(b)

Figure 5-41: 2-cells 1F (a) and 1G (b) of the adjunction F ┤G , based on the unit  and counit  .

In Figures 5-40 & 5-41, categories C and D can be replaced by comma categories

(C  G) and ( F  D) for the first adjunction F ┤G , while for the second adjunction

G┤F  can be replaced by the comma categories (D  F ) and (G  C) , respectively. Thus in the case of exploring the role of functors

 |  |  , we construct Figures

5-42, 5-43 and 5-44.

144

C

C C

C

c

c

h

c, c

cc  f , g 



ab



 f , g

h

 a  b, a  b

 a, b

  a ,b 

Figure 5-42: Adjointness  |  for categories C, C  C

C C

  a ,b 

 a, b

C ab

 a  b, a  b

 f , g 

r

 f , g 

r c, c

cc



c

c

 |  for categories C  C, C

Figure 5-43: Adjointness

C

(

(C  )

 C)

cc

c

c

c

cc  f , g 

 f , g 

h

r ab



C C

ab

 (C  C  )

(  C  C) c, c h  a  b, a  b

 f , g

  a ,b 

c, c

 f , g   a, b

  a ,b 

r

 a  b, a  b

Figure 5-44: Comma categories for adjoint functors

 |  |  between categories C, C  C

145

5.4.12 Natural transformations between 2-categories A morphism F : T  U between two 2-categories T and U is called a 2-functor [Mac Lane, 1998]. Having two 2-functors F , G : T  U , then a 2-natural transformation

.

 : F   G is a function that sends each T-object a to an U-arrow  a : Fa  Ga in such a way that the equality of 2-composites G  a   b F holds for each 2-cell

 : f  g : a  b in T (with f , g : a  b ), making the diagram in Figure 5-45 commute. In case that  is applied to 1-cells (e.g. 1 f : f  f ) then the associated functors F and G become ordinary functors.

U

T

a



F

g

f b



a

Fa

Ga

G

F

Ff

Fg

Gf

Gg

G Fb

b

Gb

Figure 5-45: A 2-natural transformation  between 2-functors F , G for two 2-categories T and U

5.4.13 Modifications (3-cells) in enriched 2-categories

.

Given two 2-natural transformations  ,  : F   G , then a map (i.e. a 3-cell) [Mac Lane, 1998;Leinster, 2004;Kelly, 2005]  :    , called modification, assigns to each T-object a a 2-cell  a :  a   a , such that the equality of 2-composites

b F  G  a holds for every 2-cell  : f  g : a  b , with f , g : a  b , as visualized in Figure 5-46.

146

U

1Ga

Ga

a

a

T

Ga

a G

Gf

a

Fa

F



g

f





1Fa



F

Ff

b

Gg

Fa

1Gb

Gb

Gb

Fg

G

b Fb

b

b 1Fb

Fb

Figure 5-46: A modification  between two 2-natural transformations  ,  for two 2-categories T and U

5.4.14 Adjointness in terms of comma categories Adjointness in terms of comma categories is depicted in Figures 5-47 and 5-48. In Figure 5-47 adjointness is expressed in terms of the isomorphic comma categories with the equivalent elements projected on the product category, for two C-objects c , c and for two D-objects d , d  . Figure 5-32 presents the three cycles of a

monad/comonad, combined. Considering c  GFc  Tc , then Figure 5-47 represents the 3 cycles of a monad/comonad construction in terms of the corresponding comma categories (C  G) and ( F  D) for an adjunction  F , G, ,   between the underlying categories C and D.

147

C

D

( F  D)

(C  G)

c

Gd TGd

TGd T 2Gd

d

c

f Gd

d

Gg f Gg 

f  Gg 

Tc

Ld

F

 Ld

Tc

L2 d

T 2c

T

g

G 2

L d

Ff

Fc

g Ff 

 Fc LFc

g 

 LFc

2

c

T 3c

L3d

Ff 

L2 Fc

Figure 5-47: The 3 cycles of a monad/comonad construction in terms of the corresponding comma categories (C  G ) and ( F  D) for an adjunction  F , G, ,   between the underlying categories C and D

Having two adjunctions  F , G, ,   ,  F , G, ,   and two natural transformations

 : F  F  : C  D ,  : G  G : D  C , then horizontal composition is given by

 : GF  GF   T  T  : C  C . The three cycles of the monad/comonad constructions T , L, T , L is given in terms of the corresponding comma categories

(C  G),( F  D) and (C  G),( F   D) . Figure 5-48 shows the comma categories (C  G) and (C  G) in the underlying category C, where object c belongs to both comma categories. In the same way, we can work with comma categories ( F  D) and

( F   D) in the underlying category D.

148

C

(C  G)

(C  G)

c f

c Gd

Gd TGd

TGd T 2Gd

Gg f Gg 

f  Gg 

Tc

 c

T c

T c

Tc  Tc

T 2c

T

h

c

2

T  c

Gd h

Gk  h

2

c

T 3c

T 2 c

Gk

 T 2c

T 3c

Gk 

G d T Gd

T Gd T 2Gd

Figure 5-48: The comma categories (C  G ) and (C  G) in the underlying category C, in case of having two adjunctions  F , G, ,   ,  F , G, ,  

The case of having two adjunctions and three functors, F ┤G┤F  (e.g.

┤┤ ) is

described in §5.3.11.

5.5

Summary

The Cube and Lattice of Cubes, introduced here, visualize composition in Category theory, based on the Godement Calculus. The monad and comonad construction, are also introduced, to visualize internal processing in categories as closed operations. The categories used here are Cartesian closed comma categories. Comma categories are expressed as pullbacks. Categorical constructions and visualizations describe the use of 1-cells, 2-cells and 3-cells.

149

6

The four-level architecture

6.1 Introduction to the architecture It has been shown [Sisiaridis et al., 2008] [Rossiter et al., 2006] [Heather et al., 2007] that any realizable system can be conceptually expressed using four interchangeable levels in categorical terms, which are sufficient to provide ultimate closure for computational types to construct information systems - defining an arrow as unique up to natural isomorphism [Mac Lane, 1998] [Rossiter and Heather, 2005]. The levels are named as Concepts (CPT), Constructs (CST), Schema (SCH), and Data (DAT). CPT, CST, SCH and DAT are just labels, not new categories. They represent general categories with internal structure and connectivity. The relationships between the levels are expressed as categorical adjunctions in terms of higher-order functions, functors which ensure that certain appropriate restrictions are satisfied in the mapping between the source and target categories. Each type level taken with its adjacent type level, acts as a level pair so that there are three level pairs across the four levels. This means that each point at each level is directly related to a point at the other level in the level pair. Between each level the mappings are strictly defined by their starting and terminating points in the respective level types. Thus, each functor in the downward direction (decreasing abstraction) or the upward direction (increasing abstraction) acts as a level pair (Figure 6-1).

P

O

CST

CPT

P

I DAT

SCH

O

I

Figure 6-1: Natural composition of adjoint functors

At the top level (CPT) concepts relating to policy and philosophy are defined e.g. object-oriented abstractions. In principle, only one instance of them need be defined here as in a coherent system there can be only one collection of such types. With the open-ended nature of object-oriented structures, however some extensibility may be required. 150

At the second level (CST) schema construction facilities are defined. Each system will have its own type definition. For example constructions would include recordtypes as an aggregation of single- or multi-valued data field-types while relations would include table-types as an aggregation of single-valued data fields. At the third level (SCH) the schema for each application is defined. There will clearly be many intensions defined in an organization, one for each application. Typing of names and other constraints will be applied to data objects and their methods. UML diagrams are another example of this level. At the fourth level (DAT) the data values for each application are defined. There will be one collection of data values for each schema, the values being consistent with the types of names and constraints of the schema. Data values may be simple objects as in relations or complex objects as in computer-aided design and multimedia systems. For matching across the levels in a contravariant manner, the intension e.g. SCH is defined with arrows of the form name  type and the extension e.g. DAT with arrows of the form value  name . The four levels can be seen as two intension-extension pairs in Figure 6-2 (CPT/CST & SCH/DAT), that is concepts/constructs and schema/data respectively.

CPT name

P

type

P

I-E CPT/CST

CST value

name

O

O Figure 6-2: Four levels defined with covariant functors and intensionextension pairs

SCH name

type

I

I

I-E SCH/DAT

DAT value

name

151

6.2 Interoperability issues in the 4-level architecture 6.2.1

Naturality

In today‟s global environments, which are based on non-local activities as in distributed information systems, interoperable systems are free and open. Higherorder operations are needed, as the same conditions applied in different systems may lead to unpredictable results. In Figure 6-3, this interoperability is expressed as the adjointness  F , G, ,   where 1L  GF if and only if FG  1R . Naturality is based on the ordering and interoperability of the two free and open represented category systems. Triangles represent unique correlation of components of the system. Functors F and G are the free and underlying functors, respectively; an underlying (or forgetful) functor „forgets‟ some or all of the structure of an algebraic object – (the cofreeness principle as the prescription of rules) while a free functor allows selection of a target type at a lower level – (the freeness principle as un-prescribed development) [Rossiter and Heather, 2006]. Thus the lower-limit functor F preserves co-limits and the upper-limit functor G preserves limits. 1L

1R F R

L

G

Figure 6-3: Adjointness between two systems

Using adjointness, for example between categories SCH and DAT (Figure 6-4), arrows f in SCH are compared with arrows g in DAT. Thus, a SCH-object a is compared with the result GFa , after applying functors F and G. This comparison is a natural transformation  (the unit of adjunction) involving type changing a  Fa  GFa . This arrow η is called the unit of adjunction. The comparison is

made in the context of the corresponding object Gb which maps b in DAT to SCH. Similarly, a DAT-object b is compared with the result FGb , after applying functors G and F. This comparison is a natural transformation  , called the counit of the adjunction, involving type changing b  Gb  FGb . Thus, based on equation 152

solving, there is a functorial way to relate any arrow g : Fa  b to an arrow f : a  Gb in such a way that f solves the equation g   b Fy and that the solution

is unique for either some arrow y or object y in category DAT (as any object y can be y regarded as an arrow 1  DAT ).

SCH

a

DAT

a

f

GFa

F

g

Ff

Gg Gb

Fa

G

FGb

b

b

Figure 6-4: Adjointness between categories SCH and DAT

Let us have a user Alice as a participant in a client process. Alice is an object or a record, in level DAT. It is actually a user_instance (a user_object_instance or a user_record_instance, in levels DAT and SCH) of a type user (object_type or entity_type, in level SCH). User is an instance of the abstract_object_type or of a table (in level CST) as an example of encapsulation or aggregation (in the CPT level), in the context of an object-oriented system design or an RDBMS, respectively. 6.2.2

Semantic & organizational interoperability in the 4-level architecture

Semantic interoperability (in the metadata level) refers to obtaining as many as possible of the types in the different systems. Semantic and organizational interoperability (in the meta-meta level) for information systems implementing the four-level architecture can be defined in terms of Godement Calculus, as can be shown in Figures 6-5 and 6-6, respectively, where comparisons are between a relational (r), an object-relational (or) and an object-oriented environment (oo). In Figure 6-6, the introduced functor P varies, in order to express a variable metameta level (e.g. real-time processing between heterogeneous, non-local, distributed information systems with complex multi-policy domains). In multi-database architectures, each local database is managed by a local DBMS and the various DBMSs are connected through a DDBMS. Thus, local extensionalities (e.g. local 153

security policies) are interconnected one with another through global intensionality (e.g. global security policy / meta-policy framework) by integrating local slice categories (e.g. local policy security domains, each one corresponding to a specific security policy).

Or CPT

P

CST

Oor Ooo

Ir

 

SCH

I or I oo

 DAT



Figure 6-5: Semantic interoperability

Pr CPT

Por Poo

Or

 

CST

Oor Ooo

Ir

 

SCH

I or I oo

 DAT



Figure 6-6: Organizational interoperability

The four-level architecture, in the cases of achieving semantic interoperability and organizational interoperability, expressed as objects of the 2-category CAT (enriched with 3-cells), are given in Figures 6-7 & 6-8. Identity functors are used as 2-cells when there is only one functor between two simple categories (as objects).

Or

P CPT

P P

1P

CST

1P

Oor Ooo

Ir

 

SCH

I or I oo

 

DAT

Figure 6-7: Semantic interoperability in the 4-level architecture using 2-categories

Pr CPT

Por Poo

Or

 

CST

Oor Ooo

Ir

 

SCH

I or I oo

 

DAT

Figure 6-8: Organizational interoperability in the 4-level architecture using 2-categories

154

6.2.3

Achieving ultimate closure at the top-level

The vertical categories (i.e. bifunctors) in Figure 6-8 for the three level-pairs are T (CPT, CST) , T (CST, SCH) and T (SCH, DAT) . Horizontal composition between the second and third level-pairs is given by the bifunctor KCST,SCH,DAT : T (SCH, DAT)  T (CST, SCH)  T (CST, DAT) . Then, horizontal

composition of the first level-pair and the composite of the other two level-pairs is given by the bifunctor KCPT,CST,DAT : T (CST, DAT)  T (CPT, CST)  T (CPT, DAT) .

DATCST

DATSCH

T(SCH, DAT)

T(CPT, DAT)

DATSCH  SCHCST

T (SCH, DAT)  T (CST,SCH)

SCHCST

T(CST,SCH)

T(CST, DAT)

DATCST

T (CST, DAT)  T (CPT,CST)

DATCST  CSTCPT

T(CPT,CST)

CSTCPT

DATCPT

Figure 6-9: Semantic and organizational interoperability, in the 4-level architecture, in terms of vertical categories and functor categories – the functor category DAT the top level

CPT

provides the closure in

In terms of the corresponding functor categories, horizontal composition of the second and the third level-pairs is given by the bifunctor DATSCH  SCHCST  DATCST , 155

while horizontal composition of the first level-pair and the composite of the other two level-pairs is given by the bifunctor DATCST  CSTCPT  DATCPT . The correspondence between the implicated vertical categories and the functor categories is represented in Figure 6-9. The functor category DATCPT , as the exponential object, provides the ultimate closure in the 4-level architecture. Security principles such as confidentiality, integrity and availability will be achieved by the applied security policies, implemented by a number (e.g. collection) of security mechanisms (usually based on cryptographic techniques), following a top-down approach, thus reducing abstraction. On the other hand, following a bottom-up approach (increasing abstraction), any piecemeal intervention on the applied security measures, based on risk management (including high level security services, security awareness and the experience of the experts) will always target on the desired level of security, in the context of security principles.

156

6.2.4

Semantic & organizational interoperability using the lattice of cubes CST

CPT

a

SCH

Or

Pa

Or Pa

 Pa

 Oor Pa  Pa

Ooo Pa

 P

f

Oor

Pf

Or Pf

Ooo Pf

Oor Pf

 b

Ooo

Pb

Or Pb

Oor Pb

 Pb

I oo

 Pb

Ooo Pb

Ir

I or





DAT

 Ooo Pa

I r Ooo Pa I r Pa

I or Ooo Pa

I or Pa I or Ooo Pf

I r Oor Pf

 Or Pa I r Oor Pb

I r Or Pf

I r Pb

I r Or Pb

I or Oor Pa

 Or Pb

I r Pb

I or Oor Pb

I or Pa

I or Or Pa

 Oor Pb

I or Or Pb

I or Pb

 Or Pa

I or Or Pf

 Oor Pa

 Ooo Pb

I r Ooo Pb

I r Pa I r Or Pa

 Oor Pa

I ooOoo Pa

I oo Pa

I r Ooo Pf I r Oor Pa

 Ooo Pa

I or Oor Pf

I or Pb

 Or Pb

I ooOoo Pf I ooOor Pa

 Ooo Pb

I or Ooo Pb

I oo Pa I O Pf oo or I ooOr Pa

 Oor Pb I ooOr Pf

I ooOoo Pb

I oo Pb

I ooOor Pb

I oo Pb

I ooOr Pb

Figure 6-10: Semantic interoperability (relational, object-relational and object-oriented paradigm) in the 4-level architecture, based on Godement Calculus, using the lattice of cubes

157

CST

CPT

Pr

a

a

Pr a

 a

Por a

Poo a

 Por

f

Pr f

Poo f

Por f

 Por b

Pr b

Por b

b

Ooo

Poob

 b

Or

Oor





SCH

 Poo a

Or Poo a

Or a

Oor a Oor Poo f

Or Poo f

 Por a

Or Por a Or a

 Poob

Oor Pr a

 Pr a

 Por b

Or Por b Or Pr f

Or b

Or Pr b

Oor Pr f

 Pr b

Oor Pr b

I  DAT

 Poob

Ooo Poob

Ooo Por f Ooo b

 Por b Ooo Pr f

Ooo Por b Ooo b

OPb

 Pr b



Ooo Por a

Ooo a

Oor Por f Oor b

Ooo Poo f

Ooo Pr a

 Pr a

Ooo Poo a

Ooo a

Oor Poob

Oor Por b Oor b

Oor a Or Pr a

 Por a

Oor Por a

Or Poob

Or Por f

 Poo a

Oor Poo a

I



I

Or b

158 Figure 6-11: Organizational interoperability (relational, object-relational and object-oriented paradigm) for the 4-level architecture, based on Godement Calculus, using the lattice of cubes

DAT

I r Or Poo a

I r Or Por a

I r Oor Poo a

I r Or Pr a

I r Or Por b

I r Ooo Poob

I r Oor Poob

I r Oor Pr a

I r Or Pr b

I r Ooo Por a

I r Oor Por a

I r Or Poob

I r Ooo Poo a

I r Ooo Pr a I r Ooo Por b

I r Oor Por b

I r Oor Pr b

I r Ooo Pr b

I or Or Poo a

I or Or Por a

I or Or Por b

I or Or Pr b

I or Ooo Pr a

I or Oor Por b

I or Oor Pr b

I ooOr Poo a

I or Ooo Poo a

I or Ooo Por a I or Oor Por a I or Ooo Poob I or Or Poob I or Oor Poob

I or Oor Pr a

I or Or Pr a

I or Oor Poo a

I ooOor Poo a

I or Ooo Por b

I or Ooo Pr b

I ooOoo Poo a

I ooOr Por a I ooOoo Por a I ooOor Por a I ooOoo Poob I ooOr Poob I ooOor Poob I ooOr Pr a

I ooOor Pr a I ooOr Por b

I ooOr Pr b

I ooOor Pr b

I ooOoo Pr a

I ooOor Por b

I ooOoo Por b

I ooOoo Pr b 159

Semantic interoperability in the 4-level architecture using the lattice of cubes is presented in Figure 6-10. Organizational interoperability is represented in Figure 611, which, due to page limits, is separated into two parts (two pages). In the first part, category DAT is represented as an empty rectangle. Its full analysis is given in the second part. Between categories CPT and CST, there are two natural transformations

 : Pr  Por : CPT  CST and   : Por  Poo : CPT  CST . Between categories CST and SCH, there are two natural transformations  : Or  Oor : CST  SCH and

  : Oor  Ooo : CST  SCH . Between the categories SCH and DAT, there are two natural transformations  : I r  I or : SCH  DAT and   : I or  I oo : SCH  DAT . Thus, the arrows between the top lattice of cubes and the middle one, represent Or Pr b natural transformations  . For example, I r Or Pb  I or Or Pb r  r . The arrows

between the middle lattice of cubes and the bottom one represent natural  Or Pr b transformations   . For example, I or Or Pb r  I ooOr Pb r .

6.2.5

More security examples based on interoperability

The implementation of security protocols, at the application and host level, has caused several interoperability problems. On the Web, Web Services Description Languages (WSDL) are used to ensure interoperability. Ontologies also, have been introduced to solve interoperability problems. An ontology describes objects, processes, resources, capabilities, etc. It can be expressed in a categorical manner, as a comma category constructed from its elementary subcategories with functors representing its functionality. From an application point of view, in Figure 6-11, functors Pr , Por , Poo apply security policies on three different implemented systems; in the first one, the information assets are stored in relational databases, in the second one in object-relational and in the third one, object-oriented databases. The three systems are based on the same concepts, expressed in the top layer (category CPT). Another example might be the three database schemas being partitions of a global integrated schema. Local security policies (expressed in the form of the above functors) are compared through the two natural transformations.

160

Instead of using three different paradigms, the three functors (now named as P, P, P ) may represent the applied security policies, in the distributed system, in order to achieve confidentiality, integrity and availability, following a top-down security approach (a baseline approach, based on security standards). The adjoint functors

P , P, P in the backward direction (increasing abstraction) may be used to apply a bottom-up approach (e.g. risk analysis) based on a threat model. Global security in the distributed system under examination is achieved through the integration of the two approaches, expressed in the form of the adjoint functors (e.g. P as a left adjoint to

P ).

6.3 Adjointness in the 4-level architecture in terms of 2-categories The unit and the counit of the adjunction between categories CPT and CST are

 : ICPT  PP : CPT  CPT and  : ICST  PP : CST  CST , respectively, such that ( P)  ( P  )  1P : P  PPP : CPT  CST and

( P  )  ( P)  1P : P  PPP : CST  CPT (Figure 6-12). CPT

CST

(CPT  P)

c

c

( P  CST)

PPc

P

Pc

Pd

P

PPd

Q

CPT  CST

c, d 

d

d

Q Figure 6-12: Adjointness between the levelpair (CPT, CST) , in terms of comma categories

161

Functors P : CPT  CST and P : CST  CPT POP : CST  CPT are adjoint iff the comma categories ( P  CST) and (CPT  P) are isomorphic, and equivalent elements in the comma category can be projected onto the corresponding elements of CPT  CST .

Figure 6-12 shows adjointness in terms of the unit and counit of the adjunction. Equivalent elements e.g. c, d are projected onto the corresponding element of CPT  CST e.g. c, d  .

6.4 Natural transformations in the 4-level architecture A 2-category  can be derived from a simple category C, in three ways: 

There are two parallel arrows f , g between every ordered pair  a, b with a, b C as well as a 2-cell (or map)  : f  g : a  b .



There are only identity arrows f for every ordered pair  a, b with 1f : f  f : a  b



A combination of parallel arrows and identity arrows. That means there are ordered pairs with just one arrow between them and others with pairs of parallel arrows between them. CST 1Pa

Pa

a

a

CPT

Pa

P

1f

f

f





P1 f

Pf

b

P

Figure 6-13: A modification  for an arrow f in CPT

1Pb

Pb

Pb

Pf

b Pb

Pf

Pa

1Pa



a P1 f

Pf

a

Pa

b

b 1Pb

Pb

162

In the case of having only identity arrows f for every ordered pair  a, b , category C is the same as category  . For every object a C , there is an identity 2-cell 1:1  1: a  a .

In the context of the four-level architecture, let us have two 2-natural transformations

 , : P  P : CPT  CST (the two 2-functors P, P behave now as ordinary functors). Then a modification  :    is a 3-cell, such that a CPT , then

a :  a   a . Thus, for a 2-cell 1 f : f  f : a  b , it is b P1 f  P1 f  a , which is visualized in Figure 6-13. An example of using 2-natural transformations in the four-level architecture is the comparison of two security policies (2- functors P and P ) in terms of their effectiveness (actually, of the implemented security mechanisms) on an identified security issue e.g. a system security breach. Another example of illustrating the use of 2-categories in the four-level architecture is presented in §6.6, which deals with eventordering.

Comma categories in the four-level architecture – top level analysis

6.5

The four-level architecture in terms of the involved comma categories and their interaction inside and between the levels is presented in Figures 6-14 (top-down view) and 6-15 (bottom-up view). There, isomorphic categories, for example ( P  CST) and (CPT  P) and equivalent elements in the comma category can be projected onto the same elements of the product category ( (CPT  CST) ). P, O, I are the free functors who select a target type at a lower level; they preserve colimits. P, O, I  are the underlying (or forgetful) functors, responsible for the prescription of rules; they preserve limits. In Figures 6-14 and 6-15, in the general form of a comma category

(T  S ) , functors T and S can be identity or selection functors. Actually, between the levels there are two pairs of functor categories, e.g. DATSCH and SCH DAT , which provide all the connectivity between components of the two levels. Natural transformations  and  (the unit and counit of the adjunction) are here comparisons of type changing. In Figure 6-14, there is type changing as follows: 

a  Pa  PPa



b  Ob  OOb

163



b  Pb  PPb



c  Ic  I Ic



c  Oc  OOc



d  I d  II d

Based on equation solving, there is a functional way to relate any arrow g : Pa  b to an arrow f : a  Pb in such a way that solves the equation g   b Ff . CPT  CST  a, b L

Q

CST

CPT

a

a

PPa

Oc

Pa

P

Pg

f

Pf

P

Pb

(CST  O)

( P  CST)

(CPT  P)

f

g

PPb

b

b

b

Og  OOb

O

Q

CST  SCH

b, c

O

SCH

DAT

( I  DAT) Ic

I g 

If  II d

I

d

L

(SCH  I )

(O  SCH)

d

OOc

Of  Ob

SCH  DAT

c g

c

c f 

L

I Ic

I g  I d

Q

c, d 

Figure 6-14: The four-level architecture in terms of comma categories –top/down view

164

DAT  SCH

 d , c L

Q

SCH

DAT

d

d

II d

I

Ig

f

I

Ic

(SCH  O)

( I   SCH)

(DAT  I )

I d

Ob

I f

f

g

I Ic

c

c

c

Og  OOc

Q

SCH  CST

c, b

O

O CST

CPT

( P  CPT) Pb

PPa

P

g 

Pf 

d

L

a

(CST  P)

(O  CST)

P

OOb Of 

Oc

CST  CPT

b g

b

b

L

PPb Pg 

f  Pa

Q

b, a

Figure 6-15: The four-level architecture in terms of comma categories –bottom/up view

In Figure 6-14, four different types of representing comma categories in a pullback diagram, as in Figure 5-37, can be identified. Objects-X under Y (e,g, Objects-P under CPT ) in Figure 6-16, objects-X over Y (e.g. Objects-P over CST )

(where X is a functor and Y a level category) in Figure 6-17, a comma category produced of two others comma category in Figure 6-18 and a comma category as a result of horizontal composition of two product categories (with c, d  b, c  b, d  ) in Figure 6-19.

165

CPT  CST

M

Q

V

(CPT  P) R

V

CPT

Q

CST

(CPT  CPT)

1CPT

CPTdom

CPTcod

P

CPT Figure 6-16: Objects-P under CPT

CPT  CST

M

Q

V

(P  CST) R

V

CPT

Q

CST

(CST  CST)

P

CSTcod

CSTdom

1CST

CST Figure 6-17: Objects-P over CST

166

(O  SCH)  (SCH  I )

M

Q

V

(O  I ) R

V

(O  SCH)

Q

(SCH  I )

(SCH  SCH)

V 

SCH dom

SCHcod

Q

SCH Figure 6-18: Comma category (O  I ) derived from comma categories (O  SCH) and

(SCH  I )

(SCH  DAT)  (CST  SCH)  CST  DAT

M

Q

V

(O  I ) R

V CST

Q DAT

(SCH  SCH)

O

SCH dom

SCHcod

I

SCH Figure 6-19: The comma category (O  I ) derived from the product category CST  DAT , which in turn is derived under horizontal composition (*) of product categories CST  SCH and SCH  DAT

167

Having a CCC C and a Cartesian closed monad T, then in the multicategory

(C, T )  and  are Cartesian natural transformations, while T preserves pullbacks based on  and  . A subject for future examination would be to check if the comma categories in the levels could be multicategories (e.g. ( ((CPT  P), T ),((O  I ), T )) etc.). Between levels, there is not only one adjoint pair of functors but rather two functor categories (e.g. CSTCPT and CPTCST ) with all free and forgetful functors in adjoint pairs. Thus, for example, a security service in SCH is implemented by one or more security mechanisms in DAT. The unit  and counit  are used as comparison measurements of applying security measures. In such a way, vertical categories (e.g. bifunctors) T (CPT, CST) and the corresponding functor categories (e.g. CSTCPT ) provide ultimate closure in the top level (i.e. CPT ). A comma multicategory in that level (e.g. (C, T ) ) would ensure closure based on Cartesian closed monad T. Thus, one or more security mechanisms can be combined in order to implement a security policy, while their effectiveness can be measured through the unit  and counit. The product category (e.g. CPT  CST ) then provides all the possible combinations (global intensionality) while the internal comma categories represent local applicability (local extensionalities). Monads can be implemented as closures, in programming languages that support closures, based on the endofunctor T defined with free variables for closed operations on the components.

6.6

Security in distributed computations

A distributed system is an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks. Components in a distributed system include processes (e.g. computers), channels (i.e. networks) and time and failure handling components (as delays and failure detectors in synchronous/asynchronous communication). Each component in a distributed system is generally constructed from a collection of other components. Good software engineering practice for process management in distributed systems has been based on a variety of mathematical formal interpretations of processes, known as process calculi. 168

Process calculi provide a tool for the high-level description of interactions, communication (through message-passing), and synchronization between a collection of independent agents or processes in general. They have been used for modelling transactions in distributed information systems, particularly for representing concurrency. They allow sequential and parallel composition of processes. They provide channel specification for sending and receiving data. They explain recursion and process replication based on repetition. The feature of hiding operations in channels allows agents (which are processes) to be composed in parallel. CCS (Calculus of Communicating Systems), CSP (Communicating Sequential Processes) [Hoare, 1985] and ACP (the Algebra of Communicating Processes) [Bergstra and Klop, 1995] constitute the three major branches of the process calculi family. The use of channels for communication is one of the features distinguishing the process calculi from other models of concurrency, such as Petri nets. A system can evolve in different ways depending on the interaction among processes. Milner has shown [1999] that computation and communication in distributed systems and particularly in mobile distributed systems, can be modelled using the notion of the process. λ-calculus has been used for handling single-threaded computation and in general does not offer any explicit constructs for parallel programming. But as Milner observed, processes consist of many elementary parallel, interacting, communicating threads which exchange messages with each other. In π-calculus, all participants in a process are themselves processes. Thus integers, strings, objects, workflows, procedures or any other computational entities (e.g. users as participants in process interaction) or services (e.g. security services as core processes of the distributed system), can be considered to be just different forms (i.e. types) of an abstract process data type. Names in π-calculus represent channels that can act both as the actual communication channels (e.g. input and output channels) or as the actual data (the contents encapsulated in messages). Pairs of processes interact with each other by sending and receiving named messages in a synchronized way. That means that processes can be composed allowing them to communicate through named channels with complementary names (e.g. distributed agents may use the same named channel as input and output interaction point). The mobility feature in πcalculus refers to the fact that the recipient process may use the received channel for further communication. It directly refers to a dynamic change in the communication topology among processes. 169

The BPMS technology, a life-cycle for good systems design, is a holistic management approach that promotes business effectiveness and efficiency by attempting to improve and optimize processes continuously. The relationship of π-calculus with Business Process Management along with an implementation for workflow systems was presented by Smith and Fingar [2003], where new values are created from processes in an analogous way with an RDBMS that creates new values from data. Just as RDBMS support the aggregation of business data and the creation of application and enterprise data models, a BPMS achieves the same for business processes. Thus, an RDBMS creates new value from data while a BPMS creates new value from process. Sangiorgi and Walker examined the relationship between πcalculus and λ-calculus as well as a number of applications of π-calculus to objectoriented paradigm [2003]. A distributed channel-based implementation of π-calculus presented by Wischik [2002], demands that at runtime-level only channels are present, each one either at its own location or co-located with some other channels (local extensionalities/global intensionality). The authors argue that creating a new channel is an actual physical event. On the other hand, location-based calculi, such as the distributed π-calculus and ambient calculus, need a further set of definitions for interaction between locations, something that adds a further level of complexity as in π-calculus only one type of entities is used (located_channels) while the latter two demand two types of entities at the runtime (locations and channels). In its simplest form, a process (i.e. a single-thread process in λ-calculus, or the initial thread, the terminal thread or any other thread in a list of parallel threads of a process in π-calculus) is regarded as a sequence of events (i.e. a sequential program). Events of a process A are in a total order e A0  e1A  e A2  ...  e An . In each step (clock tick), a process either performs some computation locally, or sends/receives a message to/from another process (in this case, globally). Messages transfer information between processes or coordinate processes activities. Processes interact with each other via communication channels. In Figure 6-20, each one of the processes P and Q consists of a sequence of events. Those events either affect the local state of a process or are send/receive events denoting the exchange of a message (e.g. messages m, m ). Since a channel is created explicitly between two processes, each one of them can use it for send/receive messages, represented in the form of send/receive events in process local history. New channels can be created, even during the execution of a distributed 170

computation. An agent or a process can be connected to more than one channel at a time. That means that a process can communicate and exchange messages with more than one other process. Also it allows group communication, when a process sends a message to a group of processes.

m

Q

P m

a communication channel Figure 6-20: A communication channel between processes P and Q

We could use a graph to illustrate process communication via communication channels, as for example it is proposed by Tsaur & Horng [2001]. In this way, vertices would represent processes, while edges would represent channels used for process interaction. A distributed service (e.g. a security service) can be provided by one or more server processes, interacting with each other and with other client processes, in order to maintain consistency on service‟s resources. Processes (e.g. server processes or peer processes) encapsulate resources (e.g. objects), in the exchanged messages, and allow clients to access them through interfaces. Interfaces are represented as components in the application level. Principals (users of other processes as participants) are authorized to operate on resources. Resources must be protected against unauthorized access. The issue of security in distributed systems mainly derives by the need of sharing resources via communication channels (e.g. secure channels as SSL) used in the network level by processes in order to exchange messages. These channels should be secured in order to deliver secure distributed computations. In the processing level, this requirement mainly is achieved by the use of secure distributed transactions. In Figure 6-21, a client requests access on system resources from a server process ( send (m) ) and permission can be granted based on client‟s (a user) access control rights (privileges). In this case, the server replies on this request by issuing a send (m) message.

171

send (m)

client process

send (m)

server

client

process

process

request reply

server process

Figure 6-21: Exchange of messages between client and server processes

In OSI X.800 security architecture [1989], a security service is defined as „… a processing or communication service that is provided by a system to give a specific kind of protection to system resources. It is a service provided by a protocol layer of communicating open systems which ensures adequate security of the system or of data trasfers…‟. And the actual organization of a DIS refers to the way that processes are organized, usually following the client-server architecture (Figure 6-21). Therefore, the issue of security has to do with the applied level of security in each layer of communication between the different layers of a DIS, expressed in terms of the provided security services as core processes of the system. Lamport clocks [1978] [1986] as well as Vector clocks [Fidge, 1988] [Mattern, 1989] [Basten et al., 1997] use numbers, for example, in order to define the conditions for comparing Vector timestamps or to define a total order of events in a global process history by taking into account the identifiers (i.e. numbers) of the processes at which events occur [Babaoglu and Marzullo, 1993] . Parallel events are identified and ordered by taking into account the identifiers of the processes at which they occur. Vector timestamps also have the disadvantage, compared to Lamport timestamps, of taking up an amount of storage and message payload that is proportional to the total number of processes in the distributed system. Totally ordered clocks, an extension to Lamport clocks, in order to be addressed and distinguished, need two elements, the value of the process clock along with the process identifier. The shortcoming of Lamport clocks is that by having L(e), L(e) as the values of the clocks for two events e, e belonging to the same or different processes, it is not implied that if L(e)  L(e)

then e  e , as it might be the case that e || e . The issue of event-ordering or causal-ordering in distributed systems is crucial especially for security purposes as it is directly connected with non-repudiation, accounting and auditing control. Furthermore, in distributed transactions via secure communication channels it is vital to know the condition of each activity in runtime level in order to control and evaluate the needed security measures e.g. security 172

policies in the form of security restrictions implemented as security mechanisms, to ensure data integrity and availability. The following analysis intends to clarify the use of the partial relationship „happenedbefore‟ as well as the cases where two events are parallel. Furthermore, it will try to examine if event-ordering in a distributed system can be defined using higher order logic in the context of category theory. For an event a in a process A, one of the following is true: 

There is no event in A prior to a . That is a is the initial object in A and it is internal (Figure 6-22 [i])



There is an event b in A prior to a . Both of them are internal events in A (Figure 6-22 [ii])



There are two events prior to a , an event b in process A and an event c in another process C. In this case there is an exchange of the message m; a = receive(m) and c = send(m). Event b is internal in A. (Figure 6-22 [iii])

The dashed arrows represent none, one, or more events in the local history of a process. [i]

A a

A

b

a

A

b

a

C

c

[ii]

[iii]

Figure 6-22: Internal events and events on a message exchange

Between two events a , b that belong to the same process A, if a  b , then there may be a series of events e, e, e,.. that occur between them. That is the arrow between two distinct events a , b can be either single or composite. It should be noted that Lamport [1978] argued that those two events can also belong to different processes, thus the intermediate series of events can also belong to different 173

processes. In this case, this sequence of events might not be unique, meaning the repetition of some of them. There may be two events a , b that belong to two different processes with no arrow between them (that is neither a  b , nor b  a is true). These two events can be internal to their process or participating on a message exchange (a send(m) or receive(m) event) and called parallel events. Thus, parallel events never belong to the same process. The associations between the events of two processes in the case of exchanging a message m, is presented in the following Figure 6-23.

A

a

a

a

B

b

b

b

Figure 6-23: Potential relationships between the events of two processes in the case of a message exchange

In terms of the partial ordering „happened-before‟, a  send (m) , b  receive(m) , a  b , a  a , a  a , b  b and b  b . Questions occur about the relationship

between the following pairs of events: 

a, b o as neither a  b , nor b  a o



a || b

a, b

o as a  a  b o 

a  b

a, b o as a  a  b  b o



a  b

b, a

174

o as neither a  b , nor b  a o 

b || a

b, a o as neither b  a , nor a  b o



b || a

a, b

o as a  b  b o 

a  b

b, a o as neither b  a , nor a  b o



b || a

a, b

o as neither a  b , nor b  a o

a || b

The above relationships can be summarized, in general, in cases when two events of two different processes are parallel, or can be partially ordered by the „happenedbefore‟ relationship, as given below, for events e, send (m)  A e, receive(m)  B : Partially ordered 

If e  send (m)  e  receive(m)



If e  send (m)  receive(m)  e  e  e



If receive(m)  e  send (m)  e

Parallel events 

If e  send (m)  e  receive(m)  e || e



If e  receive(m)  e || send (m)



If e  receive(m)  send (m)  e  e || e



If send (m)  e  receive(m) || e



If send (m)  e  receive(m)  e  e || e

The arrows in the above relationships can be either single or composite. Furthermore, if e  send (m) or e  receive(m) , then the relevant relationship e.g. e || e is true for all the events prior to e or e , until we find an event that belongs to a consistent run R of the system. Thus, for example, if 175

e  e  e  e  e || e implies that e || e , e  A, e  B , with e, e  R .

Similarly, if send (m)  e or receive(m)  e , then the relevant relationship e.g. e || e is true for all the events after e or e , until we find an event that belongs to R.

Thus, for example, if e  e  e  e  e || e implies that e || e , e  A, e  B , with e, e  R . A total ordering of the events in a process of a distributed system, represented as a category E, can be given by a comma category (e  E) where e is the initial event of E (E-objects under e). This total ordering expresses also the individual, local history history (E) of process E. A finite prefix of the process’s history until a specified event e (with e  e  e ) can be given by a comma category ( (E  e) , that is E-objects

over e . A cut C of the system‟s execution is a subset of its global history, which is a union of prefixes of process histories. Thus, a cut C consists of the individual comma categories (E  e) as subcategories; in this case, the functor T : (E  e)  C sends every object and arrow of the comma category to itself in C (an inclusion function). In the case where E  D , in the pullback diagram with comma categories introduced in §5.3.8, represents all the pairs of events of two processes E and D (the two categories represent actually the local histories of the two processes), the comma category (T  S ) represents all the parallel events between these two processes. Let S be the category of sentences of propositional logic. We can consider S as a preorder (S, ) , where p  q means that from p we can derive q. Then S forms a Cartesian closed category, where products are given by conjunction of propositions and the exponential q p corresponds to “p implies q”. Let us now have two arrows f , f  between two events a, b that belong to the same or different processes, with f representing the partial causal relation „happened-before‟ and f  the relation „is parallel with‟. Between these two events, only one of them can be true; either one event happened before the other or they are parallel. Let us have two natural transformations (2-cells) between the two arrows as  , : f  f  . Let  represents the statement „if f is true then f  is false‟ and  represents the statement „if f is false then f  is true’; both of them are composed by S-sentences of the form „ g is true‟ or „ g is false‟. Then, a modification (3-cell)  :    may modify the 176

relation between the two events. For example, when an event a that belongs or not to a consistent run R, it may be determined if it is parallel with another event b (Figure 6-24).

f

a







b

f Figure 6-24: A modification  that determines if two events are parallel or not

Processes interact through a network that is shared by many users (enemies as well). Usually, we are in the need to build a reliable system that runs over an unreliable communication network – that means we are forced to deal with uncertainty. Security activities, dealing with security requirements, policies, services and mechanisms, take place in the form of distributed computations at application, network and host level. In the end, all of them are translated into a system of interacting processes. Any application-level communication is mediated by a specific security policy. Security services are implemented by security mechanisms. Security services materialize specific security policies which in turn satisfy security requirements. The correspondence between security policies and security requirements is shown in Figure 6-25. In these security requirements we should add also availability, as it is associated mainly with fault-tolerance policies, services and techniques.



Access control policy



Inference policy



User identification policy



Accountability & Audit policy



Consistency

Access control Confidentiality Authentication Accounting & Auditing

Integrity

Figure 6-25: Correspondence between security policies and security requirements

177

An example of illustrating higher order logic, in access control, is given in Figure 626.

system

system

server

client

resource

resource

process

process

repository

Access control service

message resource

server

repository

process

client

message

process

Access control interfaces

Figure 6-26: Higher-order logic in the case of access control

178

6.7 Monads and comonads in the 4-level architecture 6.7.1

Evaluation and comparison of two policy frameworks using the cube and the monads/comonads

CST

CPT

a

P

f

a

Pa



Pa

Pf

Pf

P Pb

b

Pb

b

Pop

P op

 CPT

 Pa

Pop Pa

 a

Pop a

Pop a

Pop Pf Pop Pa

Pop Pa

Pop Pa

 Pa

 Pb

Pop Pb

P Pf  b

 f

Pop Pf

Pop Pb

op

P op Pf

Pop b

Pop Pb

 Pb

Pop b

Pop Pb

Figure 6-27: Adjointness (evaluation) and comparison of two applied policy frameworks based on the same security principles (confidentiality, integrity, availability)

179

The evaluation (based on the adjoint functors P and P ) and comparison (natural transformation  ) of two security policy frameworks based on the same security principles of confidentiality, integrity and availability, is presented in Figure 6-27. Functor P is a left adjoint to functor P op . Functor P is a left adjoint to functor Pop . The adjunction  P, Pop , ,   gives rise to a monad T  T , ,   with T : CPT  CPT as an endofunctor T  Pop P and natural transformations  : ICPT  T ,  :T 2  T . The counit of the adjunction is  : PPop  ICST , while the endofunctor T has composites T 2  T T  Pop PPop P  Pop P : CPT  CPT and T 3  T 2 T  Pop PPop PPop P  Pop P : CPT  CPT . The associative, interchange law

and left & right unit laws for the monad T  Pop P are given in Figures 6-28, 6-29 & 630, respectively. Pop PPop P

Pop PPop PPop P

Pop PPop P

Pop P

Pop PPop P

Pop PPop P

PPop PPop

Pop P

 PPop

Pop P

Pop P

PPop

PPop

PPop





I CST



Figure 6-28: Associative law for T  P P op

Figure 6-29: Interchange law for T  P P op

By applying the left and right unit laws, it produces the two triangular identities,

1  Pop  Pop : Pop  Pop ( Pop  Pop : ICPT Pop  Pop , where P P  ICPT   Pop PPop   Pop ICST  Pop ) and op

op

P P  PPop P   ICST P  P ) 1   P P : P  P (  P P : PICPT  P , where PICPT 

making the following diagram commute: ICPT Pop P

 Pop P

Pop PPop P

P P

Pop P

Figure 6-30: Left & right unit laws for

op

=

Pop PICPT

=

T P P op

180

Pop P

The adjunction  P, Pop , ,   also, gives rise to a comonad L   L,  ,   with L : CST  CST as an endofunctor L  PPop and natural transformations

 : PPop  ICST ,  : L  L2 with   P Pop . By applying the left and right unit laws in the case of the comonad, it produces the two triangular identities, 1   P P : P  P P P (  P P : P  ICST P , where P   PPop P   ICST P  P ) and

1  Pop  Pop : Pop  Pop ( Pop  Pop : Pop  Pop ICST , where P P  Pop   Pop PPop   Pop ICST  Pop ) making the following diagram, in Figure 6op

op

31, commute:

 PPop

ICST PPop

PPop

PPop PPop

PPop ICST

PGPop =

=

PPop

Figure 6-31: Left & right unit laws for L  PP

op

Having a CPT-object a and a CST-object b , the unit and the counit of the adjunction involved in each of the three cycles is presented on Figures 6-32, 6-33 and 6-34. CPT

a

CST

a

op

P Pa

f

P

g

Pf

P op g P op b

Pa

P op

PPopb

b

b

Figure 6-32: The unit and counit of the adjunction on the 1 st cycle of a monad

181

CPT Pop Pa

CST

P

op

Pa

PPop Pa

Pop PPop Pa GF

P

P op g 

f

g

Pf 

Pop PPopb

PPop PPopb

P op

 FGb

PPopb

Figure 6-33: The unit and counit of the adjunction on the 2 nd cycle of a monad

CPT

Pop PPop Pa

CST

P

op

PPop Pa

Pop PPop PPop Pa

P

Pf 

P g 

f 

op

P op op

op

op

P PP PP b

PPop PPop Pa

g 

PPop PPop PPopb  PPop PPopb PPop PPop b

Figure 6-34: The unit and counit of the adjunction on the 3 rd cycle of a monad

The above three diagrams can be combined in one as in Figure 5-32, where T  Pop P ,

T 2  T T  Pop PPop P , T 3  T 2 T  Pop PPop PPop P , L  PPop , L2  L L  PPop PPop and L3  L2 L  PPop PPop PPop . In that diagram, the dashed arrows represent the unit and the counit of the adjunction involved in each of the three cycles, for a CPT-object P op b and a CST-object Pa .

Similarly, the adjunction  P, Pop , ,   gives rise to a monad T   T , ,   with T  : CPT  CPT as an endofunctor T   Pop P and natural transformations

  : ICPT  T  ,   :T 2  T  . The counit of the adjunction is   : PPop  ICST . The comonad for the second adjunction is L   L,  ,   with L : CST  CST as an endofunctor L  PPop and natural transformations   : PPop  ICST ,   : L  L2 with

   P Pop . 182

In order to address a security policy framework for achieving the three security principles, confidentiality, integrity and availability, three pairs of adjoint functors are needed, one corresponding to each security principle. That means three adjunctions,  P, Pop , ,   ,  P, Pop , ,   ,  P, Pop , ,   between categories CPT and CST . Each one of them can be visualized as an integrated schema, such as in Figure 6-

35. In the top part of it (monad construction), the closure is provided by the object a (covariant way) while in the bottom part (comonad construction), closure is provided by the object L2 Pa (contravariant way). The last one varies in order to express the different ways of applying the desired level of security in terms of different security implementations.

P

P op b

op

b

TPopb

Ta

a b

 Lb

Pf

g

T 2a

Ta

Lb

b

Pa

 P 

op

P b

b

T P b op

 a  b

T

T 2 Pop Pb

T 3a 2

a

L b 2

L3b GF

g 

T a

Lb

Pf Pa

 Ta

 L b

k

  Pa

L2 Pa

 LPa T P

op

b

T P b 2

op

h Pop g 

Pop g 

Pop g

k

LPa

h

h

b

L2b GF Pf 

g

 Pa

op

a

b

P op g 

P op g

a

op

f 

f

f

TP

T 2 a

L 2b GF Pf  LPa

T 2 a  L  b 2

k 

 L Pa

Figure 6-35: The integrated diagrams for the monad/comonad construction – the 3 cycles, in the case of having two adjunctions between categories CPT and CST

T 3a

L 3b GF Pf  L Pa 2

183

6.7.2

Maintaining database consistency

In the case of maintaining database consistency in distributed transactions (and transactions, in general) via secure communication channels, the three cycles of the monad/comonad construction are defined in Figure 6-36. 1st cycle: Applying integrity semantic control 2nd cycle: Applying protection (based mainly on encryption techniques) 3rd cycle: Applying the ACID properties of a transaction

Figure 6-36: The three cycles of the monad/comonad construction for maintaining database consistency

Mechanisms of maintaining database consistency include reliability, protection, semantic integrity control and concurrency control. Database reliability is expressed in terms of the atomicity and durability properties while concurrency control is expressed in terms of the consistency and isolation of the ACID properties of a transaction (Figure 3-11). Thus, reliability can be expressed as a 2-cell between atomicity and durability with concurrency control as a 2-cell between consistency and isolation (Figure 6-37).

atomicity reliability

durability

consistency concurrencycontrol

isolation

consistency atomicity



concurrencycontrol reliability

isolation durability

Figure 6-37: The ACID properties of a transaction expressed as horizontal composition of 2-cells, in terms of reliability (atomicity & durability) and concurrency control (consistency & isolation)

The vertical category T (CST, SCH) (i.e. a bifunctor) is defined from the vertical composition     : SemanticIntegrityControl  Protection  ACID : CST  SCH as can be seen in Figure 6-38. There, the dual constructions ensure the three adjunction cycles occur.

184

CST

SCH

CST

SemanticIntegrityControl

SemanticIntegrityControl

ProtectionControl

ProtectionControl

ACIDproperties

ACIDproperties

op

op

op

Figure 6-38: Database consistency in terms of its components, Semantic Integrity Control, Protection Control and ACID properties (for reliability and concurrency control)

The integrated diagrams for the three adjunctions between the categories CST and SCH are similar to those in Figure 6-35. Instead of functors P, P and the covariants P, P now we have the functors SemanticIntegrityControl , ProtectionControl , ACIDproperties and the covariants SemanticIntegrityControl , ProtectionControl , ACIDProperties . Also, instead of two diagrams, there are three integrated diagrams, one for each adjunction. In the end, there are three integrated diagrams, one for each adjunction of the three cycles of a monad/comonad construction. In this case, closure is given by the exponential object

CC . The correspondence between a functor category and a vertical category of 2-cells, by substituting category E with category C, is given in Figure 6-39.

CC

CD

CD  DC

DC Figure 6-39: functor categories and the correspondent vertical categories in database consistency

T (D, C)

T (D, C)  T (C, D)

T (C, D)

T (C, C)

185

6.8 Identify risks based on threats and vulnerabilities Attacks on a target, through communication channels, from attackers (enemies or adversaries) are given by the exponential target attacker in Figure 6-40. The higher-order evaluation function eval () provides the target of the attacks from an attacker. The target of a specific attack from an attacker can be given also from function g () applied on the product of an identified risk from an attacker risk  attacker . Then, the function curry( g ) identifies all the security attacks (passive and active) on a specific target from

an attacker based on an identified risk exposure. Arrows f , f  represent attacks on a specified target. risk

1

2

risk  attacker

f

g

curry( g ) 1attacker

curry( g )

attacker

f target

1attacker f

eval

f

target attacker

 1

target attacker  attacker

 2

attacker

Figure 6-40: Illustrating attacks on a specified target from an identified risk from an attacker, based on exponentiation

Entities as risk, attacker and target as well as the entities in the next two examples are comma categories themselves, in levels SCH and DAT (schema entities from the SCH category are projected onto simple or complex objects in the category DAT).

6.9 Balancing the cost of security measures against their effectiveness The effectiveness and the cost of the security techniques that are needed should be balanced against the threats, thus providing an assurance that information risks and security controls are in balance. A risk is the result of the interaction of threats and vulnerabilities. Both must exist for a risk to exist. The cost is defined in terms of 186

computational effort and network usage. Each security measure achieves a level of control (i.e. the effectiveness of the security measure). Figure 6-41 provides a view for a threat assessment (based on a threat model), risk evaluation and control as well as evaluation of the effectiveness of applied security measures, in categorical terms displaying the appropriate categories. A category

A category

Threat analysis

A comma category System

Potential

vulnerabilities

attacks

Threat

A threat model

Potential risks

assessment

A category

A comma

Security

category

control

Risk evaluation and control as adjoint pair of functors

Security measures Balanced cost of

A comma

applied security

category

Computational effort –

measures Actual cost as a colimit

distributed

A category

– cost can be reduced

computations based on processes

Actual

based on actual security

cost

needs

A comma Network usage A category

communication

category (pushout)

channels Figure 6-41: Threat assessment, risk evaluation and control and effectiveness of applied security measures

Figure 6-42 shows how a risk can be defined in terms of threats and vulnerabilities. In Figure 6-43, the effectiveness (control) of a security measure on identified risks is 187

balanced against cost. The comma category (T  S ) , in each case, can represent all the possible combinations of threats and vulnerabilities (Figure 6-42) or risks and security measures (Figure 6-43), by choosing the appropriate type of functors T and S (e.g. selection or identity functors). In Figure 6-43, the actual cost, corresponding to the overall cost (in terms of computational effort and network usage) is represented as a pushout in the context of the balanced cost. The unit control and counit  balancedcost of the adjunction  F , G,control ,  balancedcost  : control  balancedcost provide the measures of keeping a balance between the cost of the applied security measures and their effectiveness in minimizing security risks.

threat  vulnerability

L

Q

P

(T  S ) P

R

Q vulnerability

(risk  risk )

threat

risk cod

risk dom

S

T

risk

Figure 6-42: A risk identified as a result of the interaction of threats and vulnerabilities

188

risk  securitymeasure

L

Q

P

(T  S ) P

Q

R

securitymeasure

(control  control )

risk

control dom

control cod

S

T

control G

F

balancedcost

N

N

computationaleffort

networkusage

J

I

I

actualcost

J

M

computationaleffort  networkusage Figure 6-43: How to keep the balance between security controls (the effectiveness of the applied security measures) and costs (in terms of computational effort and network usage)

189

6.10 Integrating two intrusion detection methods – auditing and logging procedures Not a single IDS approach can detect all types of intrusion, but it is better a combination of an anomaly detection approach with a misuse detection approach. Anomaly detection is based on the assumption that an attack on a computer system will be noticeably different from normal system activity and an intruder will exhibit a pattern of behaviour different from normal user. In the case of misuse detection, a collection of known intrusion techniques is kept in a knowledge base and intrusions are detected by searching through the knowledge base for the same techniques. One of the methods that have been used for implementing the two approaches is auditing (using audit trails). Audit trails can help a system administrator trace a security violation once it occurred, if possible back to the user responsible for it. They are made of log typed events of user actions. An auditing service (e.g. implemented on a dedicated audit server) can be used for monitoring and logging procedures. In Figures 6-44 & 6-45 a formal approach (on the top level) is presented. It combines a comparison of attacker‟s steps with user profiles representing normal user behaviour as well as a comparison with stored intrusion techniques. System activities (including users, attackers and other processes) are given by the category SA. User profiles are represented with the category UP. The comma category (SA  G) represents intrusion activities while the comma category (F  UP) represents normal users activities. Then, the product functor R  L R constrains the global extensionality SA  UP , that is all the combinations of system activities with stored user profiles (including stored intrusion techniques), to the comma category (T  S ) , that is the local association of an intrusion activity with a normal user profile. In Figure 6-45, each single step (event) of the attacker is compared to the corresponding step of a normal user behaviour profile. The unit  c of the adjunction  F , G, ,   expresses the deviation of a normal user behaviour, while the counit  d defines a new intrusion technique (event FGd differs from event d) which needs to be stored to the knowledge base. The functors I , I  are the inclusion functors from the comma categories

(SA  G) and (F  UP) (locally) to the categories SA and UP (globally), respectively, as subcategories. 190

SA  UP L

(SA  G)  ( F  UP) R P

P

Q

T S

Q

Q

P

(F  UP)

(SA  G)

I

I F

SA

UP G

Figure 6-44: System activities and user profiles (global extensionality) constrained to intrusions and normal user activities (local extensionality)

SA

UP

(SA  G)

c

c

( F  UP)

GFc

F

Fc

Gd

G

FGd

P

SA  UP

d

Q

c, d 

Figure 6-45: The unit and counit of the adjunction between intrusions and normal user activities express the deviation of the normal user behaviour and a new intrusion technique, respectively

191

6.11 Inter- and intra-relationships in the architecture Lawvere, in his Dialectica [1969a], deals among others with the notion of a hyperdoctrine. Thus, for a category T, with objects of the form type the types of a typed system (i.e. the distributed system under consideration, where types refer directly to system components i.e. an object type or a relation), there is an associated CCC category P(type) , the attributes (and valid arrows between them) of a particular type (i.e. the

attributes of an object type or a relation). In category T arrows 1  type are constant f terms of type type . For a morphism type1   type2 in T, there is a corresponding f () () f functor P(type2 )   P(type1 ) in P(type) with P(type1 )   P(type2 ) a left-adjoint ()  f and P(type1 )   P(type2 ) a right-adjoint, that is ()

f  | f () | ()  f .

In the four-level architecture, where arrows follow a contravariant logic of the form value  name and name  type (and therefore implicitly mediated arrows of the form

value  type ), a particular user Alice is a user_instance, and ultimately a system user.

These associations are presented in Figure 6-46. With the object Alice is associated a CCC P( Alice) , the values of the attributes of that object with a correspondence between P(user ) and P( Alice) . P(user )

P( Alice _ object )

P(type)

P(value)

user

Alice _ object

type

value

user _ instance

user _ instance

name

name

Figure 6-46: The association between a

value (a record or an object) and a type of a system

component, through the corresponding CCC

P(type) and P(value) .

According to Lawvere [1969a], in every adjunction, we assume that one of the categories is the dual category of the original underlying category for this part of the adjunction. Assuming that in category SCHop , morphisms are of the form type  name , that is type objects generally can be distinguished to one or more name objects based on specific 192

attributes (i.e. a CCC P(type) category, that is the corresponding category of attributes of type ), then in SCH morphisms of the form name  type . For example, for each event

e Ax in a process A, there is a CCC P(e Ax ) as the attributes of that event. Morphisms in category DAT are of the form value  name . Then, in the bottom-up direction we have adjunctions of the form (D, Cop , F , G) i.e. (DAT, SCH, F , G) for functors F : DAT  SCH , G : SCH  DAT (increasing abstraction). Looking at the top-down direction (decreasing abstraction), that is (Cop , D, G, F ) i.e.

(SCH, DAT, G, F ) for functors F  : DAT  SCH , G : SCH  DAT we have, similarly, morphisms value  name in DAT and morphisms of the form name  type in SCH . In every case, functors between categories are covariant in order to preserve

isomorphisms between left and right hand CC comma isomorphic comma categories, which participate in the adjunction. The case of describing an adjunction following the bottom-up direction of the architecture is shown in Figure 6-47. For an object value in DAT and an object type in SCH, the object name (in the form of object G(type) in DAT and F (value) in SCH) is used as a comparison measurement in the unit and counit of the adjunction.

value

value

GF(value)

F(value) F

f

g

Ff

Gg G

G(type)

FG(type)

 type

type

Figure 6-47:The unit and counit of the adjunction between levels DAT and SCH (i)

or, value

value

G(name)

name F

f

Ff

Gg name

G

F (name)

g

 type

type

Figure 6-48: The unit and counit of the adjunction between levels DAT and SCH (ii)

193

In Figure 6-48, an object value is examined in the context of an object type through the adjunction F  | G . An object value, in practice, is associated to an object name, through a morphism f. An object value is measured for integrity, consistency, semantics, etc., through functor G applied on object name. Practically, an object value is of a type type of the next level. An arrow g : name  type is related to an arrow f : value  name in such a way that solves the equation g   type Ff . The same analysis is followed also in the case of constructions as monads and the dual comonads, in order to explain how closure, in the application level of a distributed system, is achieved through natural transformations of endofunctors (participating in the construction of the underlying Cartesian closed isomorphic comma categories as well as their interaction wrapped in the three-cycles scheme, presented in §5.3.4). The role of objects of the form name can be explained in terms of the derived comma categories generated in the two levels, between each level pair. As can be seen, for consistency purposes, in a typed system, objects of the form nameij ( name11  G(type1 ) , name12  G(type2 ) , …) belong to a category NAMES (a preorder), objects of the form f f  name11 , value12  valueij ( value11 , value12 ,…| value11   name11 ,

f f  name12 , value22   name12 , etc.) belong to a category value12 , value22 ,…| value12 

VALUES (its internal structure will be analyzed in the following pages) and objects of the form type j belong to a category TYPES (a preorder). Figure 6-49 illustrates that association. Following the same principles, it can be redesigned in order to express the association between objects name and objects type (in the form of the corresponded Cartesian closed comma categories), in more detail. value11

value12 f

f

value12

value13

f 

name11



value22 f

f

value23 f 

2 1

name

g

g

type1 Figure 6-49: Association between objects of the form value , name and type

194

Figure 6-50, presents the case where an object value is associated with more than one objects name (e.g. multiple inheritance in object-oriented paradigm)

name12

name11

f value11

f value11

1value1 1

Figure 6-50: Multiple inheritance in terms of objects value , and name

A question that needs to be answered is the following: If morphisms of the form value  name in the fourth level and name  type in the third level are constructed by

objects from different levels, in which way the unit and counit of an adjunction between the levels, can be described?

value11

value

1 1

Gname11

name11 F

f

name11

g

Ff

Gg G

Fname11

 type

type1

1

Figure 6-51: The unit and counit of the adjunction between levels DAT and SCH (iii)

In Figure 6-51, an arrow g : name11  type11 is related to an arrow f : value11  name11 in such a way that solves the equation g   type1 Ff . Thus, between the third and the fourth 1

level there is a pair of adjoint functors F-|G with Fname11  1name1 and Gtype1  1name1 in 1

1

order to ensure consistency in type changing (through natural isomorphisms). In additional, there are all the other adjoint pairs of functors (that belong to the functor category corresponded to that level pair) in order to measure qualitative and quantitative changes taking place in security components, including system resources. These functor categories are needed in order to describe the complexity in security activities in a distributed system (different types of vulnerabilities, threats, risk, attacks, security breaches etc.) as well as to evaluate security measures effectiveness and to balance the 195

cost based on risk management procedures. Such high order security activities can be illustrated with the use of the Cube and the Lattice of Cubes.

C

H M

E D

K

L

Q

P

(T  S ) P

Q

R

(C  C)

E

T

Ccod

Cdom

D

S

C Figure 6-52: A comma category as a pullback diagram –all the functors

Another question has to do with the nature of the comma categories constructed in the four levels of the architecture. For example, in the case of assigning different roles to a user in the context of access control, as was described in RBAC models presented in §3.3.2.4, we need to define the appropriate comma categories (slice and coslice categories) and adjunctions which materialize their interactions in terms of process interaction for user authorization in order to have access on system resources. In the pullback diagram in Figure 6-52, the product functor L and the inclusion functor R compose a bifunctor R . Functor T can be the free functor T : E  C (e.g. covariant 196

functor O : CST  SCH ) while S can be the underlying functor (e.g. adjoint functor I  : DAT  SCH ). Thus, in 4-level architecture, the different schemas are defined in

terms of constructs (meta-meta-data) and data, for example a security service as a result of a security policy and the implemented security mechanisms. The product category E  D provides all the possible combinations, for example between a security policy

implemented by security mechanisms in the context of a security service. Other examples have been given for intrusion detection and parallel events in distributed programming (§6.10 & §6.6). In Figure 6-52, objects of category C are of the form Te ( e is an E-object) and Sd . For an object e, d  in category E  D , it is HTe  HSd  e, d  . More details about functors M, K and H can be found in Mac Lane [1998, p. 36] (product categories). For the projection functors P, Q, P, Q we have Pe, d , f   Pe, d   e and Qe, d , f   Qe, d   d . For functors M , K , T , S we have MTe  e , KSd  d , C d0 f  Te and C d1 f  Sd . An example can be given in the field of threat analysis. Let‟s have categories vulnerabilities (for category E), threats (for category D) and risks (for category C). Suppose that risk management procedures have shown that for a vulnerability v there is a potential risk Tv and for a threat t there is a potential risk St. From risk analysis it is known that a risk is materialized and becomes real (i.e. identified) when a specific threat takes advantage of a specific vulnerability, that is Tv  St , in comma category (T  S ) . In order to satisfy arrows of the form value  name , that is a collection of objects value is associated with an object name in the fourth level (and finally with an object

type of the third level through arrows of the form name  type ), objects of category

VALUES are associated with objects of the category NAMES in the context of a category VN through a comma category (T  S ) , where T  1VALUES : VALUES  VN , the identity functor for category VALUES and S : NAMES  VN , the selection functor for category NAMES (e.g. Sname11 ), as can be seen in the next Figure 6-53. Thus, objects of category VN are objects of category NAMES, projected to category VN using selection functor S.

197

(T  S ) Q

P

VALUES

NAMES

(VN  VN) K

M

S

T VN

Figure 6-53: Adjoint functors in a pullback diagram in level DAT

For example users (system components as client and server processes) can be distinguished based on specified characteristics (of the type name , in the fourth level) using the identity functor T (based on that specific characteristics of objects of the form value i.e. a user_dimitris) assigned to a specific object name (e.g. a user_instance) using the selection functor S . In this way, we can explain examples as those given in §6.9 and §6.10. The next question has to do with the nature of functors M and K. In Figure 6-53, functor M is a right adjoint to identity functor T, while functor K is a right adjoint to selection functor S (Figure 6-54). Thus we preserve type changing in the fourth level (that is based on arrows of the form value  name ) and similarly in every level of the four-level architecture, as can be seen by the pairs of isomorphic comma categories

(VALUES  M ),(T  VN) with T  1VALUES and (NAMES  K ),(S  VN) for objects r of category VN (Figure 6-55). In other words, S is a two-sided inverse of K.

198

VALUES

VN (T  VN )

( VALUES  M )

v11

Tv11

T Mr

Mr 

r

r

M

2 1

v

Tv12

Figure 6-54: Isomorphic categories between VALUES and VN in level DAT

NAMES

VN ( S  VN )

( NAMES  L )

n11

n12 Kr

Kr 

S

K

Sn11

Sn12 r

r

Figure 6-55:Isomorphic categories NAMES and VN in level DAT

In Figure 6-56, the unit and the counit of the adjunction T  | M is illustrated. An object v in category VALUES is recognized as of mediate type object r in category VN. An arrow g : Tv  r is related to an arrow f : v  Mr in such a way that solves the equation

g   rTf . The projection functors P and Q ensure the consistency of this association; Pv, r   v and Qv, r   r , in the isomorphic comma categories (VALUES  M ) and

(T  VN) . In order to define the nature of objects Mr, MTv, Tv, TMr we should examine the internal structure of comma category (T  VN) . The analysis shows that it consists of a collection of subcategories, each one of them corresponded to an object r , which in turn comes from an object n of category NAMES. In a similar way, we can represent the second

199

pair of isomorphic comma categories in the fourth level, derived from the adjunction S  | K , that is categories (NAMES  K ) and ( S  VN) . VN

VALUES

(T  VN )

( VALUES  M )

v

v

MTv

T

Tv

M

TMr

Mr

P

VALUES  VN

r

r

Q

v, r 

Figure 6-56:Objects of VALUES  VN projected in the isomorphic comma categories

The objects of category VN are of the form r of types Tv , where v is an object of category VALUES and Sn , where n is an object of category NAMES. In Figure 6-57, which is based on Figure 5-37 that gives the general form of representing a comma category as a pullback, objects Hr in product category VALUES  NAMES are either of the form HTv  v, n or HSn  v, n . Also, it is PHTv  Pv, n  v and QHSn  Qv, n  n . Thus, functors M and K are forgetful functors, operating as right

adjoints to functors T and S of the cartesian closed comma category (T  S ) . Through functor K, an object of category VN is associated to an object of category NAMES. The nature of forgetful functor M : VN  VALUES is illustrated in Figure 6-58. As can be seen, functor M determines representing objects of the form Mr in category VALUES , each one as a cone in the comma category (VALUES  M ) , defining thus a

collection of subcategories (that is, M is a selection functor for objects r of category VN , each one corresponded to objects n in category NAMES.

200

VN M

VALUES

P

K

H VALUES  NAMES

Q

NAMES

L

Q

P

(T  S ) Figure 6-57:A bifunctor in the pullback diagram of a comma category between VALUES and NAMES

VN

VALUES

(T  VN )

( VALUES  M )

Tv11

v11 v12

T

v13

Tv12 Tv13 Tv12

v12

v22

r

Mr

M Mr 

r

v23

Tv22 Tv23

Figure 6-58: Representing objects in the isomorphic comma categories

In order to represent all the complex associations between security components in a distributed system, it is necessary to define pairs of adjoint functors which consist the functor category, between each level pair and inside levels, in the form of 2-cells and 3cells interacting in monad/comonad constructions. 201

X

(T  S )



send (m)

receive(m)





send (m)



Figure 6-59: The comma category in the case of visualizing a communication channel between two processes





receive(m)



 send (m)

receive(m)



S

T

XC

XA



send (m)

receive(m)

send (m)

 



 receive(m)

send (m)

receive(m)



G

F C

A send (m)

send (m) send (m)

eA

eB

eA eB eA

eB

receive(m) receive(m) receive(m) 202

For two processes A and B (single-thread processes or threads from the lists of the parallel threads of two underlying processes) communication via a channel X is presented in Figure 6-59. Functors F and G pick up send and receive events taken place in processes A and B , respectively, in the exchange of messages. The Cartesian closed comma category (T  S ) describes all those pairs of send/receive events for A and B , when send events are referred to A and receive events are referred to B . For example, process A sends a message m (by issuing a send (m) event) and subsequently process B receives this message (by issuing a receive(m) event). Two special events while

setting and using a channel in this way, is the initial open() which starts a session and the terminal close() event which terminates the session in that channel. We should mention that the exact number of exchanging messages between the processes via a communication channel is not known in the beginning of the session (the existence of parallel threads increases that number, although it can be anticipated using probabilistic methods). Thus, the generation of new events can be accomplished through recursion operations (i.e. A  {...}.A ) using repetition that is based on the free variables of a monad construction implemented as a closure (similar operations can be used also for replication, i.e. !A , as for example in process group communication). Communication channel X can be used also for exchanging messages in the opposite direction, that is, process B sends a message and process A receives that message (issuing every time the appropriate send/receive events). Local histories in processes A and B can include any number of events prior, after and between send and receive

events (e.g. eA  send (m)  eA  send (m)  eA  send (m) or

eB  receive(m)  eB  receive(m)  eB  receive(m) ). The symbol  is used to express the happened-before ordering of events. The execution of a distributed system is characterized as a series of transitions between global states of the system, that is initial prefixes of the individual process local histories, as S0  S1  S2  ... . That means, the evolvement of the distributed system consists of a series of natural transformations from a global state to a new one. The global history of a distributed system P is the union of the individual histories. A global state S j 

sij ,

203

where s ji is the state of a process pi , corresponds to initial prefixes of the individual process histories. Let us have sij , sij 1 , two consecutive states for a process pi (0-cell). Then the transition from state sij to the next one sij 1 (both are defined as 1-cells sij , sij 1 : pi  pi ) can be expressed as a natural transformation (i.e. 2-cell)

l j : sij  sij 1 : pi  pi . States sij and sij 1 are defined as Cartesian closed monads (i.e. sij and sij 1 are both endofunctors) for the associated adjoint pairs of functors in the

functor category of each level-pair (following a top-down or a bottom-up view). The execution of a distributed system is characterized as a series of transitions between global states of the system S0  S1  S2  ... . A state S j 1 is reachable from a state S j , if there is a linearization L j that passes through S j and S j 1 , which is defined as a 2-cell L j : S j  S j 1 . All linearizations pass through consistent global states. Thus, the

transition between two consecutive linearizations is defined as a modification of the system global state (i.e. a 3-cell) M j : L j  L j 1 : pi  pi . In Figure 5-18, cases (i) and (ii) are defined in terms of a Cartesian closed comma category, totally ordered, (e  A) . A finite prefix of the process local history until a specified event e , with e  e  e , can be given by a Cartesian closed comma category ( A  e) , that is A-objects over e (i.e. all events in process A prior to e ). Case (iii) is presented, in details, in Figure 5-19. In that figure, partially ordered and parallel events, explained in §6.6, can be defined as it is illustrated in Figure 6-59.

6.12 Functional dependencies in system components of the architecture – LCCC and 3NF Good practice in software engineering is following the LCCC approach. The family of pullbacks such as the ones introduced here can be related to the normal forms of software engineering (1NF, 2NF and 3NF) and the concept of functional dependencies. Indeed we show that LCCC correspond precisely to the industry-strength standard of 3NF for data design, therefore justifying the choice of LCCC as the underlying structures in our architecture. 204

1NF disallows „relations within relations‟, or „relations as attributes or tuples‟. That means that the value of any attribute e of a system object value in the associated CCC P(value) of a relation (or an object-type) P(type) of a type type must be a single value

from the domain of that attribute (requirement of atomicity). With the acceptance of nested relations in the relational model and in class models, this requirement is considered optional depending on the target implementation. A relation that is no better than 1NF does have the property of connectivity: all attributes are connected (although maybe not optimally), which is a starting requirement for a Cartesian closed category. The relation P(type) is in 2NF if every non-key attribute is fully functionally dependent on the primary key of P(type) e.g. a  b  c  d . If not, and there are any dependencies on part of the key, then the relation is in 1NF. A functional dependency k  e in P(type) (where k refers to the Cartesian product of the attributes of the primary key) is a transitive dependency if there is a set of non-key attributes z that is neither a candidate key nor a subset of any key in P(type) , and both k  z and z  e hold. In this case, a new relation is created having as primary key the

Cartesian product of the attributes in z (e.g. if z  {e, e} , then a new type type is created along with the associated CCC P(type) which has as the primary key the product e  e with the fully functional dependency m : e e e  e for the non-key attribute e of

P(type) ). Both, now, P(type) and P(type) are in 3NF (and are LCCC, as will be shown later). In other words, a relation is in 3NF if non-key attributes are dependent on the key, the whole key and nothing but the key. Relations that are in 1NF or 2NF but not in 3NF suffer from difficulties in updating with regards to consistency, giving rise to storage anomalies. Hence 3NF is the industry standard for information systems in the design of databases and class models. The interest in underpinning the work in this dissertation is how 3NF relates to the LCCC employed as the basis of the four-level architecture. For relations the relevant LCCC construction is the pullback/pushout diagram (intension/extension). A pullback involves a binary relationship between attributes a and b in the context of a  b (and generally of an object x ) written as  a, b (Figure 6-60), with the arrows

p1 :  a, b  a , p2 :  a, b  b , i1 : a  a  b and i2 : a  a  b (a pushout is the dual 205

construction where a and b are disjoint). Then the inner square in Figure 6-60 commutes in that i1 p1  i2 p2 (something that makes the pullback an equalizer of the upper and lower paths). There is nothing assumed about the typing of the arrows in the pullback [Kelly, 1967]: the arrows can be defined as monic or epic but there is no necessity for such further specification. Additional arrows in a relation which affect the validity of the corresponding pullback are precisely those identified as undesirable in the normalization process. For example, there may be an additional functional dependency, outside the relationship, from part of the key to a non-key attribute, as can be the case for 1NF. The choice of LCCC as the basic building block of the four-level architecture is justified as the entities defined conform to the software engineering standard of 3NF. From the process perspective, software engineering has also been adopted as an approach for good practice which approaches that of LCCC in category theory. For instance the ideal of high cohesion, meaning that there is a single purpose to each process, ensures that everything is related in a process and the mixing of unrelated data and operations is avoided. This gives one of the principles of Cartesian closed categories of connectivity (the exponential). The ideal of loose coupling whereby processes are entered or initiated only through the official interface also relates to Cartesian closed categories through the emphasis on an initial object in entrance to such categories. Process interaction in distributed transactions as closed operations in well-defined interfaces can be described using the monad/comonad constructions, in the way that was presented in this chapter as well as in previous [Rossiter et al. 2006] and new complementary work on monads [Rossiter et al. 2010]. Of course the underlying drive in software engineering is to represent the real world effectively by mapping logical structures onto physical ones. It is the physical world that ultimately drives tasks such as normalization. Mathematical structures that match the physical world are both desirable and practicable. In this sense LCCC are a fair representation of logical structures that handle effectively the physical world. In the general case where there is a relation R normalized in 3NF, with a a b  Cartesian product (universal object) and a  b 

xi

b is the

xi the coproduct (co-universal object)

206

with X  {x1 , x2 ,..., xn } the set of mutually independent disjoint non-key attributes xi and

xi  x1  x2  ...  xn , then R can be expressed categorically as a LCCC represented by a pullback/pushout diagram as in Figure 6-60. In that diagram, it can be seen why a relation R in 3NF is LCCC: all records in R, in other words all instances (i.e. slice categories), such as  a1 , b1 , xi1 ,..., x1n  ,  a2 , b2 , xi2 ,..., xn2  , etc, are CCC. The initial object is the primary key and the terminal object is the dual coproduct based on the validity of the values of the attributes of R.

a q1 a a be b

p1

u

i1

 a, b

i2

p2

q2

i1

ab

l

q1

f

g

b

u

a  b  xi

ab

q2

i2

r

xi Figure 6-60: A relation in 3NF – the general case

a

p1

l 

ab

p1

p1

p2

Figure 6-61: A composite key

k abc

i2

c

i1

ab

i2 b

i1 p2

l

abc d

p2

l

abc

i1

k

abcd

d

i2

a  b  c  d for a relation R

207

xi

In the case that the primary key is composite, e.g. defined by four attributes a, b, c, d as the Cartesian product a  b  c  d , the relevant pullbacks are presented in Figure 6-61. The following examples explain the efficacy of the integrated pullback/pushout diagram for defining functional dependencies. Let us have a relation R  {a, b, e} where the domains of attributes are a  {1, 2,3} , b  {11,12,13} and e  {40,50,60} . Then the Cartesian product a  b includes all the

possible combination between values of a and b (i.e. the limit as a universal construction). In order to define the Cartesian product in the context of the coproduct, in other words to enforce dependencies between combinations of values of a and b (i.e. subsets of the Cartesian product) and values of e so as to give the semantics of the relation, we create products and coproducts between meaningful combinations of values of a and b . This procedure is followed as we want to represent real world entities as meaningful components (i.e. entities) of an information system. Thus, we design the following diagram in Figure 6-62 to represent a full functional dependency r e.g. of the value of the primary key 1,11 to the value of the non-key attribute e  40 , in the context of the coproduct 1  11  40 . The pullback and pushout triples ((a a be b), q1 , q2 ) and (a  b  e, q1, q2 ) , respectively, which provide all the possible combinations of values for the primary key and the non-key attributes (i.e. the limit and colimit) make arrows u and u  unique. Indeed, they are the universal arrows in the relation R, as a CCC, normalized in 3NF and represented in a pullback/pushout diagram. In the four-level architecture, a a be b and a  b  e are the initial and terminal objects, respectively, of category P(type) while 1,11 and 1  11  40 being the initial and terminal object of the category P(value) . The instances 1,11, 40 are also objects of this category, which are the values of a record/object value in the fourth level, of a type type of the third level. For values 2,12,50 of another record/object value in the fourth level, the value of the primary key  2,12 determines the value of the non-key attribute e  50 , in the context of the coproduct 2  12  50 through the fully functional dependency r .

208

1

q1

u

a a be b

p1

i1 l

1,11

i1

1  11

i2

p2

q2

f

u

1  11  40

g

11

r

q1 abe

q2

i2

40 Figure 6-62: A relation R in 3NF

Let us now have a relation R  {a, b, e, e} where the domain of attributes a, b, e are those defined as above and the domain of the non-key attribute e is {90,95,100} . Assuming that there is a functional dependency r : e  e between the non-key attributes 90 ), that is e

(e.g. there is a dependency between the values 40 and 90 as 40

i 1 1  11  40  90 . From r  and r : 1,11  40 , we determines e , we have 1  11  40 

have the transitive dependency r  : 1,11  90 , that makes the relation R to be in 2NF, as it is shown in the Figure 6-63.

1 q1 a a bee b

u

p1

q1 i1

l

1,11

f 1  11

p2

i2

g

i1

f 1  11  40 g

q2 r 

r r

11 40

i2

i1

1  11  40  90

u

q2

i2

90 Figure 6-63: A relation

R in 2NF

209

a  b  e  e

Let us now have a look on the diagram in Figure 6-63. Composition of the injections i1 and i1 as well as i2 and i1 is valid, producing the arrows f and g , respectively, with assignments such as 1

1  11  40 and 11

1  11  40 . That makes the value of the

primary key 1,11 to be the pullback of 1  11  40 , something that holds, since the primary key determines the value of the non-key attribute e . For tuples of the form

 a, b, e, e e.g. 1,11, 40,90 , we notice that the context of the coproduct has changed. In this case, composition of f and the injection i1 as well as of g and the injection i1 is valid, producing the arrows f  and g  , respectively, with assignments such as 1

1  11  40  90 and 11

1  11  40  90 . Again, that makes the value of the primary

key 1,11 (the initial object) to be the pullback of 1  11  40  90 (the terminal object), since we have the transitive dependency r   r r , from the primary key (e.g. with value 1,11 ) to a non-key attribute e (e.g. with value 90) through the functional dependency e  e (with the assignment 40

such as 2

90 ). For another tuple  2,12,50,95 , assignments

2  12  50  95 and 12

2  12  50  95 , make the value of the primary

key  2,12 (the initial object) to be the pullback of 2  12  50  95 (the terminal object) through the transitive dependency r  . Dependencies such as r : e  e would prevent the relation being in 3NF as the non-key attributes are not mutually independent. A relation that is in 2NF but not in 3NF may not be represented as a single pullback/pushout diagram. The solution, in an RDBMS as well in the proposed four-level architecture, is the same: decompose the structure (i.e. the relation R , which corresponds for example to P(type) ) into two relations R, R (or pullbacks) with the latter pasted together with e as the linking object (Figures 6-62 and 664). 40

q1 e ee e

u

p1 l

 40, 40

q2

r

40

i1

i2

p2 40

q1

f

i1

u

40  90

g

q2

i2 Figure 6-64: The relation

90

e  e

R in 3NF

210

In conclusion, we can say that if for a relation there is a fully functional dependency between the primary key and each non-key attribute, then the relation is at least 2NF. A valid single pullback diagram can however only be constructed in such a case if the nonkey attributes are mutually independent, that is the relation is in 3NF. Pasted pullback diagrams are used for representing the decomposition of relations in much the same way as in relational design. If a relation is not in 2NF, there are functional dependencies from part of the key to a non-key attribute. Such a functional dependency (an extra arrow, for example, y : b  e with the assignment 11

40 or 12

50 ) would mean that the

pullback is not an equalizer for the composed arrows i1 p1 , i2 p2 : a a be b  a  b  e and the diagram is no longer a strong pullback. It is possible, although, with limited amounts of data to construct a (weak) pullback extension for a relation which fails the 2NF test. This occurs for instance when p2 is epic, where we can have the same result in the coproduct e.g. x (the + side) following the upper and lower path of the inner square. Therefore, a relation R defined in a typed system at least in 3NF is Cartesian closed and actually is a LCCC as all the slice categories (i.e. records) are CCC. In terms of the fourlevel architecture, a user _ instance (or an object, in o-o paradigm) Alice _ record (or Alice _ object ) of a type user (or a class user , respectively) is associated with a CCC P( Alice _ record ) ( or P( Alice _ object ) ) . All such instances are CCC, making thus the

categories P(user ) and P( Alice _ record ) (or P( Alice _ object ) ) LCCC. In a database management system, databases are created according to the activities and relationships in which we are interested. It is possible that there are more activities taking place between the identified entities, but the domain of interest is always the crucial factor.

6.13 Summary Processes, channels, participants, services, activities and the actual data transmitted in process interaction or those referred to system state changes, in the four-level architecture, are treated as instances of an abstract type process, in the way that was described in this chapter. The full interaction and connectivity between system components is given by the functor categories between comma categories involved in the distributed system evolution, focused on those processes that deal with security issues. 211

Security services and activities are explained in terms of core processes of the distributed system. Examples are given, in the context of a holistic approach for handling security in local computational environments that form the distributed systems using applied category theory. The internal processing in distributed transactions via secure communication channels is explained using monads/comonads constructions. The stepby-step computation and communication processing is described using the Cube and the Lattice of Cubes. Moreover, we have shown that LCCC correspond precisely to the industry-strength standard of 3NF for data design, therefore justifying the choice of LCCC as the underlying structures in our architecture.

212

7

Discussion of the results and future work

The proposed architecture, as it was developed and deployed in the current research in the context of system resources management and system security management in particular, provides the means to organize in a holistic way the protection measures needed to be taken in a distributed system in order to ensure secure transparent distributed computations and to enhance the availability of system‟s services in complex and interoperable environments. Processes and channels should be secured and resources should be protected in order to achieve data sharing transparently between system components. A comprehensive analysis of the literature shows that security for distributed systems is not a local feature but has to be treated globally. It is based on higher-order activities which are related to issues such as data integrity and interoperability among complex heterogeneous systems that share data coherently and transparently. Holistic security involves both network engineering and application development. Perimeter network defence based primarily on the use of internal and external firewalls cannot guarantee an adequate protection from potential security breaches. Butler [2007] defined holistic security in terms of an integration of proactive threat mitigation, vulnerability assessment, and the protection of assets against intentional destruction. IBM‟s Identity Management Services (IDMS), which is a collection of software applications and safeguards that are designed to protect identity systems and customer information databases, describes a holistic security framework under the umbrella of IBM‟s Information Security Framework (ISF) [IBM, 2006]. A holistic security management framework for e-commerce has been presented recently by Zuccato [2007] , in which process activities are hierarchically organized and conducted iteratively. The need for developing a holistic security approach is also reflected in the latest OECD guidelines towards a holistic security system reform and guidance [2005]. Holistic security was the primary objective of a project that investigated the consequences for the internal and external security of the European Union of the distinctive fragmented governance structures used by the Union in order to define a holistic security policy framework [Winn, 2005].

213

The analysis of the current security approaches, like baseline approaches or risk management, has shown that they are characterized by their locality. Baseline approaches as well as security standards are top-down while risk management is bottom-up. The latest attempts on a holistic basis are either platform-dependent (e.g. IBM‟s ISF), cover a specific restricted environment (e.g. e-commerce transactions as for example Zuccato‟s security framework or government structures in European Union) or include some of the characteristics of the proposed architecture (e.g. Butler‟s holistic security framework). As indicated in Chapter 1, a complete, holistic security strategy needs to be layered to deal with high-level aspects such as continuity strategies (threat assessment, risk evaluation & control), security policies, incident response plan, host-based & networkbased perimeter and/or perimeterless detection, auditing procedures, fault tolerance and recovery strategies, anti-malware control (intrusion detection, router and firewall security, anti-virus control) as well as legal and regulatory compliance (Figure 1-1). Assuring optimal security of an information system is not a trivial task, as it requires a wide variety of expertise from technological to organizational. Research has shown that there is not a concrete method to ensure that all possible attacks and loopholes are excluded while a secure system is designed as the final result will be based on the best state of the art available standards. A good practice while designing or evaluating a security framework is to assume for the worst. The current thesis presents an anticipatory framework as it is unlikely that an organization will have a clear understanding of the probability that an event will occur. It is an attempt to deal with issues as process communication (processes, channels, resources), security mechanisms (in application, network and host level), security services (as complete solutions, not as patches or piecemeal countermeasures), security policies (independent of the implemented security mechanisms and based on security requirements defined in standards). The thesis is an attempt to define a global security framework for interoperability, with natural closure on the top level, based on categorical logic (i.e. the study of logic with the help of categorical means). According to Cadish & Diskin [1995], the categorical logic theories are graphs with special labelling, i.e. sketches. Graphs have been widely used in computing as for the study of databases, programming languages, computer networks, 214

and operating systems [Reynolds, 1980], [Pitt et al., 1985], [Barr and Wells, 1999]. Power and Tourlas [2001] have used hypergraphs and higraphs to provide an algebraic foundation for some of the graph-based structures in computing. Schweimeir [2001] presented a categorical and graphical model for typed concurrent programming languages such as CCS. Schweimeir‟s model makes use of the interaction categories (monoidal categories) first proposed by Abramsky et al. [1996], where objects are types and arrows are processes and which attempts to express the semantics of language using closure conversion. Diaconescu [2000] presented category-based constraint semantics for constraint logic programming, as a special case of category-based equational logic. Categorical adjunctions and exponentiation were used by Kara & Asem [2003] to represent public-key cryptography based on the RSA algorithm. Froschle and Lasota [2006] use categorical logic in order to represent adjunctions between causal trees and event trees, through a mediating model of event trees. From an application viewpoint, a useful view, for example, of an adjunction is that of insertion in a constrained environment. The unit η can be thought of as quantitative creation, the counit ε as qualitative validation. There is then a relationship between the left and right adjoints such that η represents quantitative identification and ε qualitative identification. Examples of left adjoints are enrichments such as taking a set to a group, a set to a preorder, a collection of record keys to hashed addresses or a graph to a category. The corresponding right adjoints qualitatively identify the enrichment, ensuring that a number of type restrictions are satisfied. In this research, new categories have not been invented. Applied category theory has been used based on higher order logic, reflected in the processing in the three level-pairs and the internal processing in each of the four levels of the proposed architecture. Higherorder functions extend the scope and the usability of applied category theory beyond the locality of first-order predicate calculus. Set theory has been proved to work well for local systems but their global character is in question. It has been widely used in database systems. Its efficacy is poor though on distributed databases, especially in knowledgeintegration systems based on user requirements. Johnson and Rosebrugh [2000], for example, attempted to define a semantic data-modelling paradigm based on category 215

theory for achieving database interoperability and database federation through designated views. Their approach though is based on categorification; the views are a direct translation of existing set-based techniques using the SET category, rather than a selection of optimal categorical structures to meet the requirements. The current thesis is towards of a non set-theoretic-based approach. It uses applied category theory based on higher-order logic to express complex security activities. Functors (i.e morphisms between categories) are another example of higher-order functions. In this research, functors have been widely used on natural transformations, adjunctions, monads/comonads, comma categories, mappings and modifications in higher-order categories. The use of adjoint functors between the level pairs aims to achieve an assurance that information risks and security controls are in balance by applying technical solutions that are in compliance with system security strategies. Composition of covariant/contravariant functors, 2-functors, endofunctors and ordinary functors (as 1-cells between object categories) through a system of adjunctions aims to provide a global security framework for interoperability across distributed information systems. Furthermore, product functors and bifunctors have been used for expressing universal constructions as limits/colimits, products/coproducts and pullbacks/pushouts. Limits and products express global security (intension) while pullbacks, using comma categories, express local security computational domains (extensions). The full interaction and connectivity between system components is given by the functor categories between comma categories involved in the distributed system evolution with a focus on those processes that deal with security issues. These functor categories are needed in order to describe the complexity in security activities in a distributed system (different types of vulnerabilities, threats, risk, attacks, security breaches etc.) as well as to evaluate security measures effectiveness and to balance the cost based on risk management procedures. There are two functor categories between each level-pair (top-down / bottom-up direction). A higher-order activity (e.g. a security service) is represented as two adjoint pairs of functors, between each level-pair (for each pair, we have the actual service and its impact - two pairs of adjoint functors for evaluation of ongoing processes in a holistic way). As the distributed information system represents reality using good software 216

engineering practice, it is "live". That means that when something is happening, we should check for the next instance (i.e. view) of the system. In software engineering terms, such an activity is represented by a linearization, which describes the transition from a consistent global state of the system to a new one, also consistent. System evolves (even when it seems that nothing is happening), thus we need to describe this ongoing process. In other words, we need to know what exists between two linearizations. A modification (i.e. 3-cell) is the connector between two instances of the system that is expressed in categorical terms, as explained in section 6.13. Therefore, if levels DAT and SCH are needed for defining schemas and explain process interaction in low-level aspects, then levels CST and CPT are indeed meta- and metameta levels of the categories in the architecture (including objects, morphisms, functors, natural transformations etc.). What connects the levels, is the pairs of functor categories between them, not only for defining structures (LCCC) but also for defining and describing all the activities that take place in the system (in the local computational environments/extensions). The pair of functor categories DATCPT and CPTDAT provide ultimate closure in the top level, even if we proceed to higher-dimensional categories as n-categories. For example, composition of two 2-natural transformations in a 3-cell (i.e. a modification) takes place naturally inside the boundaries of a system expressed in the proposed four-level architecture. As associativity is not a strict requirement in exponentiation, composition in Cartesian closed categories can be associative up to natural isomorphisms. In weak ncategories, as they are treated in the current research with 0-cells being Cartesian closed comma categories, associativity is not strict (not given by equalities) but it is satisfied up to a natural isomorphism (of the next level). Kasangian et al. [1997] used weak 2categories (i.e. bicategories) to represent information flow and access control in security frameworks. One of the motivations of using Cartesian closed comma categories through this research is to avoid the use of numbers and sets, something that has been achieved in a significant degree. Indeed, certain Cartesian closed categories, the topoi, have been proposed as a general setting for mathematics, instead of the traditional set theory. For example, objects

217

of a class in a type hierarchy could be ordered in a comma category with the characteristic function providing the unique value of the OID. Good practice in software engineering is following the LCCC approach. We have shown that LCCC correspond precisely to the industry-strength standard of 3NF for data design, therefore justifying the choice of LCCC as the underlying structures in our architecture. The underlying drive in software engineering is to represent the real world effectively by mapping logical structures onto physical ones. It is the physical world that ultimately drives tasks such as normalization. Mathematical structures that match the physical world are both desirable and practicable. In this sense LCCC are a fair representation of logical structures that handle effectively the physical world. Higher-order logic, not based on category theory, was used for example by McCullough [1988], attempting to define various information system security issues. A generalization of McCullough's event-based restrictiveness model, as the basis for proving security properties of distributed system design, is presented by Alves-Foss and Levitt [1991]. The authors use higher order logic in order to define verification rules of secure distributed systems. More information on event-based, concurrency schemes and access control using higher order logic can be found in the work of Milner [1989], Hoare [1985], Kuo & Humen [2002] and Winskel & Nielsen [1995]. Good practice in software engineering for process management in distributed systems has been based on a variety of mathematical formal interpretations of processes, known as process calculi. Process calculi provide a tool for the high-level description of interactions, communication (through message-passing), and synchronization between a collection of independent agents or processes. They have been used for modelling transactions in distributed information systems, particularly for representing concurrency. They allow sequential and parallel composition of processes. They provide channel specification for sending and receiving data. They explain recursion and process replication based on repetition. The feature of hiding operations in channels allows agents (which are processes) to be composed in parallel. CCS (Calculus of Communicating Systems), CSP (Communicating Sequential Processes) [Hoare, 1985] and ACP (the Algebra of Communicating Processes) [Bergstra and Klop, 1995] constitute the three major branches of the process calculi family. The use of channels for communication is 218

one of the features distinguishing the process calculi from other models of concurrency, such as Petri nets. In Hoare logic [1969] [1985], system evolvement is depicted as sequences of traces in order to define the different routes of system evolvement (usually for concurrency purposes). A triple {P}C{Q} describes how the execution of a piece of code changes the state of the computation. C is a command, while processes P and Q express predicate logic formulas (i.e. assertions). In typed λ-calculus higher-order functions are generally those with types containing more than one arrow. In functional programming, the eval() and curry() functions are such examples. Function composition and structural transformations are other examples. Furthermore, methods of object definitions, based on categorification, are examples of higher-order functions. Abstract definitions of object class specifications, formalized using category theory, are given by Grant [1996]. He uses limits to model parallel composition of object specifications. The latter are triples of the form ( P,  , D) or

  D where P is a monoid, D is a set and  is a function from the carrier of P to D. P  A programming paradigm that combines the advantages of the logic, object, and functional paradigms was presented by Goguen et al. [2002], based on hidden algebra. In Yoon‟s categorical framework [1993] of object-oriented concurrent systems, objects represent algebras while processes represent sub-algebras. A system can evolve in different ways depending on the interaction among processes. The execution of a distributed system is characterized as a series of transitions between global states. Milner has shown [1999] that computation and communication in distributed systems and particularly in mobile distributed systems, can be modelled using the notion of the process. λ-calculus has been used for handling single-threaded computation. But as Milner observed, processes consist of many elementary parallel, interacting, communicating threads which exchange messages with each other. In π-calculus, all participants in a process are themselves processes. Thus integers, strings, objects, workflows, procedures or any other computational entities (e.g. users as participants in process interaction) or services, can be considered to be just different forms (i.e. types) of an abstract process data type. Names represent channels that can act both as the actual communication channels (e.g. input and output channels) or as the 219

actual data (the contents encapsulated in messages). In π-calculus, the creation of a new channel is an actual physical event. Pairs of processes interact with each other by sending and receiving named messages in a synchronized way. That means that processes can be composed allowing them to communicate through named channels with complementary names (e.g. distributed agents may use the same named channel as input and output interaction point). The mobility feature in π-calculus refers to the fact that the recipient process may use the received channel for further communication. It directly refers to a dynamic change in the communication topology among processes. Sangiorgi and Walker have examined the relationship between π-calculus and λ-calculus as well as a number of applications of π-calculus to object-oriented paradigm. The relationship of π-calculus with Business Process Management (a holistic management approach for good systems design that promotes business effectiveness and efficiency by attempting to improve and optimize processes continuously) along with an implementation for workflow systems was presented by Smith and Fingar [2003], where new values are created from processes in an analogous way with an RDBMS that creates new values from data. Following the ideas of λ-calculus and π-calculus, processes, channels, participants, services, activities and the actual data transmitted in process interaction or those referred to local system state changes, in the four-level architecture, are treated as instances of an abstract type process. The behaviour of a process can be expressed by algebraic equations or by commutative diagrams. The complete process behaviour can be illustrated using The Cube and the Lattice of Cubes, governed by rules defined in Godement Calculus, where the interaction through identified secure channels is depicted step-by-step. Such constructions can be used also to describe type changing and adjunctions between the levels. Examples of type changing in the four-level architecture can be found in §3.1.5.1, (e.g. encipherment security mechanisms and services for confidentiality and integrity security requirements etc.). For example, in the case of assigning different roles to a user in the context of access control, as was described in RBAC models, we need to define the appropriate comma categories and adjunctions which materialize their interactions in terms of process interaction for user authorization in order to have access on system resources. 220

The internal processing in distributed transactions via secure communication channels is explained using monads/comonads constructions. Monads and their dual construction comonads can be implemented as closures, in programming languages that support closures, based on the endofunctor T defined with free variables for closed operations on the components (monads constructions have already been implemented as wrappers in functional programming languages). In functional programming, a closure consists of the code of the body of a λ-function and the environment (i.e. the set of the available variables and their values) in which the λ-function is defined. Because closures delay evaluation, they can be used to define control structures, something that is applied to concurrent programming and generally can be used to control transition from one state to another e.g. describing the way that a distributed system evolves through consistent global states by taking into account that two events (in the way which are defined in section 6.6) either affect the local state of a process or are send/receive events denoting the exchange of a message. Modifications in the architecture (i.e. 3-cells) are used to describe system behaviour in terms of system state changes taking place in process interaction. An example has been given in section 6.6 in order to determine if two events belonging to two different processes are parallel or not. The issue of security in distributed systems mainly derives by the need of sharing resources via communication channels (e.g. secure channels as SSL) used in the network level by processes in order to exchange messages. These channels should be secured in order to deliver secure distributed computations by enhancing confidentiality and integrity. In the processing level, this requirement mainly is achieved by the use of secure distributed transactions. Actually, security has to do with the applied level of security in each layer of communication between the different layers of a distributed system, expressed in terms of the provided security services as core processes of the system. A distributed service (e.g. a security service) can be provided by one or more server processes, interacting with each other and with other client processes, in order to maintain consistency on service‟s resources. Processes (e.g. server processes or peer processes) encapsulate resources (e.g. objects), in the exchanged messages, and allow clients to access them through interfaces. Interfaces are represented as components in the 221

application level. Principals (users of other processes as participants) are authorized to operate on resources. Resources must be protected against unauthorized access. Security services and activities, in the thesis, are explained in terms of core processes of the distributed system, as explained in section 6.6. Following a top-down direction of the architecture, we can check the result of applying the needed security controls on the actual system components that participate on interactions (i.e. the result on the actual data in a process interaction). The bottom-up direction, allow us to check if process interactions are in compliance with security standards, in the form of the implemented security policies on local computational environments of the distributed system. A holistic security architecture, at least, should include baseline assessment, risk analysis, specific policy development, security measure implementation as well as monitoring and reporting action. Examples, in the context of a holistic approach for handling security in local computational environments that form the distributed systems using applied category theory, are presented in sections 6.6 (security in distributed transactions, process management in terms of applied security controls, event-ordering for non-repudiation, accountability and auditing control, access control etc.), 6.7.1 (evaluation and comparison of security policies framework), 6.7.2 (maintaining database consistency in distributed transactions, concurrency control etc.), 6.8 (construction of a threat model in categorical terms, handling attacks e.g. DoS attacks), 6.9 (threat assessment, risk evaluation and control as well as how to balance the cost of applied security measures against their effectiveness), 6.10 (integration of intrusion detection techniques for increasing their performance, implementing auditing and logging techniques in categorical terms) as well as the definition of processes, channels, interfaces and system components (e.g. system entities) involved in process interaction, in categorical terms, are presented in section 6.11. Security policies are satisfied in the top level of the four-level architecture (e.g. security policies defining communication between client, server and peer processes) as well as in the bottom levels (e.g. access control policies governing access on system resources). Other security policies are defined for unicast and multicast group communication as well as for security policy groups. Security mechanisms are represented as a dynamic synthesis (i.e. composition) of comma categories. In every case, security requirements 222

and security standards must be satisfied in every level of the four-level architecture. The details of distributed computations going from a higher level to a lower level (and the opposite) are given by the functor category between each level pair, in the form of composed adjunctions (closure provided by the exponential objects DATCPT and

CPTDAT ). Local security policies, implemented using different paradigms (in the application level, for example as security services and mechanisms in middleware) and based on a global policy security scheme, are presented in Figure 6-11. These policies can be compared and evaluated using 2-natural transformations of 2-functors (2-cells) combined with integrated monad/comonad constructions, in terms of their effectiveness (Figure 6-28 & Figure 6-36). Pairs of adjoint functors can be used to express the CIA security principles. Their composition through natural transformations (2-cells) provides the global security policy framework. Risks have been identified in Figure 6-43 as the interaction of threats and vulnerabilities. The local extensionalities (the possible combinations of threats and vulnerabilities in a local subsystem of the distributed system) are expressed in a pullback diagram by the comma category (T  S ) in terms of identified risks (category risk – as the pullback object), while the product category threat  vulnerability corresponds to all the combinations and interactions of threats and vulnerabilities in the DIS. The cost of the applied security measures can be balanced against their effectiveness (security control) through the unit and the counit of the adjunction between categories control and balancedcost, respectively, in Figure 6-44. The actual cost of a security measure, as a countermeasure on active and passive security attacks, is defined in terms of the computational effort and network usage, as a pushout participating in the total cost of applied security measures ( computationaleffort  networkusage ). Attacks from an enemy on a specified target can be expressed using exponentiation. An incident consists of a series of attacks. For example, Distributed DoS attacks (as parallel 1-cells) on a target from an attacker are expressed as the functor category target attacker in Figure 6-41. The higher-order function curry(g) identifies all these security attacks based on an identified risk. 223

Event-ordering in DIS is currently based now on the use of Lamport clocks and Vector clocks. We propose a way to handle event-ordering based on applied category theory, in terms of composed adjoint functors and endofunctors between Cartesian closed comma categories. A distributed system with n processes would normally need storage for n static arrays [0 .. n-1] to keep the values of the vector clock for each single process, something that increases the cost in terms of storage and computational effort for maintaining consistency. With the proposed approach, the cost is reduced on a low-cost fault-tolerant system to maintain composition of the comma categories involved. Such an event-ordering scheme describes the communication in a distributed system in terms of how processes interact with each other by limiting the use of numbers in a significant degree. Thus, it increases validity, certainty and availability. By dealing with causal ordering, it provides non-repudiation and accountability. For example, audit trails, in the form of attacker‟s intentional and taken steps, can be visualized as a comma category (Figure 6-45). Then, they can be compared with user profiles (step-by-step and overall comparison), based on the unit and counit of each individual step and normal user step, respectively, in the adjunction between system activities and user profiles (Figure 6-46). The architecture maintains database integrity and consistency, for central and distributed databases that make use of distributed transactions, through the composition of vertical categories between the level pairs (Figures 6-37, 6-38 & 6-39). Endofunctors as closed operations on monad/comonad constructions are used to express the ACID properties of a distributed transaction. Moggi [1991] had already introduced the use of a monad T on a Cartesian closed category C to model call-by-value languages. The call-by-push-value paradigm was introduced by Levy [2005], by subsuming call-by-value and call-by-name language principles. The proposed architecture addresses issues such as semantic and organizational interoperability and data integrity. The architecture provides the means to handle interoperability problems, which arise when there is a need for communication and computation between local environments of a distributed system, implementing different paradigms, by following industry standards. Interoperability in the architecture is based on the unit and counit of adjunctions between level pairs as well as between Cartesian closed comma categories participating in the internal processing in each level of the 224

architecture. Examples have been given on semantic and organizational interoperability in terms of system‟s security. The Service-Oriented Modelling paradigm attempts to take a holistic view of the analysis, design and architecture of all the information assets (called Services) in a system [Arsanjani, 2004] [Bierberstein et al., 2005]. This dynamic composition of services has been proposed as a way forward for interoperability and it would benefit from a categorical approach. Information systems security includes also the physical protection of system resources. The implementation of the needed security policies for that physical protection, using appropriate security services and security mechanisms, is defined in categorical terms in the four-level architecture in the form of the needed security components (including data for protection) along with their interaction. The proposed framework provides the ability to revise, redefine and reorganize and evaluate the current security measures taken in the system under consideration, using state-of-the-art security techniques, measurements of applied security measures efficiency, measurements of balancing cost of applied security measures based on risk management as well as taking into account the knowledge of the experts very seriously as in some cases it can reduce time and cost significantly. A further development step in the architecture would be to include evaluation procedures for security software applications based on evaluation criteria such as the Common Criteria. A major difficulty of the current research was the lack of any previous work on using category theory for expressing a holistic categorical security framework. There are few examples on applied categorical logic, as already cited, mainly in the fields of security mechanisms, causal ordering and concurrency control. Other difficulties have to do with the tasks of defining the appropriate categories across the levels as well as of capturing the nature of the adjoint functors in the level pairs. Also, defining categorical structures and internal processing within the levels is another difficult task, in order to express all the details of the proposed architecture. The design of a distributed system and the initial input of system components is a drawback as it is time-consuming. One way to do it would be to define the different levels of the architecture in stages. By following a bottom-up approach, system resources and system components, including security components as for example the outcomes of a 225

threat analysis, would be defined in the fourth level in the first place, as well as their interaction, in terms of processes of the distributed system. A top-down approach would need to define clearly security requirements, security policies and security policies implemented as security mechanisms by reducing the abstraction going from the first down to the fourth level of the architecture; again, all security activities are expressed in terms of processes. It is a novel work that still needs improvement especially in the fields of implementation, testing and evaluation, in order to meet what the theory proposes. Among the issues that should be examined in the future is how fault tolerance techniques as masking faults can be expressed using adjointness and evaluated based on the unit and counit of the adjunction, for example in the communication between two process groups. Other issues for future research include the use of natural transformation and adjoint functors in replication, for example between replicated servers, as well as concurrency issues in distributed transactions. Router & firewall security as well as antivirus control can be based on security policies defined on the second level of the proposed architecture. Access control components, as ACLs, capabilities, RBAC etc., can be expressed using comma categories. Object-oriented characteristics as generalization/specialization and aggregation/composition can also be expressed using comma categories, although categorification is not one of the objectives of this research (more information on this subject can be found in the work of Baez and Dolan [1998]). Another field that needs further examination is the integration of security awareness issues in the architecture, for example based on the unit and counit of the adjunction between normal user behaviour and potential attacker‟s behaviour (as insider and external threats on system security), combining proactive and reactive security measures in compliance with regulations, thus strengthening the holistic character of the proposed approach. In practical terms, the current research can be the basis for a future development and implementation of a software graphical tool with the abilities to visualize and evaluate the proposed applied categorical logic in terms of holistic security in distributed information systems. This tool, based on mathematical principles and logic, could be the basis for a future standard way to represent interoperability issues governed by a global security framework in distributed information systems expressed in the four-level architecture. Already, the feasibility of applying categorical methods in specifying and maintaining 226

industrial strength software systems using a software tool called Specware [Srinivas and Jullig, 1995] was demonstrated by Williamson & Healy [1999]. The Specware development tool supported the specification, design and semi-automated synthesis of correct-by-construction software. Taylor [1993] presented several category theoretic tools for the understanding of some common constructions in computer science. The implementation of the new tool can be made in multi-level functional languages, based on the fundamentals of typed λ-calculus, as it is expressed in the form of Cartesian closed categories, and process management as it is treated in π-calculus, which have the ability to represent higher-order logic using basic language structures. Such a tool, for a distributed system defined in the four-level architecture, should be scalable and independent of the amount of applied security, covering the following issues: 

Visualize the distributed information system, defined in the four-level architecture



Provide a visualization of system components associated with security



Provide a detail visualization of current security measures, including security policies, security services and security mechanisms



Provide a clear view of the following: o Threat model o System vulnerabilities o Risks –potential and identified



Visualize and compare current security status against any future alterations on system‟s security measures



Feedback procedures



System fault tolerance in terms of resources availability



Visualization of event-ordering handling including accountability and auditing control



Decision-making procedures for critical situations including recovery procedures



Legal and regulatory compliance issues in the form of clearly defined security policies implemented as application level security components



Security awareness procedures including user profiles

227



Incidence response plans, main and alternatives, especially in the case of protective sensitive data and situations critical for system‟s continuity

In conclusion, the development of the categorical structures in this thesis to capture the requirements for holistic security as well as the advances in techniques for implementing category theory, aim to facilitate the materialization of a global security framework for distributed information systems, at both the conceptual and practical levels.

228

7.1

The four level-architecture development stages in summary

Figure 7-1 presents a brief view of the analysis and development stages of the thesis. Four-level architecture before

Analysis of security issues in

the thesis

distributed systems

Four-level architecture revised in

Analysis and development of a

thesis to represent details of data

holistic security architecture for

and security components in a

open and interoperable distributed

distributed system

systems

The proposed holistic, layered security architecture for distributed systems

Future work: Implementation of a graphical software tool for representing security in a holistic way, for open and interoperable distributed systems defined in categorical terms using the proposed four-level architecture

Figure 7-1:Development stages in the thesis

229

REFERENCES [Abelson et al., 1998] Abelson, H., Anderson, J. R., Bellovin, M. S., Benaloh, J., Blaze, M., Diffie, W., Gilmore, J., Neumann, G. P., Rivest, L. R., Schiller, I. J. & Schneier, B. The risks of key recovery, key escrow, and Trusted Third Party encryption. Report, Ad hoc Group of Cryptographers and Computer Scientists,1998 http://www.cdt.org/crypto/risks98/ [Abouzakhar and Manson, 2002] Abouzakhar, S. N. & Manson, A. G. An intelligent approach to prevent distributed systems attacks. Information Management & Computer Security, 10 5 (203-209), 2002. [Abouzakhar and Manson, 2003] Abouzakhar, S. N. & Manson, A. G. Networks security measures using neuro-fuzzy agents. Information Management & Computer Security, 11 1 (33-38), 2003. [Abramsky et al., 1996] Abramsky, S., Gay, S. & Nagarajan, R. Interaction categories and the foundations of typed concurrent programming,In Proceedings of Deductive Program Design: Proceedings of the 1994 of MarktOberdorf Summer School, Springer-Verlang, 1996. [Aljifri et al., 2003] Aljifri, A. H., Pons, A. & Collins, D. Global e-commerce - a framework for understanding and overcoming the trust barrier. Information Management & Computer Security, 11 3 (130-138), 2003. [Álvarez and Petrovic, 2003] Álvarez, G. & Petrovic, S. A new taxonomy of Web attacks suitable for efficient encoding. Computers & Security, 22 5 (435-449), 2003. [Alves-Foss and Levitt, 1991] Alves-Foss, J. & Levitt, K. Verification of Secure Distributed Systems in Higher Order Logic: A Modular Approach Using Generic Components,In Proceedings of 1991 IEEE Symposium on Security and Privacy, (122), 1991. [Amir et al., 2004] Amir, Y., Kim, Y., Nita-Rotaru, C., Schultz, L. J., Stanton, J. & Tsudik, G. Secure group communication using robust contributory key agreement. IEEE Transactions on parallel and distributed systems, 15 5 (468-480), 2004. [Anderson, 1996] Anderson, J. R. The eternity service,In Proceedings of Proceedings of Pragocrypt'96, 1996. [Anderson, 1996] Anderson, J. R. Security in Clinical Information Systems. British Medical Association,1996 [Anderson, 2001] Anderson, J. R. Security engineering: A guide to building dependable distributed systems, John Wiley & Sons Inc, 2001. [Anderson, 2004] Anderson, J. R. Cryptography and competition policy - issues with 'trusted computing'.2004 http://www.cl.cam.ac.uk/ [Anderson, 2003] Anderson, M. J. Why we need a new definition of information security. Computers & Security, 22 4 (308-313), 2003. [Arsanjani, 2004] Arsanjani, A. Service-Oriented Modeling and Architecture: How to Identify, Specify and Realize Services for your SOA. IBM Developer Works,2004 [Asperti and Longo, 1991] Asperti, A. & Longo, G. Categories Types and Stuctures - An introduction to Category Theory for the working computer scientist, Foundations of Computing Series, 1st ed., M.I.T. Press, 1991. ftp.ens.fr/pub/dmi/users/longo/CategTypesStructures

230

[Avancha et al., 2003] Avancha, S., Undercoffer, J., Joshi, A. & Pinkston, J. Secure sensor networks for perimeter protection. Computers & Networks, 43 (421-435), 2003. [Babaoglu and Marzullo, 1993] Babaoglu, O. & Marzullo, K. Consistent global states of distributed systems: fundamental concepts and mechanisms. In Mullender, S. J. (Ed.) Distributed Systems, Addison-Wesley, Reading, Mass.,1993 [Baez, 1995] Baez, J. n-Categories - Sketch of a Definition.1995 http://math.ucr.edu/home/baez/ncat.def.html [Baez, 1997] Baez, J. An introduction to n-categories. 7th Conference on Category Theory and Computer Science, Springer Lecture Notes in Computer Science, 1290,1997 http://math.ucr.edu/home/baez [Baez and Dolan, 1998] Baez, J. & Dolan, J. Categorification.1998 http://arxiv.org/PS_cache/math/pdf/9802/9802029v1.pdf [Bakman, 2003] Bakman, A. Patch Management. Computer Fraud & Security, 2003 8 (911), 2003. [Barendregt, 1984] Barendregt, P. H. The Lambda Calculus: its syntax and semantics, revised ed., Amsterdam, North Holland, 1984. http://www.andrew.cmu.edu/user/cebrown/notes/barendregt.html [Barr and Wells, 1985] Barr, M. & Wells, C. Toposes, Triples and Theories, Grundlehren Der Mathematischen Wissenschaften, Springer-Verlag, 1985. Reprinted in Reprints in Theory and Applications of Category Theory, 12 (1-288), 2005 http://www.case.edu/artsci/math/wells/pub/pdf/ttt.pdf [Barr and Wells, 1999] Barr, M. & Wells, C. Category Theory for computing science, International series in computer science, 3rd ed., Les Publications Centre de Recherches Mathematiques, Montreal, Canada, 1999. [Basten et al., 1997] Basten, T., Kunz, T., P, B. J., H, C. M. & Taylor, D. S. Vector time and causality among abstract events in distributed computations. Distributed Computing, 11 (21-39), 1997. [Beckers and Ballerini, 2003] Beckers, J. & Ballerini, J. P. Advanced analysis of intrusion detection logs. Computer Fraud & Security, 2003 6 (9-12), 2003. [Bell and Grimson, 1992] Bell, D. & Grimson, J. Distributed Database Systems, 1st ed., Addison-Wesley, 1992. [Bell and LaPadula, 1976] Bell, E. D. & LaPadula, J. L. Secure Computer Systems: Unified exposition and Multics interpretation. Secure Computer Systems,Technical Report, Mitre Corporation,1976 [Bennett et al., 2002] Bennett, S., McRobb, S. & Farmer, R. Object-oriented systems analysis and design using UML, 2nd ed., McGraw Hill, 2002. [Benyon-Davies, 1998] Benyon-Davies, P. Information Systems Development, McMillan Press Ltd., 1998. [Bertino, 2003] Bertino, E. RBAC models concepts and trends. Computers & Security, 22 6 (511-514), 2003. [Bertino et al., 2003] Bertino, E., Fan, J., Ferrari, E., Hacid, M.-S., Elmagarmid, K. A. & Zhu, X. A hierarchical access control model for video database systems. ACM Transactions on Information Systems, 21 2 (155-191), 2003.

231

[Bézivin, 2001] Bézivin, J. From Object Composition to Model Transformation with the MDA,In Proceedings of Proceedings of Tools USA, Santa Barbara, IEEE Tools 39, 2001. [Biba, 1977] Biba, K. J. Integrity Considerations for Secure Computer Systems. Technical Report, MITRE Corporation, Bedford, Massachusetts,1977 [Bicakci et al., 2003] Bicakci, K., Tsudik, G. & Tung, B. How to construct optimal onetime signatures. Computers & Networks, 43 (339-349), 2003. [Bierberstein et al., 2005] Bierberstein, N., Bose, S., Fiammante, M., Jones, K. & Shah, R. Service-Oriented Architecture (SOA) Compass: Business Value, Planning, and Enterprise Roadmap, IBM Press, 2005. [Blyth et al., 2003] Blyth, A., Cunliffe, D. & Sutherland, I. Security analysis of XML, XML usage and XML parsing. Computers & Security, 22 6 (494-505), 2003. [Botha and Eloff, 2001] Botha, A. R. & Eloff, H. P. J. Access control in Documentcentric Workflow systems - an agent based approach. Computers & Security, 20 6 (525-532), 2001. [Botha and von Solms, 2003] Botha, M. & von Solms, R. Utilizing fuzzy logic and trend analysis for effective intrusion detection. Computers & Security, 22 5 (423-434), 2003. [Brands, 2002] Brands, S. Secure access management: trends and solutions. Information Security Technical Report, 7 3 (81-94), 2002. [Brewer and Nash, 1989] Brewer, D. & Nash, M. The Chinese Wall Policy,In Proceedings of IEEE Symposium on Research in Security and Privacy, Oakland, California, IEEE, 1989. [Brooke and Paige, 2003] Brooke, P. J. & Paige, R. F. Fault trees for security system design and analysis. Computers & Security, 22 3 (256-264), 2003. [Bruschi et al., 2003] Bruschi, D., Curti, A. & Rosti, E. A quantitative study of public key infrastructures. Computers & Security, 22 1 (56-67), 2003. [BSI, 1999] BSI BS 7799 - Code of practice for information security management. British Standard Institute,1999 [Butler, 2007] Butler, C. W. "Holistic Security" References on Terrorism, Homeland Security, Threat Assessment and Preparedness, 3rd ed., Butler Research LLC, 2007. [Cadish and Diskin, 1995] Cadish, B. & Diskin, Z. Algebraic Graph-oriented approach to management of multi-base systems Part 1: view integration via sketches and equations,In Proceedings of Next Generation of Information Technologies and Systems,NGITS'95, 2nd Int.Workshop, Naharia (Israel), (69-79), 1995. [Cali et al., 2004] Cali, A., Calvanese, D., De Giacomo, G. & Lenzerini, M. Data integration under integrity constraints. Information Systems, 29 (147-163), 2004. [Chadwick and Basden, 2001] Chadwick, W. D. & Basden, A. Evaluating Trust in a public key certification authority. Computers & Security, 20 7 (592-611), 2001. [Chan and Kwok, 2001] Chan, T. M. & Kwok, F. L. Integrating security design into the software development process for e-commerce systems. Information Management & Computer Security, 9 3 (112-122), 2001. [Chen and Chung, 2002] Chen, T. S. & Chung, Y. Hierarchical access control based on Chinese Remainder theorem and symmetric algorithm. Computers & Security, 21 6 (565-570), 2002. 232

[Cheung and Misic, 2002] Cheung, H. K. & Misic, J. On virtual private networks security design issues. Computers & Networks, 38 (165-179), 2002. [Chien and Jan, 2003] Chien, H. Y. & Jan, J. K. Robust and Simple authentication protocol. The Computer Journal, 46 2 (193-200), 2003. [Church, 1932] Church, A. A set of postulates for the foundations of logic. Annals of Mathematics, 2 33 (346-366), 1932. [Church, 1933] Church, A. A set of postulates for the foundation of logic (second paper). Annals of Mathematics, 2 34 (839-864), 1933. [Church, 1936] Church, A. An unsolvable problem of elementary number theory. American Journal of Mathematics, 58 (345-363), 1936. Reprinted by Davis, Martin (ed), in The undecidable: basic papers on undecidable propositions, unsolvable problems, and computable functions, (108 - 115), with the correction incorporated, Hewlett, New York, Raven Press, 1965 [Church, 1941] Church, A. The Calculi of Lambda Conversion. Princeton University Press, Princeton, N.J,1941 [Clark and Wilson, 1987] Clark, D. D. & Wilson, R. D. A comparison of commercial and military computer security policies,In Proceedings of Proceeding of IEEE Symposium on security and privacy, Oakland, CA, (184-194), 1987. [Clauss and Kohntopp, 2001] Clauss, S. & Kohntopp, M. Identity management and its support of multilateral security. Computers & Networks, 37 (205-219), 2001. [Coles and Moulton, 2003] Coles, S. R. & Moulton, R. Operationalizing IT Risk Management. Computers & Security, 22 6 (487-493), 2003. [Conte and Sichman, 2002] Conte, R. & Sichman, S. J. Dependence Graphs - dependence within and between groups. Computational & Mathematical Organization Theory, 8 (87-112), 2002. [Coulouris et al., 2005] Coulouris, G., Dollimore, J. & Kindberg, T. Distributed Systems: Concepts and Design, 4th ed., Addison-Wesley/Pearson Education, 2005. [Cournane and Hunt, 2004] Cournane, A. & Hunt, R. An analysis of the tools used for the generation and prevention of spam. Computers & Security, 23 (154-166), 2004. [Denning, 1976] Denning, E. D. A lattice model of secure information flow. Communications of the ACM, 19 5 (236 - 243), 1976. [Desai et al., 2002] Desai, S. M., Richards, C. T. & von der Embse, T. System insecurity - firewalls. Information Management & Computer Security, 10 3 (135-139), 2002. [Deshpande and Karypis, 2004] Deshpande, M. & Karypis, G. Item-based top N recommendation algorithms. ACM Transactions on Information Systems, 22 1 (143-177), 2004. [Diaconescu, 2000] Diaconescu, R. Category-based Constraint Logic. Mathematical Structures in Computer Science, 10 3 (373-407), 2000. [Dini, 2001] Dini, G. Electronic voting in a large-scale distributed system. Networks, 38 1 (22-32), 2001. [Douceur, 2002] Douceur, R. J. The Sybil Attack,In Proceedings of Electronic Proceedings for the 1st International Workshop on Peer-to-Peer Systems (IPTPS '02), Cambridge, MA, USA, Springer-Verlang, 2002. [Doughty, 2003] Doughty, K. Implementing enterprise security: a case study. Information Systems Control Journal, 2 (99-114), 2003.

233

[Dunbar, 2001] Dunbar, N. Ipsec networking standards - an overview. Information Security Technical Report, 6 1 (35-48), 2001. [Emery et al., 2003] Emery, D., Upton, S. & Trevorah, R. tScheme - voluntary approval for certificate authority services. Information Security Technical Report, 8 3 (2335), 2003. [Emmerich, 2000] Emmerich, W. Engineering distributed objects, John Wiley & Sons, Ltd, 2000. [Fan et al., 2002] Fan, W., Kuper, M. G. & Simeon, J. A unified constraint model for XML. Computers & Networks, 39 (489-505), 2002. [Fidge, 1988] Fidge, E. J. Timestamps in message-passing systems that preserve partial ordering. The 11th Australian Computer Science Conference, (56-66),1988 [FIPS, 2003] FIPS Standards for security categorization of federal information and information systems. In U.S Department of Commerce (Ed.), National Institute of Standards and Technology (NIST), FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION,2003 [Forte, 2003] Forte, D. Biometrics: future abuses. Computer Fraud & Security, 2003 10 (12-14), 2003. [France et al., 2002] France, T., Yen, D., Wang, J. C. & Chang, C.-M. Integrating search engines with data mining for customer-oriented information search. Information Security Technical Report, 10 5 (242-254), 2002. [Frantzen et al., 2001] Frantzen, M., Kerschbaum, F., Schultz, E. E. & Fahmy, S. A framework for understanding vulnerabilities in firewalls using a data flow model of firewall internals. Computers & Security, 20 3 (263-270), 2001. [Froschle and Lasota, 2006] Froschle, S. & Lasota, S. Causality versus TrueConcurrency. Electronic Notes in Theoretical Computer Science, 154 (3-18), 2006. [Furnell et al., 2004] Furnell, M. S., Papadopoulos, I. & Dowland, S. P. A long-term trial of alternative user authentication technologies. Information Management & Computer Security, 12 2 (178-190), 2004. [Gerber et al., 2001] Gerber, M., von Solms, R. & Overbeek, P. Formalizing information security requirements. Information Management & Computer Security, 9 1 (3237), 2001. [Gladney, 1997] Gladney, H. M. Access control for large collections. ACM Transactions on Information Systems, 15 2 (154-194), 1997. [Globus, 2001] Globus The Globus Project.2001 http://www.globus.org [Gödel, 1929] Gödel, K. Über die Vollständigkeit des Logikkalküls, University of Vienna, 1929. Thesis, On the completeness of the Calculus of Logic, Reprinted by Feferman, S et al. (eds) in Kurt Gödel Collected Works Volume 1 Publications 1929 - 1936, translated by Stefan Bauer-Mengelberg & Jean van Heijenoort, Oxford University Press, 1986 [Gödel, 1930] Gödel, K. Die Vollständigkeit der Axiome des logischen Funktionenkalküls, Monatshefte für Mathematik und Physik, 1930. The completeness of the Axioms of the Functional Calculus of Logic, Reprinted by Feferman, S et al. (eds) in Kurt Gödel Collected Works Volume 1 Publications 1929 - 1936, translated by Stefan Bauer-Mengelberg, Oxford University Press, 1986 234

[Gödel, 1931] Gödel, K. Über formal unentscheidbare Sätze der Principia mathematica und verwandter Systeme I, Monatshefte für Mathematik und Physik, 1931. On formally undecidable propositions of Principia Mathematica and Related Systems I, Reprinted by Feferman, S et al. (eds) in Kurt Gödel Collected Works Volume 1 Publications 1929 - 1936, translated by Jean van Heijenoort, Oxford University Press, 1986 [Gödel, 1934] Gödel, K. On undecidable propositions of formal mathematical systems, (mimeographed lecture notes, taken by Stephen C. Kleene and J. Barkley Rosser), 1934. Reprinted by Davis, Martin (ed), in The undecidable: basic papers on undecidable propositions, unsolvable problems, and computable functions, Hewlett, New York, Raven Press, 1965. Also reprinted by Feferman, S et al. (eds) in Kurt Gödel Collected Works Volume 1 Publications 1929 - 1936, Oxford University Press, 1986 [Godement, 1958] Godement, R. Theorie des faisceaux, Hermann, 1958. [Goguen et al., 2002] Goguen, J., Grant, M. & Kemp, T. A hidden Herbrand Theorem: combining the object and logical paradigms. Journal of Logic and Algebraic programming, 51 (1-41), 2002. [Goldreich, 2003] Goldreich, O. Cryptography and cryptographic tools. Distributed Computing, 16 (177-199), 2003. [Grant, 1996] Grant, M. Interconnection of Object specification,In Proceedings of Formal Methods in Object Technology, Springer Workshops in Computing, 1996. [Gritzalis and Gritzalis, 2001] Gritzalis, S. & Gritzalis, D. A digital seal solution for deploying trust on commercial transactions. Information Management & Computer Security, 9 2 (71-79), 2001. [Gritzalis et al., 2000] Gritzalis, S., Iliadis, J. & Oikonomopoulos, S. Distributed component software security issues on deploying a secure electronic marketplace. Information Management & Computer Security, 8 1 (5-13), 2000. [Groves, 2002] Groves, J. Achieving cost reductions through biometrics. Computer Fraud & Security, 2002 9 (8-11), 2002. [Guarino and Welty, 2000] Guarino, N. & Welty, C. Towards a methodology for ontology-based model engineering. International Workshop of model engineering - IWME 2000, Nice / Sophia Antipolis, France,2000 [Halfmann and Kuhnhauser, 1999] Halfmann, U. & Kuhnhauser, E. W. Embedding security policies into a distributed computing environment. ACM SIGOPS Operating Systems Review, 33 2 (51-64), 1999. [Han and Cho, 2003] Han, S. J. & Cho, S. B. Detecting intrusion with rule-based integration of multiple models. Computers & Security, 22 7 (613-623), 2003. [Harding, 2003] Harding, A. SSL Virtual Private Networks. Computers & Security, 22 5 (416-420), 2003. [Harris and Yen, 2002] Harris, J. A. & Yen, C. D. Biometric authentication - assuring access to information. Information Management & Computer Security, 10 1 (1219), 2002. [Harrison et al., 1976] Harrison, A. M., Russo, I. W. & Ullman, D. J. Protection in operating systems. Communications of the ACM, 19 8 (461-471), 1976.

235

[Hawkins et al., 2000] Hawkins, M. S., Yen, C. D. & Chou, C. D. Disaster recovery planning: a strategy for data security. Information Management & Computer Security, 8 5 (222-229), 2000. [Hawkins et al., 2000] Hawkins, S., Yen, C. D. & Chou, C. D. Awareness and challenges of Internet security. Information Management & Computer Security, 8 3 (131143), 2000. [Heather et al., 2007] Heather, M., Rossiter, N. & Sisiaridis, D. The Semantics of Jitter in Anticipating Time Itself within Nano-Technology,In Proceedings of CASYS'07, Liège, (12pp), 2007. [Hindley and Seldin, 1986] Hindley, R. J. & Seldin, P. J. Introduction to combinators and λ-Calculus, London Mathematical Society Student Texts, 1st ed., Cambridge University Press, 1986. [Hoare, 1985] Hoare, C. A. Communicating Sequential Processes, Prentice Hall, London, 1985. [Hofmann, 2004] Hofmann, T. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems, 22 1 (89-115), 2004. [Höne and Eloff, 2002] Höne, K. & Eloff, H. P. J. Information security policy - what do information security standards say. Computers & Security, 21 5 (402-409), 2002. [Hong et al., 2003] Hong, K. S., Chi, Y. P., Chao, R. L. & Tang, J. H. An integrated system theory of information security management. Information Management & Computer Security, 11 5 (243-248), 2003. [Huang and Chang, 2004] Huang, H. & Chang, C. A new cryptographic key assignment scheme with time-constraint access control in a hierarchy. Computer Standards & Interfaces, 26 (159-166), 2004. [Hunter, 2004] Hunter, P. Hardware-based security: FPGA-based devices. Computer Fraud & Security, 2004 2 (11-12), 2004. [IBM, 2004] IBM z/OS Release 5- Multilevel security on zSeries multiframes. IBM,2004 [IBM, 2006] IBM Federated Identity Management and Web Services Security with IBM Tivoli Security Solutions. RedBooks,2006 http://www.redbooks.ibm.com/redbooks/pdfs/sg246394.pdf [IETF, 1997] IETF Simple Public Key Infrastructure.1997 http://world.std.com/~cme/html/spki.html [Iheagwara and Blyth, 2002] Iheagwara, C. & Blyth, A. The impact of security layering on end-to-end latency and system performance in switched and distributed ebusiness environments. Computers & Networks, 39 (827-840), 2002. [Irakleous et al., 2002] Irakleous, I., Furnell, M. S., Dowland, S. P. & Papadaki, M. An experimental comparison of secret-based user authentication technologies. Information Management & Computer Security, 10 3 (100-108), 2002. [ISO, 1989] ISO Open systems interconnection - basic reference model part 2: security architecture.1989 [ISO, 2002] ISO Information Security Management ISO/IEC 17799 -BS 7799.2. International Organization for Standardization,2002 [ISO, 2005] ISO Information Security Management ISO/IEC 17799:2005 Code of Practice. International Organization for Standardization,2005 [ISO, 2005] ISO Information Security Management ISO/IEC 27001:2005 Specification. International Organization for Standardization,2005 236

[ISO/CCIT, 1999] ISO/CCIT Common Criteria for information technology security evaluation - ISO 15408. International Organization for Standardization,1999 [ITF, 1998] ITF TLS - Transport Layer Secure Protocol (active WG).1998 http://tools.ietf.org/wg/tls [ITU-T, 1991] ITU-T X.800: Security architecture for Open Systems Interconnection for CCITT applications. Recommendation, International Telecommunication Union,1991 [ITU-T, 2000] ITU-T Information technology - Open Systems Interconnection - The Directory: Public-key and attribute certificate frameworks. Recommendation, International Telecommunication Union,2000 [ITU-T, 2005] ITU-T ITU-T X.500 Directory Specifications. ITU-T X.500 Series of Recommendations,2005 http://www.x500standard.com/index.php?n=X500.X500 [Janczewski and Portougal, 2000] Janczewski, J. L. & Portougal, V. 'Need-to-know' principle and fuzzy security clearances modelling. Information Management & Computer Security, 8 5 (210-217), 2000. [Janczewski et al., 2001] Janczewski, J. L., Reamer, D. & Brendel, J. Handling distributed denial-of-service attacks. Information Security Technical Report, 6 3 (37-44), 2001. [Janczewski and Shi, 2002] Janczewski, J. L. & Shi, X. F. Development of information security baselines for Healthcare information systems in New Zealand. Computers & Security, 21 2 (172-192), 2002. [Jayeju-akinsiku, 2002] Jayeju-akinsiku, B. Technology and electronic communication Act 2000. Computers & Security, 21 7 (624-628), 2002. [Johansson and Schultz, 2003] Johansson, M. J. & Schultz, E. E. Dealing with contextual vulnerabilities in code: distinguishing between solutions and pseudosolutions. Computers & Security, 22 2 (152-159), 2003. [Johnson and Rosebrugh, 2000] Johnson, M. & Rosebrugh, R. Database Interoperability Through State Based Logical Data Independence,In Proceedings of Proceedings of CSCW2000, the Fourth International Conference on Computer Supported Collaborative Work, IEEE Hong Kong, (161-166), 2000. [Johnston and Eloff, 2003] Johnston, J. & Eloff, J. H. P. Security and human computer interfaces. Computers & Security, 22 8 (675-684), 2003. [Kamara et al., 2003] Kamara, S., Schultz, E. E., Kerschbaum, F. & Frantzen, M. Analysis of vulnerabilities in Internet firewalls. Computers & Security, 22 3 (214232), 2003. [Kankanhalli et al., 2003] Kankanhalli, A., Teo, H., Tan, C. Y. B. & Wei, K.-K. An integrative study of information systems security effectiveness. International Journal of Information Management, 23 2 (139-154), 2003. [Kara and Asem, 2003] Kara, A. & Asem, Y. On Categorical Notions in Cryptography. Journal of Three Dimensional Images, 17 1 (176-180), 2003. [Kasangian et al., 1997] Kasangian, S., Kelly, G. M. & Vighi, V. A bicategorical approach to information flow and security. Categorical studies in Italy, Perugia, 2 (99-122),1997 [Kelly, 1982] Kelly, G. M. Basic concepts on enriched category theory, Lecture Notes in Mathematics, Cambridge University Press, 1982. Reprinted in Reprints in Theory

237

and Applications of Category Theory, 10 (1-143), 2005 http://www.emis.de/journals/TAC/reprints/articles/10/tr10abs.html [Kelly, 2005] Kelly, G. M. Basic concepts on enriched category theory. Reprints in Theory and Applications of Category Theory, 10 (1-143), 2005. [Kesh et al., 2002] Kesh, S., Ramanujan, S. & Nerur, S. A framework for analyzing ecommerce security. Information Management & Computer Security, 10 4 (149158), 2002. [King and CBCP, 2003] King, D. L. & CBCP Moving towards a business continuity culture. Network Security, 2003 1 (12-17), 2003. [King, 2003] King, S. Threats and solutions to Web services security. Network Security, 9 (8-11), 2003. [King, 2004] King, S. Applying application security standards - a case study. Computers & Security, 23 (17-21), 2004. [Kokolakis and Kiountouzis, 2000] Kokolakis, S. A. & Kiountouzis, A. E. Achieving interoperability in a multiple security policies environment. Computers & Security, 18 4 (267-281), 2000. [Kranakis and Santoro, 2001] Kranakis, E. & Santoro, N. Distributed computing on oriented anonymous hyper-cubes with faulty components. Distributed Computing, 14 (185-189), 2001. [Kuhnhauser, 1999] Kuhnhauser, E. W. Policy groups. Computers & Security, 18 4 (351363), 1999. [Kuo and Humenn, 2002] Kuo, C. J. & Humenn, P. Dynamically authorized role-based access control for secure distributed computation,In Proceedings of Proceedings of the 2002 ACM workshop on XML security, Fairfax, VA, ACM Press (97-103), 2002. [Labuschagne and Eloff, 2000] Labuschagne, L. & Eloff, H. P. J. Electronic commerce the information security challenge. Information Management & Computer Security, 8 3 (154-157), 2000. [Lam et al., 2003] Lam, K. Y., Chung, S. L., Gu, M. & Sun, J. G. Security middleware for enhancing interoperability of public key infrastructrure. Computers & Security, 22 6 (535-546), 2003. [Lambek, 1980] Lambek, J. From λ-calculus to Cartesian Closed Categories. In Hindley, R. J. & Seldin, P. J. (Eds.) Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, Academic Press, (375-402),1980 [Lambek and Scott, 1986] Lambek, J. & Scott, P. J. Introduction to higher order categorical logic. Cambridge Studies in Advanced Mathematics, Cambridge University Press, 7,1986 [Lamport, 1978] Lamport, L. Time, clocks and the ordering of events in a distributed system. Communications of the ACM, 21 (558-565), 1978. [Lamport, 1986] Lamport, L. On interprocess communication: Part 1 - basic formalism and Part 2 - Algorithm. Distributed Computing, (77-101), 1986. [LATE Group, 2002] LATE Group The development of open, federated specification for network identity. Information Security Technical Report, 7 3 (55-64), 2002. [Lawvere, 1963] Lawvere, F. W. Functorial semantics of algebraic theories. Ph.D Thesis, Columbia University, New York,1963

238

[Lawvere, 1964] Lawvere, F. W. An elementary theory on the category of sets. Proceedings of the National Academy of Science, 52 6 (1506-1511), 1964. Reprinted with comments by the author and by Colin McLarty (long version) in Reprints in Theory and Applications of Category Theory, 11 (1-35), 2005 http://tac.mta.ca/tac/reprints/articles/11/tr11abs.html [Lawvere, 1969a] Lawvere, F. W. Adjointness in Foundations. 23 (281-296), 1969a. Reprinted with commentary in Reprints in Theory and Applications of Category Theory, 16 (1-16), 2006 http://tac.mta.ca/tac/reprints/articles/16/tr16abs.html [Lawvere, 1986] Lawvere, F. W. Taking categories seriously. Revista Colombiana De Mathematica XX, (147-178), 1986. Reprinted with author commentary in Reprints in Theory and Applications of Category Theory, 8 (1-24), 2005 http://tac.mta.ca/tac/reprints/articles/8/tr8abs.html [Leach, 2003] Leach, J. Improving user security behaviour. Computers & Security, 22 8 (685-692), 2003. [Leach, 2004] Leach, J. TBSE - an engineering approach to the design of accurate and reliable security systems. Computers & Security, 23 (22-28), 2004. [Lee and Lee, 2002] Lee, J. & Lee, Y. A holistic model of computer abuse within organizations. Information Management & Computer Security, 10 2 (57-63), 2002. [Lee et al., 2002] Lee, Y., Lee, J. & Lee, Z. Integrating software lifecycle process standards with security engineering. Computers & Security, 21 4 (345-355), 2002. [Leinster, 2002] Leinster, T. A survey of definitions of n-category. Theory and Applications of Categories, 10 1 (1-70), 2002. [Leinster, 2004] Leinster, T. Higher operads, higher categories, London Mathematical Society Lecture Notes, Cambridge University Press, 2004. Reprinted as Operads in higher-dimensional category theory in Reprints in Theory and Applications of Category Theory, 12 3 (73-194), 2004 http://arxiv.org/abs/math.CT/0305049, http://tac.mta.ca/tac/volumes/12/3/12-03abs.html [Levy, 2005] Levy, P. B. Adjunction models for call-by-push-value with stacks. Theory and Applications of Categories, 14 5 (75-110), 2005. http://tac.mta.ca/tac/volumes/14/5/14-05abs.html [Li et al., 2003] Li, L. H., Tzenga, S. F. & Hwangb, M. S. Generalization of proxy signature-based on discrete logarithms. Computers & Security, 22 3 (245-255), 2003. [Lillywhite, 2004] Lillywhite, P. T. Implementing BS7799 in the UK National Health System. Computer Fraud & Security, 2004 2 (4-8), 2004. [Lin, 2001] Lin, C. H. Hierarchical key assignment without public key cryptography. Computers & Security, 20 7 (612-619), 2001. [Lin et al., 2003] Lin, I. C., Hwang, M. S. & Chang, C. C. Security enhancement for anonymous secure e-voting over a network. Computer Standards & Interfaces, 25 (131-139), 2003. [Lopez et al., 2003] Lopez, J., Mana, A., Ortega, J. J., Troya, M. J. & Yague, I. M. Integrating PMI services in CORBA applications. Computer Standards & Interfaces, 25 (391-403), 2003. [Lothian and Wenham, 2001] Lothian, P. & Wenham, P. Database security in a web environment. Information Security Technical Report, 6 2 (12-20), 2001. 239

[Mac Lane, 1998] Mac Lane, S. Categories for the working mathematician, Graduate Texts in Mathematics, Axler, S., Gehring, W. F. & Ribet, A. K. (Eds.), 2nd ed., Springer-Verlang, New York, 1998. [Mamaghani, 2002] Mamaghani, F. Evaluation and selection of an anti-virus and content filtering software. Information Management & Computer Security, 10 1 (28-32), 2002. [Mason, 2003] Mason, S. Electronic security is a continuous process. Computer Fraud & Security, 2003 1 (13-15), 2003. [Mason, 2000] Mason, T. Platform security and Common Criteria. Information Security Technical Report, 5 1 (14-25), 2000. [Mattern, 1989] Mattern, F. Virtual time and global states of distributed system. International workshop on Parallel and Distributed Algorithms, (215-226),1989 [Matthews, 2000] Matthews, S. Authorization models - PKI versus the real world. Information Security Technical Report, 5 4 (66-71), 2000. [May and Lausen, 2004] May, W. & Lausen, G. A uniform framework for integration of information from the web. Information Systems, 29 59-91, 2004. [Mayes and Markantonakis, 2003] Mayes, K. & Markantonakis, K. Are we smart about security? Information Security Technical Report, 8 1 (6-16), 2003. [McCullough, 1988] McCullough, D. Foundations of Ulysses: The Theory of Security. Odyssey Research Associates, Inc.,Technical Report RADC-TR-87-222,1988 [McHugh, 2001] McHugh, J. Intrusion and intrusion detection. International Journal of Information Security, 1 1 (14-35), 2001. [McLean, 1985] McLean, J. A comment on the `basic security theorem' of Bell and LaPadula. Information Processing Letters, 20 2 (67-70), 1985. [Midian, 2003] Midian, P. How to ensure an effective penetration test. Information Security Technical Report, 8 4 (65-77), 2003. [Milner, 1989] Milner, R. A. Communicating and concurrency, Prentice Hall, New York, 1989. [Moggi, 1991] Moggi, E. Notion on computation and monads. Information and Computation, 93 (55-92), 1991. [Moitra and Konda, 2004] Moitra, D. S. & Konda, L. S. An empirical investigation of network attacks on computer systems. Computers & Security, 23 (43-51), 2004. [Moona et al., 2004] Moona, C. J., Parkb, D. H., Parkc, S. J. & Baik, D. K. Symmetric RBAC model that takes the separation of duty and role hierarchies into consideration. Computers & Security, 23 (126-136), 2004. [Morau, 2004] Morau, T. An information security framework addressing the initial cryptrographic key authentication challenges. CONNOTECH Experts-conceils inc,2004 [Morgan et al., 1999] Morgan, G., Shrivastava, S., Ezhilchelvan, D. P. & Little, C. M. Design and Implementation of a CORBA Fault-Tolerant Object Group Service,In Proceedings of Second IFIP WG 6.1 International Working Conference Distributed Applications and Interoperable Systems, 1999. [Naor and Levy, 2003] Naor, Z. & Levy, H. Announced dynamic access probability (DAP) protocol for next generation wireless networks. Computers & Networks, 41 (527-544), 2003.

240

[National Bureau of Standards, 1977] National Bureau of Standards Data Encryption Standard (DES), Federal Information Processing Standards No. 46. Washington DC: US National Bureau of Standards,1977 [Needham, 1990] Needham, M. R. Names. In Mullender, S. (Ed.) Distributed Systems. ACM Press,89-101, 1990. [Needham and Schroeder, 1978] Needham, M. R. & Schroeder, D. M. Using Encryption for Authentication in Large Networks of Computers. Communications of the ACM, 21 12 (993-999), 1978. [Netscape, 1996] Netscape SSL 3.0 Specification- Secure Sockets Layer. SSL Digital Certificates,1996 http://www.ssl.com [NIST, 1996] NIST An introduction to computer security, The NIST Handbook, National Institute of Standards and Technology, 1996. [OECD, 2002] OECD Guidelines for the security of information systems: towards a culture of security. OECD Council,2002 [OECD, 2005] OECD Security System Reform and Governance. DAC Guidelines and Reference Series,2005 http://www.oecd.org/dataoecd/8/39/31785288.pdf [Oh and Park, 2003] Oh, S. & Park, S. Task Role-based access control model. Information Systems, 28 (533-562), 2003. [OMG a, 2001] OMG a CORBA/IIOP Specification.2001 http:/www.omg.org/technology/documents/formal/corba_iiop.htm [Ozsu and Valduriez, 1999] Ozsu, M. T. & Valduriez, P. Principles of distributed database systems, 2nd ed., Prentice-Hall Inc, 1999. [Pan et al., 2003] Pan, J., Hou, Y. T. & Li, B. An overview of DNS-based server selections in content distribution networks. Computers & Networks, 43 (695-711), 2003. [Parrington et al., 1995] Parrington, D. G., Shrivastava, S., Wheater, M. S. & Little, C. M. The design and implementation of Arjuna. USENIX Computing Systems Journal, 8 3 (255-308), 1995. [Patel, 2001] Patel, A. Access control mechanisms in digital services. Computer Standards & Interfaces, 23 (19-28), 2001. [Paterson, 2003] Paterson, G. K. A comparison between traditional Public Key infrastructures and identity-based cryptography. Information Security Technical Report, 8 3 (57-72), 2003. [Paterson, 2002] Paterson, K. G. Cryptography from pairings: a snapshot of current research. Information Security Technical Report, 7 3 (41-54), 2002. [Peyravian et al., 2004] Peyravian, M., Roginsky, A. & Zunic, N. Non-PKI methods for public key distribution. Computers & Security, 23 (97-103), 2004. [Pieprzyk et al., 2003] Pieprzyk, J., Hardjono, T. & Seberry, J. Fundamentals of Computer Security, Springer-Verlang, 2003. [Pierce, 1991] Pierce, C. B. Basic Category Theory for Computer Scientists, Foundations in Computing, Michael, G. & Albert, M. (Eds.), 1st ed., MIT Press, Massachusetts, 1991. [Pitt et al., 1985] Pitt, D., Abramsky, S., Poigne, A. & Rydeheard, E. D. (Eds.) Category Theory and Computer Programming, Guildford, UK, Springer-Verlang, 1985. [Pounder, 2003] Pounder, C. Security with unfortunate side effects. Computers & Security, 22 2 (115-118), 2003. 241

[Power and Tourlas, 2001] Power, J. & Tourlas, K. An algebraic foundation for graphbased diagrams in computing,In Proceedings of MFPS 2001- Seventeenth Conference on the Mathematical Foundations of Programming Semantics, Aarhus, Denmark, Electronic Notes in Theoretical Computer Science, 45, 2001. [Purser, 2001] Purser, S. A simple graphical tool for modelling trust. Computers & Security, 20 (479-484), 2001. [Raptis et al., 2001] Raptis, K., Spinellis, D. & Katsikas, S. Multi-technology distributed objects and their integration. Computer Standards & Interfaces, 23 (157-168), 2001. [Reid and Floyd, 2001] Reid, C. R. & Floyd, A. S. Extending the risk analysis model to include market-insurance. Computers & Security, 20 4 (331-339), 2001. [Reynolds, 1980] Reynolds, J. Using category theory to design implicit conversions and generic operators,In Proceedings of Proceedings of the Aarchus Workshop on Semantics-Directed Compiler Generation, Aarchus, Holland, Jones, D. N. (Ed.), Springer-Verlang, 94 (211-258), 1980. [Rivest et al., 1978] Rivest, L. R., Shamir, A. & Adelman, L. A method of obtaining digital signatures and public key cryptosystems. Communications of the ACM, 21 2 (120-126), 1978. [Rossiter and Heather, 2005] Rossiter, N. & Heather, M. Conditions for interoperability,In Proceedings of 7th International Conference on Enterprise Information Systems (ICEIS), Florida, USA, (92-99), 2005. [Rossiter and Heather, 2006] Rossiter, N. & Heather, M. Free and Open Systems Theory,In Proceedings of 18th European Meeting on Cybernetics and Systems Research (EMCSR-2006), University of Vienna, Trappl, R. (Ed.), 1 (27-32), 2006. [Rossiter et al., 2006] Rossiter, N., Heather, M. & Sisiaridis, D. Process as a World Transaction,In Proceedings of Proceedings of ANPA 27, Cambridge University, (36pp), 2006. [Rossiter et al., 2010] Rossiter, N., Heather, M. & Sisiaridis, D. Information Systems and the Physical World, to be submitted following presentation in ANPA 31, (35pp), 2010. [Rousset and Reynaud, 2004] Rousset, M. & Reynaud, C. Knowledge representation for information integration. Information Systems, 29 (3-22), 2004. [Ruppel et al., 2003] Ruppel, C., Underwood-Queen, L. & Harrington, J. S. E-commerce - the roles of trust, security, and type of e-commerce involvement. e-Service Journal, 2 2, 2003. [Rydeheard and Burstall, 1988] Rydeheard, E. D. & Burstall, M. R. Computational Category Theory, 1st ed., Prentice Hall, 1988. [Sattler et al., 2003] Sattler, K. U., Conrad, S. & Saake, G. Interactive example-driven integration and reconciliation for accessing database federations. Information Systems, 28 (393-414), 2003. [Schultz et al., 2001] Schultz, E. E., Proctor, W. R., Lien, M. C. & Salvendy, G. Usability and security - an appraisal of usability issues in information security methods. Computers & Security, 20 7 (620-634), 2001. [Schwarz and Mattern, 1994] Schwarz, R. & Mattern, F. Detecting causal relationships in distributed computations: in search of the Holy Grail. Distributed Computing, 7 (149-74), 1994. 242

[Schweimeier, 2001] Schweimeier, R. Categorical and Graphical Models of Programming Languages. Thesis, University of Sussex, PhD,2001 [Sedov, 2000] Sedov, V. Certificate classes. Information Security Technical Report, 5 4 (72-80), 2000. [Seldin and Hindley, 1980] Seldin, P. J. & Hindley, R. J. To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, 1st ed., Academic Press, London-New York, 1980. [Senicar et al., 2003] Senicar, V., Jerman-Blazic, B. & Klobucar, T. Privacy-enhancing technologies - approaches and development. Computer Standards & Interfaces, 25 (147-158), 2003. [Shen and Chen, 2002] Shen, R. L. V. & Chen, T. S. A novel management scheme based on discrete logarithms and polynomial interpolations. Computers & Security, 21 2 (164-171), 2002. [Sherif, 2003] Sherif, S. J. Intrusion detection - the art and the practice, part I. Information Management & Computer Security, 11 4 (175-186), 2003. [Sherif and Ayers, 2003] Sherif, S. J. & Ayers, R. Intrusion detection - methods and systems, part ii. Information Management & Computer Security, 11 5 (222-229), 2003. [Sherif and Gilliam, 2003] Sherif, S. J. & Gilliam, P. D. Deployment of anti-virus software - a case study. Information Management & Computer Security, 11 3 (510), 2003. [Sherwood, 1996] Sherwood, J. SALSA: a method for developing the enterprise security architecture and strategy. Computers & Security, 15 6 (501-506), 1996. [Shirey, 2000] Shirey, R. http://www.rfc.net/rfc2828.html. Internet Security Glossary,2000 [Shrivastava, 2005] Shrivastava, S. Contract-Mediated Interorganizational Interactions. IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922, 6 11 (11), 2005. [Shrivastava and Wheater, 1998] Shrivastava, S. & Wheater, M. S. A Transactional Workflow Based Distributed Application Composition and Execution Environment,In Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications, Sintra, Portugal, ACM (74-81), 1998. [Siponen, 2000] Siponen, T. M. Critical analysis of different approaches to minimizing user-related faults in information systems security: implications for research and practice. Information Management & Computer Security, 8 5 (197-209), 2000. [Sisiaridis et al., 2008] Sisiaridis, D., Rossiter, N. & Heather, M. A Holistic Security Architecture for Distributed Information Systems - A Categorical Approach,In Proceedings of EMCSR-2008: European Meeting on Cybernetics and Systems Research, Symposium Mathematical Methods in Cybernetics and Systems Theory, University Vienna, 25-28 March, I (52-57), 2008. [Sklavos and Koufopavlou, 2003] Sklavos, N. & Koufopavlou, O. Data dependent rotations, a trustworthy approach for future encryption system/ciphers: low cost and high performance. Computers & Security, 22 7 (585-588), 2003. [Skoularidou and Spinellis, 2003] Skoularidou, V. & Spinellis, D. Security architectures for network clients. Information Management & Computer Security, 11 2 (84-91), 2003. 243

[Smith and Eloff, 2002] Smith, E. & Eloff, H. P. J. A prototype for assessing information technology risks in Health Care. Computers & Security, 21 3 (266-284), 2002. [Sodiya et al., 2004] Sodiya, S. A., Longe, D. O. H. & Akinwale, T. A. A new two-tiered strategy to intrusion detection. Information Management & Computer Security, 12 1 (27-44), 2004. [Spaccarpietra et al., 1992] Spaccarpietra, S., Parent, C. & Dupont, Y. Model independent assertions for integration of heterogeneous schemas. The International Journal on Very Large Data Bases (VLDB), 1 1 (81-126), 1992. [Srinivas and Jullig, 1995] Srinivas, U. Y. & Jullig, R. Specware:Formal Support for Composing Software,In Proceedings of Mathematics of Program Construction, Springer-Verlang, 947 (399-422), 1995. [Stallings, 2002] Stallings, W. Cryptography and Nework Security: principles and practice, 3d ed., Prentice Hall, 2002. [Stephenson, 2003] Stephenson, P. Applying forensic techniques to information system risk management - first steps. Computer Fraud & Security, 2003 12 (17-19), 2003. [Stephenson, 2003] Stephenson, P. Manual link analysis and trace back. Computer Fraud & Security, 2003 6 (17-20), 2003. [Stephenson, 2004] Stephenson, P. Applying impact and vulnerability analysis to risk management. Computer Fraud & Security, 2004 2 (16-20), 2004. [Stephenson, 2004] Stephenson, P. Policy domain mapping. Computer Fraud & Security, 2004 4 (17-20), 2004. [Tanenbaum and Van Steen, 2002] Tanenbaum, S. A. & Van Steen, M. Distributed Systems - Principles and Paradigms, 1st ed., Prentice Hall, 2002. [Taylor, 1993] Taylor, P. An Exact Interpretation of while,In Proceedings of Proceedings of the First Imperial College Department of Computing Workshop on Theory and Formal Methods, Isle of Thorns Conference Centre, Chelwood Gate, Sussex, UK, Springer-Verlang (302-313), 1993. [TCSEC, 1985] TCSEC Trusted Computer System Evaluation Criteria. National Computer Security Center,1985 [Tickle, 2002] Tickle, I. Data integrity assurance in a layered security strategy. Computer Fraud & Security, 2002 10 (9-13), 2002. [Todd et al., 2002] Todd, M., Colwill, C., Allen, D. & Btexact Benchmarking for critical infrastructure protection. Information Security Technical Report, 7 2 (37-49), 2002. [Trompeter and Eloff, 2001] Trompeter, M. C. & Eloff, H. P. J. A framework for the implementation of socio-ethical controls in information security. Computers & Security, 20 (384-391), 2001. [Tsaur and Horng, 2001] Tsaur, W. J. & Horng, S. J. Auditing causal relationships of group multicast communications in group-oriented distributed systems. The Journal of Supercomputing, 18 (25-45), 2001. [Turing, 1937] Turing, A. M. On computable numbers, with an application to the Entcheidungsproblem. Proceedings of the Londond Mathematical Society, 2 42 (230-265), 1937. Reprinted by Davis, Martin (ed), in The undecidable: basic papers on undecidable propositions, unsolvable problems, and computable functions, (116 - 154), Hewlett, New York, Raven Press, 1965 244

[Usher, 2003] Usher, M. Certificate policies and certification practice statements - a practical approach. Information Security Technical Report, 8 3 (14-22), 2003. [Van der Haar and Von Solms, 2003] Van der Haar, H. & Von Solms, R. A model for deriving information security control attributed profiles. Computers & Security, 22 3 (233-244), 2003. [Venter and Eloff, 2003] Venter, H. S. & Eloff, J. H. P. A taxonomy for information security technologies. Computers & Security, 22 4 (299-307), 2003. [Vermeulen and Von Solms, 2002] Vermeulen, C. & Von Solms, R. The information security management toolbox - taking the pain out of the security management. Information Management & Computer Security, 10 3 (119-125), 2002. [W3C a, 2001] W3C a Pages on SOAP. World Wide Web Consortium,2001 http://www.w3.org/TR/SOAP/ [W3C b, 2001] W3C b Pages on the Semantic Web. World Wide Web Consortium,2001 http://www.w3.org/2001/sw/ [W3C c, 2002] W3C c Pages on Web Services Description Language (WSDL). World Wide Web Consortium,2002 http://www.w3.org/2002/ws/desc/ [W3C d, 2001] W3C d XML - a technical recommendation of the W3C. World Wide Web Consortium,2001 http://www.w3.org/TR/REC-xml [Ward and Smith, 2002] Ward, P. & Smith, C. L. The development of access control policies for information technology systems. Computers & Security, 21 4 (356371), 2002. [Wheeler and Needham, 1994] Wheeler, J. D. & Needham, M. R. TEA, a Tiny Encryption Algorithm. Two Cryptographic Notes,Technical Report 355, Computer Laboratory, University of Cambridge, (1-3),1994 http://www.ftp.cl.cam.ac.uk/ftp/papers/ [Williamson and Healy, 1999] Williamson, K. & Healy, M. Industrial Applications of Software Synthesis via Category Theory,In Proceedings of Proceedings of the 14th IEEE international conference on Automated software engineering, IEEE Computer Society Washington, DC, USA (35), 1999. [Wilson, 2003] Wilson, P. Top-down versus bottom-up different approaches to security. Network Security, 2003 12 (17-19), 2003. [Winn, 2005] Winn, N. Holistic Security in a Fragmented Governance Structure?2005 http://www.esrcsocietytoday.ac.uk/ESRCInfoCentre/Plain_English_Summaries/go vernance_and_citizenship/global_governance/index11.aspx?ComponentId=1565 6&SourcePageId=11713 [Winskel and Nielsen, 1995] Winskel, G. & Nielsen, M. Models for concurrency. Handbook of Logic in Computer Science, Oxford University Press, New York, 4 (1-148),1995 [Wiseman, 2001] Wiseman, S. Database Security: retrospective and way forward. Information Security Technical Report, 6 2 (30-43), 2001. [Wolfe-Wilson and Wolfe, 2003] Wolfe-Wilson, J. & Wolfe, H. B. Management strategies for implementing forensic security measures. Information Security Technical Report, 8 2 (55-64), 2003. [Wong et al., 2003] Wong, W. E., Sugeta, T., Li, J. J. & Maldonado, C. J. Coverage testing software architectural design in SDL. Computers & Networks, 42 (359374), 2003. 245

[Wood, 2004] Wood, C. C. Why information security is now multi-disciplinary, multidepartmental, and multi-organizational in nature. Computer Fraud & Security, 2004 1 (16-17), 2004. [Ye, 2001] Ye, N. Robust intrusion tolerance in information systems. Information Management & Computer Security, 9 1 (38-43), 2001. [Yeh et al., 2002] Yeh, Y. S., Lai, W. S. & Cheng, C. J. Applying Lightweight Directory Access Protocol (LDAP) service on session certification authority. Computers & Networks, 38 (675-692), 2002. [Yoon, 1993] Yoon, H. H. D. The categorical framework of object-oriented concurrent systems. Computers & Mathematics with applications, 25 2 (33-38), 1993. [Zenkin, 2001] Zenkin, D. Fighting against the invisible enemy - methods for detecting an unknown virus. Computers & Security, 20 (316-321), 2001. [Zuccato, 2004] Zuccato, A. Holistic security requirement engineering for electronic commerce. Computers & Security, 23 (63-76), 2004. [Zuccato, 2007] Zuccato, A. Holistic security management framework applied in electronic commerce. Computers & Security, 26 3 (256-265), 2007.

246

APPENDIX A - PKI infrastructure According to X.509, a public-key certificate is an electronic document, signed by a Certification Authority (CA). It consists of a public key together with a string identifying the entity to which that key is associated. A CA is the basic component of a PKI infrastructure. A PKI includes also a registration authority (RA), a PKI directory, a certificate policy [King and CBCP], and a certification practice statement that contains the rules specific to each CA. In the same document, attribute certificates are defined, which carry attribute information about a subject. A particular type of attribute certificates, the role attribute certificate‟, carries information about roles assigned to the certificate subject in accordance with subject‟s responsibilities and capabilities. Certification Authorities (CAs) are also known as Key Distribution Centres (KDCs), or Authentication Services (ASs). They are special trusted services, which share a secret key with each of the hosts, but no pair of hosts is required to have a shared secret key as well. The Needham-Schroeder KDS [1978], used in the Kerberos system, and the DiffieHellman KDS, are the best known examples. The latter is only suitable for point-to-point communication. X.509 is based on the global uniqueness of distinguished names, a goal that is impractical, as it does not reflect the current legal and commercial practice. An alternative to X.509 is the Simple Public-Key Infrastructure (SPKI) [IETF, 1997]. It enables chains of certificates to be processed using logical inference to produce derived certificates. Chadwick & Basden [2001] have built an expert system that calculates the amount of trust, or trust quotient, which one can place in the name to public key binding in a certificate, based on RFC 2527. Another approach [Usher, 2003] considers the factors that influence the implementation of a documented infrastructure that will support the delivery of PKI services and the understanding of trust that can be placed on those services. A simple, graphical tool for modelling trust is presented by Purser [2001]; it can be used to model PKI, using trust diagrams. Lam et al [2003] designed a middleware architecture to enhance interoperability of PKI applications. A quantitative, evaluation study of X.509 compliant PKI using queuing models [Bruschi et al., 2003] focused on the performance of the subsystem in terms of generating and 247

managing digital certificates as well as implementing revocation mechanisms and auditing activities. The results show that the subsystem can guarantee acceptable response time with consistent number of users. In advance, throughput must not exceed 3.5 requests per second, where a request can be a certificate generation or a revocation request. Brands [2002] argues that X.509-style PKI fails in the context of access management, and presents a solution based on Digital Credentials. Peyravian et al [2004] propose three methods for reducing the complexity of the operations as well as to save storage and bandwidth in public key distribution, as an alternative to X.509 CA PKI. An important issue concerning certificates is their longevity. Mechanisms to revoke the certificate by making it publicly known that the certificate is no longer valid, include the use of Certificate Revocation Lists (CRLs), published regularly by the certification authority, as well as the use of other methods to restrict the lifetime of a certificate.

248

APPENDIX B – RBAC CONCEPTS Flat Hierarchical

User-role and permission-role assignment can be many-to-many A „role hierarchy‟ is a partial order defining a seniority relation between roles, whereby senior roles acquire the permissions of their juniors

Constrained

Enforces Separation of Duty [Sodiya et al.] i.e. the partitioning of tasks and associated privileges among different roles in order to prevent a single user obtaining too much authority. SoD can be static (SSD) or dynamic (DSD).

Symmetric

Adds the permission-role review requirement similar to rolereview requirement in flat RBAC, whereby roles to which a particular permission is assigned can be determined as well as permissions assigned to a specific role.

Table A-1: Symmetric RBAC with constraints, based on RBAC 96

R: a set of roles {r1… rn} U: a set of users {u1…um} P: a set of permissions {p1…po}

UA  U  R : a many-to-many user-to-role assignment relation PA  P  R : a many-to-many permission-to-role assignment relation RH  R  R : a role hierarchy (partial order) users : R  2U : a function mapping each role ra to a set of users ( 1  a  n )

users(ra )  {u  U}|(u, ra )  UA} : a function returning all users assigned to ra perms : R  2P : a function mapping each role ra to a set of permissions perms(ra )  { p  P|( p, ra )  PA} : a function returning all permissions assigned to ra roles : P  2R : a function mapping each permission pb to a set of roles ( 1  b  o )

roles( pb )  {r  R|(pb , r )  PA} : a function returning all roles pb is assigned juniors : RThe basic 2 : elements a function role ra Table A-2: andmapping functionseach of RBAC96: R

to a set of roles junior of ra

juniors(ra )  {r  R|(r,ra )  RH} : a function returning all roles junior to ra seniors : R  2R : a function mapping each role ra to a set of roles senior of ra

seniors(ra )  {r  R|(ra ,r)  RH} : a function returning all roles senior to ra

249

Separation of Duty handling in TRBAC models Separation of Duty in constrained RBAC models is represented in terms of conflicting roles instead of conflicting permissions. Thus, the notion of mutual exclusion is introduced that identifies exclusive pairs of roles ( ME  R  R ). In a variation of a constrained RBAC, the Temporal Constrained RBAC, roles can be enabled in some periods and not enabled in others. The function „status assignment‟ ( SA : R  {enabled , disabled} ) associates an enabling status with roles. An event E is either the enabling or the disabling of a role. Each event has an associated priority. A periodic event (PE: ) is a prioritized event with an associated temporal interval of validity. Role triggers (RT) allow one to specify enabling/disabling dependencies among roles. A role trigger (RT: ) is characterized by a list of preconditions for the activation of a trigger, a prioritized event and two optional clauses, the „after‟ and „tolerance‟ clauses, which influence the time the event takes place.

250

APPENDIX C - ’Customer-oriented’ information integration Web caching and content distribution have been proposed for scalability issues such as the process of delivering Internet services to a large number of users. A survey was made [Pan et al., 2003] on content distribution networks and in particular their system-based server selection schemes for domain names. Dependence graphs [Conte and Sichman, 2002] aim to model the object of knowledge before modelling the knowledge itself, based on dependence theory. Social structures are viewed as patterns (graphs) of objective relationships of social dependence. The approach has been applied in modelling multiagent and group dependence. Hofmann [2004] proposes a family of model-based algorithms, which rely on a statistical modelling technique that introduces latent class variables in order to discover user communities and prototypical interest profiles. France et al [2002] developed a search engine, which is integrated with data mining, in an effort to help support customer-oriented information search action by reducing the consumer‟s information search perplexity. Using knowledge representation techniques, mediators can be built based on a single mediated schema which describes a domain of interest, and on a set of source descriptions –mappings, expressing how the content of each source available to the system is related to the domain of interest. Two recent information integration systems based on the mediated schema, are Picsel and Xyleme [Rousset and Reynaud, 2004]. May & Lausen [2004] have proposed an integrated framework for Web exploration, wrapping, data integration, and querying semi-structured data, using the Florid system. Recommender systems are a personalized information filtering technology used to identify a set of items that will be of interest to a certain user. User-based collaborative filtering, the most successful technology for building recommender systems, aims to learn predictive models of user preferences, interests or behaviour from community data, that is, a database of available user preferences; its computational complexity grows linearly with the number of customers. Deshpande and Karypis [2004] propose a class of modelbased recommendation algorithms that first determines the similarities between the various items and then uses this information to identify the set of items to be recommended.

251

APPENDIX D - Workflow systems Workflow systems are increasingly being used to streamline organization‟s business processes. They use the Internet as the underlying communications infrastructure. Functional access control requirements in Workflow Systems (WS) such as the „strict least privilege‟, the „order of events‟, and the „separation of duty‟ are often associated with Business-Process Reengineering (BPR). A workflow is a set of sequences of activities called business processes, which represent the functioning of an organization and are executed in the run-time environment. A business process is described to the workflow system by means of a process definition; the latter identifies the tasks that form part of the business process and provides also rules for specifying the conditions for executing those tasks in certain roles. Existing access control frameworks for WS separate the administration-time and the run-time aspects and are based on RBAC principles using different types of agents [Botha and Eloff, 2001]. Additional elements and function of workflow systems compared with those of RBAC96 are given in Table 311. Here, a workflow W is considered as a partial order of tasks ( W=(T,  ) ). Thus, if it is t1  t2 , then t1 is executed before t2, or in parallel with t2. Instead of associating a user with permissions based on the user‟s role, it can be based on the requirement of the workflow task. Thus, a workflow session WS controls the activation of roles in the workflow environment based on the task instance with which it is currently dealing: WS:{(u, t _ perms(t )) | (r  t _ roles(t )) : u  users(r )} , which supports the „strict least

privilege‟. Other access control principles that can be shown here are the order of events and the Separation of Duties [Sodiya et al.]. O: a set of objects {o1…oq} M: a set of actions (methods) {m1…ms} T: a set of tasks {t1…tk}

TA  T  R : a many-to-many task-to-role assignment relation t _ roles : T  2R : a function mapping each task tc to a set of roles ( 1  c  k )

t _ roles(tc )  {r  R|(tc ,r )  TA} : a function returning all roles tc is assigned t _ perms : T  2 elements : a function tc to Table D-1: Additional of WSmapping comparedeach withtask RBAC96 P

a set of permissions 252

Organizational practice was investigated also by Gladney [1997]. He proposed a method for handling access control in digital libraries using a subject tree with an ad hoc role that grants control privileged roles and binds access control information to objects indirectly. Another scheme derived from a case study upon authentication, access control, and payment scenarios for digital libraries [Patel, 2001], is based on the X.500 standard. A transactional workflow-based distributed application composition and execution environment is presented by Shrivastava and Wheater [1998] designed for interoperability, scalability, flexible task composition, dependability, dynamic reconfiguration and auditability.

253

APPENDIX E - CORBA security The Common Object Request Broker Architecture (CORBA) [OMG a, 2001] is a standard tool in which a meta-language interface is used to manage interoperability among objects. The primary focus has been on client-server interactions within a relatively static resource environment. Object member access is defined using the IDL. An Object Request Broker (ORB) is used to provide resource discovery among client objects. The CORBA security service was designed around the ideas of Distributed Computing Environment (DCE) and Kerberos, both of which both typically work in a small closed environment where the underlying platforms and policies can be controlled. It provides interfaces for objects authentication (including delegation), secure transactions, auditing and non-repudiation. CORBA security depends on a distributed Trusted Computing Base (TCB). A TCB is defined as the set of components (hardware, software, human, etc.) whose correct functioning is sufficient to ensure that the security policy is enforced. The key security features include the identification and authentication of principals, authorization and access control, delegation, auditing, integrity and confidentiality of communication between objects, non-repudiation, security policy, and security mechanisms independence. TINA is an open architecture for telecommunication services based on distributed objects. Its security architecture, CrySTINA [Gritzalis et al., 2000], is aligned with the CORBA security specification. Lopez et al have developed an approach [2003], which integrates the services of an external Privileged Management Infrastructure (PMI) into CORBA applications using the Resource Access Decision (RAD) facility of CORBA, a mechanism used by security-aware applications to obtain authorization decisions and to manage access decision policies.

254

APPENDIX F - Published work [1] Rossiter, N., Heather, M. & Sisiaridis, D. Process as a World Transaction,In Proceedings of Proceedings of ANPA 27, Cambridge University, (36pp), 2006. [2] Heather, M., Rossiter, N. & Sisiaridis, D. The Semantics of Jitter in Anticipating Time Itself within NanoTechnology,In Proceedings of CASYS'07, Liège, (12pp), 2007. [3] Sisiaridis, D., Rossiter, N. & Heather, M. A Holistic Security Architecture for Distributed Information Systems - A Categorical Approach,In Proceedings of EMCSR-2008: European Meeting on Cybernetics and Systems Research, Symposium Mathematical Methods in Cybernetics and Systems Theory, University Vienna, 25-28 March, I (52-57), 2008. [4] Rossiter, N., Heather, M. & Sisiaridis, D. Information Systems and the Physical World, to be submitted following presentation in ANPA 31, (35pp), 2010.

255

Suggest Documents