complex systems: science at the edge of chaos - CiteSeer

19 downloads 179 Views 3MB Size Report
COMPLEX SYSTEMS: SCIENCE AT THE EDGE OF CHAOS. Collected ..... tions of real, existing carbon-bound life forms: The science of the principles governing  ...
Helsinki University of Technology Control Engineering Laboratory Espoo 2004

Report 145

COMPLEX SYSTEMS: SCIENCE AT THE EDGE OF CHAOS Collected papers of the Spring 2003 postgraduate seminar Heikki Hyötyniemi (ed.)

TEKNILLINEN KORKEAKOULU TEKNISKA HÖGSKOLAN HELSINKI UNIVERSITY OF TECHNOLOGY TECHNISCHE UNIVERSITÄT HELSINKI UNIVERSITE DE TECHNOLOGIE D´HELSINKI

Helsinki University of Technology Control Engineering Laboratory Espoo October 2004

Report 145

COMPLEX SYSTEMS: SCIENCE AT THE EDGE OF CHAOS Collected papers of the Spring 2003 postgraduate seminar Heikki Hyötyniemi (ed.) Abstract: Complexity theory studies systems that are too complex to be attacked applying traditional methods. The search for new tools is boosted by intuition: There seem to exist strange similarities among different kinds of complex systems, and finding a general modeling framework for all of them would be a major breakthrough. However, today's complexity research consists of a variety of approaches and methodologies that seem to be mutually incompatible; it seems that the whole field is in a huge turmoil. This report tries to give a coherent view to the field from different points of view: The role of a historical perspective is emphasized when trying to understand the methodological developments.

Keywords: Chaos theory, complexity; agents, networks, cellular automata; fractals, power law distributions; hierarchies, decentralization; emergence.

Helsinki University of Technology Department of Automation and Systems Technology Control Engineering Laboratory

Distribution: Helsinki University of Technology Control Engineering Laboratory P.O. Box 5500 FIN-02015 HUT, Finland Tel. +358-9-451 5201 Fax. +358-9-451 5208 E-mail: [email protected] http://www.control.hut.fi/

ISBN 951-22-7507-4 ISSN 0356-0872

Picaset Oy Helsinki 2005

Preface In Spring 2003, a postgraduate course on complex systems was organized at HUT Control Engineering Laboratory. Complex systems is a hot topic today: Complexity in technical systems and in nature around us is overwhelming. The traditional modeling methods do not seem to be powerful enough to capture this diversity, and it seems that a paradigm shift is needed. It is complexity theory that promises to give us tools for this. But what is it all about — what is it now, and what are the future possibilities? These issues are discussed in this report. In addition to the printed version, this report is available also in PDF format through the HTML page at the Internet address http://www.control.hut.fi/hyotyniemi/publications /04_report145/. I am grateful to everybody who has contributed in this effort. — All authors are responsible for their own texts.

Contents 1 Introduction

7

1.1

Attacking complexity in general ...

. . . . . . . . . . . . . . .

7

1.2

... and complexity research in particular . . . . . . . . . . . .

9

1.3

About the contents of the Report . . . . . . . . . . . . . . . . 11 1.3.1

Philosophical view . . . . . . . . . . . . . . . . . . . . 12

1.3.2

Top-down view . . . . . . . . . . . . . . . . . . . . . . 12

1.3.3

Bottom-up view . . . . . . . . . . . . . . . . . . . . . . 13

1.3.4

Systems view . . . . . . . . . . . . . . . . . . . . . . . 15

2 Complex Systems

17

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2

Wolfram’s “New Kind of Science” . . . . . . . . . . . . . . . . 18

2.3

2.2.1

About the book and author . . . . . . . . . . . . . . . 18

2.2.2

Basic ideas

2.2.3

Cellular Automaton . . . . . . . . . . . . . . . . . . . . 18

2.2.4

Wolfram’s conclusions . . . . . . . . . . . . . . . . . . 19

2.2.5

Criticism . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.6

Provocation? . . . . . . . . . . . . . . . . . . . . . . . 20

. . . . . . . . . . . . . . . . . . . . . . . . 18

Networked systems . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1

New findings and inventions . . . . . . . . . . . . . . . 21

2.3.2

From hierarchy to network . . . . . . . . . . . . . . . . 21

2.3.3

Networked structures . . . . . . . . . . . . . . . . . . . 22 1

2

CONTENTS 2.3.4

Nodes or only connections?

. . . . . . . . . . . . . . . 22

2.4 Conclusions: A way to the New World? . . . . . . . . . . . . . 23 3 “What Kind of Science is This?”

25

3.1 Science, what is that? . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Is the age of revolutions over? . . . . . . . . . . . . . . . . . . 26 3.3 Is the Truth good, bad or ugly? . . . . . . . . . . . . . . . . . 27 3.4 Cellular automata ...? . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 Has the science become just a show? . . . . . . . . . . . . . . 31 3.6 Where to go? . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4 Architecture of Complex Systems

35

4.1 Conceptions of Complexity . . . . . . . . . . . . . . . . . . . . 35 4.1.1

Holism and Reductionism . . . . . . . . . . . . . . . . 35

4.1.2

Cybernetics and General System Theory . . . . . . . . 36

4.1.3

Current interest in complexity . . . . . . . . . . . . . . 36

4.2 The architecture of complexity . . . . . . . . . . . . . . . . . . 37 4.2.1

Hierarchic systems . . . . . . . . . . . . . . . . . . . . 37

4.2.2

The evolution of complex systems . . . . . . . . . . . . 37

4.2.3

Nearly decomposable systems . . . . . . . . . . . . . . 38

4.2.4

The description of complexity . . . . . . . . . . . . . . 40

4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5 Towards Decentralization

43

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 Intelligent agents that interact . . . . . . . . . . . . . . . . . . 45 5.3 Rationales for multiagent systems . . . . . . . . . . . . . . . . 46 5.4 Multiagent systems . . . . . . . . . . . . . . . . . . . . . . . . 47 5.4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4.2

Motivations . . . . . . . . . . . . . . . . . . . . . . . . 48

5.5 Degree of decentralization . . . . . . . . . . . . . . . . . . . . 49

CONTENTS

3

5.5.1

A single central server . . . . . . . . . . . . . . . . . . 49

5.5.2

Multiple mirrored servers . . . . . . . . . . . . . . . . . 49

5.5.3

Multiple, non-mirrored servers . . . . . . . . . . . . . . 50

5.5.4

Totally distributed peers . . . . . . . . . . . . . . . . . 51

5.6

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.7

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Networks of Agents 6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.1.1

6.2

57

Reductionism . . . . . . . . . . . . . . . . . . . . . . . 59

Advent of Graph Theory . . . . . . . . . . . . . . . . . . . . . 59 6.2.1

Königsberg bridges . . . . . . . . . . . . . . . . . . . . 59

6.2.2

Random networks . . . . . . . . . . . . . . . . . . . . . 61

6.3

Degrees of separation . . . . . . . . . . . . . . . . . . . . . . . 62

6.4

Hubs and connectors . . . . . . . . . . . . . . . . . . . . . . . 64

6.5

The Scale-Free Networks . . . . . . . . . . . . . . . . . . . . . 65 6.5.1

The 80/20 Rule . . . . . . . . . . . . . . . . . . . . . . 65

6.5.2

Random and scale-free networks . . . . . . . . . . . . . 66

6.5.3

Robustness vs. vulnerability . . . . . . . . . . . . . . . 67

6.6

Viruses and fads . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.7

The Map of Life . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.8

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

7 Cellular Automata

73

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.2

History of Cellular Automata . . . . . . . . . . . . . . . . . . 74 7.2.1

Spectacular historical automata . . . . . . . . . . . . . 74

7.2.2

Early history of cellular automata . . . . . . . . . . . . 74

7.2.3

Von Neumann’s self-reproducing cellular automata . . . . . . . . . . . . . . . . . . . . . 76

7.2.4

Conway’s Game of Life . . . . . . . . . . . . . . . . . . 77

4

CONTENTS 7.2.5

Stephen Wolfram and 1-dimensional cellular automata . . . . . . . . . . . . . . . . . . . . . 79

7.2.6

Norman Packard’s Snowflakes . . . . . . . . . . . . . . 81

7.3 Cellular automata in technical terms . . . . . . . . . . . . . . 81 7.4 A Mathematical analysis of a simple cellular automaton . . . . . . . . . . . . . . . . . . . . . . . . 84 7.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 8 From Chaos ...

89

9 . . . Towards New Order

91

9.1 Introduction to complexity . . . . . . . . . . . . . . . . . . . . 91 9.2 Self-organized criticality . . . . . . . . . . . . . . . . . . . . . 93 9.2.1

Dynamical origin of fractals . . . . . . . . . . . . . . . 93

9.2.2

SOC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

9.2.3

Sand pile model . . . . . . . . . . . . . . . . . . . . . . 96

9.3 Complex behavior and measures . . . . . . . . . . . . . . . . . 98 9.3.1

Edge of chaos — Langton’s approach . . . . . . . . . . 98

9.3.2

Edge of chaos — another approach . . . . . . . . . . . 99

9.3.3

Complexity measures . . . . . . . . . . . . . . . . . . . 100

9.3.4

Phase transitions . . . . . . . . . . . . . . . . . . . . . 101

9.4 Highly optimized tolerance . . . . . . . . . . . . . . . . . . . . 102 9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 10 Self-Similarity and Power Laws

109

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 10.2 Self-Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 10.2.1 Self-organization . . . . . . . . . . . . . . . . . . . . . 112 10.2.2 Power laws . . . . . . . . . . . . . . . . . . . . . . . . 113 10.2.3 Zipf’s law . . . . . . . . . . . . . . . . . . . . . . . . . 115 10.2.4 Benford’s Law . . . . . . . . . . . . . . . . . . . . . . . 117 10.2.5 Fractal dimension . . . . . . . . . . . . . . . . . . . . . 119

CONTENTS

5

10.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 11 Turing’s Lure, Gödel’s Curse

123

11.1 Computability theory . . . . . . . . . . . . . . . . . . . . . . . 123 11.2 Gödelian undecidability

. . . . . . . . . . . . . . . . . . . . . 124

11.3 Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . . 124 11.4 Computation as frame work . . . . . . . . . . . . . . . . . . . 126 11.4.1 Computations in cellular automata . . . . . . . . . . . 126 11.5 The phenomena of universality . . . . . . . . . . . . . . . . . . 128 11.6 Game of Life

. . . . . . . . . . . . . . . . . . . . . . . . . . . 129

11.6.1 Making a Life computer . . . . . . . . . . . . . . . . . 134 11.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 12 Hierarchical Systems Theory

139

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 12.2 Process level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 12.2.1 Sub-system models . . . . . . . . . . . . . . . . . . . . 141 12.2.2 Lower-level decision units . . . . . . . . . . . . . . . . 144 12.3 Upper level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 12.3.1 Upper-level decision making . . . . . . . . . . . . . . . 145 12.3.2 Sub-system decision making problem . . . . . . . . . . 146 12.4 Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 12.5 The balancing principle . . . . . . . . . . . . . . . . . . . . . . 150 12.6 The Langrange technique . . . . . . . . . . . . . . . . . . . . . 152 12.7 A toy example . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 12.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 13 Qualitative Approaches

159

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 13.2 History of system dynamics . . . . . . . . . . . . . . . . . . . 160 13.3 Beer distribution game . . . . . . . . . . . . . . . . . . . . . . 163

6

CONTENTS 13.3.1 Rules of the beer distribution game . . . . . . . . . . . 163 13.3.2 Management flight simulators . . . . . . . . . . . . . . 166 13.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 13.5 Basic concepts of system dynamics . . . . . . . . . . . . . . . 169 13.5.1 Stocks and flows . . . . . . . . . . . . . . . . . . . . . 169 13.5.2 Causal loops . . . . . . . . . . . . . . . . . . . . . . . . 169 13.5.3 Equations behind the model . . . . . . . . . . . . . . . 172 13.5.4 Flight simulator . . . . . . . . . . . . . . . . . . . . . . 173 13.6 Literature and research institutes . . . . . . . . . . . . . . . . 175 13.6.1 System dynamics literature . . . . . . . . . . . . . . . 175 13.6.2 Research institutes . . . . . . . . . . . . . . . . . . . . 176 13.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

14 Systems Theory

179

14.1 General System Theory . . . . . . . . . . . . . . . . . . . . . . 179 14.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 179 14.1.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 14.1.3 Trends in system theory . . . . . . . . . . . . . . . . . 181 14.1.4 Ideas in general system theory . . . . . . . . . . . . . . 183 14.1.5 Open vs. closed systems . . . . . . . . . . . . . . . . . 185 14.1.6 The system concept . . . . . . . . . . . . . . . . . . . . 185 14.2 Towards a New Science of industrial automation

. . . . . . . 186

14.2.1 Towards new paradigm? . . . . . . . . . . . . . . . . . 189 14.2.2 Theoretical issues concerning New Science . . . . . . . 191 14.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Session 1 Introduction Heikki Hyötyniemi Helsinki University of Technology Control Engineering Laboratory [email protected]

In the field of complex systems everything is complex — it is even difficult to define what it means that something is “complex”. In this context, the idea of emergence is taken as the unifying principle: It is assumed that in a complex system some unanticipated higher-level functionality or structure pops up when simple low-level constructs are appropriately connected and when they interact. The results on the higher level cannot be predicted when only the building blocks on the lower-level are seen. In this way, research on complex systems defies traditional scientific practices: It is holistic rather than reductionistic. As systems scientists (!) our ambition is to construct models of systems, however complex they happen to be. How to attack a modeling problem that by definition defies engineering-like deterministic analysis attempts?

1.1

Attacking complexity in general ...

The first steps when constructing model for a system, complex or not, involve outlining the system, determining its boundaries and dependency structures; after that, data is collected. Model construction involves abstraction: The 7

8

Session 1. Introduction

available data is contracted so that only the most relevant information remains. Despite approved modeling principles a great deal of ingenuity and intuition is necessary when pieces are put back together. We see various examples of complex systems around us, intuitively understanding that there is something fundamentally similar beneath the surface. However, the complex systems we see are so multifaceted that it is difficult to see what this underlying similarity is. Typically the low-level actors, whatever is their physical appearance, are essentially identical, but after some kind of dynamic processes involving positive feedback loops take place, some kind of self-organization emerges where some kind of fractal and self-similar patterns can be detected. For example, take the following examples: • In a genetic (cybernetic) system, the emergent phenomenon is specialization of tissues and organs, resulting from interaction between genes and enzymes. • In a cognitive system, the emergent phenomenon is intelligence or consciousness, resulting of interaction between individual neurons. • In a social system, the emergent phenomenon is the birth of organizations, nations, and business enterprises, and the rivalry among them, resulting from actions of individual humans. • In a memetic system, the emergent phenomenon is the formation of ideologies and religions, and also scientific paradigms, resulting from an interplay between ideas. What can be said about all complex systems in general? From the modeling point of view, one has to recognize that there exist much too little data to draw definite conclusions. All examples of complex systems are too diverse, there is too little similarity to uniquely determine the underlying similarities. Will the “General Theory of (All) Complex Systems” remain just a dream forever? It has been claimed that complexity theory will revolutionize old ways of thinking. Is this just empty jargon, or is there some relevance beneath the bluster? Indeed, it can be claimed that ways of doing science will not be the same any more. The natural sciences doing analysis on natural phonomena will change drastically because the nature being studied will change. — What on earth does this claim mean? As compared to traditional modeling problems, where, too, phenomena in the physical environment are being explained, a deeper transition is taking place. Let us study the power of new thinking in concrete terms:

1.2. ... and complexity research in particular

9

The proponents of traditional science say that the intuitive nature and shaky definition of complex systems is a disadvantage — a pessimist would say that one cannot study complexity because one cannot define it in the first place. But one recognizes a complex system when seeing one. As optimists we can think that luckily enough we do not have to define what a complex system exactly is! A complex system is anything that looks like a complex system. The essense is there, no matter what is the origin of the system, whether that creature is dwelling in biosphere or in infosphere. This makes it possible to artificially generate fresh data — examples of complex behavior — for analysis purposes. Complexity as a research topic, or searching for the general rules underlying all complex systems, is perhaps a better manageable problem than studying only one complex system at a time. For example, in artificial life research one is no more bound to the limitations of real, existing carbon-bound life forms: The science of the principles governing the universal properties of life may transcend from “life as we know it” to “life as it could be”. It is easier to see the big ideas and basic principles when there exists a larger body of material to study — this kind of wealth can be reached when artificial life forms are constructed. The computer is the tool for creating strange worlds where creatures defined in terms of simple mathematical formulae interact. Whether or not complex behaviors emerge — the boundaries of life can thus be mapped. The philosophers will also have new material to ponder about: How to focus the age-old definitions of what life is? To have a glimpse of what kind of manifestations complex systems can have, it is necessary to look at different kinds of examples. Thus, showing examples of complex systems from different points of view is the goal in this report. Whatever is the final destiny of complexity research, one contribution that will remain is this shift from analytic way of doing science towards synthetic science. It is simulation that plays a central role in future science. However, simulation must not be an end in itself.

1.2

... and complexity research in particular

One reason for the current interest in complex systems research is due to Stephen Wolfram’s new book “A New Kind of Science”. However, complex

10

Session 1. Introduction

systems research has a much longer, colorful history, its mathematics dating back to Poincaré Orbits, continuing to Lorenz Attractors, and Mandelbrot Sets, etc. No definite conclusions have ever been drawn, and the same challenges are invented once again after the previous generation of researchers has left the field. Research seems to be cyclic — at least the branches of research that are driven by fashionable buzzwords. Indeed, the society of complexity researchers itself constitutes a complex system: The field is a mosaic where different ideas exist — every now and then particularly good ideas emerge from chaos, staying alive and flourishing, being seeds of further great ideas. The goal here is to try and understand the field of complex systems research. What is the dynamics there? Can we estimate the future state, and perhaps even personally contribute in reaching this new understanding? How to make the research cycles into spirals, so that when the same old ideas pop up later, they can be seen on some higher level of understanding? When trying to understand complex systems research, history cannot be forgotten. Typically in sciences, it is the results that are of importance, and one can forget about the background with its detours and dead ends. When seen in retrospect, the Kuhnian “paradigm shifts” only remain visible. Now, on the other hand, one is facing a science during its making, and everything is in turmoil — the new stasis has not yet been reached. It is not yet clear what is the hard core of the new paradigm. Modeling of dynamic transients is much more challenging than static modeling of the steady states. Without enough understanding of the Hegelian “Spirit of Complexity”, its natural dynamics and inertias, there seems not to be much sense in the bumping of this research field. This Spirit determines what is “interesting” at some time; the Spirit is an emergent phenomenon, not bound to individual researchers but to the dynamics of the whole research community. If the Spirit is not mature enough, an idea, however good, is not understood and widely discussed, and it suffers a “memetic death”, not becoming a landmark along the evolution of the research branch. To understand the whims of this free Spirit, one first has to understand what is interesting. One has to know what kind of understanding there already exists, how the current state has been reached, and what are the painstaking problems at the moment. It turns out that the results from complexity theory can be applied for analysis of this interestingness issue. As shown later, the interesting areas typically reside between order and chaos. How to remain there in research work, not to sway over the borderline to hopeless disorder and chaos — this is a key question. Indeed, it is easy to do “ironic science” in the chaotic domain, introducing new and fancy ideas with no solid grounding; to remain on the

1.3. About the contents of the Report

11

boundary between order and chaos in complexity research, one explicitly has to connect new ideas to relevant mathematics. Mathematics is an emblem of order; it is just a way to express things in a logical, consistent way, and non-mathematics is not much more than hand-waving. Stephen Wolfram speaks of New Science where simulation is the exclusive way to do science, and mathematics has to be abandoned altogether. However, after seeing the flood of endless images and patterns, one has to face the fact that intuition has to be fetched from more concrete analyses. The hypothesis here is that the traditional, mathematically oriented analytic science still remains, and this way the ongoing further complexity research can continue along the edge of chaos. Sticking to the central role of mathematics may sound like a harsh limitation — but, truly, this restriction only opens up new horizons. Actually, one could be more ambitious here than what Stephen Wolfram is: Even if there will not exist New Science, it is the whole world of observations that will be changed. Within the computer, there exist an infinite number of New Worlds to be conquered! To summarize, simulation is a new way to have new data, but it is no substitute for theoretical analysis. However, traditional mathematical tools are not well suited for the new challenges, and new ones need to be developed for analysis of complex systems. Seeing the multitude of complex systems and approaches to attacking them may help to see the problems one is facing when trying to develop such mathematics.

1.3

About the contents of the Report

The role of this report is to try to give an overall view over the field of complex systems research. The view is necessarily incomplete and biased, mainly reflecting the editor’s personal view. But if the view were not incomplete, then (following the Gödelian intuition!) it would be certain that it would be inconsistent. Different perspectives to the same thing may open up new ways to understanding, and in what follows, the ideas of the selected four projections are briefly explained. All these four projections span a continuum where nothing is black-and-white — neither is it grey, but extremely colorful!

12

Session 1. Introduction

1.3.1

Philosophical view

The field of complex systems cannot be understood if the deedply human nature of this endeavour is not taken into account. The huge promises and equally deep disappointments have characterized the turbulent developments in the field. The fascinating nature of the research persuades people to deep discussions ... this kind of layman’s philosophy has divided people’s opinions, so that there is a continuum from euphoria to absolute denial. 1. New Kind of Science ... It has been claimed that the theory of complex systems would someday solve all problems concerning all kinds of complex systems. These systems include biological, economic, and social ... and the problems that would be solved include diseases, poverty, and unhappiness! The goal in Chapter 2 is to give insight in the huge promises. 2. ... or End of Science? It seems that different kinds of unsubstantiated promises have become commonplace in different branches of science, and the concept of ironic science has been coined. There are warning voices saying, for example, that physics, starting from natural philosophy in the Middle Ages, is regressing back to metaphysical philosophizing, being based on mere speculations. The claims can be neither validated nor invalidated. This kind of hype is specially characteristic to such fields as complex systems research, where no actual results have been reached to justify the huge promises, and where making unsubstantiated prophesies seems to be the basic style of doing research (Chapter 3).

1.3.2

Top-down view

There are different kinds of ideas of what kinds of structural constructs the complex systems are qualitatively composed of. Even though the differences between paradigms are clear, it is interesting to see the “mental inertia”: One often wants to see the world in terms of his/her own mental constructs, being reluctant against fresh approaches; this holds true what comes to the whole research community — at certain times different ways of seeing the world dominate. Looking complex systems research in a perspective, one can recognize the shift from centrally organized to completely distributed control. 1. From strict hierarchies ... Since the Aristotelian taxonomies, the traditional way of structuring

1.3. About the contents of the Report

13

systems is through hierarchies; in 1960’s, these views were formalized by Herbert Simon, and the ideas of “empty world” and “almost decomposability” were coined. It has been shown that hierarchies are more robust structures against disturbances and environmental changes that structureless constructions are; perhaps this robustness is the reason for why such hierarchic structures seem to be so common results of natural evolution in different domains, not only in natural systems (biological, neuronal, etc.) but also within societies, organizations, etc. (Chapter 4). 2. ... towards decentralization ... The pursue towards enhanced system robustness has led to extension of the hierarchy idea: The substructures can be given more and more freedom and independence. This kind of ideas are developed within the agent paradigm. For example, in concurrent artificial intelligence research speaking of agents seems to be fashionable (Chapter 5). 3. ... and interaction ... In agent systems, there still exist some central coordination; when this external control is explicitly ripped off, so that the structure is more or less random, one has a network where no node can be seen as being more important than the others. The interesting thing is that some organization automatically emerges ... As an example of changes in the thinking of what is efficient, is that the army-like hierarchic organizations are changing to project organizations where people are networked (Chapter 6). 4. ... only on the local scale? When the distribution idea is taken to the extreme, there is no external control whatsoever, the nodes being able to communicate only with their immediate neighbors, and one has a cellular automaton. This approach is the latest hit; however, it seems that more quantitative approaches are needed to reach real analysis tools (Chapter 7).

1.3.3

Bottom-up view

If starting from top, explicitly determining the structures, the essence of emergence is lost. In this sense, the new contributions to the philosophical age-old discussions result from quantitative simulation analyses, where data and algorithms can reveal their inner self without predetermined prejudices. Different kinds of universality properties have been observed in the behaviors

14

Session 1. Introduction

of different kinds of complex systems, some of these underlying principles having more long-lasting value than others.

1. From chaos ... Simple nonlinearities in systems can result in mindbogglingly complex behaviors. Starting in the 1970’s, many conceptually valuable phenomena were observed, including deterministic chaos, bifurcations, fractals, self-similarity, strange attractors, and Feigenbaumian universality. However, it soon became clear that chaos where all order disappears is too trivial; the interesting phenomena take place between order and chaos. 2. ... towards new order ... Whereas chaos theory recognizes that simple systems result in hopeless complexity, complexity theory, on the contrary, says that some simplicity can emerge from complex behaviors. Concepts like phase transitions, edge of chaos, self-organized criticality, and highly optimized tolerance reveal that there exists underlying similarity in very different environments (Chapter 9). 3. ... with new laws of behavior ... It seems that the unifyining principles are concentrated around power law distributions that are characteristic to all systems where some kind of self-organization takes place, and where self-similarity can be detected. In such environments the Gaussian distribution is no more the normal distribution! The resulting “Zipf law” structure among variables in complex systems, artificial or natural, seems to offer new tools for high-level analysis of systems (Chapter 10). 4. ... or with no laws? It can be seen that most of the proposed complex systems architectures are very powerful; indeed, the Turing’s machine can be implemented in those frameworks. Even though this power sounds like a benefit, it is not: It can be shown that in such frameworks the Gödel’s incompleteness theorem makes it clear that no analysis tools can be developed for such systems. It seems that there is a cycle here from order or higherlevel understanding back to chaos — or has something been gained? How to circumvent the scientific dead end (Chapter 11)?

1.3. About the contents of the Report

1.3.4

15

Systems view

It should not be a surprise that complex systems are complex — one just has to somehow tackle with such systems in practice. In the final chapters, examples of practical approaches, old and new, are presented. The objective of systems engineering is to manage and understand systems, however complicated they might be. 1. From hard mathematics ... One of the last theoretical success stories of modern system theory was the framework that was intended for the hierarchical control of large-scale systems. In the 1960’s and 1970’s they still thought that sophisticated system models could be constructed, or understood. The practitioners voted against such illusions, and selected “postmodern” approaches (fuzzy and neural) instead (Chapter 12). 2. ... towards applying intuitions ... When studying truly complex systems, no explicit models exist. The proponents of “system dynamics” assume that qualitative approaches suffice, forgetting about exact details and numbers, and still utilizing system theoretical tools. Simulation of more or less heuristically determined qualitative causality models is by no means mathematically well justified, but without such assumptions not very much useful is left that could be applied in practice (Chapter 13). 3. ... to reach a systemic view? In general system theory the abstractions are brought to extremum — having no more concrete numerical models available one just has to manipulate symbols, the methods being too abstract to be of any real use, and one easily ends up doing mere philosophy. However, the system theoretic approaches may still offer holistic ideas that are not bound to today’s practices. And, perhaps, from the system theoretic considerations one can complete the spiral back to hard mathematics. Can there exist a systemic view of complexity? How to see complex processes in a perspective? How the emergent patterns could be controlled, and how to make something interesting emerge out from the simulations? What kind of “holistic mathematics” might be available when analyzing complex systems? Specifically, how to restrict the power of tools so that they remain on the boundary between analyzability and non-analyzability? Indeed, such questions can be studied. For example, it seems that the idea of one single complexity theory has

16

Session 1. Introduction to be abandoned — perhaps the theory itself has to be fractal: To make something interesting emerge from the mindless simulations, the domain-specific semantics has to be coupled in the system structure. The structures also differ in different application domains. This kind of observations only make the questions more interesting and challenging: The area of complexity research cannot be exhausted by the pioneers alone. For example, most probably there will exist separate research areas like “complex cognitive system theory” and “complex automation system theory”, where the ways how emergent phenomena pop up differ from each other. Correspondingly, the appropriate mathematical tools are different, and the new intuitions and tools are different. The last chapter hopefully gives some insight on what this all really means.

Session 2 Complex Systems — A New World View Reino Virrankoski Helsinki University of Technology, Control Engineering Laboratory [email protected] Because of the rapid increase of computational power, many numerical methods has been widely used in the complex system research. In addition to that the development of information processing and networked systems like the Internet has also effect to the field of complex systems. This chapter gives introduction to the ongoing change and possible promises.

2.1

Introduction

After the development of microprocessor the computational capacity started to increase rapidly. This has enabled totally new world to the modelling and simulation of complex systems — systems that are difficult or impossible to analyze by using traditional mathematical methods, like analysis. That has also generated several questions. It has been shown that it is possible to imitate complex system behaviour by using computer programs, but how useful and valid that kind of imitations finally are? Shall we get something useful out of it, or does the intuitive similarity only confuse us? Is it possible to find something that reshapes our picture of systems called “complex” by 17

18

Session 2. Complex Systems

using different type of computer programs and number crunching methods, or are we just operating with new tools in old field?

2.2 2.2.1

Wolfram’s “New Kind of Science” About the book and author

Stephen Wolfram has recently generated a lot of discussion by publishing a book A New Kind of Science [1]. Wolfram was born in London and educated at Eton, Oxford and Caltech. He received his Ph.D. in theoretical physics on 1979 at the age of 20 having already made lasting contributions to particle physics and cosmology. In 1986, he founded Wolfram Research, Inc. [2], and developed computer algebra program Mathematica that was also, according to Wolfram, important instrument when he made the simulations and calculations presented in the book. Over the past decade Wolfram has divided his time between the leadership of his company and his pursuit of basic science.

2.2.2

Basic ideas

The crucial point in Wolfram’s thinking is that computing capacity is today so huge, that simple computer programs with simple rules should be enough for understanding complex behaviour. He presents several examples, how complex behaviour is generated based on simple rules, and he points out several similarities between nature and systems generated by simple computer programs. This leads him to the conclusion that there are often simple rules behind complex behaviour. Wolfram formulates his main idea as the principle of computational equivalence: whenever one sees behaviour that is not obviously simple, it can be taught of as corresponding to a computation of equivalent sophistication [1].

2.2.3

Cellular Automaton

One important instrument in Wolfram’s work is cellular automaton. One of the first predecessors of cellular automata was so-called Turing Machine, developed by Alan Turing in the 1930’s. Turing machine is formulated by giving a set of rules how it should operate and change its states. Turing proved that whatever computing process can be repeated by Turing machine.

2.2. Wolfram’s “New Kind of Science”

19

In the 1950’s John von Neumann formulated cellular automaton. There are cells located somewhere in space, and in each step of time the state of a single cell is determined based on the states of neighbouring cells one step earlier. One of the most well-known cellular automata was so-called Game of Life, a two-dimensional cellular automaton presented by John Conway in the 1960’s. Stephen Wolfram start to work with cellular automata in the 1980’s, and found soon out that even cellular automata generated by very simple rules have interesting properties. By using computer programs including Mathematica he and his company developed, Wolfram made numerous different types of simulations with different types of cellular automata. He found out interesting variations of chaotic and regular patterns in cellular automata, nested structures, evolutionary structures and also interactions between different types of structures. Wolfram used cellular automata to investigate the system sensitivity to the different initial values, and he used cellular automata-like substitution systems to imitate some natural processes, like fluid vortexes, formation of snowflakes and growth of plants and animals. He built different classifications to cellular automata types based on his findings, and found also analogies between cellular automata and traditional mathematics [1].

2.2.4

Wolfram’s conclusions

Based on his investigations, Wolfram claims that most of the complex natural processes are based on the same type of simple rules that he implemented in cellular automata. He goes even further, and postulates that even the laws of fundamental physics could be based on similar types of simple rules, and underneath all the complex phenomena we see in physics there could be some simple program, which, if run long enough, would reproduce our universe in every detail. As a comment for these postulates Wolfram writes: “The discovery of such subprogram would certainly be an exiting event — as well as dramatic endorsement for the new kind of science that I have developed on this book”. Furthermore, he suggests that by combining this with a certain amount of mathematical and logical deduction, it will be possible at least as far as reproducing the known laws of physics [1]. After huge amount of examples and comments Wolfram returns to the main idea of his “New Kind of Science”, what he calls the Principle of Computational Equivalence: “All processes, weather they are produced by human effort or occur spontaneously in nature, can be viewed as computations” [1]. This raises the question whether there is need for traditional mathematical

20

Session 2. Complex Systems

analysis any more; is traditional mathematical analysis even valid tool in some cases?

2.2.5

Criticism

Wolfram presents several examples of how to generate complex behaviour based on simple rules, but in many cases reader gets a feeling that something is missing. The big question — especially from the system theoretic and control point of view — is how to go backwards? In real life situations the problem is usually that we already have a complex system, and we are trying to find rules for handling it. This task is much demanding that just showing that it is possible to generate complex behaviour with simple rules. There exist also some contradictions in Wolfram’s presentation: In some cases he argues that it is important to take all the details into account, but in some other cases he says it should be enough just to follow the main lines and forget the details [1]. Another weakness is that the comparisons between traditional methods and methods based on Wolfram’s methods are missing. In many examples presented in the book (concerning vortexes, for example) it would have been interesting to see the model of the same system based on traditional methods. In some cases there exist quite courageous conclusions about the equivalence between cellular automata based systems and processes in real nature. It is hard to believe that it is enough that the behaviour only looks similar at some level [1].

2.2.6

Provocation?

It is not a miracle that Wolfram’s book has evoked a lot of discussion — and also furious criticism — in the scientific community. What in any case is widely accepted and important point in this book is that it presents how rapidly the computational power has increased and that the possibilities of many simulation and computer algebra programs are becoming better and better. Many methods which were recently shot down immediately because of their huge demand of computation capacity are part of the normal practice today. There are more computational power in many cheap and simple calculators, than there were in building-size machines 25 years ago. Could one of the motivations of Wolfram’s courageous text be provocation?

2.3. Networked systems

21

When talking about big basic things in science, the scientific community is today so smug that many radical new ideas are immediately shot down without any further discussion, if presented conventionally. Hence, more provocative style is needed for generating active debate.

2.3 2.3.1

Networked systems New findings and inventions

The basic principles of the neural system in biological organism have been known relatively long time, and during the latest decades a lot of new knowledge has been gathered in the area of neurosciences. In addition to that, the development of computers and the rapid growth of computing recourses have enabled the creation of different networked systems. Probably two most well-known network categories are the Internet and artificial neural network algorithms developed for modelling and simulation. The success of those two network types and the rapid increase in the number of users has also generated new viewpoints to the complex systems, and, generally, to the universe around us.

2.3.2

From hierarchy to network

The traditional way to see many complex systems has been as a hierarchical structure. Even in modern quantum physics there exist different electron energy levels in the basic structure, and furthermore each of those stages is split to hierarchical structure in lower level. In macroscale, space consists of solar systems, galaxies and so on. There are of course a lot of hypotheses and different types of theories concerning those models, but the traditional way how the things have been seen is some kind of hierarchical structure. Because of the traditional hierarchical way of thinking and relatively simple structure of hierarchical models, those types of models have been widely used in control engineering when trying to model and control complex systems. In many cases this approach has been useful or at least good enough when controlling industrial processes, for example.

22

2.3.3

Session 2. Complex Systems

Networked structures

In his book “Emergence” [3], Steven Johnson presents the idea that everything is linked rather than hierarchically structured. Furthermore, Johnson weights the importance of local low-level interactions in the network. He found analogies for example in the behaviour of ant colonies, economical systems and in the growth of cities. The structure of all of those systems is network-type, and there seems to be a lot of local interactions that are not centralized controlled. It could be possible that some general trends in the behaviour are initially or centrally set, like rules in the markets and city and instincts in the ant behaviour. Anyway, local interactions and decisions that cannot be centrally controlled have still crucial effect to the development of the whole system. Thus, Johnson sees many systems as being constructed from bottom to up. He also presents some discussion about the development of local structures and local decision making clusters [3]. In his book “Linked” [4], AlbertLazlo Barabasi presents similar types of thoughts about network-structured systems, and he formulates Agents as a functional objects in the network.

2.3.4

Nodes or only connections?

Two interesting issues concerning network structures are the importance of nodes and the importance of node connections. From one point of view, the locations of single nodes do not matter so much or they even do not matter at all; how the nodes are connected to each other matters, because it determines the possibilities how information can be transferred in the network. One can, for example, think the cellular phone system. It is not necessary to know the exact location of single cellular phone to make the system work; cellular accuracy has been enough to built robust communications system. When talking about the Internet, it is a fact that it has made the World smaller during the last 10 years, but there have not been big changes in the physical travelling times since the beginning of 1990’s until the beginning of new millennium. The change that has made the World smaller is the change in information transferring time and capacity. When in Euclidian space the distance between two points is calculated using the Euclidian norm, in the networked space it can be, for example, the number of hops between two nodes, or the time that it takes to send information from node to another [1, 2, 4].

2.4. Conclusions: A way to the New World?

2.4

23

Conclusions: A way to the New World?

Since the development of mathematical analysis, the theoretical research based on differential computation and algorithm development has had the leading rule in the area of system research. Such complex systems that were difficult or impossible to model were seen as exceptions from general uniform rules. Recently the achieved computational resources and continuous increase of it has started to shake this balance. Different methods based on numerical algorithms and other unstandard methods have given interesting and in some cases also promising results in the complex system research, and also in many traditional problems simulators based in numeric methods have became widely used. Many critics argue that because of the rapid development the possibilities of new methods are still not understood well enough. For example, some of the driving computers installed in the basic models of 1990’s cars have the same computational capacity as the computers used in the first Moon flight had. During the revolution of physics at the beginning of 1900’s the research of phenomena earlier usually just regarded as a holes in uniform physical picture created radically new physical world. Could it be so that, similarly, the use of new computational methods in the research of complex systems will radically renew the field of systems research? If it is so, mathematical analysis in the form it developed in the 1600’s will be just a part of system modelling valid in some restricted area — just like the results and methods of classical physics are special cases in the field of modern physics. Furthermore, when the new capabilities of information processing and networked systems are connected to new science, the change could be even more radical and in the near future we can deal with many systems taken earlier as too difficult or even impossible to model and control.

Bibliography 1. Wolfram, S., A New Kind of Science, Wolfram Media, Inc., 2002. 2. http://www.wolframresearch.com 3. Johnson, S., Emergence: The Connected Lives of Ants, Brains, Cities, and Software, Touchstone, 2002. 4. Barabasi, A-L., Linked: The New Science of Networks, Perseus Publishing, 2002.

24

Session 2. Complex Systems

Session 3 “What Kind of Science is This?” Heikki Pisto Helsinki University of Technology Control Engineering Laboratory [email protected]

What is the purpose of science? Is it valuable in itself, or is it only supposed to serve other human interests? In his book The End of Science John Horgan argues that age of scientific revolutions and breakthroughs is over, and that all there is left to do is to fill some gaps and that’s it.

3.1

Science, what is that?

Most of what can be called triumphant march of industrialised societies is basically built on two feet. One is the free market way of dividing and sharing resources, which has led to very profound specialisation between individuals. Market economy has in some form or another existed as long as homo sapiens itself. Nobody invented or discovered it, it seems quite natural, although there are good reasons to modify the rules and harness the actors if needed. The other, natural sciences, is also very old; it has its roots deep in the antiquity. There is much evidence, that the first advanced civilizations, like Babylonians and Egyptians, were clearly aware of benefits of what could be described as “applied sciences”. From the golden age of Greeks there has survived numerous manuscripts, and many show clear interest to knowledge 25

26

Session 3. “What Kind of Science is This?”

purely because of itself, not only for economical or other benefits. This idea, so called basic research, has proven to be as crucial in scientific developments than the other half, applying. For example, celestial mechanics may seem to be useless endeavour at first, but it has always been one of the basic objects of curiosity of mankind. After one learns that earth is round rather than flat, vast possibilities immediately open. The Greeks had much interest in philosophy, and they considered study of nature one branch of it, natural philosophy. They emphasised rationality, and tried to form laws about nature works. On the other hand, they did not consider testing of their theories with experiments too important, and some of their ideas are somewhat confusing, even silly, to modern person. The important idea, which emerged little by little during otherwise long and dark Middle Ages, was empiristic way of acquiring information about nature. In 13th century Grosseteste worked with optics, and concerned to verify theories by experiments. His student Bacon used mathematics to describe optical systems, and conducted systematic experiments with them. Later, the methods of Galilei, da Vinci and their successors of looking at the world through experiments became very powerful and popular. From those days on natural sciences have been increasingly related to mathematics. Observations needed interpretations, theories, to be useful, and mathematics was practically the only way to express them precisely. This symbiosis of mathematics and science is still in good health, but lately there have been some suspicions about this old paradigm of doing science.

3.2

Is the age of revolutions over?

Every now and then scientists have felt that there are no more things to discover. For a long time Newtons laws were good enough for everybody, mainly because they explained almost all of the common phenomena that occurred in everyday life. Electricity changed everything. This mysterious force raised new questions but also made possible to arrange new experiments, which revealed new phenomena, which in turn needed new theories. In early 20th century there were suddenly again plenty to do and explain, and it took some 50 years for particle physicists to clean up the mess. Towards the end of century there were again separate theories that were capable to explain basically all observations. The only problem is that these theories are not consistent with each other. Or is it a problem? Today most of the questions concerning space, time and matter can be considered solved, at least for current practical purposes. The picture of physics

3.3. Is the Truth good, bad or ugly?

27

looks good, but it is The Reality? Hardly. And even if it was, the wholeness is way too broad to be mastered by any individual. Still there are people, who think it is worth trying. Some of the scientists have taken the quantum leap beyond experimentability, and continued making science in further dimensions. Although this means quite a change in the traditional virtue of science, it may not be wise to blame them for doing that. The ghost of Democritus is still lurking among physicists. The other way to do progress seems to be to construct something from these already acquired building blocks of the Standard Model. Approaches are numerous. Economically the most lucrative branch of science is probably material sciences, where increase of knowledge has over the last 50 years caused another industrial revolution. The so called Moore’s law, which says that device complexity on a chip doubles every 18 months, has held true for almost 40 years now, and probably will for at least 10 years. Stream of new applications will continue to distant future, and yet new markets will emerge. Unfortunately, as smaller details and scales are attained, the costs of investments in industry have risen as fast as the markets. The so-called Moore’s second law states, that also cost of facilities increase on a semi-log scale. Modern semiconductor factory may easily cost billions and billions of euros in the future, which effectively keeps profits from rising exponentially. Biotechnology has also given big promises, but has not fully met them yet, despite huge amounts of investments. It is the same phenomenon that concerns basic research: For ever increasing use of resources there is inevitably diminishing amount of returns. It seems, that the progress is not slowing down because of lack of adept scientists. The potential scientist material is larger than ever. It is only that gigantic telescopes, particle colliders and fusion reactors seem to be the only way to go ever further to make any progress. Could it be, that scientists are running out of good ideas?

3.3

Is the Truth good, bad or ugly?

Science journalist John Horgan has taken to himself the burden of revealing the emperors new clothes. In his 1996 published book The End of Science he argues, that although there are vast amount of scientists, more than ever before, bustling around science, all they have left to do is to fill some gaps and figure out applications. He bases his argument on dozens of interviews he has made with some prominent scientists of late 20th century. They include philosophers, biologists, physicists, social scientists, neurologists, and people that he amusingly calls “chaoplexologists”. By chaoplexity he refers

28

Session 3. “What Kind of Science is This?”

to complex systems and chaos research, which has over last decades changed names according to concurrent fashions. Through the interviews he creates quite ambiguous construction about modern science. Some of the scientists are looking for The Answer; somebody seems to already have found it. Some believe in the Big Bang, some swear by evolution. And all this is flavoured with Horgans sceptical and refined comments. Horgan certainly hit a sore point. To make science is a profession, and “pure science”, basic research, is mostly publicly funded. In private sector people working with technology and knowledge like to call it “product development”, as contrary to “science” in public sector, even if they would be doing the same thing. Engineers in the private companies want to please the capitalists, scientists for one the taxpayers, the great public that is following their work. To this public it may sound quite confusing, if a notable science journalist declares, “it is all over, everybody go to your homes, there is nothing more to see”. It may be a simple mans shot to make a handprint in the world, but what if ...? The most common objection to Horgans argument has been “That’s what they thought hundred years ago”. Even though there are some famous quotes to support this, it is probably not true. Horgans answer is simply “No they didn’t”. For most of the sciences and scientists, this is the fact. In the end of 19th century science was by no means in a dead end, at least for a scientifically oriented mind. It is another thing, that for some spectators and appliers of technology, like Horgan hundred years later, it might have seemed that way. By Horgan this 100-years-argument means, that people want to say, “because science has advanced so rapidly over the past century or so, it can and will continue to do so, possibly forever”. This rhetoric turns out to be a philosophical hypothesis, which sounds unconvincing, thus supporting Horgans point. Yet that is not too convincing either in its strictest form, as in the title of his book. If the truth is somewhere in between, what is the catch? Some, if not most, of Horgans critic is pointed at something he calls “ironic science”. His most important observation about modern science is, that either in lack of potential or because of pure curiosity about abstract, it is heading more and more to speculative mode, separate from verifiable physical experiments. Physical world is becoming engineering and is left for engineers, because for potential future Einsteins, the prospects are bad. For example there is superstring theory, which Horgan calls “naïve ironic science”. Superstrings are mathematical constructions, that may make sense at certain level of comprehension, but is it really enough to form a world-view? String

3.4. Cellular automata ...?

29

theory can be described as speculative; it is not possible to do any experiments in its bonus dimensions, with energies unimaginable. Also some of its practitioners have almost religious relationship about their subject. When they see the truth, it is the truth, and should be treated as such, despite of the fact, that when none of its results are verifiable. They can in no way affect our lives. It is known, that scientific revolutions result because of change of paradigm, way of seeing and doing science. This speculative mode may well be a new paradigm, at least if there really is nothing else to do. In that case, also the experimental method of science, which has hundreds of years been so successful, needs some new explaining. So far purpose of the theories has been to make explanations, and if possible, predictions about nature. Experiments have been necessary to evaluate usefulness of theories. If importance of experiments is diminishing, or it the experiments are changing too expensive to conduct, there is clearly a transformation going on between paradigms. To Horgan it means, that science is going into decline. Obviously situation in not such, that there is nothing to discover any more. But even though “Answers raise new questions”, many of the questions are such by nature, that they cannot be critically answered with objective truth. Let’s consider cosmology. Every layman can formulate questions, which are as impossible to answer as those of religion. Science has its limits because of its (present) method; there is nothing that can be done about it. And when we pass the method of science, it is the end of science? In Horgans opinion ironic science has its function, though. By handling unanswerable or unimportant questions, it reminds human beings, how little they know, and maybe sets their existence in perspective, if is doesn’t happen to be there.

3.4

Cellular automata ...?

One common paradigm in doing science has been, that the truth in itself is somehow simple and beautiful. One of the more famous formulations of this is the so called “the Occam’s Razor”, after William of Occam (1300-1349), which basically recommends the simplest of all rational explanations to an observed phenomenon. This approach can be deceiving, for example in Horgans book it is gloomily quoted without reference, that “in biology Occam’s Razor cuts your throat”. Indeed, lots of interest has been spent since end of 19th century on dynamically complex behaviour of systems of several simple entities as opposed to studying properties of individual components. Many times the n-body problem has been considered as starting point for com-

30

Session 3. “What Kind of Science is This?”

plex systems research. After that there has been numerous important and fascinating examples of other such systems, for example strange attractors, fractals and cellular automata. All of them have been in turn very fashionable subjects, only to be replaced in a few years with something else. All these have the common treat that the beauty is not in the simplicity, but in their complexity. More specifically, what interests human mind is something between simple and complex, something that comes out of chaos and appears to be somehow more than the original ingredients. Philosophically speaking, for example in the case of cellular automata, all behaviour is essentially equal in system point of view. For external observer, in this case human being, there is in some cases more to see. With certain rules, some patterns expand infinitely, some survive indefinitely, and some repeat cycles, some die out. This is fascinating. So are fractals, too. But, what kind of practical results can be derived out of them? Stephen Wolfram says, that every kinds of results. In his book New Kind of Science he asserts, that basically the whole reality is working like cellular automata, and that knowledge about it can be acquired through simulations of other cellular automata. His hypothesis of computational equivalence suggests, that everything can be reduced to series of computations, which in turn can be carried out by very simple logic. This would mean, that everything could be simulated, and not just approximately, but precisely. This is another major deviation from current paradigms of science. Like superstring theory, it is a quantum leap into obscurity, and it seems quite difficult to work out the way back to normal, concrete world. And, before the connection is again formed with experimental reality, it is nothing more than metaphysics of ancient Greeks. So what are we researching, when we study these so-called complex systems? Many scientists have been very optimistic about what can be forged out of them. Maladies, natural catastrophies, all kinds of messes can be cleaned up in the future which will make human life better. Yet all the progress so far has been discovering new intriguing dynamics in different kinds of systems, but little more. Characteristic for the research has been, that for every research subject there is real background somewhere in reality; complex systems researchers seem to be different from mathematicians, who do not care about nature as such. Only logic and consistency is relevant, abstract is a virtue. In set theory, it is not very professional to use apples and horses as members of sets. Then in complex side, people are more than eager to develop great stories about what is going on in their complex world, and relate new results straightforwardly with some other things in reality, even if there is no real connection. Then on the bottom line, horses are happy to eat apples, and it is much more probable for the research to get publicity

3.5. Has the science become just a show?

31

and funding. In the past it was work of the science journalists to make quantum mechanics sound interesting, now the researchers themselves do it. Is science becoming popular culture, and researchers becoming superstars? Of course Einstein and Hawking are superstars, although few have any idea, what is the content of their work. If there is any Philosopher’s Stone to be found in chaoplexity field, it will give to its finder the most juicy stories of science history, and more fame that anyone before that. After all, some of the contemporary complex researchers have already had a taste of this, but everybody is not convinced yet.

3.5

Has the science become just a show?

Science has traditionally been prone to personality cults, even though great majority of all-important discoveries seem later quite inevitable. Planck’s constant could well be somebody-elses-constant, Pauli’s exclusion rule be Schulzennagel’s rule. All this encourages generations after another to continue making science. It might be difficult for a young scientist to content oneself with only developing applications and giving lectures about existing knowledge. So many ambitious scientists couldn’t have resisted the idea of inventing science rather than discovering it. If a theory great enough could be constructed so, that it would not contradict with any of the existing theories but instead would expand and incorporate them, then one could truly call that achievement regardless of whether it would be useful, or even experimentally provable at the moment. This approach has worked on several occasions before, most notably for special relativity. Basically this is also what Horgan calls ironic science. The first deliberate ironic comment was physicist Alan Sokals nonsense article in “a journal of cultural and political analysis” Social Text in 1996, just to test, if there really existed any more intellectual criticism among post-modern philosophers. The text passed the editors, and was not revealed as a hoax, until Sokal himself did it. At least in some people’s opinion more serious occasion was in 2002, when the infamous Bogdanov brothers got caught for at least four gibberish papers, for which they also got their Ph.D:s in University of Bourgogne. The brothers had their own TV-show in France, and they were local celebrities. The alarm bells rang, and almost all of the people, who considered themselves scientists, wondered, what was happening to this their beloved mission for The Answer called science. Who can be so vain to do things like that? From outside point of view, though, there is no difference, if paper is written with hoax in mind or by studying something of little importance in itself.

32

Session 3. “What Kind of Science is This?”

All this might cause, that talented young people hesitate to choose scientific careers, and maybe prefer going for business or government, even after acquiring a scientific degree from university. Problem is not, Horgan argues, that there are no more questions to be answered. After all one can always wonder the meaning of life. If anything, there is shortage of good questions, which have always been essential. It is hard to imagine theories that would have even remotely same impact as evolution theory, general relativity or quantum mechanics. There are still many unanswered questions in biology and cosmology, for example. The nature of these questions is nonetheless such that every hypothesis is necessarily very difficult to verify, if not impossible. The beginning of life may remain unanswered forever, unless some life form resembling existing life really can be created artificially and repeatably from the elements. Also, the idea about the first moments of universe is so abstract, that one has take it with a grain of salt. It might be mathematically consistent in the scientist’s head, but if it is true, that is a whole different thing. For an individual scientist, it might bring fame and welfare, even if the theory were false, as history can show. But from practical point of view, it is quite safe to remain dubious about such things. Scepticism has always been virtue in thinking, when it comes to matters, which have little tangible effects on life itself. Whatever that is.

3.6

Where to go? “It’s like the jazz musician, who was asked, where jazz is going, and he said, ‘If I knew, we would be there right now” ’

Chaos research and complexity research is basically same thing under different names. Chaos refers to disorder, complexity to something elaborate and sophisticated, but still somehow ordered to human mind. One could say that weather is chaotic, you cannot predict that, let us say, for a year forward, although you can with big confidence assume that in Finland it is chilly. Some cellular automaton may be complex, but it is not really chaotic, because it is so discrete and computable. Both terms are quite ambiguous and mean different things to different persons. Ways of attacking complexity are numerous. Purely theoretical, mathematical ones, like fractals and cellular automata, have produced little more than entertaining patterns. Some more practical approaches, like emulating nature by networks and agents, have produced some real applications, like Internet, where wholeness is so much more complicated than what the constructors

3.6. Where to go?

33

were planning in the beginning. It basically has a life of its own, and cannot be controlled by any single quarter. From the net point of view the users are not using the net as a traditional tool, but rather living in part the life of the net. If there were no voluntary diverse action in the net, it would be little more than an advanced telegraph. Internet is not designed to fulfil different sexual desires, but because of the human nature, the services emerged there. This for one encourages also other use of the net, also improving its usefulness. How could have this been possible to predict? This something-from-nothing-aspect has characterised much of the research in the field. It is always good thing, when something interesting emerges. But, to make something really desired and useful effects, like future stock prizes or a wristwatch, to emerge from something trivially available resource, like computing capability or a ton or ore, appears to be daydreaming. But also it is impossible to tell, what is possible and what is not. Life itself should not be possible, because from human point of view, it is too complex to work. The complexity of complexity has encouraged certain researchers to make predictions about glorious future, starting from curing diseases, ending with world peace. Human being would move on from being part of the system to be the master of the system, knowing all relevant information, taking into account all side effects of its action on the system. Also many practical problems of an average person are complex in their nature, like guessing lottery numbers. Others are usually computationally equivalent of computing, if one has enough money for additional bottle of beer, or if one can afford to sleep ten minutes more in the morning. Such considerations will remain always the same, but in the best case in distant future, complex systems research could make the lottery meaningless, or maybe at least the stock exchange. Up to the present the research of complexity and chaos has been more or less alchemy with computers. The hype has caused some people, eager to leave a handprint in the history, desperately seek for the Philosopher’s Stone, which would convert complexity into simplicity. It should be evident, that this is not going to happen. But, after all, research of chemistry was based on alchemy. Alchemy was Isaac Newton’s hobby, but not all alchemists were Isaac Newtons. And indeed something has emerged, if nothing else, then plenty of material for scientific Monty Python sketches.

Bibliography 1. Horgan, J.: The End of Science Broadway Books, 1997.

34

Session 3. “What Kind of Science is This?”

Session 4 Architecture of Complex Systems Eugene M. Burmakin Helsinki University of Technology Industrial IT Laboratory [email protected]

4.1

Conceptions of Complexity

Last century has seen strong interest in complexity and complex systems. There were three main waives of interest in complexity. The post-WWI interest in the topic was focusing on the claim that the whole is more than the sum of their parts, and was strongly anti-reductionistic in flavor. The postWWII interest was rather focusing on the idea of feedback control and selfstabilization of complex systems. The current interest in complexity focuses mainly on mechanisms that create and sustain complexity, and analytical tools for describing and analyzing it.

4.1.1

Holism and Reductionism

“Holism” is a modern name for very old idea. In the words of its author, the South African statesman and philosopher, J.C. Smuts: “Holism regards natural objects as wholes . . . It looks upon nature as consisting of discrete, concrete bodies and things . . . 35

36

Session 4. Architecture of Complex Systems which are not entirely resolvable into parts; and . . . which are more than the sums of their parts, and the mechanical putting together of their parts will not produce them or account for their characters and behavior.”

Two different interpretations can be given to this idea of holism — “weaker” or “stronger”. The stronger one postulates new system properties and relations among subsystems that had no place in the system components; hence it calls for emergence, a creative principle. In a weaker interpretation, emergence simply means that the parts of a complex system have mutual relations that do not exist for the part in isolation. By adopting the weak interpretation of holism one can adhere to reductionism even though it is not easy to prove rigorously that the properties of the whole can be obtained from properties of parts. This is the usual concept of the science as building things form elementary parts.

4.1.2

Cybernetics and General System Theory

In the 30’s and 40’s of XX century Norbert Wiener defined “cybernetics” as being a combination of servomechanisms theory, information theory, and stored-program computers. His works provide new insight into complexity. The information theory explains organized complexity in terms of the reduction of the entropy that is achieved when systems absorb energy from external sources and convert it into a pattern or structure. In information theory, energy, information, and pattern, all correspond to negative entropy. Feedback control defines how a system can achieve a goal and adapt to changing environment.

4.1.3

Current interest in complexity

The current wave of interest in complexity has lots of common ideas with the second one. But other new ideas are considered, such as catastrophe theory, chaos theory, genetic algorithms, and cellular automata. The motivation for the current interest is the need to tackle with complexity in global and largescale systems such as environment, society, organisms etc. Also, tools that were developed for system complexity management in the second wave are not appropriate for current systems under investigation due to their increased complexity.

4.2. The architecture of complexity

4.2

37

The architecture of complexity

It is good to define the term of complex systems, before going to the architecture of complex system. There are many definitions of a system. But roughly, a complex system is made up of a large number of parts that have many interactions. In such systems the “whole is more than the sum of the parts”. Given the properties of parts and the laws of their interactions, it is not a trivial thing to infer the properties of the whole.

4.2.1

Hierarchic systems

By a hierarchic system we can understand a system that is composed of interrelated subsystems, each of them being in turn hierarchic in structure until some lowest level of elementary subsystems is reached. The question is rising immediately “what is an elementary subsystem”? Probably, in different cases different types of elementary units can be chosen with respect to specific task’s needs. There are lots of examples of hierarchical systems in different domains such as social systems (governments, business firms), biological and physical systems (cells — tissues — organs) all having a clearly visible parts-within-parts structure.

4.2.2

The evolution of complex systems

Let us quote here the “watchmakers metaphor” that was presented in [4.3]: There once were two watchmakers, named Hora and Tempus, who manufactured very fine watches. They were making their watches in own premises, a phone was calling frequently – new customers were calling. Hora prospered and Tempus became poorer. What was the reason? The watches consisted of 1000 parts each. Tempus was assembling them part by part, and if his phone was ringing (the work process was interrupted) he had to put the not finished watch down, and it immediately fell to pieces and had to be reassembled from the elements. The watches that Hora made were not less complex. But he had designed them so that he could put together subassemblies

38

Session 4. Architecture of Complex Systems of about ten elements each. Ten of these subassemblies could be put together into a larger subassembly; and so on until the ten last subassemblies would be combined into the whole watch. Hence, when he was interrupted, he had to put down th epartly assembled watch to answer the phone. He was losing only a small part of his work in comparison to Tempus.

If we take into account some ideas from biological evolution and other fields, a number of objections will arise against this metaphor. • The complex forms can arise from the simple ones by purely random process. Direction is provided to the scheme by the stability of the complex forms, once these come into existence. • Not all systems appear hierarchical. For example, most polymers are linear chains of large number of identical components. • Multi-cellular organisms have evolved through multiplication and specialization of the cells of a single system, rather than through the merging of previously independent subsystems. Hence, there are reasons to dismiss the metaphor. However, the systems that evolve by means of specialization have the same kind of boxes-within-boxes structures as well as systems that evolve by assembly of simple systems. There are application were the metaphor works. Consider, for example, the theorems and their proofs. The process starts with axioms and previously proven theorems. Various transformations allow obtaining new expressions; this process goes on until the theorem are proven. Another good example is the development process of an empire: Philip assembled his Macedonian empire and gave it to his son, to be later combined with the Persian subassembly and others into Alexander’s greater system. On Alexander’s death his empire did not crumble into dust but fragmented into some of the major subsystems that had composed it.

4.2.3

Nearly decomposable systems

The interactions among subsystems and interaction within subsystems can be distinguished in hierarchic systems. The interactions at the different levels

4.2. The architecture of complexity

39

may be of different orders of magnitude. For example, in a formal organizations there will be more interactions between employees from the same department than from different departments. The system can be considered as decomposable if there are no interactions among the subsystems at all. In practice, these interactions can be weak but not negligible. Hence, one may move to the theory of nearly decomposable systems. There are two main theoretical propositions found concerning nearly decomposable systems: • In a nearly decomposable system the short-run behavior of each of the component subsystems is approximately independent of the short-run behavior of the other components. • In the long run the long-run the behavior of any one of the components depends in only an aggregate way on the behavior of the other components. These ideas can be described using an example. Consider a building whose outside walls (boundary of our system) provide perfect thermal isolation from the environment. The building is divided into a large number of rooms; the walls between them (subsystems boundaries) are not perfect isolators. Each room is divided in partitions — the cubicles being poorly isolated. A thermometer hangs in each cubicle. There is wide variation in temperature during the first observation. But a few hours later the temperature variation has become very small among rooms inside the building. One can describe the process of reaching the equilibrium formally by setting up the usual equation of heat flow. The equations can be represented by the matrix of their coefficients: r ij is the rate at which heat flows from the ith cubicle to jth one. If cubicle i and j do not have a common wall then r ij will be zero. If cubicles i and j are in the same room then r ij will be large. If cubicles i and j are separated by the room’s wall then r ij will be small but not zero. Hence, one can group together all these coefficients, getting a matrix where all of its large element will be located inside a string of square submatrices along the main diagonal. We shall call a matrix with this properties a nearly decomposable matrix (see Fig. 4.1). Now it has been shown that a dynamic system that can be described in terms of a nearly decomposable matrix has the properties of nearly decomposable systems, stated earlier in this section. Hence, we have seen that hierarchies have the property of near decomposability. Intra-component linkages are generally stronger then inter-component linkages. This fact has the effect of

40

Session 4. Architecture of Complex Systems

Figure 4.1: The nearly decomposable matrix. The matrix coefficients r ij are the rates at which heat flows from the cubicle i th to the cubicle j th (A, B, and C are rooms; A1 denotes the cubicle 1 in room A) separating the high-frequency dynamics of a hierarchy (involving the internal structure of the components) from the low-frequency dynamics (involving interaction among components).

4.2.4

The description of complexity

Information about a complex object is arranged hierarchically like a topical outline. When information is put in outline form, it is easier to include information about the relations among the major parts and information about the internal relations of parts in each of the sub-outlines. Detailed information about the relations of subparts belonging to different parts has no place in the outline and is likely to be lost. From the discussion of dynamic properties of nearly decomposable systems, we have seen that comparatively little information is lost by representing them as hierarchies. Subparts belonging to different parts only interact in an aggregative fashion — the detail of their interaction can be ignored. For example, in studying the interaction between two molecules, generally we do not need to consider in detail the interactions of nuclei of the atoms belonging to the one molecule with the nuclei of atoms belonging to the other. The fact then that many complex systems have a nearly decomposable hierarchic structure is a major facilitating factor enabling us to understand, describe and even see such systems and their parts. From another point of

4.3. Conclusions

41

view, if there are important systems in the world that are complex without being hierarchic, they may to a considerable extent escape our observation and understanding. Analysis of their behavior would involve such detailed knowledge and calculation of the interactions of their elementary parts that it would be beyond our capacities of memory or computation. There are two main types of description that seem to be available to us when seeking for understanding of complex systems: • State description – “A circle is the locus of all points equidistant from a given point”. – Pictures, chemical structural formulas, blueprints, etc. • Process description – “To construct a circle, rotate a compass with one arm fixed the other arm has returned to its starting point”. – Recipes, differential equations, etc.

4.3

Conclusions

Empirically a large proportion of the complex systems one can see have hierarchical structure. Or one perceives them as hierarchies in order to solve some complex problems. A complex system can be presented as a hierarchy in order to make proper control of the systems. This approach of presentation is very desirable in many cases due to the fact that it has many useful properties such as near decomposability that is simplifying description and analysis of the system.

Reference • Herbert A. Simon: The Sciences of the Artificial. MIT press, Cambridge, Massachusetts, 1996 (third edition).

42

Session 4. Architecture of Complex Systems

Session 5 Towards Decentralization Matti Saastamoinen [email protected]

The study of multiagent systems began in the field of distributed artificial intelligence (DAI)) about 20 years ago. Today these systems are not simply a research topic, but are also beginning to become an important subject of academic teaching and industrial and commercial application. DAI is the study, construction, and application of multi-agent systems, that is, systems in which several interacting,intelligence agents pursue some set of goal or perform some set of tasks [1]. This paper is based mostly on book Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence by Gerhard Weiss, Part I: Basic Themes.

5.1

Introduction

Artificial intelligence is an advanced form of computer science that aims to develop software capable of processing information on its own, without the need for human direction [2]. The term was first time used in 1956 by John McCarthy at the Massachusetts Institute of Technology. Artificial intelligence includes: • games playing: programming computers to play games such as chess and checkers 43

44

Session 5. Towards Decentralization • expert systems: programming computers to make decisions in real-life situations (for example, some expert systems help doctors diagnose diseases based on symptoms) • natural language: programming computers to understand natural human languages • neural networks: Systems that simulate intelligence by attempting to reproduce the types of physical connections that occur in animal brains • robotics: programming computers to see and hear and react to other sensory stimuli

An agent is a computational entity such as a software program or a robot that can be viewed as perceiving and acting upon its environment and is autonomous in that its behavior at least partially depends on its own experience. As an intelligent entity, an agent operates flexibly and rationally in a variety of environmental circumstances given its perceptual and effectual equipment. Behavioral flexibility and rationality are achieved by an agent on the basis of key processes such as problem solving, planning, decision making, and learning. As an interacting entity, an agent can be affected in its activities by other agents and perhaps by humans. A key pattern of interaction in multiagent systems is goal- and task-oriented coordination, both in cooperative and in competitive situations. In a case of cooperation several agents try to combine their efforts to accomplish as a group what the individuals cannot, and in the case of competition several agents try to get what only some of them can have. Two main reasons to deal with DAI can be identified, and these two reasons are the primary driving forces behind the growth of this field in the recent years. The first is that multiagent systems have the capacity to play a key role in current and future computer science and its applications. Modern computing platforms and information environments are distributed, large, and heterogeneous. Computers are no longer stand-alone, but have became tightly connected both with each others and their users. The increasing complexity of computer and information systems goes together with an increasing complexity of their applications. These often exceed the level of conventional, centralized computing because they require, for instance, the processing of huge amounts of data, or of data that arises at geographically distinct locations. To cope with such applications, computers have to act more as "individuals" or agents, rather than just "parts".

5.2. Intelligent agents that interact

5.2

45

Intelligent agents that interact

"Agents" are autonomous, computational entities that can viewed as perceiving their environment through sensors and acting upon their environment through effectors. Agents pursue goals and carry out tasks in order to meet their design objectives, and in general these goals can be supplementary as well as conflict [3]. "Intelligent" indicates that the agents pursue their goals and execute their tasks such that they optimize some given performance measures. To say that agents are intelligent does not mean that they are omniscient or omnipotent, nor does it mean that they never fail. Rather, it means that they operate flexibly and rationally in a variety of environmental circumstances, given the information they have and their perceptual and effectual capabilities. "Interacting" indicates that agents may be affected by other agents or perhaps by humans in pursuing their goals and executing their tasks. Interaction can take place indirectly through the environment in which they are embedded or directly through a sheared language. To coordinate their goals and tasks, agents have to explicitly take dependencies among their activities into consideration. Two basic, contrasting patterns of coordination are cooperation and competition. In the case of cooperation, several agents work together and draw on the board collection of their knowledge and capabilities to achieve a common goal. Against that, in the case of competition, several agents work against each other because their goals are conflicting. Cooperating agents try to accomplish as a team what the individuals cannot, and so fail or succeed together. Competitive agents try to maximize their own benefit at the expense of others, and so the success of one implies the failure of others. The following major characteristics of multiagent systems are identified: • each agent has just incomplete information and is restricted in its capabilities; • system control is distributed; • data is decentralized; and • computation is asynchronous. Broadly construed, both DAI and the traditional AI deal with computational aspects of intelligence, but they do so from different points of view and under

46

Session 5. Towards Decentralization

different assumptions. Where traditional AI concentrates on agents as "intelligent stand-alone systems" and on intelligence as a property of systems that act in isolation, DAI concentrates on agents as "intelligent connected systems" and on intelligence as a property of systems that interact. In this way, DAI is not so much a specialization of traditional AI, but a generalization of it.

5.3

Rationales for multiagent systems

The two major reasons that cause to study multiagent systems are: • Technological and Application Needs — Multiagent systems offer a promising and innovative way to understand, manage, and use distributed, large-scale, dynamic, open, and heterogeneous computing and information systems. The Internet is the most prominent example of such systems; other examples are multi-database systems and in-house information systems. These systems are too complex to be completely characterized and precisely described. DAI does not only aim at providing know-how for building sophisticated interactive systems from scratch, but also for interconnecting existing legacy systems such hat they coherently act as whole. Moreover, like no other discipline, DAI aims at providing solutions to inherently distributed and inherently complex applications [4]. • Natural View of Intelligent Systems — Multiagent systems offer a natural way to view and characterize intelligent systems. Intelligence and interaction are deeply and inevitably coupled, and multiagent systems reflect this insight. Natural intelligent systems, like humans, do not function in isolation. Instead, they are at the very least a part of the environment in which they and other intelligent systems operate. Humans interact in various ways and at various levels, and most of what humans have achieved is a result of interaction. In addition, multiagent systems, as distributed systems, have the capacity to offer several desirable properties: • Speed-up and Efficiency — Agents can operate asynchronously and in parallel, and this can result in an increased speed.

5.4. Multiagent systems

47

• Robustness and Reliability — The failure of one or several agents does not necessarily make the overall system useless, because other agents already available in the system may take over their part. • Scalability and Flexibility — The system can be adopted to an increased problem size by adding new agents, and this does not necessarily affect the operationality of the other agents. • Costs — It may be much cost-effective than centralized system, since it could be composed of simple subsystems of low unit cost. • Development and Reusability — Individual agents can be developed separately by specialists, the overall system can be tested and maintained more easily, and it may be possible to reconfigure and reuse agents in different application scenarios. The available computer and network technology forms a sound platform for realizing these systems. In particular, recent development in object-oriented programming, parallel and distributed computing, and mobile computing, as well as ongoing progress in programming and computing standardization efforts such as KSE, FIPA and CORBA are expected to further improve the possibilities of implementing and applying DAI techniques and methods.

5.4 5.4.1

Multiagent systems Introduction

Agents operate and exist in some environment, which typically is both computational and physical. The environment might be open or closed, and it might or might not contain other agents. Although there are situations where an agent can operate usefully by itself, the increasing interconnection and networking of computers is making such situations rare, and in the usual state of affairs the agent interacts with other agents [5]. Communication protocols enable agents to exchange and understand messages. Interaction protocols enable agents to have conversations, which for our purposes are structured exchanges of messages. As a concrete example of these, a communication protocol might specify that the following types of messages can be exchanged between two agents: • Propose a course of action

48

Session 5. Towards Decentralization • Accept a course of action • Reject a course of action • Retract a course of action • Disagree with a proposed course of action • Counter propose a course of action.

5.4.2

Motivations

Why should we be interested in distributed systems of agents? Distributed computations are sometimes easier to understand and easier to develop, especially when the problem being solved is itself distributed. There are also times when a centralized approach is impossible, because the systems and the data belong to independent organization that want to keep their information private and secure for competitive reason [5]. The information involved is necessarily distributed and it resides in information systems that are large and complex in several sense: (1) they can be geographically distributed, (2) they can have many components, (3) they can have a huge content, both in the number of concepts and in the amount of data about each concept and (4) they can have a broad scope, i.e., coverage of a major portion of a significant domain. The topology of these systems is dynamic and their content is changing so rapidly that it is difficult for a user or an application program to obtain correct information, or for enterprise to maintain consistent information. There are four major techniques for dealing with the size and complexity of these enterprise information systems: modularity, distribution, abstraction, and intelligence, i.e., being smarter about how you seek and modify information. The use of intelligence, distributed modules combines all four of these techniques, yielding a distributed artificial intelligence (DAI) approach. For the practical reason that the systems are too large and dynamic for global solutions to be formulated and implemented, the agents need to execute autonomously and be developed independently.

5.5. Degree of decentralization

5.5

49

Degree of decentralization

Here’s a look at degrees of decentralization of agents that serve multiple, geographically-dispersed users [6].

5.5.1

A single central server

This is generally the easiest and most obvious organization for any sort of agent that serves multiple users. Advantages: • Easy to coordinate; no work has to be done by the agent itself to do so. • Easy for users to know where to contact. • Lends itself to crossbar algorithms and similar cases in which the entire knowledge base must be examined for each query or action. • If the server is used by very widely spread users, timezones may spread out some of the load. Disadvantages: • Doesn’t scale: generally, the workload goes up as the square of the number of users. • Not fault tolerant: the server is a single point of failure for both performance and security (it is a single obvious point to compromise). • The majority of users will find themselves a good part of an Internet diameter away from the server; this can be serious if low latency is required of the server.

5.5.2

Multiple mirrored servers

This describes a class of server where the basic algorithm is run in parallel on a number of machines (typically, very-loosely-coupled parallelism, e.g., separate workstations, rather than a single MIMD or SIMD parallel architecture). Such architectures in general can be divided into:

50

Session 5. Towards Decentralization • Tightly-consistent architectures, in which all servers have exactly the same, or virtually the same, database and simply handle multiple requests or take actions for users in parallel, possibly checkpointing with each other as each action is taken, and • Loosely-consistent architectures, in which the servers have mostly the same information or at least information in the same domain, but they do not try to enforce a particularly strong and consistent world view among themselves.

The choice of tight or loose consistency is generally a function of the operations being supported by the servers. Advantages: • These architectures are handy when it is relatively simple to maintain database consistency between servers (for example, if user requests or actions taken on their behalf do not side-effect the database, then its consistency is easier to maintain). • Load-balancing is fairly simple, and extra hosts can be added incrementally to accommodate increases in load. • The servers may be geographically distributed to improve either network load-balancing, time zone load-balancing, or fault-tolerance. Disadvantages: • If the algorithm requires tight consistency, the requisite interserver communications costs can eventually come to dominate the computation. • Even loosely-consistent servers will probably still suffer from roughly quadratic growth in load with the number of users. This implies that, to keep up with even linear user growth, a quadratically-increasing number of servers must be put online; keeping up with typical exponential growth, of course, is much harder.

5.5.3

Multiple, non-mirrored servers

These types of agent architectures can fairly trivially be divided into: • Those that know about each other, and

5.5. Degree of decentralization

51

• Those that probably don’t. The Web itself is an example of this architecture; each server does not need to know about each other, and, in general, do not mirror each other. Few agent architectures seem to be designed in this way, however, except in the limit of the same agent simply being run in multiple instantiations in different information domains. Advantages: • Consistency is easy to achieve. • Load sharing may be implemented as in the mirroring case above. Disadvantages: • Similar to mirrored servers, though the disadvantage of maintaining consistency is eliminated. • Load growth may still be a problem. • It may be difficult to find all servers if the algorithm demands it, since the lack of mirroring means servers may tend to fall out of touch with each other.

5.5.4

Totally distributed peers

As in the case above of multiple, non-mirrored servers, totally-distributed peers can be divided into: • Those that know about each other • Those that probably don’t This approach resembles an ALife system more than the approaches above, and deviates most radically from typical client/server or centralized-system approaches. Advantages: • Can probably be scaled up to accommodate loading easily, because servers are also peers and probably associate close to 1-to-1 with the user base.

52

Session 5. Towards Decentralization • No central point of either failure or compromise.

Disadvantages: • Coordination between the peers becomes much more difficult. • Algorithms that require global consistency are probably impossible to achieve with acceptable performance. • It may be difficult to keep all agents at similar software revision levels.

5.6

Applications

Many existing and potential industrial and commercial applications for DAI and multiagent systems are described in the literature [7]. Basically following examples of such applications are: • Electronic commerce and electronic markets, where "buyer" and "seller" agents purchase and sell goods on behalf of their users. • Real-time monitoring and management of telecommunication networks, where agents are responsible, e.g., for call forwarding and signal switching and transmission. • Modelling and optimization of in-house, in-town, national- or worldwide transportation systems, where agents represent, e.g., the transportation vehicles or the goods or customers to be transported. • Information handling in information environments like the Internat, where multiple agents are responsible, e.g., for information filtering and gathering. • Improving the flow of urban or air traffic, where agents are responsible for appropriately interpreting data arising at different sensor stations. • Automated meeting scheduling, where agents act on behalf of their users to fix meeting details like location, time and agenda. • Optimization of industrial manufacturing and production processes like shopfloor scheduling or supply chain management, where agents represent, e.g., different workcells or whole enterprises.

5.7. Conclusion

53

• Analysis of business processes within or between enterprises, where agents represent the people or distinct departments involved in these processes in different stage and at different levels. • Electronic entertainment and interactive, virtual reality-based computer games, where, e.g., animated agents equipped with different characters play against each other or against humans. • Design and re-engineering of information- and control-flow patterns in large-scale natural, technical, and hybrid organizations, where agents represent the entities responsible for these patterns. • Investigation of social aspects of intelligence and simulation of complex social phenomena such as the evolution of roles, norms, and organizational structures, where agents take on the role of the members of the natural societies under consideration.

5.7

Conclusion

Distributed intelligent agents have the potential to play a significant role in the future of software engineering. Further research is needed to develop the basis and techniques for societies of computational agents that execute in open environments for indefinite periods [8]. Distributed planning has a variety of reasonable well-studied tools and techniques in it repertoire. One of the important challenges to the field is in characterizing these tools and understanding where and when to apply each. Until many of the assumed context and semantics for the plans are unveiled, the goal of having heterogeneous plan generation and plan execution agents work together is likely to remain elusive. The field of distributed problem solving is even more wide open, because the characterization of a ’problem’ is that much broader. Representations and general-purpose strategies for distributed problem solving are thus even more elusive. Distributed problem solving and planning strategy has still more ’art’ to it than we like to see in an engineering discipline [9]. Although real-time search provides an attractive framework for resourcebounded problem solving, the behavior of the problem solver is not rational enough for autonomous agents: the problem solver tends to perform superfluous actions before attaining the goal; the problem solver cannot utilize and improve previous experiments; the problem solver cannot adapt to the dy-

54

Session 5. Towards Decentralization

namically changing goals; and the problem solver cannot cooperatively solve problems with other problem solvers [10]. In the future, systems will increasingly be designed, built, and operated in a distributed manner. A large number of systems will be used by multiple real-world parties. The problem of coordinating these parties and avoiding manipulation cannot be tackled by technological or economic methods alone. Instead, the successful solutions are likely to emerge from a deep understanding and careful hybridization of both [11]. Centralized systems are failing for two simple reasons: They cannot scale, and they don’t reflect the real world of people [12]. Decentralized approaches often seem impractical, but they work in practice. The Internet itself is a prime example–it works because the content, the domain name system and the routers are radically distributed. But it’s the human element that is really driving the pressure for decentralized solutions. This shouldn’t be too surprising. Biological phenomena like the human body and the global biosphere have had billions of years to evolve, and they are the most complex decentralized systems we encounter. Decentralization is neither automatic nor absolute. The most decentralized system doesn’t always win. The challenge is to find the equilibrium points–the optimum group sizes, the viable models and the appropriate social compromises.

Bibliography 1. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 1. 2. http://webopedia.internet.com/Term/a/artificialintelligence.html 3. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 2–5. 4. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 8–9. 5. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 79–81. 6. http://foner.www.media.mit.edu/people/foner/Essays/Agent-Coordination/degree-of-decentralization.html

5.7. Conclusion

55

7. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 6–7. 8. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 28–73. 9. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 121–158. 10. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 165–196. 11. Gerhard Weiss: Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence The MIT Press, 1999, pp. 201–251. 12. http://192.246.69.113/archives/000010.html

56

Session 5. Towards Decentralization

Session 6 Networks of Agents Jani Kaartinen [email protected] It seems that in today’s world everything is linked – or networked – together. When studied closer, networks varying from internet to a coctail party or ancient bacteria resemble each other and the same rules apply. However, there are no solid tools for studying networks. That is why we have to learn how to interpret these nets and to try to see if “network thinking” could enable us to see something new or at least see things in a different light.

6.1

Introduction

In February 7th 2000 at 10:20 Pacific Standard Time, Yahoo — one of the most popular search engines at that time — started to receive billions of service requests. This would have been a celebrated event but there was a problem, however. The requests were not produced by ordinary people trying to find information. They were produced by computer program running in a large number of computers. The request was not very fruitful either, the messages sent to hundreds of computers in Yahoo’s Santa Clara, California, headquarters contained only the message “Yes, I heard you!” [24]. Yahoo’s computers were under a “Denial Of Service” attack while millions of legitimate customers, who wanted a movie title or an airline ticket, waited. Next day the same attack was targeted at Amazon.com, eBay, CNN.com, ETrade and Excite. The damage caused to these “giants of the Web” was huge and thus a 57

58

Session 6. Networks of Agents

very high-profile search was launched by the Federal Bureau of Investigation (FBI). The common opinion was that the attack must be doings of a group of sophisticated crackers who had hijacked hundreds of computers in schools, research labs and businesses and turned them into zombies screaming “Yes, I heard you!” to Yahoo and the others. Finally the FBI solved the crime. They did not find the much-anticipated cyberterrorist organization, however. Instead, they tracked down a single fifteen year old teenager living in the Canadian suburbs. Funny thing was that the FBI would have never found the boy hadn’t he been bragging with his doings in a chat room where he called himself MafiaBoy. The MafiaBoy successfully managed to halt the operation of billion-dollar companies with access to best computer security experts in the world. Although, it was noticed that MafiaBoy was not even very skilled among computers he still managed to organize this attack from his suburban home using his modest desktop computer. And what is more interesting is that these big companies were completely powerless against this type of rudimentary attack. What made it possible for this fifteen–year–old to cause this kind of damage? The answer lies in the structure of the complex network that he was using to accomplish his task. Although it’s clear that he could not have done anything to these computers directly, he utilized the properties and weaknesses of the underlying network — the Internet. If a mere youth can cause this kind of harm, what could a small group of trained and skilled professionals achieve? How vulnerable are we to such attacks? In the light of the example above, it seems clear that the different properties and the weaknesses of the Internet have to be studied carefully in order to assure reliable operation of legitimate companies and other every–day– activities that we use it for. However, there have been findings suggesting that the same kind of structures and behaviors can be found in a number of different kinds of networks varying from a social network in a cocktail party to metabolic network within a cell. Thus, a growing number of researchers believe that there must be some universal laws describing these nets and once we find them for a certain network we can possibly utilize the laws in some other nets and explain the behavior more precisely.

6.2. Advent of Graph Theory

6.1.1

59

Reductionism

Reductionism has been very popular method among scientists in the last decade. The idea behind reductionism is that to comprehend nature we have to study its smallest parts. Once we understand the parts, it will be easy to grasp the whole. Divide and conquer; the devil is in the details. To understand the universe we have to study atoms and superstrings, molecules to comprehend life, individual genes to understand complex human behavior and so on. However, even if we know almost everything there is to know about the pieces we are still as far as we have ever been from understanding the nature as a whole. The main problem with reductionism is that once we start to put our little pieces of the puzzle together, we run into the hard wall of complexity. There are billions of ways to do this assembly, if you will, and we would need something more than just the knowledge of the pieces. One possible remedy to this problem is networks, they are everywhere. All we need is an eye for them. Once the general laws governing the properties and formation of different networks are studied and understood, we may have a very powerful “roadmap” or “blueprint” describing a variety of our complex problems.

6.2

Advent of Graph Theory

Graph theory can be considered the basis for today’s thinking about networks. It was born in 1736 when Leonhard Euler, a Swiss born mathematician who spent most of his career in Berlin and St. Petersburg, offered a proof to a problem that had been troubling the people of Königsberg [8].

6.2.1

Königsberg bridges

At Euler’s time Königsberg was a flowering city in eastern Prussia since it had not yet confronted the horrors of the Second World War. The city had a busy fleet of ships and their trade offered a comfortable life to the local merchants and their families. Due to good economical situation the city officials had built seven bridges across the Pregel river. The bridges can be seen in the Figure 6.1, where the seven bridges connect four pieces of land (A, B, C and D) together. For a long time there was an amusing mind puzzle among the citizens of Königsberg: “Can one walk across the seven bridges and never cross the same one twice?”

60

Session 6. Networks of Agents

Figure 6.1: Königsberg bridges

Once Euler heard about this he was intrigued about the problem and decided to search for a solution. This he quickly found and was able to proof that such a path does not exist. He even wrote a short paper describing his solution [5]. The fact that Euler could solve the problem is not that interesting to us, but rather the intermediate step that he took in the process: Euler had the foresight to simplify his problem by considering the problem as a graph consisting of links and nodes. The nodes in this case are the four pieces of land and the links are the seven bridges connecting the nodes. Once this is done it is easy to proof generally that if a node has an odd number of links then it must be either a starting point or an end point of the journey. This means that there cannot be more than two nodes with odd number of links and in the case of the bridges there were four i.e. all of the nodes had odd number of links. Thus the path cannot exist even if we spent our life searching for it. This is important to notice: It does not depend on our ability to find such a path; rather it is a property of the graph. Finally the people of Königsberg agreed with Euler and later on they built yet another bridge, solving the problem. There were many consequences to this intermediate step Euler took in his proof. First of all it launched an avalanche of studies and contributions made by mathematical giants such as Cauchy, Hamilton, Cayley, Kirchoff and Pólya. On the other hand it started the thinking of networks. The fact that there are many types of networks (or graphs at that time) and that they can have properties hidden in their construction that enable or restrict the ability to do different things with them was acknowledged. Also, by making even very small changes to these properties we can drastically change the behavior of the network and perhaps open hidden doors that enable new possibilities to emerge as was the case in Königsberg; just by adding a single link to the graph the whole problem was solved.

6.2. Advent of Graph Theory

61

After Euler’s work the main goal of graph theory was to discover and catalogue various graphs and their properties. For example labyrinth problems, first solved in 1873, were very famous at that time along with problems like finding a sequence of moves with a knight on a chess board such that each square is visited only once and that the knight returns to its starting point. Also things like the lattice formed by atoms in a crystal or the hexagonal lattice made by bees in a beehive were studied.

6.2.2

Random networks

It took two centuries after the advent of graph theory before scientists started moving from studying various properties of different graphs to asking how networks form. What are the laws governing their appearance, structure and changes in them? The first answers to these questions were given by two Hungarian mathematicians Paul Erdos and Alfréd Rényi when in 1959 they published their paper about random networks [9]. The idea behind random networks was that when creating a network the links between nodes were added completely randomly. Once they started adding random links one by one they also noticed that when the average amount of links per node approached one, something special happened; a single network emerged. Physicists would perhaps call this phenomenon a phase transition, similar to the moment in which water freezes. Sociologists would say that your subjects had just formed a community, but all would agree that something special happened: After placing critical amount of links, the network drastically changed. The properties of random networks were further studied and one of Erdos’s students, Béla Bollobás, proofed in 1982 that the histogram of links per node in a random network follows a Poisson distribution [6]. Poisson distribution tells us that the majority of the nodes have the same number of links as the average node has (peak value of the distribution) and that it is extremely rare that there exists a node in the graph that would deviate significantly from this “prototype node”. When translated for example to society, we end up with very democratic world in which everyone has on the average the same number of friends or other social links. It tells that most companies trade with roughly the same number of companies, most neurons connect to roughly the same number of neurons, most Websites are visited by roughly the same number of people and so on. But is this the way that nature truly behaves? Of course it is not. Although Erdos and Rényi understood that there is very diverse spectrum of different

62

Session 6. Networks of Agents

kinds of networks in the universe they deliberately disregarded this diversity and proposed the simplest possible way nature could follow: Randomness. They never tried to solve all the network-problems at once but to propose a general approach to certain problems by introducing randomness to their model. This observation can also be noticed from their 1959 seminal paper: “the evolution of graphs may be considered as a rather simplified model of the evolution of certain communication nets (railway, road or electric network systems, etc.)” [9]. Still their work inspired the research of the theory behind complex networks and since for a long time there was no better approach available, often complex networks were considered as fundamentally random.

6.3

Degrees of separation

One interesting aspect in networks that has also intrigued scientists over the decades is their smallness; even if the network at hand is very large, still often it seems to be so interconnected that the average distance between nodes is surprisingly small. One of the most important concepts in this area is the so called “six degrees of separation” [13]. The concept was born from the studies of society made by Frigyes Karinthy. In his book “Láncszemek” (or “Chains” in english) he proposes that everybody is at most five handshakes away from each other. He presents few interesting examples in which he can connect himself to any Nobelist or to any worker in Henry Ford’s car factory. Although the “six degrees of separation” did not draw too much attention in Karinthy’s time it was rediscovered almost three decades later, in 1967, by Stanley Milgram, a Harvard professor who turned the concept into famous study on our interconnectivity [21]. He modified Karinthy’s idea so that he could estimate the “distance” between any two people in United States. In his experiment he sent the following letter along with other information to 160 randomly chosen people living in Wichita and Omaha. HOW TO TAKE PART IN THIS STUDY 1. ADD YOUR NAME TO THE ROSTER AT THE BOTTOM OF THIS SHEET, so that the next person who receives this letter will know who it came from. 2. DETACH ONE POSTCARD. FILL IT OUT AND RETURN TO HARVARD UNIVERSITY. No stamp is needed. The postcard is very important. It allows us to keep track of the progress of the folder as it moves toward the target person.

6.3. Degrees of separation

63

3. IF YOU KNOW THE TARGET PERSON ON A PERSONAL BASIS, MAIL THIS FOLDER DIRECTLY TO HIM (HER). Do this only if you have previously met the target person and know each other on a first name basis. 4. IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO NOT TRY TO CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POSTCARDS AND ALL) TO A PERSONAL ACQUAINTANCE WHO IS MORE LIKELY THAN YOU TO KNOW THE TARGET PERSON. You may send the folder to a friend, relative or acquaintance, but it must be someone you know on a first name basis. After sending the letters, Milgram wondered if any of those letters would find their path to the receiver and if they did how many links would it take? He consulted also other well-educated people with his question and their best estimate was close to 100. To their great surprise, first letter arrived within a few days, passing through only two intermediate links! This turned out to be the shortest path ever recorded. Eventually 42 of the 160 letters made it back and the median number of intermediate persons turned out to be 5.5. Quite a small number compared to what he and his well-educated friends came up with. Although the results of Stanley Milgram’s experiment have been questioned (see [18] and [17]), they provide a clear evidence that the distance between any two people in a “social web” is relatively small. Degrees of separation have also been studied in many other networks: For example the species in food web are separated by two degrees of separation ([28], [22]), molecules in a cell are connected through three chemical reactions ([15], [14], [26]), scientists through four co-authorships ([23]) and neurons in the brain of the Caenorhabditis elegans worm through 14 synapses ([27]) an so on. However, one could argue that the experiments performed with social networks and with so small number of data points, could be very far from the truth. This suspicion can today be addressed with enormous databases that are kept on various things. For example social networks among actors and actresses have been studied utilizing the Internet Movie Data Base (www.imdb.com), which contains virtually every actor or actress in the world that has appeared in a movie. Also the modern computer networks that are connected through routers, the largest being the Internet, are a good source of well documented information for these studies. In fact, the degree of separation for the Internet is 10. The largest degree of separation found today is 19 and it was identified for the highly interconnected network of

64

Session 6. Networks of Agents

billion plus documents that are connected to each other through hyperlinks in the World Wide Web (WWW) [2]. The problem can also be approached completely analytically by using the theory of random networks. It can be shown that generally the degree of N separation d for random networks is the following: d = log , where N = size log k of the net and k = average number of links per node. As can be seen from the logarithmic behavior of d, the degree of separation grows very slowly even if the size of the network becomes large. This tells us in clear terms of mathematics that when it comes to interconnected networks; “we live in a small world”.

6.4

Hubs and connectors

A staff writer at the New Yorker magazine called Malcolm Gladwell wrote a book named The Tipping Point where he introduces a simple test: The task is to go through a list of 248 surnames that he had compiled from Manhattan phone book and count the number of people from your own circle of friends and acquaintances that have their surname appearing on the list, counting also multiple occurrences [11]. The purpose of the test was to measure the social connectedness of different people. Gladwell tested about four hundred people including several different types and also very homogenous groups of people (similar age, education and income). The average was changing according to education, age and social status of different groups and was somewhere between 20 and 40. This was not so important, however, and could be easily explained but the interesting thing in these experiments was that the variation in the results was very large and independent of the group or the homogeneity of the people in the group. The lowest score was typically between 2 and 16 and the highest 95 and 118. Gladwell found that in every test there are a small number of people that have the special ability to make friends and acquaintances more easily than others. He called these people connectors but in network terms they can be called hubs. Same kind of phenomena has been studied in many different networks (e.g. WWW, Internet, social nets) [16], [10]. Yet, the random universe discussed above does not support connectors. If the networks were random then the existence of the hubs would be practically forbidden due to their increasingly small probability. If the WWW were a random network, the probability of there being a page with five hundred incoming links would be 10−99 . Yet the latest Web survey, covering roughly 20% of the full Web, found four hundred such documents. The document with

6.5. The Scale-Free Networks

65

most incoming links had over two million links. The chance of finding such a document from a random network is smaller than locating an individual atom in the universe. Thus, new laws for describing networks had to be found.

6.5

The Scale-Free Networks

When researchers were studying the WWW it was found that the number of links between HTML Hyper Text Markup Language) pages did not follow a uniform distribution. From real data collected by robots crawling through the WWW it was found that when the data was plotted on a log-log scale it was a straight line. This was a strong indication that the distribution follows a power law [7]. This implies that the number of WebPages with exactly k links, denoted by N(k), follows N(k) ∼ k −γ , where the parameter γ is the degree exponent. For WWW the γ was recorded to be 2.1 for incoming links and 2.5 for outgoing links [2]. Same kind of behavior was found in many other networks also. For example by utilizing the vast database of movies (The Internet Movie Database, www.imdb.com) it was found that the number of actors that had links to exactly k other actors decays following a power law [10]. Another example is the cell where the number of molecules interacting with exactly k other molecules follows also a power law [26].

6.5.1

The 80/20 Rule

One interesting implication of the existence of the power laws in certain systems has been around for a long time — from the beginning of the 20th century — and is known as the 80/20 rule. The rule was invented by an influential Italian economist called Vilfredo Pareto. Pareto was an avid gardener and he happened to notice that 80 percent of his peas were produced by only 20 percent of the peapods. He made also other similar observations, for example he noticed that 80 percent of Italy’s land was owned by only 20 percent of the population and so on. More recently Pareto’s Law or Principle got the name 80/20 rule and turned into the Murphy’s Law of management: 80 percent of profits are produced by only 20 percent of employees, 80 percent of customer service problems are produced by 20 percent of consumers, 80 percent of decisions are made during 20 percent of the meeting time, 80 percent of crime is committed by 20 percent of criminals, and so on [19].

66

Session 6. Networks of Agents

The 80/20 rule is a clear implication of the existence of the power laws in certain systems: It says that there are a smaller number of strong influencers (the hubs) in the system and the others are not so important.

6.5.2

Random and scale-free networks

Of course not all networks follow a power law distribution. There are many different networks in the universe that have nothing to do with power laws. But there exists also many real networks that do and they are called scalefree networks. The name comes from the fact that in networks that follow a power law there is no intrinsic scale in them. There is no single node that we could pick up and say that this is the prototype-node of this network, like we can in a network that follows a Bell curve. A good way to visualize this is to consider the differences between a random network and one described by a power-law degree distribution. Let’s compare a U.S. roadmap with an airline routing map. In the roadmap cities are the nodes and the highways connecting them are the links. As one can imagine the roadmap is fairly uniform network: Each major city has at least one link to the highway system and on the other hand there are no cities that would have hundreds of links. Another graph is the airline routing map where the situation is completely different. In this map airports (nodes) are connected to other airports by direct flights between them (links). In this case it is clear that there exists a few major airports in the country that have a high number of flights and a large number of smaller airports that have only few flights. When the two networks are compared it is clear that they have very different properties and this is why they also look very different. The roadmap follows a Bell curve telling that vast majority of the nodes has the same number of links and that it is extremely rare to find a node that would differ greatly from the average, which is located at the peak value of the degree distribution. On the other hand the airline map follows a power law distribution and there is no single node that could be selected as a prototype of the network. Rather there exist few major influencers (hubs) that take care of most of the traffic and shape the network to be scale-free. As can be seen, scale-free networks are one special case of networks and they define the topology of the network to such that there is no prototype-node but the hubs in the net define the behavior [3].

6.6. Viruses and fads

6.5.3

67

Robustness vs. vulnerability

Let us consider the two types of networks presented above, and think them as communication networks where each node is a communication point and the messages can be sent via links between nodes. Which would be more reliable? As usual, the answer is not unambiguous: If we consider the robustness against random errors (equipment failures for example) then it has been studied that the scale-free networks are extremely error tolerant. It has been found that typically nearly 80% of the nodes can be randomly removed until the network is crippled. On the other hand random network is fairly easily halted. This difference in behavior is because of the connecting force of the hubs in the scale-free network. Since the probability of error occurring in a node is exactly the same for every node in the network, the scale-free network can afford to lose a large amount of the “less important” nodes without significant loss in performance. There is a tradeoff however, as the increasing number of hubs in the network increases its stability against random errors at the same time its tolerance against targeted attacks is decreased. This is clear because of the important role of the hubs in the network and is exactly how the MafiaBoy accomplished in his attack (as described in the beginning). The attack was targeted at the most important hubs and thus the damage was enormous. This tradeoff is very important to notice and should be considered carefully in network design. Obviously the selection of the network topology should be based on the boundary conditions (intended use, security aspects and so on) [1].

6.6

Viruses and fads

Another interesting and much studied property of most networks is their ability to rapidly spread information. In most cases it is a good and hoped phenomenon, but there are cases where this kind of behavior can lead to unwanted results (i.e. computer viruses, spreading of different diseases like AIDS or SARS more recently). There are, however, many unanswered questions, like: Why do some make it and others just “crash and burn”? How could this behavior be predicted in a network? Could it be controlled reliably? Again, it is important to understand the properties of the underlying network when studying these things. Aiming to explain the disappearance of some

68

Session 6. Networks of Agents

fads and viruses and the spread of others, social scientists and epidemiologists have developed a tool called the threshold model. The model is based on the fact that all people differ in their willingness to adopt a new idea. But in general, with sufficient positive evidence everyone can be convinced. Only the level of positive evidence differs. Thus, the model gives every individual (or generally every node) an individual threshold value indicating the likelihood that he or she will adopt a given innovation. Another parameter called critical threshold can be calculated utilizing the properties of the network and thus a spreading rate for a given innovation can be estimated [12].

Figure 6.2: Mike Collins’s cartoon One interesting example of the rapid spreading rate was the frustration of a twenty-six-year old municipal water board engineer to the confusion in the presidential elections in Florida in 2000. The man was called Mike Collins and to make fun of the situation he sketched a picture like the one shown in Figure 6.2 and sent it to thirty friends through e-mail. The next day was his birthday and the day his sister gave birth to a daughter so he was out all day. When he came home, a huge surprise waited: 17 000 new hits on his Webpage and several hundred e-mails. While he was away his nipping cartoon had circled the globe and newspapers and Websites from United States to Japan were bombarding him with requests for permission to publish. In a few hours he went from unknown “Mike” to an instant celebrity, with girls hitting on him and parents wanting to fix him up with their daughters. On the other hand when some commercial companies try to do similar things with their products, they often fail even if they spend huge amounts of money while trying [20]. It seems that the threshold model is quite powerful tool for estimating spreading of different things in networks: Epidemiologists use it to model the probability that a new infection will turn into an epidemic; Marketing people estimate how a new product will do in the markets; sociologists use it to explain the spread of birth control practices among women; Political science exploits it to explain the life cycle of parties and movements or to model the

6.7. The Map of Life

69

likelihood that peaceful demonstrations turn into riots

6.7

The Map of Life

“Today we are learning the language in which God created life”, said President Bill Clinton on June 25, 2000, at the White House ceremony announcing the decoding of the 3 billion chemical “letters” of human genome. Was he right? Now that we have the full DNA of a human being figured out, don’t we have all the information needed to cure different diseases and estimate medical facts? The answer is: No! And the reason should be clear by now. As explained earlier when talking about reductionism, we run again into the hard wall of complexity. We only have the pieces but not the blueprint. To cure most illnesses, we need to understand living systems in their integrity. We need to understand how and when different genes work together, how messages travel within the cell, which reactions are taking place or not in this complex cellular network. However, studies have shown that there is scale–free–network–behavior present also in metabolic networks: For example, a study conducted on 43 different organisms on how many reactions each molecule participates in showed that all the nets were scale–free with three degrees of separation [15]. There are also many other examples showing that networks are present in living organisms [25]. So the question is; if we improve our “network-thinking” abilities can we also improve our understanding of living systems? Many researchers seem to think that we can.

6.8

Conclusions

Networks are present almost everywhere, all we need is an eye for them. Perhaps if we shift our way of thinking towards network-world we can open up new doors for science. One interesting area of research would be the dynamical models of the changes taking place in different networks. After all, most of the real networks are under continuous change and the networks describing them are only the skeletons of complexity. Identifying the underlying structures and their properties gives us a map for our journey through complex worlds!

70

Session 6. Networks of Agents

Bibliography 1. Albert R., Jeong H., Barabási A-L.: Attack and Error Tolerance of Complex Networks. Nature, vol. 406, 2000, p. 378. 2. Albert R., Jeong H., Barabási A-L.: Diameter of the World Wide Web. Nature, vol. 401, 1999, pp. 130–131. 3. Amaral L. A. N., Scala A., Barthélémy M., Stanley H. E.: Classes of Small–World Networks. Proceedings of the National Academy of Sciences, vol. 97, 2000, pp. 11149–11152. 4. Barabási A-L.: LINKED, The New Science of Networks. Perseus Publishing, Cambridge, Massachusettts, 2002. 5. Biggs N. L., Lloyd E. K., Wilson R. J.: Graph Theory: 1736–1936. Clanderon Press, Oxford, England, 1976. 6. Bollobás B.: Degree Sequences of Random Graphs. Discrete Mathematics, vol. 33, 1981, p. 1. 7. Buchanan M.: Ubiquity: The Science of History . . . Or Why the World Is Simpler Than We Think. Crown Publishers, New York, 2001. 8. Durham W.: Euler: The Master of Us All. Mathematical Association of America, Washington D.C., 1999. 9. Erdós P., Rényi A.: On Random Graphs I. Math. Debrecen, vol. 6, 1959, pp. 290–297. 10. Fass G., Ginelli M., Turtle B.: Six Degrees of Kevin Bacon. Plume, New York, 1996. 11. Gladwell M.: The Tipping Point. Little, Brown, New York, 2000. 12. Granovetter M.: Threshold Models of Collective Behavior. American Journal of Sociology, vol. 83, no. 6, 1978, pp. 1420–1443. 13. Guare John: Six Degrees of Separation. Random House, New York, USA, 1990. 14. Jeong H., Mason S., Barabási A-L., Oltvai Z. N.: Centrality and Lethality of Protein Networks. Nature, vol. 411, 2001, pp. 41–42.

6.8. Conclusions

71

15. Jeong H., Tombor B., Albert R., Oltvai Z. N., Barabási A-L.: The Large-Scale Organization of Metabolic Networks. Nature, vol. 407, 2000, pp. 651–654. 16. Kleinberg J.: Authoritative Sources in a Hyperlinked Environment. Proceedings of the 9th Association for Computing Machinery — Society for Industrial and Applied Mathematics. Symposium on Discrete Algorithms, 1998. 17. Kleinfeld J.: Six Degrees of Separation: An Urban Myth. Psychology Today, 2002. 18. Kleinfeld J.: The Small World Problem. Society, vol. 39, 2002, pp. 61–66. 19. Koch R.: The 80/20 Principle — The Secret to Success by Achieving More with Less. Currency, New York, USA, 1998. 20. Mandelbaum R.: Only in America. New York Times Magazine, November 26, 2000. 21. Milgram S.: The Small World Problem. Physiology Today, vol. 2, 1967, pp. 60–67. 22. Montoya J. M., Solé R. V.: Small World Patterns in Food Webs. http://www.santafe.edu/sfi/publications/Abstracts/00-10-059abs.html. 23. Newman M. E. J.: The Structure of Scientific Collaboration Networks. Proceedings of the National Academy of Sciences of the United States of America, vol. 98, January 16, 2001, p. 404–409. 24. Taylor C.: Behind the Hack Attack. Time Magazine, February 21, 2000. 25. Venter J. C. et al: The Sequence of the Human Genome. Science, vol. 291, 2001, pp. 1304–1351. 26. Wagner A., Fell D.: The Small World Inside Large Metabolic Networks. Proceedings of the Royal Society of London, Series B—Biological Sciences, vol. 268, September 7, 2001, pp. 1803–1810. 27. Watts D. J., Strogatz S. H.: Collective Dynamics of ’Small–World’ Networks. Nature, vol. 393, 1998, pp. 440–442.

72

Session 6. Networks of Agents

28. Williams R. J., Martinez N. D., Berlow E. L., Dunne J. A., Barabási A-L.: Two Degrees of Separation in Complex Food Webs. http://www.santafe.edu/sfi/publications/Abstracts/01-07-036abs.html.

Session 7 Cellular Automata Juan Li Communications Laboratory, HUT First beginning with introduction to the history of cellular automata, this paper pesents several important ideas in this field. Then a brief analysis of 1-D cellular automata is given. Finally, some more advanced applications of cellular automata are demonstrated.

7.1

Introduction

The increasing prominence of computers has led to a new way of looking at the world. This view sees nature as a form of computation. That is, we treat objects as simple computers, each obeying its own set of laws. A computer follows rules. At each moment, the rules determine exactly what the computer will do next. We say that a computer is an example of an automaton. Other, simpler examples of automata also exist1 . These more abstract rule-following devices can be easier to study using computers, and they can be interesting to study in their own right. One type of automaton that has received a lot of attention is cellular automata. For one thing, they 1

Automata is the plural of automaton. While the word “automaton” conjures up the image of a mechanical toy or a soulless organism, in computer science it has a very precise meaning. It refers to all machines whose output behaviour is not a direct consequence of the current input, but of some past history of its inputs. They are characterised as having an internal state which is a repository of this past experience. The inner state of an automaton is private to the automaton, and is not available to an external observer.

73

74

Session 7. Cellular Automata

make pretty pictures. For another, they are related to exciting new ideas such as artificial life and the edge of chaos. For a fairly simple example see [1]. The “cellular automaton” provides a way of viewing whole populations of interacting “cells”, each of which is itself a computer (automaton). By building appropriate rules into a cellular automaton, we can simulate many kinds of complex behaviours, ranging from the motion of fluids governed by the Navier-Stokes equations to outbreaks of starfish on a coral reef.

7.2 7.2.1

History of Cellular Automata Spectacular historical automata

The earliest automata were mechanical devices that seemed to demonstrate lifelike behaviour. They took advantage not only of gears, but also of gravity, hydraulics, pulleys and sunlight — the effect could be dazzling, as with the extraordinary clock of Berne: Created in 1530, this massive timepiece hourly disgorged a dazzling pageantry of automata figures, beginning with a crowing cock and followed by a procession in which the nodding head of a clock king allowed the passage of a parade of spear-wielding bear cubs and a ferocious lion! The most famous of early automata was the creation of Jacques de Vaucanson, who in 1738 dazzled Paris with “an artificial duck made of gilded copper who drinks, eats, quacks, splashes about the water, and digests his food like a living duck.” The complexity of this duck was enormous — there were over four hundred moving pieces in a single wing.

7.2.2

Early history of cellular automata

Mathematician Stanislaw M. Ulam liked to invent pattern games for the computer at Los Alamos. Given certain fixed rules, the computer would print out ever-changing patterns. Many patterns grew almost as if they were alive. A simple square would evolve into a delicate, coral-like growth. Two patterns would “fight” over territory, sometimes leading to mutual annihilation. He developed 3-D games too, constructing thickets of coloured cubes as prescribed by computer. He called the patterns “recursively defined geometric objects”. Ulam’s games were cellular games. Each pattern was composed of square (or triangular or hexagonal) cells. In effect, the games were played on limitless

7.2. History of Cellular Automata

75

chessboards. All growth and change of patterns took place in discrete jumps. From moment to moment, the fate of a given cell depended only on the states of its neighbouring cells. The advantage of the cellular structure is that it allows a much simpler physics. Without the cellular structure, there would be infinitely many possible connections between components. Ulam suggested that John von Neumann “construct” an abstract universe for his analysis of machine reproduction. It would be an imaginary world with self-consistent rules, as in Ulam’s computer games. It would be a world complex enough to embrace all the essentials of machine operation, but otherwise as simple as possible. Von Neumann adopted an infinite chessboard as his universe. Each square cell could be in any of a number of states corresponding roughly to machine components. A “machine” was a pattern of such cells. The rules governing the world would be a simplified physics. A proof of machine reproduction would be easier to devise in such an imaginary world, as all the nonessential points of engineering would be stripped away [1]. Ironically, the name von Neumann is now strongly associated with the old-fashioned, single-CPU computer architecture. He was also the major pioneer in parallel computing via his research on arrays of computer or cellular automata. In 1944, von Neumann was introduced to electronic computing via a description of the ENIAC by Goldstine. Shortly, he formed a group of scientists headed by himself, to work on problems in computers, communications, control, time-series analysis, nd the “communications and control aspects of the nervous system”. The last topic was included due to his great interest in the work on neural networks of McCulloch and Pitts. In 1946, von Neumann proceeded to design the EDVAC (Electronic Discrete Variable Computer) which was the first design of the ideas on automata developed by Post (1936) and Turing (1936), he had commenced studies on the complexity required for a device or system to be self-reproductive. These studies also included work on the problem of organizing a system from basically unreliable parts (a field of study which we now know as “fault tolerant computing”). At first, von Neumann investigated a continuous model of a self-reproducing automaton based on a “system of non-linear partial differential equations, essentially of the diffusion type”. He also pursued the idea of a kinematic automaton which could, using a description of itself, proceed to mechanically assemble a duplicate from available piece parts. When von Neumann found it difficult to provide the rigorous and explicit rules and instructions needed to realize such an automaton and when it became evident that the value of such an automata would be moot, he redirected

76

Session 7. Cellular Automata

his efforts towards a model of self-reproduction using an array of computing elements. Both Burks and Goldstine confirm that the idea of such an array was suggested to von Neumann by Stanislaw Ulam. Von Neumann was also attracted to this idea of using parallelism because he saw that it would eventually lead to high computational speed. By 1952 he had put his ideas in writing and in 1953 described them more fully in his lectures at Princeton University. Unfortunately, his premature death in 1957 prevented him from completely achieving his goals. Thus, it can be said that, in the early 1950s, von Neumann conceived of the cellular automata [2].

7.2.3

Von Neumann’s self-reproducing cellular automata

The central problem of self-reproduction is the problem of infinite regress. Living organisms are finite and reproduce in a finite time. Any machine, whether in the real world, von Neumann’s cellular space, or Life’s cellular space, is likewise finite. If self-reproduction involves infinities, then it is pointless to look for self- reproducing machines. The problem with machine reproduction is that the universal constructor is a mindless robot. It has to be told very explicitly what to do. If such a machine were given a description of an alleged self-reproducing machine, this constructor needs to understand that it is supposed to be reproducing the machine, including its description. Von Neumann’s solution was to append a “supervisory unit” to the machine to handle precisely such tasks. Infinities are avoided because the machine description does not try to encapsulate itself, but is interpreted in two ways. It is first interpreted literally, as a set of directions to be followed in order to make a certain type of machine. Once the self-reproduction has entered the second phase, the instructions are ignored, and the code is treated merely as data for the copying process. Von Neumann began with a horizonless grid, with each cell in an inactive state. An organism was then introduced, covering two hundred thousand cells. The details of this creature were represented by different states of individual cells — there were 29 possible states. The different combinations of these states governed the behaviour of the organism, and defined the organism itself. It was shaped like a box with a very long tail; the box, about eighty cells long by four hundred cells wide, contained suborganisms: • A factory, gathering “materials” from the environment and arranging

7.2. History of Cellular Automata

77

them according to instructions from another suborganism, • A duplicator, reading informational instructions and copying them, and • A computer, the control apparatus. These took up only a quarter of the creature’s total number of cells. The rest of the cells were in a single-file line of 150,000 cells, acting as a code for the instructions to construct the entire organism. Once this automaton was embedded in the grid, each cell, as an individual finite state machine, began to follow the rule that applied to it. The effect of these local behaviours caused a global behaviour to emerge: The selfreproducing structure interacted with neighbouring cells and changed some of their states. It transformed them into the materials — in terms of cell states — that made up the original organism. The tail of the cell contained instructions for the body of the creature. Eventually, by following rules of transition (drawn up by Von Neumann), the organism made a duplicate of its main body; information was passed through an “umbilical cord”, from parent to child. The last step in the process was the duplication of the tail, and the detachment of the “umbilical cord”. Two identical creatures, both capable of self-reproduction, were now on the grid.

7.2.4

Conway’s Game of Life

Life was devised in 1970 by John Horton Conway, a young mathematician at Gonville and Caius College in Cambridge, and it was introduced to the world via two of Martin Gardner’s columns in Scientific American (October 1970 and February 1971). Although it was true that von Neumann’s automaton qualified as a universal computer (it could emulate any describable function of any other machine by the use of a set of logical rules), the organism itself was very complex, with its two hundred thousand cells in any of twentynine states. Conway suspected that a cellular automaton with universal computing capabilities might be simpler. The key to this simplicity would be the rules that dictated survival, birth and death. The game Life is a simple 2-D analogue of basic processes in living systems. The game consists in tracing changes through time in the patterns formed by sets of “living” cells arranged in a 2-dimensional grid. Any cell in the grid may be in either of two states: “alive” or “dead”. The state of each cell changes from one generation to the next depending on the state of its

78

Session 7. Cellular Automata

immediate neighbours. The rules governing these changes are designed to mimic population change. The behaviour in Life is typical of the way in which many cellular automata reproduce features of living systems. That is, regularities in the model tends to produce order: Starting from an arbitrary initial configuration, order (i.e., patches of pattern) usually emerges fairly quickly. Ultimately, most configurations either disappear entirely or break up into isolated patterns that are either static or else cycle between several different forms with a fixed period. Conway tried many different numerical thresholds for birth and survival. He had three objectives: • First, Conway wanted to make sure that no simple pattern would obviously grow without limit. It should not be easy to prove that any simple pattern grows forever. • Second, he wanted to ensure, nonetheless, that some simple patterns do grow wildly. There should be patterns that look like they might grow forever. • Third, there should be simple patterns that evolve for a long time before stabilising. A pattern stabilised by either vanishing completely or producing a constellation of stable objects. If Life’s rules said that any cell with a live neighbour qualifies for a birth and no cell ever dies, then any initial pattern would grow like a crystal endlessly. If on the other hand the rules were too anti-growth, then everything would die out — Conway contrived to balance the tendencies for growth and death. It turned out that these objectives were fulfilled when the following rules were applied: STASIS If, for a given cell, the number of on neighbours is exactly two, the cell maintains its status into the next generation. If the cell is on, it stays on, if it is off, it stays off. GROWTH If the number of on neighbours is exactly three, the cell will be on in the next generation. This is regardless of the cell’s current state. DEATH If the number of on neighbours is 0, 1, or 4–8, the cell will be off in the next generation. In the real world, matter-energy cannot be created or destroyed. If a colony of bacteria grow to cover a Petri dish, it is only by consuming nutrient. The

7.2. History of Cellular Automata

79

Petri dish as a whole weighs the same before and after. However, no such restriction applies in Life, the amount of “matter” in the Life universe can fluctuate arbitrarily. Simple rules can have complex consequences. The simple set of rules in Life has such a wealth of implications that it is worth examining in detail. Life is forward-deterministic: This means that a given pattern leads into one, and only one sequel pattern. The Game of Life is not backward-deterministic; a pattern usually has many patterns that may have preceded it. In short, a configuration has only one future, but usually many possible pasts. This fact is responsible for one of the occasional frustrations of playing Life. Sometimes you will see something interesting happen, stop the program, and be unable to backtrack and repeat it. There is no simple way you can program a computer to go backward from a Life state — there are too many possibilities. Conway had tried to design Life’s rules so that unlimited growth patterns were possible. It was not immediately obvious that he had succeeded, though. Conway conceived of two ways of demonstrating that unlimited growth is possible, if indeed it is. Both ways involved finding a pattern that does grow forever, yet in a disciplined, predictable manner. In 1970, in response to Conway’s challenge in the Scientific American that a finite initial configuration of Life would not be able to generate infinite populations, R.W. Gosper used a DEC PDP-6 computer to run Life simulations far quicker than could be done by hand. Eventually, they found the “glider gun”, which generated gliders continually, and a “puffer train” that moved on the grid, leaving behind a constant trail of live cells. They even found “puffer trains that emitted gliders that collided to make glider guns which then emitted gliders, but in a quadratically increasing number ...”. From this breakthrough, Conway could prove that Life could indeed support universal computation. Using glider streams to represent bits, he was able to produce the equivalent of and-gates, or-gates and not-gates, as well as an analogue of a computer’s internal storage.

7.2.5

Stephen Wolfram and 1-dimensional cellular automata

Stephen Wolfram worked with a one-dimensional variant of von Neumann’s cellular automata; this was fully horizontal and occurred on a single file line. Each cell touched only two other cells, its two immediate neighbours on either side, and each succeeding generation was represented by the line

80

Session 7. Cellular Automata

underneath the preceeding one. A cell in generation 2 would determine its state by looking at the cell directly above it, ie., in generation 1, and that cell’s two neighbours. Thus, there are eight possible combinations of the states of those 3 cells, ranging from “000” (all off) to “111” (all on). Since there are 8 possible states for the ancestors of any given cell, and these states may result in one of two states (1 or 0), there are 256 possible rulesets for this type of cellular automata. Wolfram explored them all. Some of those rules quickly resolved themselves into boring configurations, for example, all dead cells or all live cells. Wolfram called these Class 1 cellular automata. Another variation was some other frozen configuration where initial activity ceased and stable structures reigned — Wolfram designated these as Class 2. However, other configurations broke up into relatively disordered patterns, resembling video noise, but sometimes scattered with inverted triangles. This was Class 3. There was a final class, Class 4, of cellular automata that displayed behaviour that was not disordered, but complex, and sometimes long-lived. These were capable of propagating information, and included all cellular automata that supported universal computers (see Fig. 7.1).

Figure 7.1: Wolfram’s four classes of cellular automata

7.3. Cellular automata in technical terms

7.2.6

81

Norman Packard’s Snowflakes

Norman Packard was a physicist working with Stephen Wolfram at the Institute for Advanced Study, and chose to apply information theory to cellular automata to emulate snowflakes. He reasoned that firstly, it was not the case that every snowflake was identical. However, if every snowflake had a random structure, the information from each snowflake would be meaningless. Therefore, there is a syntax within snowflakes, several main types of structures, which are capable of containing individual variations. Packard discovered that different weather conditions result in snowflakes taking on different general aspects. One set of conditions yields configurations that look like plates, another determines snowflakes shaped like collections of rods, and another yields dendritic stars. He wrote a cellular automaton simulation in which the “off” cells, those with a value of “0”, represented water vapour, and the “on” cells, those assigned a value of “1”, represented ice, and appeared on the screen in colour. The snowflake would grow outwards from its boundary. A typical set of rules initiated a scan of a cell’s neighbourhood, summed the values of the surrounding cells, and filled the new cells with either ice or water vapour, depending on whether the toal was odd or even. The resulting artificial snowflakes lacked the complexity of real snowflakes, particularly those with structures based on patterns of needle-like shapes — but they did have plates and dendrites growing from the corners of the plates, from which more dendrites grew; they were easily idenfiable as snowflakes.

7.3

Cellular automata in technical terms

A cellular automaton is an array of identically programmed automata, or “cells”, which interact with one another. The arrays usually form either a 1-dimensional string of cells, a 2-D grid, or a 3-D solid. Most often the cells are arranged as a simple rectangular grid, but other arrangements, such as a honeycomb, are sometimes used. The essential features of a cellular automaton (“CA” for short) are: • Its STATE is a variable that takes a different value for each cell. The state can be either a number or a property. For instance if each cell represents part of a landscape, then the state might represent (say) the number of animals at each location or the type of forest cover growing there.

82

Session 7. Cellular Automata • Its NEIGHBOURHOOD is the set of cells that it interacts with. In a grid these are normally the cells physically closest to the cell in question. Below, some simple neighbourhoods (cells marked n) of a cell (C) in a 2-D grid are shown:

n

n C n

n n n

n

n n C n n n

n n n

n n n C n n n n n

n

n

• Its PROGRAM is the set of rules that defines how its state changes in response to its current state, and that of its neighbours. Example — Fabric patterns Imagine a CA consisting of a line of cells as follows: • States: 0 or 1 • Neighbourhood: The two adjacent cells (or “n C n”) • Rules: The following list shows the new state of a cell for each possible local “configuration”, i.e. arrangement of states for the cell and its two neighbours. Because there are two possible states (0 or 1) for each of 3 cells, there are 23 = 8 rules required: 0 0 0 0 1 1 1 1 0

→ 0 → 0 → 0

0 1 1

0 1 → 0 0 → 1 1 →

1 1 0

0 1 1 0

0 → 1 1 → 1

Suppose that we start with just a single cell in state 1. Then here is how the array will change with time (see Fig. 7.2; here “.” denotes 0 for clarity). When we plot the successive states as shown above, some properties emerge: Self-organization: Notice that when we plot the successive states as shown in the above example a pattern emerges. Even if the line of cells starts with a random arrangement of states, the rules force patterns to emerge; the pattern depends on the set of rules, for examples see Fig. 7.3.

7.3. Cellular automata in technical terms

83

Figure 7.2: Evolution in a one-dimensional CA

Figure 7.3: Fabric patterns Life-like behaviour: Empirical studies by Wolfram and others show that even the simple linear automata behave in ways reminiscent of complex biological systems. For example, the fate of any initial configuration of a cellular automaton is either 1. to die out; 2. to become stable or cyclic with fixed period; 3. to grow indefinitely at a fixed speed; 4. to grow and contract irregularly. “Thermal behavior”: In general, models that force a change of state for few configurations tend to “freeze” into fixed patterns, whereas models

84

Session 7. Cellular Automata that change the cell’s state in most configurations tend to behave in a more active “gaseous” way. That is, fixed patterns do not emerge.

In other words, a cellular automaton is a discrete dynamical system. Space, time, and the states of the system are discrete. Each point in a regular spatial lattice, called a cell, can have any one of a finite number of states. The states of the cells in the lattice are updated according to a local rule. That is, the state of a cell at a given time depends only on its own state one time step previously, and the states of its nearby neighbors at the previous time step. All cells on the lattice are updated synchronously. Thus the state of the entire lattice advances in discrete time steps [4].

7.4

A Mathematical analysis of a simple cellular automaton

As an example of a cellular automaton, consider a line of sites, each with value 0 or 1 (Fig. 7.4). Take the value of a site at position i on time step t to be ai . One very simple rule for the time evolution of these site values is (t+1)

ai

(t)

(t)

= ai−1 + ai+1

mod 2,

(7.1)

where mod 2 indicates that the 0 or 1 remainder after division by 2 is taken. According to this rule, the value of a particular site is given by the sum modulo 2 (or, equivalently, the Boolean algebra “exclusive or”) of the values of its left and right hand nearest neighbor sites on the previous time step. The rule is implemented simultaneously at each site. Even with this very simple rule quite complicated behavior is nevertheless found.

Figure 7.4: A typical configuration in the simple cellular automaton described by (7.1) The cellular automaton rule of (7.1) is particularly simple and allows a rather complete mathematical analysis. The analysis proceeds by writing for each configuration a characteristic polynomial A(x) =

N −1  i=0

ai xi

(7.2)

7.5. Applications

85

where x is a dummy variable, and the coefficient of xi is the value of the site at position i. In terms of characteristic polynomials, the cellular automaton rule (7.1) takes on the particularly simple form A(t+1) (x) = T (x)A(t) (x)

mod

(xN − 1)

(7.3)

where T (x) = (x + x−1 ),

(7.4)

and all arithmetic on the polynomial coefficients is performed modulo 2. The reduction modulo xN −1 implements periodic boundary conditions. The structure of the state transition diagram may then be deduced from algebraic properties of the polynomial T (x). Since a finite cellular automaton evolves deterministically with a finite total number of possible states, it must ultimately enter a cycle in which it visits a sequence of states repeatedly. Such cycles are manifest as closed loops in the state transition graph. The algebraic analysis of Martin et al. shows that for the cellular automaton of (7.1) the maximal cycle length (of which all other cycle lengths are divisors) is given for odd N 2sordN(2) − 1.

(7.5)

Here sordN (2) is a number theoretical function defined to be the minimum positive integer j for which 2j = ±1 modulo N. The maximum value of sordN (2) , typically achieved when N is prime, is (N 2−1) . The maximal cycle N length is thus of order 2 2 , approximately the square root of the total number of possible states 2N . An unusual feature of this analysis is the appearance of number theoretical concepts. Number theory is inundated with complex results based on very simple premises. It may be part of the mathematical mechanism by which natural systems of simple construction yield complex behavior.

7.5

Applications

Cellular automata applications are diverse and numerous. The laws in our Universe are only partly known and appear to be highly complex, whereas in a cellular automaton laws are simple and completely known. One can then test and analyse the global behaviour of a simplified universe, for example:

86

Session 7. Cellular Automata • Simulation of gas behaviour. A gas is composed of a set of molecules whose behaviour depends on the one of neighbouring molecules. • Study of ferromagnetism according to Ising model. This model (1925) represents the material as a network in which each node is in a given magnetic state. This state — in this case one of the two orientations of the spins of certain electrons — depends on the state of the neighbouring nodes. • Simulation of percolation process. • Simulation of forest fire propagation. • Conception of massive parallel computers. • Simulation and study of urban development. • Simulation of crystallisation process.

In different fields, cellular automata can be used as an alternative to differential equations. Cellular automata can also be used as graphic generators2 . The several images in Fig. 7.5 show some graphic effects.

Figure 7.5: Some pictures generated by cellular automata

2

According to Rucker et Walker op. cit., “In five years you won’t be able to watch television for an hour without seeing some kind of cellular automata”.

7.5. Applications

87

Bibliography 1. http://math.hws.edu/xJava/CA/ 2. Kendall Preston, Jr. and Michael J.B.Duff, Modern Cellular Automata: Theory and Applications. Plenum Press, New York, 1984. 3. http://life.csu.edu.au/complex/tutorials/tutorial1.html 4. http://shakti.trincoll.edu/ pkenned3/alife.html 5. http://math.hws.edu/xJava/CA/CA.html 6. http://www.tu-bs.de/institute/WiR/weimar/ZAscriptnew/intro.html 7. http://www.stephenwolfram.com/publications/articles/ca/85-two/2/text.html 8. http://www.brunel.ac.uk/depts/AI/alife/al-ca.htm 9. http://www.rennard.org/alife/english/acintrogb01.html 10. http://www.ce.unipr.it/pardis/CNN/cnn.html 11. http://cafaq.com/soft/index.shtml 12. N. Howard, R. Taylor, and N. Allinson.The design and implementation of a massively-parallel fuzzy architecture. Proc. IEEE, pages 545-552, March 1992

88

Session 7. Cellular Automata

Session 8 From Chaos ...

Complexity theory cannot be understood without taking a look at chaos theory. Chaos theory flourished in the 1980’s; it had been observed that simple rules can result in very complicated structures. Only after that, in complexity theory, the emphasis was turned in the opposite direction, searcing for the underlying rules beneath the observed complexity. Thus, in complexity research there are many underlying assumptions, etc., that are directly inherited from chaos research. Unfortunately, this chapter was never delivered in the correct format to be included in this report.

89

90

Session 8. From Chaos ...

Session 9 . . . Towards New Order Lasse Eriksson Helsinki University of Technology Control Engineering Laboratory [email protected]

In this chapter we are moving from chaos towards new order i.e. complexity. The complexity lies at the edge of chaos (EOC), which is the border area between static and chaotic behavior. Other subjects in this chapter such as self-organized criticality (SOC), highly optimized tolerance (HOT) and measures of complexity are closely related to EOC.

9.1

Introduction to complexity

It is obvious that chaos theory existed long before anyone spoke about complexity. But there was a field called “Catastrophe theory” even before chaos. Catastrophe theory studies and classifies phenomena characterized by sudden shifts in behavior arising from small changes in circumstances. Originated by the French mathematician Rene Thom in the 1960s, catastrophe theory is a special branch of dynamical systems theory [1]. Anyway, as James P. Sethna from Cornell University, Ithaca says: “. . . the big fashion was topological defects. Everybody was . . . finding exotic systems to write papers about. It was, in the end, a reasonable thing to do. The next fashion, catastrophe theory, never became important for anything” [2]. 91

92

Session 9. . . . Towards New Order

The successor of catastrophe theory, chaos theory has been described in the previous chapter. In this chapter the emphasis is on the latest research area of this field, i.e. complexity and complex systems. There has been a lot of discussion whether complexity is any different from chaos. Is there really something new or are we talking about the same subject with a new name? Langton’s (“father” of EOC) famous egg diagram in Fig. 9.1 gives some insight to this matter. The behavior of systems has been divided into four different classes in the egg. These are “Fixed”, “Periodic”, “Complex” and “Chaotic”. It is justified to say that complex behavior is different from chaotic behavior. Thus complexity is not chaos and there is something new in this field. But yet the things studied in the field of complexity are somehow familiar from the past.

Figure 9.1: Langton’s famous egg diagram [3] The advances in the scientific study of chaos have been important motivators and roots of the modern study of complex systems. Chaos deals with deterministic systems whose trajectories diverge exponentially over time and the models of chaos generally describe the dynamics of one or few variables that are real. Using these models some characteristic behaviors of their dynamics can be found. Complex systems do not necessarily have these behaviors. Complex systems have many degrees of freedom, many elements that are partially but not completely independent. Complex behavior can be seen as "high dimensional chaos". Chaos is concerned with a few parameters and the dynamics of their values, while the study of complex systems is concerned with both the structure and the dynamics of systems and their interaction with their environment [4]. Complexity can be defined in many ways. We have different definitions depending on the field of science. For example, in the field of information technology complexity is defined to be either the minimal length of a description of a system (in information units) or the minimal time it takes to create the system [4]. On the other hand, complexity necessarily depends

9.2. Self-organized criticality

93

on the language that is used to model the system. The original Latin word complexus signifies “entwined” or “twisted together”. This may be interpreted in the following way: in order to have a complex you need two or more components, which are joined in such a way that it is difficult to separate them [5]. Another important term in the field of complexity is emergence, which is [4] 1. What parts of a system do together that they would not do by themselves (collective behavior). How behavior at a larger scale of the system arises from the detailed structure, behavior and relationships on a finer scale. 2. What a system does by virtue of its relationship to its environment that it would not do by itself. 3. Act or process of becoming an emergent system. Emergence refers to all the properties that we assign to a system that are really properties of the relationship between a system and its environment.

9.2

Self-organized criticality

In this section, self-organized criticality (SOC) is described. SOC has its roots in fractals. So far we have discussed fractals only as a geometrical phenomena, but in SOC case the dynamics is important. The SOC is often demonstrated by using the sand pile model, which is also discussed here.

9.2.1

Dynamical origin of fractals

Many objects in nature are best described geometrically as fractals with self-similar features on all length scales. In nature, there are for example mountain landscapes that have peaks of all sizes, from kilometers down to millimeters, river networks that have streams of all sizes, and earthquakes, which occur on structures of faults ranging from thousands of kilometers to centimeters. Fractals are scale-free so you can’t determine the size of a picture of a part of a fractal without a yardstick. The interesting question is then how nature produces fractals? The origin of the fractals is a dynamical, not a geometrical problem. Geometrical characterization of fractals has been widely examined but it would be

94

Session 9. . . . Towards New Order

more interesting to gain understanding of their dynamical origin. Consider for example earthquakes. They last for a few seconds, but the fault formations in the crust of the earth are built up during some millions of years. The crust seems static if the observation period is only a human lifetime. The laws of physics are local, but fractals are organized over large distances. Large equilibrium systems operating near their ground state tend to be only locally correlated. Only at a critical point where continuous phase transition takes place are those systems fractal [6].

9.2.2

SOC

Per Bak, Chao Tang, and Kurt Wiesenfeld introduced the concept of selforganized criticality in 1987. SOC refers to tendency of large dissipative systems to drive themselves to a critical state with a wide range of length and time scales. As an example, consider a damped spring as in Fig. 9.2. A mass m is attached to the spring, which is fastened into the wall on the left. The spring constant is k and the damping coefficient is B. The distance from the equilibrium point is x(t).

k

m x

B Figure 9.2: A damped spring.

From the first principles it is easy to model the spring system traditionally. m¨ x(t) + B x(t) ˙ + kx(t) = 0

⇒ x¨(t) = −

1 · [B x(t) ˙ + kx(t)] m

(9.1)

It is also possible to simulate the system. To do that, the following initial conditions and constants have been defined.

9.2. Self-organized criticality

95

⎧ ⎪ ⎪ ⎪ ⎨

k = 0.1 N m B = 0.08 kg s ⎪ m = 1kg ⎪ ⎪ ⎩ x(0) ˙ = 1 ms

(9.2)

The Simulink model of the spring is presented in Fig. 9.3. Gain B Gain1 1 s

1 s

x

Integrator

Integrator1

To Workspace

−1/m

k Gain2

Figure 9.3: Simulink model of the spring. After the simulation (200 seconds) the response x(t) is plotted in Fig. 9.4. Figure 9.5 represents the last 40 seconds of the simulation (on a finer scale!). −3

x 10 3

4

2.5

3 2

2

1

ZOOM

0.5

Amplitude (m)

Amplitude (m)

1.5

1

0

0

−1

−0.5

−1

−2

−1.5

−3 −2

165 0

20

40

60

80

100 Time (s)

120

140

160

180

200

Figure 9.4: The response x(t).

170

175

180

185

190

195

Time (s)

Figure 9.5: A closer look at the damping.

The scale is very different in the two plots. As we can see from the simulation results, the oscillatory behavior of the spring with a decreasing amplitude theoretically continues forever. In real world, the motion would stop because of the imperfections such as dust. Once the amplitude gets small enough, the emotion suddenly stops. This generally occurs when the velocity is smallest i.e. at the “top” or at the “bottom” of an oscillation. This is not the state

96

Session 9. . . . Towards New Order

of smallest energy! In a sense, the system is most likely to settle near a “minimally stable” state. In that state the system still has some potential energy. If the spring stops at the top of the oscillation, some energy remains in the spring. Then the system becomes very sensitive for small perturbations because a little “push” can free the energy and the spring may start oscillating again. This kind of behavior can be detected as well when analyzing pendulums. Consider a system with coupled, multiple pendulums. If the system is in steady-state with all the pendulums in a minimally stable state, a small perturbation can avalanche the whole system. Small disturbances could grow and propagate through the system with little resistance despite the damping and other impediments. Since energy is dissipated through the process, the energy must be replenished for avalanches to continue. If self-organized criticality is considered, the interest is on the systems where energy is constantly supplied and eventually dissipated in the form of avalanches [6]. The dynamics in the SOC state is intermittent with periods of inactivity separated by well-defined bursts of activity or avalanches. The critical state is an attractor for the dynamics. The SOC idea provides a unifying concept for large-scale behavior in systems with many degrees of freedom. SOC complements the concept of chaos wherein simple systems with a small number of degrees of freedom can display quite complex behavior. Large avalanches occur rather often and there is no exponential decay of avalanche sizes, which would result in a characteristic avalanche size, and there is a variety of power laws without cutoffs in various properties of the system [7]. The paradigm model for this type of behavior is the celebrated sand pile cellular automaton also known as the Bak-Tang-Wiesenfeld (BTW) model.

9.2.3

Sand pile model

Adding sand slowly to a flat sand pile will result only in some local rearrangement of particles. The individual grains, or degrees of freedom, do not interact over large distances. Continuing the process will result in the slope increasing to a critical value where an additional grain of sand gives rise to avalanches of any size, from a single grain falling up to the full size of the sand pile. The pile can no longer be described in terms of local degrees of freedom, but only a holistic description in terms of one sand pile will do. The distribution of avalanches follows a power law. If the slope of the pile were too steep, one would obtain a large avalanche

9.2. Self-organized criticality

97

and a collapse to a flatter and more stable configuration. If the slope were too shallow, the new sand would just accumulate to make the pile steeper. If the process is modified, for instance by using wet sand instead of dry sand, the pile will modify its slope during a transient period and return to a new critical state. Consider for example snow screens. If they were built to prevent avalanches, the snow pile would again respond by locally building up to steeper states, and large avalanches would resume. The sand pile model can be simulated and there are some Java-applets available on the Internet (e.g. [8]). Typically, the simulation model is such that there is a 2-D lattice (cellular automaton) with N sites. Integer values zi represent the local sand pile height at each site i. If the height exceeds some critical height zcr (let’s say three), then one grain is transferred from the unstable site to each of the four neighboring site (this is called a toppling). A toppling may initiate a chain reaction, where the total number of topplings is a measure of the size of an avalanche. To explore the SOC of the sand pile model, one can randomly add sand onto the pile and have the system relax. The result is unpredictable and one can only simulate the resulting avalanche to see the outcome. After adding a large amount of sand, the configuration seems random, but some subtle correlations exist (e.g. never do two black cells lie adjacent to each other, nor does any site have four black neighbors). Avalanche is triggered if a small amount of sand is added to a site near the center. The size of an avalanche is calculated and the simulation is repeated. After the simulations, it is possible to analyze the distribution of the avalanche sizes. The distribution of avalanche size follows a power law

P (s) = s1−τ , τ ∼ = 2.1

(9.3)

where s is the size of an avalanche and P (s) is the probability of having an avalanche of size s. Because of the power law, the initial state was actually remarkably correlated although it was thought not to be that. For random distribution of z’s (pile heights), one would expect an avalanche to be either sub-critical (small avalanche) or supercritical (exploding avalanche with collapse of the entire system). Power law indicates that the reaction is precisely critical, i.e. the probability that the activity at some site branches into more than one active site, is balanced by the probability that the activity dies [6].

98

9.3

Session 9. . . . Towards New Order

Complex behavior and measures

In this section two approaches to the edge of chaos phenomenon are discussed as well as the measures of complexity. Phase transitions and their relation to Wolfram’s four classes (discussed in the earlier chapters) are also presented.

9.3.1

Edge of chaos — Langton’s approach

The edge of chaos is the interesting area, where complex rather than chaotic or static behavior arises. Backed by Kauffman’s work on co-evolution, Wolfram’s cellular automata studies, and Bak’s investigations of self-organized criticality, Langton has proposed (in 1990) the general thesis that complex systems emerge and maintain on the edge of chaos, the narrow domain between frozen constancy and chaotic turbulence. The edge of chaos idea is another step towards an elusive general definition of complexity [9]. In general, some cellular automata are boring since all cells either die after few generations or they quickly settle into simple repeating patterns. You could call these “highly ordered” cellular automata because their behavior is predictable. Other cellular automata are boring because their behavior seems to be totally random. You could call these “chaotic” cellular automata because their behavior is totally unpredictable. On the other hand, some cellular automata show interesting (complex, lifelike) behavior, which arises near the border of between chaos and order. If these cellular automata were more ordered, they would be predictable, and if they were less ordered, they would be chaotic. This boundary is called the edge of chaos. Christopher Langton introduced a 1-D cellular automaton, whose cells have two states. The cells are either “alive” or “dead”. If a cell and its neighbors are dead, they will remain dead in the next generation. Langton defined a simple number that can be used to help predict whether a given cellular automaton will fall in the ordered realm, in the chaotic realm, or near the boundary. The number lambda (0 < λ < 1) can be computed from the rule space of the cellular automaton. The value is simply the fraction of rules in which the new state of the cell is living. The number of rules R of a cellular automaton is determined by R = K N , where K is the number of possible states and N is the number of neighbors. If λ = 0, the cells will die immediately and if λ = 1, any cell with a living neighbor will live forever. Values of λ close to zero give cellular automata in the ordered realm, and values near one give cellular automata in the chaotic realm. The edge of chaos is somewhere in between.

9.3. Complex behavior and measures

99

Value of λ does not simply represent the edge of chaos. It is more complicated. You could start with λ = 0 (death) and add randomly rules that lead to life instead of death (⇒ λ > 0). You would get a sequence of cellular automata with values of λ increasing from zero to one. In the beginning, the cellular automata would be highly ordered and in the end they would be chaotic. Somewhere in between, at some critical value of λ, there would be a transition from order to chaos. It is near this transition that the most interesting cellular automata tend to be found, the ones that have the most complex behavior. The critical value of λ is not a universal constant [10].

9.3.2

Edge of chaos — another approach

We can approach the edge of chaos from another point of view. This view is based on measuring so called perturbation strength. Let us consider some examples. The first one is domino blocks in a row. The blocks are first in a stable state, standing still. The initial state is in a way minimally stable, because a small perturbation can avalanche through the whole system. Once the first block is nudged, an avalanche is started. The system will become stable again once all blocks are lying down. The nudge is called perturbation and the duration of the avalanche is called transient. The strength of the perturbation can be measured in terms of the effect it had i.e. the length of time the disturbance lasted (or the transient length) plus the permanent change that resulted (none in the domino case). Other examples of perturbation strength are buildings in earthquakes, and air molecules. For buildings, we require short transient length and return to the initial state (buildings are almost static). Air molecules on the other hand collide with each other continually, never settling down and never returning to exactly the same state (molecules are chaotic). For air molecules the transient length is infinite, whereas for our best buildings it would be zero. How about in the middle? Consider yet another example, a room full of people. A sentence spoken may be ignored (zero transient), may start a chain of responses which die out and are forgotten by everyone (a short transient) or may be so interesting that the participants will repeat it later to friends who will pass it on to other people until it changes the world completely (an almost infinite transient – e.g. old religions). Systems with zero transient length are static and systems with infinite transient length are chaotic. The instability with order as described in the examples is called the edge of chaos, a system midway between stable and

100

Session 9. . . . Towards New Order

chaotic domains. EOC is characterized by the potential to develop structure over many different scales and is often found feature in those complex systems whose parts have some freedom to behave independently. The three responses in the room example could occur simultaneously, by affecting various group members differently. The idea of transients is not restricted in any way and it applies to different type of systems, e.g. social, inorganic, politic, and psychological. . . Hence we have a possibility to measure totally different type of systems with the same measure. It seems that we have a quantifiable concept that can apply to any kind of system. This is the essence of the complex systems approach, ideas that are universally applicable [11].

9.3.3

Complexity measures

In previous section transient length was presented and it was concluded to be a universal measure for complex systems. Another measure is correlation distance. Correlation in general, is a measure of how closely a certain state matches a neighboring state. The correlation can vary from 1 (identical) to -1 (opposite). For a solid we expect to have a high correlation between adjacent areas, but the correlation is also constant with distance. For gases correlation should be zero, since there is no order within the gas because each molecule behaves independently. Again the distance is not significant, zero should be found at all scales. Each patch of gas or solid is statistically the same as the next. For this reason an alternative definition of transient length is often used for chaotic situations i.e. the number of cycles before statistical convergence has returned. When we can no longer tell anything unusual has happened, the system has returned to the steady-state or equilibrium. Instant chaos would then be said to have a transient length of zero, the same as a static state, since no change is ever detectable. For complex systems one should expect to find neither maximum correlation (nothing is happening) nor zero (too much happening), but correlations that vary with time and average around midway. One would also expect to find strong short-range correlations (local order) and weak long range ones. Thus we have two measures of complexity: correlations varying with distance and long non-statistical transients [11].

9.3. Complex behavior and measures

9.3.4

101

Phase transitions

Phase transition studies came about from the work begun by John von Neumann and carried on by Steven Wolfram in their research of cellular automata. Consider what happens if we heat and cool systems: at high temperatures systems are in gaseous state (chaotic) and at low temperatures systems are in solid state (static). At some point between high and low temperatures the system changes its state between the two i.e. it makes a phase transition [11]. There are two kinds of phase transitions: first order and second order. First order we are familiar with when ice melts to water. Molecules are forced by a rise in temperature to choose between order and chaos right at 0 ◦C. This is a deterministic choice. Second order phase transitions combine chaos and order. There is a balance of ordered structures that fill up the phase space. The liquid state is where complex behavior can arise [12]. Consider once again the “egg diagram” in Fig. 9.1. It shows the schematic drawing of cellular automaton rule space indicating relative location of periodic, chaotic, and complex transition regimes. Crossing over the lines (in the egg) produces a discrete jump between behaviors (first order phase transitions). It is also possible that the transition regime acts as a smooth transition between periodic and chaotic activity (like EOC experiments with λ). This smooth change in dynamical behavior (smooth transition) is primarily second-order, also called a critical transition [3]. In his research, Wolfram has divided cellular automata into four classes based on their behavior. Different cellular automata seem to settle down to classes which are: constant field (Class I), isolated periodic structures (Class II), uniformly chaotic fields (Class III) and isolated structures showing complicated internal behavior (Class IV). There is a relationship between Wolfram’s four classes and the underlying phase transition structure in cellular automata’s rule space [13]. Figure 9.6 represents a schematic drawing of the relationship. Phase transition feature allows us to control complexity by external forces. Heating or perturbing the system drives it towards chaotic behavior, and cooling or isolating the system drives it towards static behavior. This is seen clearly in relation to brain temperature. Low temperature means static behavior (hypothermia), medium temperature normal, organized behavior and high temperatures chaotic behavior (fever).

102

Session 9. . . . Towards New Order

Figure 9.6: A schematic drawing of the cellular automata’s rule space showing the relationship between Wolfram’s classes and the underlying phase transition structure. [3]

9.4

Highly optimized tolerance

John Doyle’s contribution to the complex systems field is “highly optimized tolerance”. It is a mechanism that relates evolving structure to power laws in interconnected systems. HOT systems arise, e.g. in biology and engineering where design and evolution create complex systems sharing common features, such as high efficiency, performance, robustness to designed-for uncertainties, hypersensitivity to design flaws and unanticipated perturbations, non-generic, specialized, structured configurations, and power laws. Through design and evolution, HOT systems achieve rare structured states that are robust to perturbations they were designed to handle, but yet fragile to unexpected perturbations and design flaws. As an example of this, consider communication and transportation systems. These systems are regularly modified to maintain high density, reliable throughput for increasing levels of user demand. As the sophistication of the systems is increased, engineers encounter a series of tradeoffs between greater productivity and the possibility of catastrophic failure. Such robustness tradeoffs are central properties of the complex systems that arise in biology and engineering [14]. Robustness tradeoffs also distinguish HOT states from the generic ensembles typically studied in statistical physics under the scenarios of the edge of chaos and self-organized criticality. Complex systems are driven by design or evolution to high-performance states that are also tolerant to uncertainty in the environment and components. This leads to specialized, modular,

9.4. Highly optimized tolerance

103

hierarchical structures, often with enormous “hidden” complexity with new sensitivities to unknown or neglected perturbations and design flaws. An example of HOT system design is given here to enlighten the idea. Often the HOT examples deal with the forest fire model, thus it is described here as well. Consider a square lattice where forest is planted. The forest density on the lattice can be something between zero and one. Zero means no forest at all and one means that the lattice is full of trees. Assume that a “spark” hits the lattice at a single site. A spark that hits an empty site does nothing, but a spark that hits a cluster of trees burns the whole cluster (see Fig. 9.7) [14], [15]. The yield of the forest after the hit of the spark is determined to be the forest density minus the loss (burned forest). In ideal case (no sparks) the yield of the forest grows linearly as a function of density. But if there are sparks, the linearity holds only for values lower than approximately 0.55. If the density of the forest exceeds a critical point (ρ = 0.5927), the yield decreases rapidly if forest density is yet increased. This is, of course, because the forest becomes too dense and the tree clusters are almost fully connected (the sizes of tree clusters increase). Figure 9.8 represents the yield as a function of forest density on an infinite lattice (lattice size N → ∞) [15].

Figure 9.7: A square lattice and a spark hitting a cluster of trees. [15]

Figure 9.8: Forest yield as a function of density. [15]

According to Fig. 9.8 the sparks do not matter if the density of the forest is less than the critical density (0.5927). If the forest density is higher, the whole forest might get burned because of one spark hit anywhere in the forest. If HOT is compared with SOC or EOC, we clearly see the difference. SOC

104

Session 9. . . . Towards New Order

and EOC assume that the interesting phenomena are at criticality. In HOT state systems actually work over the critical point, where their performance is optimized. In the forest model example this means that forest can be planted denser than the critical density. In that way also the yield of the forest can be increased, and the system is said to run in a HOT state. The idea is to optimize the yield based on the knowledge of distribution of the sparks. By HOT mechanism optimization, almost any distribution of sparks gives a power law distribution of events. There exist both numeric and analytic results for that. One HOT mechanism is based on increasing the design degrees of freedom (DDOF). The goal is to optimize the yield, or in other words, push it towards the upper bound. The upper bound is the case where there would be no sparks and the yield would be the same as the density of the forest. The DDOF’s are in this case the densities of different areas on the lattice. The lattice can be divided into smaller lattices. Each small lattice represents a design degree of freedom. The HOT states specifically optimize yield in the presence of a constraint. A HOT state corresponds to forest, which is densely planted to maximize the timber yield, with firebreaks arranged to minimize the spread damage. The firebreaks could be of forest planted with density of the critical density. Depending on the distribution of the sparks and the optimizing method, it might be possible to plant the forest with density one everywhere else but in the small firebreaks with a critical density. In practice this means that we could have a lattice of total density 0.85 and with a yield 0.77, for example. There are different HOT mechanisms that can be used to optimize the yield. Figures 9.10, 9.11 and 9.12 represent three different densities of forest that result, if three different methods are used in yield optimization. The assumed distribution of sparks is presented in Fig. 9.9. In the upper left corner of Fig. 9.9 the probability of sparks is almost 0.2, but in the other corners of the lattice it is lower than 10−10 . In the middle the probability decreases from left upper corner towards right lower corner as can be seen. The three design mechanisms in Figs. 9.10 - 9.12 are called “grid”, “evolved” and “DDOF”. In grid design, the firebreaks are straight lines and the density of the forest between the firebreaks is constant (in the small quadrangles). The evolved mechanism produces such forest density that almost everywhere the density is one. Only in the upper left corner of the lattice lower densities are used. The DDOF mechanism produces as well a very dense forest. Even the firebreaks are planted with a critical density and everywhere else the density is one. The design problem in the DDOF is to optimize the areas

9.4. Highly optimized tolerance

105

Figure 9.9: An example distribution of sparks. [15]

Figure 9.10: Grid. [15]

Figure 9.11: Evolved. [15]

Figure 9.12: DDOF. [15]

and alignment of the firebreaks. Increasing the design degrees of freedom increases densities and yield, because the forest can be planted densely and the firebreaks effectively stop the possible fires. Hence losses are decreased as well. Anyway, the sensitivity for design flaws increases significantly. That is why the HOT systems are said to be robust, but yet fragile. They are robust in that sense that they effectively handle disturbance situations they were designed for, but on the other hand, the systems become very sensitive for the disturbances they were not designed to handle. In the forest case, the system is very fragile, if a rare event happens i.e. a spark hits the right lower corner of the lattice. HOT may be a unifying perspective for many systems. HOT states are both robust and fragile. They are ultimately sensitive for design flaws. Complex systems in engineering and biology are dominated by robustness tradeoffs, which result in both high performance and new sensitivities to perturbations

106

Session 9. . . . Towards New Order

the system was not designed to handle. The real work with HOT is in new Internet protocol design (optimizing the throughput of a network by operating in HOT state), forest fire suppression, ecosystem management, analysis of biological regulatory networks and convergent networking protocols [15].

9.5

Conclusions

In this chapter the emphasis has been on complexity. It has been approached from different directions and some essential features and terminology have been discussed. Catastrophe-, chaos- and complexity theory all have common features, but different approaches. Complexity adds dimensions to chaos. All these branches have a common problem, lack of useful applications. With complexity there is still hope. . . Self-organized criticality was discussed and it refers to tendency of large dissipative systems to drive themselves into a critical state. Coupled systems may collapse because of an avalanche that results from a perturbation to the system. As an example of SOC a sandpile model (although not a very realistic one) was discussed. Edge of chaos is the border area where the interesting phenomena (i.e. complexity) arise. The edge lies between periodic and chaotic behavior. When cellular automata are concerned the fraction of rules that lead to “alive” state determine so called λ value. With some values of λ static behavior can be detected and with some other values chaotic behavior is seen. Somewhere in between with some critical value of λ complexity arises. That value of λ represents the edge of chaos. Highly optimized tolerance (HOT) was discussed in the last section. The idea is to optimize the profit, yield or throughput of a complex system. By design we can reduce the risk of catastrophes, but the resulting systems operating in HOT states with considerable performance are yet fragile.

9.5. Conclusions

107

Bibliography 1. http://www.exploratorium.edu/complexity/CompLexicon/catastrophe.html 2. http://www.lassp.cornell.edu/sethna/OrderParameters/TopologicalDefects.html 3. http://www.theory.org/complexity/cdpt/html/node5.html 4. http://necsi.org/guide/concepts/ 5. http://pespmc1.vub.ac.be/COMPLEXI.html 6. Bunde, A.;Havlin, S. (Eds.): Fractals in Science. Springer Verlag, 1994. 7. http://cmth.phy.bnl.gov/∼maslov/soc.htm 8. http://cmth.phy.bnl.gov/∼maslov/Sandpile.htm 9. http://pespmc1.vub.ac.be/CAS.html 10. http://math.hws.edu/xJava/CA/EdgeOfChaos.html 11. http://www.calresco.org/perturb.htm 12. http://www.wfu.edu/∼petrejh4/PhaseTransition.htm 13. http://delta.cs.cinvestav.mx/∼mcintosh/newweb/what/node8.html 14. Carlson, J.M.; Doyle, J.: Highly Optimized Tolerance: Robustness and Design in Complex Systems. 1999 15. http://www.cds.caltech.edu/∼doyle/CmplxNets/HOT_intro.ppt

108

Session 9. . . . Towards New Order

Session 10 Self-Similarity and Power Laws Tiina Komulainen Helsinki University of Technology Laboratory of Process Control and Automation [email protected]

The aim of this chapter is to present the relate the common features of complex systems, self-similarity and self-organization to power (scaling) laws and to fractal dimension and to show how all these are intertwined together. First, the basic concepts of self-similarity, self-organization, power laws and fractals are represented. Then, the power laws and fractals are discussed more detailed. Finally, examples of the applications of power laws and fractal dimension are demonstrated.

10.1

Introduction

The power laws and fractal dimensions are just two sides of a coin and they have a tight relationship joining them the together, as shown in Fig. 10.1. The relationships can be clarified with a mathematical discussion. The general equation for power law is shown in (10.1). It is a mathematical pattern in which the frequency of an occurrence of a given size is inversely proportionational to some power n of its size: y(x) = x−n .

(10.1) 109

110

Session 10. Self-Similarity and Power Laws

Figure 10.1: Relationships between self-similarity, self-organization, power laws and fractal dimension Note that y(λx) = (λx)−n = λ−n x−n = λ−n y(x).

(10.2)

It turns out that the power law can be expressed in “linear form” using logarithms: log(y(x)) = −n log(x),

(10.3)

where the coefficient n represents the fractal dimension [2]. The mathematical relationship connecting self-similarity to power laws and to fractal dimension is the scaling equation. For an self-similar observable A(x), which is a function of a variable x, a scaling relationship holds: A(λx) = λs A(x),

(10.4)

where λ is a constant factor and s is the scaling exponent, which is independent of x. Looking at (10.2), it is clear that the power law obeys the scaling relationship. The data emerging from the combination of self-similarity and self-organization can not be described by either Normal or exponential distribution.The reason is, that emergence of order in complex systems is fundamentally based on correlations between different levels of scale. The organization of phenomena that belong at each level in the hierarchy rules out a preferred scale or dimension. The relationships in this type of systems are best described by power laws and fractal dimension [3].

10.2. Self-Similarity

10.2

111

Self-Similarity

Self-similarity means that a structure, or a process, and a part of it appear to be the same when compared. A self-similar structure is infinite and it is not differentiable in any point. Approximate self-similarity means that the object does not display perfect self-similarity. For example a coastline is a self-similar object, a natural fractal, but it does not have perfect self-similarity. A map of a coastline consists of bays and headlands, but when magnified, the coastline is not identical but statistically the average proportions of bays and headlands remains the same no matter the scale [4]. It is not only natural fractals that display approximate self-similarity; the Mandelbrot set is another example. Identical pictures do not appear straight away, but when magnified, smaller examples will appear at all levels of magnification [4]. Statistical self-similarity means that the degree of complexity repeats at different scales instead of geometric patterns. Many natural objects are statistically self-similar where as artificial fractals are geometrically self-similar [5]. Geometrical similarity is a property of the space-time metric, whereas physical similarity is a property of the matter fields. The classical shapes of geometry do not have this property; a circle if on a large enough scale will look like a straight line. This is why people believed that the world was a flat pancake, the earth just looks that way to humans [4,6]. Examples of self-similar systems One well-known example of self-similarity and scale invariance is fractals, patterns that form of smaller objects that look the same when magnified. Many natural forms, such as coastlines, fault and joint systems, folds, layering, topographic features, turbulent water flows, drainage patterns, clouds, trees, leaves, bacteria cultures [4], blood vessels, broccoli, roots, lungs and even universe, etc., look alike on many scales [6]. It appears as if the underlying forces that produce the network of rivers, creeks, streams and rivulets are the same at all scales, which results in the smaller parts and the larger parts looking alike, and these looking like the whole. [3,4] “Human-made” self-similar systems include for example music [3], behavior of ethernet traffic, programming languages [7], architecture (of asian temples

112

Session 10. Self-Similarity and Power Laws

etc.). The process of human cognition facilitates scales and similarity. Human mind groups similar objects of the same size into a single level of scale. This process has been compared with digital image compression, because it reduces the amount of presented information by a complex structure [3]. Self-similarity in music comes from the coherent nature of the sounds. The coherencies are agreeing with each other in every scale and dimension which they are perceived. [5]. The expansion of the Universe from the big bang and the collapse of a star to a singularity might both tend to self-similarity in some circumstances [6]. A self-similar program is a program that can mutate itself into a new, more complex program, that is also self-similar. For example, a self-similar language can be extended with a new language feature, which will be used to parse or compile the rest of the program. Many languages can be made self-similar. When language is simple but powerful (Lisp, SmallTalk), selfsimilarity is easier to achieve than when it is complex but powerful (Java, C++) [7].

10.2.1

Self-organization

Two types of stable systems can be found in the physical universe: the death state of perfect equilibrium and the infinitely fertile condition of selforganized non-equilibrium. Self-organization provides useful models for many complex features of the natural world, which are characterized by fractal geometries, cooperative behavior, self-similarity of structures, and power law distributions. Openness to the environment and coherent behavior are necessary conditions for self-organization and the growth of complexity [8]. Because of a common conceptual framework or microphysics, self-organizing systems are characterized by self-similarity and fractal geometries, in which similar patterns are repeated with different sizes or time scales without changing their essential meaning. Similar geometric patterns are repeated at different sizes and are expressive of a fundamental unity of the system such as braiding patterns ranging from streambeds to root systems and the human lung [8]. Systems as diverse as metabolic networks or the world wide web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature is a consequence of two generic mechanisms shared by many networks: Networks expand continuously by the addition of new vertices, and new vertices attach preferentially to already well connected sites.

10.2. Self-Similarity

113

A model based on these two ingredients reproduces the observed stationary scale-free distributions, indicating that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems [9].

10.2.2

Power laws

Power law is one of the common signatures of a nonlinear dynamical process, i.e., a chaotic process, which is at a point self-organized. With power laws it is possible to express self-similarity of the large and small, i.e., to unite different sizes and lengths. In fractals, for example, there are many more small structures than large ones. Their respective numbers are represented by a power law distribution. A common power law for all sizes demonstrates the internal self-consistency of the fractal and its unity across all boundaries. The power law distributions result from a commonality of laws and processes at all scales [2]. The scaling relationship of power laws applies widely and brings into focus one important feature of the systems considered. In general it is not evident that major perturbation will have the larger effect and a minor one only a small effect. The knock-on effect of any perturbation of a system can vary from zero to infinite — there is an inherent fractal unpredictability [10]. When using the power laws, one must notice that statistical data for a phenomenon that obeys one of the power laws (exponential growth) is biased towards the lower part of the range, whereas that for a phenomenon with saturation (logarithmic growth) tends to be biased towards the upper part of the range [11].

Applications of power laws The natural world is full of power law distributions between the large and small: Earthquakes, words of the English language, interplanetary debris, and coastlines of continents. For example power laws define the distribution of catastrophic events in Self-Organized Critical systems. If a SOC system shows a power law distribution, it could be a sign that the system is at the edge of chaos, i.e., going from a stable state to a chaotic state. With power laws it is be useful to predict the phase of this type of systems. A power law distribution is also a litmus test for self-organization, self-similarity and fractal geometries [2,12].

114

Session 10. Self-Similarity and Power Laws

The Gutenberg-Richter law, used in seismography, is an empirical observation that earth quakes occur with a power law distribution. The crust of the Earth, buckling under the stress of plates sliding around, produces earthquakes. Every day at some parts of the earth occur small earth quakes, that are too weak to detect without instruments. A little more bigger earthquakes, that rattle dishes are less common and the big earth quakes, that cause mass destructions happen only once in a while [13]. Power laws are applied to monitor the the acoustic emissions in materials which are used for bridges etc. Internal defects in materials make popping sounds, acoustic emissions, under stress. Engineering materials contain tiny cracks that grow under stress — until they grow large enough to cause materials to fail. This can cause the failure of buildings, bridges and other societal structures. One way to determine the condition of structures is to monitor the acoustic emissions. This monitoring method is simple and inexpensive form of non-destructive testing. With the help of the method it is also possible to develop new techniques to make better materials and design structures less vulnerable to cracking [13]. Matthew-effect on scientific communities is a discovery, that the relationship between the amount of citations received by members of a scientific community and their publishing size follows power law with exponent 1,27±0,0 . The exponent is shown to be constant with time and relatively independent of the nationality and size of a science system. Also the publishing size and the size rank of the publishing community in a science system have a powerlaw relationship. Katz has determined the exponent to be –0,44±0.01. The exponent should be relatively independent of the nationality and size of the science system although, according to Katz, the rank of a specific community in a science system can be quite unpredictable [14]. Some examples of power laws in biology and sociology are the laws of Kleiber, which relates metabolic rate to body mass in animals; Taylor’s power law of population fluctuations and Yoda’s thinning law, which relates density of stems to plant size [15]. In the following subchapters concentration is placed on Zipf’s and Benford’s laws. Typical for the phenomena is that large occasions are fairly rare, whereas smaller ones are much more frequent, and in between are cascades of different sizes and frequencies. With Zipf’s law the structure of the phenomena can be explained and it is possible to do some sort of prediction for the earth quakes coming in the future.

10.2. Self-Similarity

10.2.3

115

Zipf’s law

Zipf’s law is named after the Harvard professor George Kingsley Zipf (1902– 1950). The law is one of the scaling laws and it defines that the frequency of occurrence of some event (P ), as a function of the rank (i) is a power-law function [16,17]. In Zipf’s law the quantity under study is inversely proportional to the rank. The rank is determined by the frequency of occurrence and the exponent a is close to unity [17]. Zipf’s law describes both common and rare events. If an event is number 1 because it is most popular, Zipf’s plot describes the common events (e.g., the use of English words). On the other hand, if an event is number 1 because it is unusual (biggest, highest, largest, ...), then it describes the rare events (e.g., city population) [17]. Benoit Mandelbrot has shown that a more general power law is obtained by adding constant terms to the denominator and power. In general, denominator is the rank plus first constant c and power is one plus a second constant d. Zipf’s law is then the special case in which the two constants, c and d are zero [16]: Pi ∼ 1/(i + c)(1+d) .

(10.5)

For example the law for squares can be expressed with d = 1, so the power will be a = 2 [16]: Pi ∼ 1/(i + c)2 .

(10.6)

Mathematics gives a meaning to intermediate values of the power as well, such as 3/4 or 1.0237 [16]. Mandelbrot’s generalization of Zipf’s law is still very simple: The additional complexity lies only in the introduction of the two new adjustable constants, a number added to the rank and a number added to the power 1, so the modified power law has two additional parameters. There is widespread evidence that population and other socio-economic activities at different scales are distributed according to the Zipf’s law and that such scaling distributions are associated with systems that have matured or grown to a steady state where their growth rates do not depend upon scale [18].

116

Session 10. Self-Similarity and Power Laws

Applications of Zipf’s law The constants of the generalized power law or modified Zipf’s law should be adjusted to gain the optimal fit of the data. The example of the cities in the USA will clarify this. In the Table 10.1 cities are listed by their size (rank). The population of the cities year 1990 are compared to the predictions of unmodified and modified Zipf’s law. The unmodified Zipf’s law uses 10 milloin people as the coefficient for the equation 1. Pi = 10, 000, 000 · 1/i1 .

(10.7)

The modified Zipf’s law consists of population coefficient 5 million and constants c = −2/5 and d = −1/4. Pi = 5, 000, 000 · 1/(i − 2/5)(3/4) .

(10.8)

The sizes of the populations given by unmodified Zipf’s law differ from the real sizes approximately 30 % and the sizes by modified Zipf’s law only about 10 %. The obvious conclusion is that the modified version of Zipf’s law is more accurate than the original one. Although the predictions of the original one are quite good too [16]. Other famous examples of Zipf’s law are: The frequency of English words and the income or revenue of a company. The income of a company as a function of the rank should also be called the Pareto’s law because Pareto observed this at the end of the last century [17,19]. The english word example is illustrated by counting the top 50 words in 423 TIME magazine articles (total 245,412 occurrences of words), with “the” as the number one (appearing 15861 times), “of” as number two (appearing 7239 times), “to” as the number three (6331 times), etc. When the number of occurrences is plotted as the function of the rank (1, 2, 3, etc.), the functional form is a power law function with exponent close to 1 [17]. Shiode and Batty have proposed a hypothesis of the applicability of Zipf’s law for the growth of WWW domains. Most of the mature domains follow closely Zipf’s law but the new domains, mostly in developing countries, do not. The writers argue that as the WWW develops, all domains will ultimately follow the same power laws as these technologies mature and adoption becomes more uniform. The basis for this hypothesis was the observation that the structure in the cross-sectional data, which the writers had collected, was consistent with a system that was rapidly changing and had not yet reached its steady state [18].

10.2. Self-Similarity

117

Table 10.1: Zipf’s Law applied to population of cities in the USA. Unmodified Zipf’s law uses population size of 10 million and constants 0. Modified Zipf’s law uses population size of 5 million and constants -2/5 and -1/4 [16]. Rank (n) City 1 7 13 19 25 31 37 49 61 73 85 97

NewYork Detroit Baltimore Washington, D.C. New Orleans Kansas City, Mo. Virginia Beach, Va Toledo Arlington Texas Baton Rouge, La. Hialeah, Fla. Bakersfield, Calif.

10.2.4

Population (1990) 7.322.564 1.027.974 736.014 606.900 496.938 434.829 393.089 332.943 261.721 219.531 188.008 174.820

Unmodified Zipf’s law 10.000.000 1.428.571 769.231 526.316 400.000 322.581 270.270 204.082 163.934 136.986 117.647 103.093

Modified Zipf’s Law 7334.265 1214.261 747.639 558.258 452.656 384.308 336.015 271.639 230.205 201.033 179.243 162.270

Benford’s Law

Benford’s law (known also as the first digit law, first digit phenomenon or leading digit phenomenon) states that the distribution of the first digit is not uniform. This law applies, if the numbers under investigation are not entirely random, but somehow socially or naturally related. Generally the law applies to data that is not dimensionless, meaning that the probability distribution of the data is invariant under a change of scale, and data, that is selected out of a variety of different sources. Benford’s law does not apply to uniform distributions, like lottery numbers [11,20,21]. Benford’s law results that the probability of a number to be the first digit in tables of statistics , listings, etc., is biggest for one and smallest for nine [11,20]. Many physical laws and human made systems require cutoffs to number series, for example the street addresses begin from 1 and end up to some cutoff value. The cutoffs impose these systems to obey Benford’s law, which can be presented for a digit D (1, 2,. . . ,9) by the logarithmic distribution: PD = log10 (1 + 1/D)

(10.9)

118

Session 10. Self-Similarity and Power Laws

The base number of logarithm can also be other than 10 [21]. According to this equation the probabilities for digits 1,. . . ,9 lie between 0.301,. . . ,0.045, as shown in Fig. 10.2. The probability for 1 is over 6 times greater than for 9 [20].

Figure 10.2: Distribution of Benford’s law for digits 1, . . . , 9 [20]

The examples of Benford’s law are numerous, including addresses, the area and basin of rivers, population, constants, pressure, molecular and atomic weight, the half-lives of radioactive atoms, cost data, powers and square root of whole numbers, death rate, budget data of corporations, income tax and even computer bugs [11,20,21]. Benford’s law is a powerful and relatively simple tool for pointing suspicion at frauds, embezzlers, tax evaders and sloppy accountants. The income tax agencies of several nations and several states, including California, are using detection software based on Benford’s law, as are a score of large companies and accounting businesses [21]. The social and natural impact affects all the listed examples. Budget data is affected by the corporation size, particular industry a company belongs to, the quality of the management, the state of the market, etc. The size of a river basin depends on a depth and breadth of the river. Most of the dependencies can be approximated by simple formulas: Linear, power or exponential, oscillating, leading to saturation.

10.2. Self-Similarity

10.2.5

119

Fractal dimension

Fractals are characterised by three consepts: Self-similarity, response of measure to scale, and the recursive subdivision of space. Fractal dimension can be measured by many different types of methods. Similar to all these method is, that they all rely heavily on the power law when plotted to logarithmic scale, which is the property relating fractal dimension to power laws [22,23]. One definition of fractal dimension D is the following equation: D = log10 N/ log10 (1/R),

(10.10)

where N is the number of segments created, when dividing an object, and R is the length of each of segments. This equation relates to power laws as follows: log(N) = D · log(1/R) = log(R−D ),

(10.11)

so that N = R−D .

(10.12)

It is simple to obtain a formula for the dimension of any object provided. The procedure is just to determine in how many parts it gets divided up into (N) when we reduce its linear size, or scale it down (1/R). By applying the equation to line, square and cubicle, we get the following results; For a line divided in 4 parts, N is 4 and R is 1/4, so dimension D = log(4)/ log(4) = 1. For a square divided in four parts N is 4, R is 1/2, and dimension D = log(4)/log(2) = 2 · log(2)/ log(2) = 2. And for a cubicle divided in 8 parts, N is 8, R is 1/2 and dimension D = log(8)/log(2) = 3 · log(2)/log(2) = 3. The following series of pictures (Fig. 10.3)represents iteration of the Koch curve. By applying equation (10.11) to the Koch curve, as in Table 10.2, it is evident, that the dimension is not an integer, but instead between 1 and 2. The dimension is always 1.26185, regardless of the iteration level. Hence, D = log N/ log(1/R) = 1.26185,

(10.13)

Which can also be written as N = (1/R)1.26185.

(10.14)

120

Session 10. Self-Similarity and Power Laws

Figure 10.3: The Koch curve Table 10.2: Statistics of the Koch curve Iteration number Number of segments, N Segment length, R Total length, N · R log N log(1/R) Dimension log N/ log(1/R)

1 4 1/3 1.33333 0.60206 0.47712 1.26185

2 16 1/9 1.77777 1.20412 0.95424 1.26185

3 64 1/27 2.37037 1.80618 1.43136 1.26185

4 256 1/81 3.16049 2.40824 1.90848 1.26185

5 1024 1/243 4.21399 3.01030 2.38561 1.26185

The formulas above indicate that N and R are related through a power law. In general, a power law is a nonlinear relationship, which can be written in the form N = a(1/R)D , where D is normally a non-integer constant and a is a numerical constant which in the case of the Koch curve is 1.26. Another way of defining the fractal dimension is box counting. In box counting the fractal is put on a grid, which is made of identical squares having size of side h. Then the amount of non-empty squares, k, is counted. The magnification of the method equals to 1/h and the fractal dimension is defined by equation: [4] D = log10 (k)/ log10 (1/h).

(10.15)

Also Hausdorff’s and Kolmogorov’s methods can be used to approximate the fractal dimension. These methods are more accurate, but also harder to use. They are described in [18,24].

10.3. References

10.3

121

References

1. Focardi, S.M.: Fat tails, scaling and stable laws: A critical look at modeling extremal events in economic and financial phenomena. http://www.theintertekgroup.com/scaling.pdf, [18.3.2003]. 2. J. McKim Malville: Power Law Distributions. http://www.colorado.edu/Conferences/pilgrimage/papers /Malville-Pap/kimpap1.html, [2.3.2003]. 3. Salingaros, N.A. and West, B.J.: A universal rule for the distribution of sizes. http://www.math.utsa.edu/sphere/salingar/Universal.html, [15.3.2003]. 4. Judd, C.: Fractals – Self-Similarity. http://www.bath.ac.uk/∼ma0cmj/FractalContents.html, [16.3.2003]. 5. Yadegari, S.: Self-similarity. http://www-crca.ucsd.edu/∼syadegar/MasterThesis/node25.html, [16.3.2003]. 6. Carr, B.J. and Coley, A.A.: Self-similarity in general relativity. http://users.math.uni-potsdam.de/∼oeitner/QUELLEN /ZUMCHAOS/selfsim1.htm, [16.3.2003]. 7. Lebedev, A.: What is Self-Similar. http://www.self-similar.com/what_is_self_similar.html, [16.3.2003]. 8. J. McKim Malville: Power Law Distributions. http://www.colorado.edu/Conferences/pilgrimage/papers /Malville-Pap/kimpap1.html, [2.3.2003]. 9. Barabasi, L.: Emergence of Scaling in Complex Networks. http://discuss.santafe.edu/dynamics/stories/storyReader$10, [18.3.2003]. 10. Lucas, C.: Perturbation and Transients — The Edge of Chaos. http://www.calresco.org/perturb.htm, [17.3.2003]. 11. Bogomolny, A.: Benford’s Law and Zipf ’s Law. http://www.cut-the-knot.org/do_you_know/zipfLaw.shtml, [15.3.2003]. 12. Goldstein, J.: Edgeware — Glossary. http://www.plexusinstitute.com/edgeware/archive/think /main_gloss.html, [15.3.2003].

122

Session 10. Self-Similarity and Power Laws

13. Hoyle, P.: Physics Reseach: Crumpling Paper. http://www.honeylocust.com/crumpling/other.html, [17.3.2003]. 14. Katz, S. J.: The Self-Similar Science System. http://www.sussex.ac.uk/Users/sylvank/pubs/SSSS.pdf, [17.3.2003]. 15. Brown, J.H.: Reseach plan on BioComplexity. http://biology.unm.edu/jhbrown/Research/BioComplexity /BioComplexityProposal.pdf, [18.3.2003]. 16. Bogomolny, A.: Benford’s Law and Zipf ’s Law. http://www.cut-the-knot.com/do_you_know/zipfLaw.shtml, [13.3.2003]. 17. Li, W.: Zipf ’s law, http://linkage.rockefeller.edu/wli/zipf/. [13.3.2003]. 18. Shiode, N. and Batty, M.: Power Law Distributions in Real and Virtual Worlds. http://www.isoc.org/inet2000/cdproceedings/2a/2a_2.htm, [17.3.2003]. 19. Adamic, L.A.: Zipf, Power-Laws and Pareto – a ranking tutorial. http://ginger.hpl.hp.com/shl/papers/ranking/ranking.html, [24.2.2003]. 20. Wolfram Research: Benford’s Law. http://mathworld.wolfram.com/BenfordsLaw.html, [15.3.2003]. 21. Browne, M. W.: Following Benford’s Law, or Looking Out for No. 1. http://courses.nus.edu.sg/course/mathelmr/080498sci-benford.htm, [15.3.2003]. 22. Green, E.: Measuring Fractal Dimension. http://www.cs.wisc.edu/∼ergreen/honors_thesis/dimension.html, [16.3.2003]. 23. Earth Monitoring Station: Measuring fractal dimension. http://ems.gphys.unc.edu/nonlinear/fractals/dimension.html, [16.3.2003]. 24. Charles Stuart University, The Hausdorff Dimension. http://life.csu.edu.au/fractop/doc/notes/chapt2a2.html, [17.3.2003].

Session 11 Turing’s Lure, Gödel’s Curse Abouda Abdulla Helsinki University of Technology Communication Engineering Laboratory [email protected]

Linear systems are well understood and completely predictable, whatever the initial state and no matter what is the system dimension. But if the system is nonlinear then different situation will be faced, and the theory of linear systems no longer is valid. It has been shown that the resulting behaviors from simple rules might emerge to form very complicated structure [6][10]. For example, the emerging behaviors from non-linear continuous time system or the emerging behaviors from hybrid systems or even from cellular automata can be theoretically intractrable. The complexity here means that these systems will forever defy analysis attempts [10]. In this article we will be elaborating on these issues by considering cellular automata as an example.

11.1

Computability theory

Computability theory dates back to 1930’s when mathematicians began to think about what it means to be able to compute a function. That was before the advent of digital computers. One of those mathematicians is Alan Turing (1912–1954) who is the inventor of Turing machines [1]. The conclusion that 123

124

Session 11. Turing’s Lure, Gödel’s Curse

Alan Turing came out with was: A function is computable if it can be computed by a Turing machine.

11.2

Gödelian undecidability

“All frame works that are powerful enough are either incomplete or inconsistent”. This is what is known as Gödelian undecidability principle. This principle has two interpretations, in mathematics and in physics. In mathematics it says that there does not exist any reasonable finite formal system from which all mathematical truth is derivable. And in physics it says that with respect to predication, there are some systems that cannot be completely predicted — like computation in Turing machine or predictbility of Life pattern in the game of Life. In this article we consider the physical interpretation [2]. It has been shown that any formalism that is powerful enough suffers from this Gödelian undecidability problem. A prototype of such a powerful system is Turing machine.

11.3

Turing Machine

A Turing machine is a very simple machine that has all the power that any digital computer has. A Turing machine is a particularly simple kind of computer, one whose operations are limited to reading and writing symbols on a tape, or moving along the tape to the left or right. This tape is divided into squares, any square of which may contain a symbol from a finite alphabet, with the restriction that there can be only finitely many non-blank squares on the tape. At any time, the Turing machine has a read/write head positioned at some square on the tape. Figure 11.1 shows how the tape looks like with a series of A’s and B’s written on the tape and with read/write head located on the rightmost of these symbols. ..

A

B

A

A

A

B

B

A

.. .

Figure 11.1: A tape with series of A’s and B’s

11.3. Turing Machine

125

Furthermore, at any time, the Turing machine is in any one of a finite number of internal states. The Turing machine is specified by a set of instructions of the following form: (current state, current symbol, new state, new symbol, left/right) The instruction means that if the Turing machine is now in current state, and the symbol under the read/write head is current symbol, then the Turing machine changes its internal state to a new state, replaces the symbol on the tape at its current position by a new symbol, and moves the read/write head one square in the given direction left or right. It may happen that the Turing machine is in a situation for which it has no instruction and that makes the Turing machine to halt. Defining instructions of Turing machine can be though of as programming the Turing machine. There are several conventions commonly used. The convention that numbers are represented in unary notation is adopted here. It means that the nonnegative integer number n is represented by using of successive 1’s. Furthermore, if a function f (n1 ,n2 ,...,nk ) to be computed, we assume that initially the tape consists of n1 ,n2 ,...,nk , with each sequence of 1’s separated from the previous one by a single blank, and with tape head initially located at the right most bit of the first input argument, and the state of the Turing machine in some initial specified value. We say that the Turing machine has computed m = f (n1 ,n2 ,...,nk ) if, when the machine halts, the tape consists of the final result and the read/write head at the right most bit of the result.

Example 1: Multiplication using Turing machine Suppose we want to create a Turing machine to compute the function m = multiply(n1 ,n2 )=n1 ×n2 . If n1 = 3 and n2 = 4, then the tape in the beginning of computation will look like Fig. 11.2. ..

1

1

1

1

1

1

1

.. .

Figure 11.2: The tape in the beginning of computation Here the position of read/write head is marked with dark cell. After the computation is finished the Turing machine should halt with its tape looking as Fig. 11.3, giving the final result (m = 3 × 4 = 12).

126

Session 11. Turing’s Lure, Gödel’s Curse ..

1

1

1

1

1

1

1

1

1

1

1

1

..

Figure 11.3: The tape at the end of computation Due to the simplicity of Turing machine, programming one machine to perform a specific computation task is challenge [1][2][3][4] 1 .

11.4

Computation as frame work

Cellular automata and other systems with simple underlying rules can produce complicated behaviors. To analyze the behavior of these systems, our experience from traditional science might suggest that standard mathematical analysis should provide the appropriate basis for any framework. This kind of analysis tends to be useful only when the overall behavior of the system is fairly simple. So what can one do when the over all behavior is more complex? The main purpose of Wolfram’s book “A New Kind of Science” is to develop a new kind of science that allows progress to be made in such cases. The single most important idea that underlines this new science is the notation of computation. So far we have been thinking of cellular automata and other systems as simple computer programs. Now we will think of these systems in terms of the computations they can perform [5]. In a typical case, the initial conditions for a system like cellular automaton or Turing machine can be viewed as corresponding to the input to a computation, while the state of the system after some number of steps corresponding to the output. The key idea is to think in purely abstract terms about the computation that is performed. This abstract is useful due to two reasons, the first, because it allows us to discuss in a unified way systems that have completely different underlying rules, and the second reason, because it becomes possible to imagine formulating principles that apply to a very wide variety of different systems [5].

11.4.1

Computations in cellular automata

Cellular automaton can be viewed as a computation. The following examples show how cellular automata can be used to perform some computations [5]. 1

A Visual Turing Machine simulation program is available from the web page http://www.cheran.ro/vturig

11.4. Computation as frame work

127

Example 2: Computing whether a given number is even or odd If one starts cellular automaton shown in Fig. 11.4 with an even number of black cells, then after a few steps of evolution, no black cells are left. But if instead one stars it with an odd number of black cells, then a single cell survives forever. Using this cellular automaton we can check whether the given number is even or odd.

Figure 11.4: Cellular automaton to check whether the given number is odd or even But cellular automaton can be used to perform more complicated computations, too.

Example 3: Computing the square of any number The cellular automaton shown in Fig. 11.5 can be used to calculate the square of any given number. If one starts with 5 black squares, then after a certain number of steps the cellular automaton will produce a block of exactly 5 × 5 = 25 black squares. Example 4: Computing the successive prime numbers The rule for this cellular automaton (Fig. 11.6) is somewhat complicated; it involves a total of sixteen colors possible for each cell. But the cellular automata that have been presented so far can be conveniently described in terms of traditional mathematical notations. But what kinds of computations are cellular automata like the one presented in Fig. 11.7 performing? Of course we cannot describe these computations by anything

128

Session 11. Turing’s Lure, Gödel’s Curse

Figure 11.5: Cellular automaton that computes the square roots of any number as simple as saying, for example, they generate primes. So how then can we ever expect to describe these computations? [5]

11.5

The phenomena of universality

Universal system is a system whose underlying construction remains fixed, but which can be made to perform different tasks just by being programmed in different ways. For example, take the computer: The hardware of the computer remains fixed, but the computer can be programmed for different tasks by loading different pieces of software. If a system is universal, then it must effectively be capable of emulating any other system, and as a result it must be able to produce behavior that is as complex as the behavior of any other system. As we have seen before simple rules can produce complex behavior. We will see later how simple rules can produce universal system. With their simple and rather specific underlying structure one might think that cellular automata would never be capable of emulating a very wide range of other systems. But what we will see in this section is that cellular automata can be

11.6. Game of Life

129

Figure 11.6: Cellular automaton to compute successive prime numbers made to emulate computer. We will consider cellular automata with specific rules known as Game of Life [6][7][8][9]. Because the construction of a “Life Computer” is an involved task, a thorough discussion is needed here.

11.6

Game of Life

One form of cellular automata is Game of Life, which was invented by Cambridge mathematician John Conway. Game of Life is no-player game. Life is played on an infinite squared board. The cells in this game represent the population. At any time some of the cells will be live and others will be dead. The initial structure of live cells can be set as one wants but when time starts to go on the cells birth or death is controlled by rules [6]. The rules of this game are as follows2 : • Birth: A cell that is dead at time t becomes live at time t + 1 only if exactly three of its eight neighbors were live at time t. • Death by overcrowding: A cell that is live at time t and has four or more of its eight neighbors live at time t will be dead by time t + 1. 2

The Game of Life simulation program is available, for example, at the web page http://www.bitstorm.org

130

Session 11. Turing’s Lure, Gödel’s Curse

Figure 11.7: Cellular automata that can not be described mathematically • Death by exposure: A live cell that has only one live neighbor, or none at all, at time t, will also be dead at time t + 1. • Survival: A cell that was live at time t will remain live at time t + 1 if and only if it had just 2 or 3 live neighbors at time t. An example of typical Life history is shown in Fig. 11.8. The starting generation has five live cells G0. In the first generation G1 there are three cells on either side of the line that are dead at G0, but have exactly three live neighbors, so will come to life at G1. From G1 to G2 the corner will survive, having 3 neighbors each, but every thing else will die by overcrowding. There will be 4 births, one by the middle of each side. From G2 to G3 there is a ring which each live cell has 2 neighbors so everything survives, there are 4 births inside. From G3 to G4 overcrowding will kill all cells except the 4 outer cells and the neighbors of these cells will born. From G4 to G5 another survival ring with 8 happy events about to take place. From G5 to G6 more overcrowding again leaves just 4 survivors. This time the neighboring births form: From G6 to G7 four separated lines of 3, called Blinkers, which will never interact again. From G7 to G8, G9, G10 - at each generation the

11.6. Game of Life

131

tips of Blinkers die of exposure but the births on each side reform the line in a perpendicular direction. The configuration will therefore oscillate with period two forever. The final pair of configuration known as Traffic Lights.

Figure 11.8: Typical Life history

Still Life There are some common forms of still lives; some of these are shown in Fig. 11.9, with their traditional names. The simple cases are usually loops in which each live cell has two or three neighbors according to local curvature, but the exact shape of the loop is important for effective birth control.

Figure 11.9: Some of the common forms of Still Life

132

Session 11. Turing’s Lure, Gödel’s Curse

Life Cycle There are some configurations whose Life history repeats itself after some time. This period of time known as Life Cycle. The Blinker is the simplest example of these configurations, where the Blinker repeats itself with period larger than 1. Some configurations with their own Life cycles are shown in Figs. 11.10 and 11.11.

Figure 11.10: Three Life cycles with period two

Figure 11.11: Life cycle with period three: (a) Two eaters Gnash at each other. (b) The Cambridge Pulsar

11.6. Game of Life

133

The Glider and other space ships If we look at Fig. 11.12, we can notice that the generation 4 is just like generation 0 but moved one diagonal place, so that the configuration will steadily move across the plane. The arrangements at times 2, 6, 10, . . . are related to those at times 0, 4, 8, 12, . . . by symmetry that geometers call a glide reflexion — so this creature is known as “Glider”.

Figure 11.12: The Glider moves one square diagonally each four generations

The unpredictability of Life Is it possible to tell before hand the destiny of a Life pattern? Is it going to fade away completely? Or is it going to be static? Or it will travel across the plane, or its going to expand indefinitely? To answer these questions let us have a look at very simple configuration, a straight line of n live cells. When n = 1 or 2 the Life pattern fades immediately. When n = 3 the result is the Blinker. When n = 4 it becomes a Beehive at time 2. When n = 5 it becomes Traffic Light at time 6. When n = 6 it fades at time 12. When n = 7 it makes a beautifully symmetric display before terminating in the Honey Farm at time t = 14. When n = 8 it gives 4 blocks and 4 Beehives. When n = 9 it makes two sets of Traffic Lights, and so on. Even when we start with very simple configuration and small number of cells it is not easy to see what goes on [6]. The other question of Life is that, can the population of a Life grow without limit? The answer of this question was YES and a group at M.I.T proved it in November 1970. The conclusion from this part is that, Life is really unpredictable.

134

11.6.1

Session 11. Turing’s Lure, Gödel’s Curse

Making a Life computer

Many computers have been programmed to play the game of Life. In this section we will see how to define Life patterns that can imitate computers. Computers are made from pieces of wire along pulses of electricity go. To mimic these by certain lines in the plane along which Glider travel (see Fig. 11.14.

Figure 11.13: The three logical gates

Figure 11.14: Gliding pulses In the machine there is a part called clock, this clock generating pulses at regular intervals. The working parts of the machine are made up of logical gates. The Glider Guns can be considered as pulse generators. To understand how to construct logical gates we first study the possible interactions of two Gliders, which crash at right angles [6]. When Glider meets Glider The figure below (Fig. 11.15) shows two Gliders crash in different ways: 1. To form a Blinker 2. To form a Block 3. To form a Pond 4. To annihilate themselves.

11.6. Game of Life

135

The last crashing way has special interest, because the vanishing reactions turn out to be surprisingly useful.

Figure 11.15: Gliders crashing in diverse fashion

How to make a not gate The vanishing reaction can be used to make a not gate. As shown in Fig. 11.16, the input stream enters at the left of the figure and the Glider Gun is positioned and timed so that every space in the input stream allows just one Glider to escape from the gun [6][9].

The Eater Other phenomena that happens when two Gliders meet, an Eater will be constructed. This is shown in Fig. 11.17.

Gliders can build their own guns So far we have been concerned by Glider meets another Glider, what happens when a Glider meets other thing? More constructively it can turn a Pond into a Ship (Fig. 11.18) and a Ship into a part of the Glider Gun. And since Gliders can crash to make Block and Pond they can make a whole Glider Gun [6].

136

Session 11. Turing’s Lure, Gödel’s Curse

Figure 11.16: A Glider Gun and vanishing reaction make a NOT gate

Figure 11.17: Two Gliders crash to form an Eater The kickback reaction Other useful reaction between Gliders is the kickback (Fig. 11.19) in which the decay product is a Glider travelling along a line closely parallel to one of the original ones but in the opposite direction. We may think of this Glider as having been kicked back by the other one [6][9]. Building blocks for our computer Figure 11.20 shows logical gates constructed based on vanishing reactions. From here on it is just an engineering problem to construct an arbitrarily large finite computer. Such computers can be programmed to do any thing [6] (construction of memory elements, for example, has been skipped here).

11.7. Conclusion

137

Figure 11.18: In (b), Glider dives into Pond and comes up with Ship. In (c), Glider crashes into Ship and makes part of Glider Gun

Figure 11.19: The kickback

11.7

Conclusion

The conclusion that can be drawn is that complex systems like cellular automata or Turing machine with very simple underlying rules are universal systems which can emulate other complicated systems. When we come to analyze such systems, we can see clearly that we do not have a framework to deal with these systems. New kind of science to analyze these powerful systems is needed.

Bibliography 1. http://www.cheran.ro/vturig 2. http://tph.tuwien.ac.at/ svozil/publ/1996-casti.htm 3. http://www.turing.org.uk/turing/ 4. http://www.bitstorm.org

138

Session 11. Turing’s Lure, Gödel’s Curse

Figure 11.20: (a) AND gate, (b) OR gate, and (c) NOT gate 5. Wolfram, S.: A New kind of science.Wolfram Media, Champaign, Illinois, 2002. 6. Elwyn Berlekamp, Conway, J., and Guy, R.: “Winning Ways”, Vol. 2, Chapter 25. 7. http://www.ams.org/new-in-math/cover/turing.html 8. http://www-csli.stanford.edu/hp/turing1.html 9. http://ddi.cs.uni-potsdam.de 10. Hyötyniemi, H.: On universality and undecidability in dynamic systems. Helsinki University of Technology, Report 133, December 2002.

Session 12 Hierarchical Systems Theory Jari Hätönen Systems Engineering Laboratory, University of Oulu P.O. BOX 4300, FIN-90014 University of Oulu, Finland

12.1

Introduction

The design and analysis of large and complex system requires frequently that the system is divided into smaller units, called sub-systems. The structure of overall system resulting from the interconnections of sub-systems can be very complex. One of the most common structures, is the hierarchical structure, i.e. the lay-out of the structure is vertical. Consequently in this report only hierarchical systems are considered, and a special emphasis is put on twolevel hierarchical systems. A diagram of a standard two-level hierarchical system is shown in Fig. 12.1, where, as expected, two levels can be found, namely the lower level and the upper level. The lower level consists of the process level, where the process level has been divided into N sub-systems. The sub-systems are connected to each other because there is either material or information flows between these sub-systems. Each sub-system has its own decision unit, which tries to control the behaviour of the sub-system so that the objectives of this particular sub-system would be met. The decision unit can also use feedback information from the sub-system to improve its ’control policy’. However, quite often the objectives of the sub-systems are conflicting, resulting in a poor overall performance. Hence an upper-level decision unit 139

140

Session 12. Hierarchical Systems Theory

Upper level

or a coordinator has to be introduced, and the objective of this decision unit is to coordinate the decision making of the sub-systems so that the overall performance of the system would be improved. The coordinator receives information from the sub-systems so that it can monitor the performance of the overall system. Note that this approach is not as restrictive as it sounds, because if a new higher level is added into a two-level system, in most of cases the new level can be chosen to be the higher level of the modified system, and the original two-level system becomes the lower-level of the modified system.

Coordinator

Lower level

1

1

2



N

Lower-level decision making

2



N

Process level

Figure 12.1: A diagram of a standard two-level hierarchical system The strength of the two-level hierachical system theory is that there exists a large number of important systems around us that can be seen as being a twolevel hierarchical system. Examples considered in [3] and [2] are organisation structures in large companies, distribution networks, oil refineries and power plants, to name a few. Consequently it is an interesting question why twolevel hierarchical systems are so frequent. The following observations might be at least a partial answer to this question: 1) It is easier to analyse and (re)design large-scale systems if they are broken into smaller units. 2) The sub-system approach allows specialisation, where a sub-system is only responsible for its own task and does not require information of the objectives of the overall system. 3) Hierarchical systems allow a certain degree of fault tolerance. This is due to the fact that if a sub-system breaks down, the overall system does not necessarily stop working. Furthermore, due to ’module structure’ the failure is ’localised’ and hence easy to detect and and repair

12.2. Process level

141

(i.e. only the faulty module has to replaced and its connections reestablished). The coordinator, however, is the weak point, because if it stops working, the overall system cannot function anymore. 3) Even evolution seems to favour two-level hierachical systems. For example in a human body the brain can be considered as being the coordinator, whereas the rest of the body forms the sub-system level. 4) In the evolution of organisations two-level hierachical systems play a major role. Even pre-historic tribes had a tribe leader, whom was responsible for coordinating the actions of individual tribe members in order to improve the overall well-being of the tribe. In the following material it is shown how two-level hierarchical systems can be analysed mathematically. Furthermore, it is shown how optimisation techniques can be used to coordinate the running of a two-level hierarchical systems. The material is a rather mathematical, but in order to understand it, only a fairly modest background is needed in constrained optimisation theory. This report is based on [1], which is a nice introduction into the theory of hierarchical systems.

12.2 12.2.1

Process level Sub-system models

The process level consists of N inter connected sub-systems. For each subsystem i there exists a set of admissible inputs Ii , the set of admissible outputs Oi and a sub-system model fi : Ii → Oi . Because the sub-systems are interconnected to each other, the set of inputs Ii is divided into the set of free inputs Mi and the set of interconnected inputs Xi and consequently Ii ⊂ Mi × Xi

(12.1)

In a similar fashion the the outputs are divided into the set of free ouput variables Yi and set of constrained output variables Zi , i.e. the outputs zi ∈ Zi are fed as inputs into other sub-systems, see Fig. 12.2. With this division the set of admissible outputs Oi can be written as O i ⊂ Y i × Zi

(12.2)

142

Session 12. Hierarchical Systems Theory

In order to achieve mathematical tractability, from now on it is assumed that Mi , Xi , , Yi, Zi are suitable vector spaces. Typically the vector spaces would be chosen to be suitable vector spaces of time functions (this is of course highly application dependent), L2 [0, T ]-spaces, l2 [0, T ] and C∞ [0, T ] being frequently used spaces. In order to describe the interconnections present in the system an interconnection mapping H : Z → X is defined where Z := Z1 × Z2 × . . . × ZN X := X1 × X2 × . . . × XN

(12.3)

and N is again the number of sub-systems.

Figure 12.2: Division of variables into free and interconnected variables Without loss of generality it can be shown that for each xi , the corresponding interconnections can be written as xi =

N 

Cij zj ,

i = 1, 2, . . . , N

(12.4)

i=1

where xi ∈ Rni , zj ∈ Rmj and Cij is a real-valued ni × nj -matrix. Furthermore, the element ckl ij of Cij is either zero or one depending on whether or not the lth component of zj is connected to the kth element of input vector xi . Define now X := X1 × X2 × . . . × XN M := M1 × M2 × . . . × MN Y := Y1 × Y2 × . . . × YN Z := Z1 × Z2 × . . . × ZN

(12.5)

12.2. Process level

143

and x = (x1 , x2 , . . . , xN ) m = (m1 , m2 , . . . , mN ) y = (y1 , y2, . . . , yN ) z = (z1 , z2 , . . . , zN )

(12.6)

Define further the mapping f : M × X ⊃ D(f ) → Y × Z and C : Z → X so that f : (m, x) → (y, z) = (y1 , . . . , yN , z1 , . . . , zN )

(12.7)

where (yi , zi ) = fi (mi , xi ),

i = 1, 2, . . . , N

(12.8)

and the interconnection mapping C where C(z) = x = (x1 , x2 , . . . , xN )

(12.9)

and xi =

N 

Cij zj

(12.10)

i=1

From now on it is assumed that exists a model F : M ⊃ D(F ) → Y × X × Z so that for an arbitrary m ∈ D(F ) the equations 

(y, z) = f (m, x) x = C(z)

(12.11)

define uniquely x ∈ X, z ∈ Z and y ∈ Y so that F : m → F (m) = (y, x, z)

(12.12)

Consequently F represents an overall process model shown in Fig. 12.3. For future purposes the overall mapping F is divided into components P : D(F ) → Y , K : D(F ) → X, S : D(F ) → Z so that F (m) = (P (m), K(m), S(m))

(12.13)

144

Session 12. Hierarchical Systems Theory

y

m

f

F

y z x

x

C

z

Figure 12.3: The overall mapping F Remark 1 Note that even the overall process model F exists, it is not necessarily explicitly available for calculations. This is due to the fact that in order to construct F the constrained input vector x has to be solved as a function of the free input vector m from the set of equations in (12.11). If the number of sub-systems is large and their mathematical models fi are complex, it can be either impossible or impractical to solve F from (12.11). Remark 2 In the context of dynamical systems the existence of F requires that the initial conditions of the system have to specified. Furthermore, the interconnection has to made so that the overall system model F is also causal. Note that P , K and S can written in the following component form P = (P1 , P2 , . . . , PN ) K = (K1 , K2 , . . . , KN ) S = (S1 , S2 , . . . SN , )

(12.14)

where Pi : D(F ) → Yi , Ki : D(F ) → Xi , Si : D(F ) → Zi , i = 1, 2, . . . , N.

12.2.2

Lower-level decision units

As was explained earlier, for each sub-system i there exists a decision unit, and the objective of the decision unit is to control the sub-system according to its objectives by manipulating the free input variables mi . The objectives of the lower-level decision units can vary a lot, but in this report each subsystem i is associated with a cost function gi , which is a function of the

12.3. Upper level

145

sub-systems input and output variables mi , xi , yi and zi . In other words for each sub-system i there exists a mapping (cost function) gi : Mi × Xi × Yi × Zi → R

(12.15)

Furthermore, the sub-system model fi can be used to eliminate the output variables from the cost function, and the cost function becomes only a function of mi and xi in the following way: Gi : Mi × Xi ⊃ D(Gi ) → R,

D(Gi ) = D(fi )

(12.16)

where 

Gi (mi , xi ) = gi mi , xi , fi1 (mi , xi ), fi2 (mi , xi )



(12.17)

with the notation fi (mi , xi ) = (fi1 (mi , xi ), fi2 (mi , xi )). In a similar fashion an overall cost function g is defined where g :M ×X ×Y ×Z → R

(12.18)

The overall process model can then used to eliminate the output variables resulting in equivalent representation of (12.18) G : M ⊃ D(G) → R, D(G) = D(F ) G(m) = g (m, K(m), P (m), S(m))

(12.19)

From now on it is assumed that the individual cost functions Gi and the overall cost funtion G are related to each other with the equation G(m) =

N 

Gi (mi , Ki (m))

(12.20)

i=1

To judge whether or not this is always plausible is left to the reader.

12.3 12.3.1

Upper level Upper-level decision making

The objective of the coordinator is affect the lower-level decision making so that that the cost function G : D(F ) → R is minimised, i.e. that the overall

146

Session 12. Hierarchical Systems Theory

optimisation problem is being solved. The overall optimisation problem can be written equivalently as a constrained optimisation problem min

N 

m∈D(F )

Gi (mi , xi )

(12.21)

i=1

with the constraint x = K(m)

(12.22)

This formulation, unfortunately, requires that the overall process model F is explicitly available, and as was earlier explained, this is not the case if the system contains a large number of complex sub-systems. Consequently the overall optimisation problem has to modified so that it can be divided into independent optimisation problems for each sub-system. In order to achieve this, define a modified sub-system model f˜i and cost function g˜i , i = 1, 2, . . . , N that now become a function of an external parameter γ ∈ Γ, where γ is the coordination parameter: f˜i : Mi × Xi × Γ ⊃ D(f˜i ) → Yi × Zi , g˜i : Mi × Xi × Yi × Zi × Γ → R

D(f˜i = D(fi ) × Γ

(12.23)

Furthermore, the modified sub-system model can be used to eliminate the output variables from the modified cost function, resulting in 

˜ i (mi , xi , γ) = g˜i mi , xi , f˜1 (mi , xi , γ), f˜2(mi , xi , γ), γ G i i



(12.24)

This results in the following sub-system decision making problem

12.3.2

Sub-system decision making problem

The sub-system decision unit i has optimise with a given γ ∈ Γ the cost ˜ i (mi , xi , γ), i.e. the decision unit has to find (mi , xi ) ∈ Mi × Xi so function G ˜ i (mi , xi , γ). that resulting pair (mi , xi ) minimises the cost function G Remark 3 The important point here is that now the optimisation problem is now an unconstrained optimisation problem, and the fact that xi is determined by the behaviour of other sub-systems is not taken into account.

12.3. Upper level

147

The optimal solution of the problem (if it exists in the first place) is called as the γ-optimal solution (m(γ), x(γ)), where m(γ) = (m1 (γ), m2 (γ), . . . , mN (γ))

(12.25)

˜ i (mi (γ), xi (γ), γ) ˜ i (mi , xi , γ) = G minmi ,xi G i = 1, 2, . . . , N

(12.26)

and

The objective of the coordinator is select the coordination variable γ ∈ Γ so that the overall optimisation is being minimised, i.e. min G(m) = G (m(γ))

m∈D(F )

(12.27)

The question whether or not there exits a γ so that (12.27) holds depends ˜ i and the set of coordistrongly on how the modified ’system variables’ f˜i , G nation variables Γ are chosen. Consider now the case where the modification is done so that for an arbitrary γ ∈ Γ, m ∈ D(F ) it holds that f˜i (mi , Ki (m), γ) = fi (mi , Ki (m)) ˜ i (mi , Ki (m), γ) = Gi (mi , Ki (m)) G

(12.28)

i.e. the modified sub-system model is equivalent to the original sub-system model and the modified cost function is equivalent to the original cost function when the constraint equation x = K(m) is met. Proposition 1 Suppose that sub-system model fi and the sub-system cost function Gi are modified according to (12.28) and htat there exists m◦ ∈ D(F ) and γ ◦ ∈ Γ so that minm∈D(F ) G(m) = G(m◦ ) x(γ ◦ ) = K (m(γ ◦ ))

(12.29)

This implies that G(m◦ ) = G (m(γ ◦ )). Proof.

◦ G (m(γ ◦ )) = N i=1 Gi (mi , Ki (m(γ )) N ˜ i (mi (γ ◦ ), Ki (mi (γ ◦ ), γ ◦ ) = i=1 G N ˜ i (mi , xi , γ ◦ ) = i=1 minmi ,xi ∈D(fi ) G N ˜ i (mi , Ki (m), γ ◦ ) ≤ i=1 minm∈D(F ) G N ˜ ≤ minm∈D(F ) i=1 Gi (mi , Ki (m), γ ◦ ) ◦ ◦ ◦ ˜ ≤ N i=1 Gi (mi , Ki (m ), γ ) N ◦ ◦ = i Gi (mi , Ki (m ) = G(m◦ )

(12.30)

148

Session 12. Hierarchical Systems Theory

and hence G (m(γ ◦ )) ≤ G(m◦ ). However, by optimality, G (m(γ ◦ )) ≥ G(m◦ ), and consequently G (m(γ ◦ )) = G(m◦ ), which concludes the proof. Note that even if the modification is done as in (12.28), is does not guarantee that condition (12.27) is met because in the proof it is assumed that there exists at least one γ ◦ ∈ Γ so that resulting pair (m(γ ◦ ), x(γ ◦ ))) satisfies the ’interconnection equation’ x(γ ◦ ) = K (m(γ ◦ )). Whether or not this is true leads to the definition of coordinability, which is the topic of the next section.

12.4

Coordination

As was explained in the previous section the sub-system decision problems were defined to be optimisation problems where the optimisation problems can be modified with an external parameter γ ∈ Γ. The purpose of this modification is to make the sub-system decision problems independent from each other and to remove the ’conflicts’ caused by the interconnections between the sub-systems. Furthermore, the objective of the upper-level coordinator is to find a γ ◦ ∈ Γ so that with this particular γ ◦ the solutions of the subsystem optimisation problems also satisfy the interconnection equations and solve the overall optimisation problem. This results in the following definition of coordinability: Definition 1 (Coordinability) 1) The overall optimisation problem has a solution, i.e. ∃m◦ ∈ D(F ) so that min G(m) = G(m◦ )

m∈D(F )

(12.31)

ii) There exists γ ◦ ∈ Γ so that there exists a γ ◦ -optimal pair (m(γ ◦ ), x(γ ◦ )), i.e. ˜ i (mi (γ ◦ ), xi (γ ◦ ), γ ◦ ) ˜ i (mi , xi , γ ◦ ) = G min(mi ,xi)∈D(fi ) G i = 1, 2, . . . , N

(12.32)

iii) The objectives of the sub-system decision units and upper-level decision unit are in ’harmony’, i.e. m(γ ◦ ) ∈ D(F ) and G(m◦ ) = G (m(γ ◦ ))

(12.33)

12.4. Coordination

149

As a consequence the coordinability of the overall system guarantees that there exists at least one γ ∈ Γ so that the γ-optimal of the sub-system optimisation problem is equal to solution of the overall optimisation problem. The coordinability of two-level hierachical system, however, does not lead into an efficient solution method, because the coordinator has to know a’priori the optimal γ ◦ that results in the solution of the overall optimisation problem. If the coordinator cannot choose directly the optimal value γ ◦ , which is the case in most problems, the coordination becomes an iterative process where the coordination variable is changed iteration by iteration to the ’correct’ direction. In order to implement the ’correction process’ the coordinator needs a coordination strategy. In the coordination strategy the new value for the coordination variable depends on the current value of the coordination variable and the corresponding γ-optimal solution (m(γ), x(γ)), resulting in the coordination strategy η, η :γ×M ×X →Γ

(12.34)

The coordination algorithm related to the coordination strategy can be described in the following way: 1) Select a suitable initial guess for γ ∈ Γ 2) The lower-level decision units solve their own optimisation problems resulting in the γ-optimal pair (m(γ), x(γ)) 3) If γ is not optimal (whether or not this can be tested in practise depends highly on the modification technique, see next section for further details), the coordinator selects a new coordination variable γ ∈ Γ using the coordination strategy γ ← η (γ, m(γ), x(γ))

(12.35)

and the algorithm jumps back to 2). Note this algorithm is purely abstract and due to its abstractness it is impossible to analyse whether not the coordination algorithm converges. In this next section one possible way to implement the coordination strategy is being discussed.

150

Session 12. Hierarchical Systems Theory

12.5

The balancing principle

The balancing principle is coordination method where the interconnections between sub-systems are ’cut off’ resulting in N independent optimisation problem. Each sub-system decision unit i optimises its running not only as a function of mi but also a function of the interconnected input variable xi . Because now the sub-systems optimise their running independently from each other, the constraint equation xi = N i=1 Cij zj , i = 1, 2, . . . , N is not typically met, and the overall optimisation problem remains unsolved. Consequently in the balancing principle the sub-system decision units are forced to select optimal solutions (mi , xi ) so that constraint equation is met. The modification of the ’system variables’ in the balancing technique is done in the following way: f˜i (mi , xi , γ) := fi (mi , xi ) g˜i (mi , xi , yi, zi , γ) := gi (mi , xi , yi, zi ) + ψi (xi , zi , γ) ˜ i (mi , xi , γ) := g˜i (mi , xi , f 1(mi , xi ), f 2 (mi , xi ), γ) G 1 i

(12.36)

where the mappings ψi : Xi × Zi × Γ → R for i = 1, 2, . . . , N are defined so that ψ(x, z, γ) :=

N 

ψi (xi , zi , γ) = 0

(12.37)

i

if the balance equation xi =

N 

Cij zj ,

i = 1, 2, . . . , N

(12.38)

i=1

is met. Hence in the balance principle only the cost functions are modified whereas the sub-system model is equal to the original sub-system model. Note that modification that satisfies (12.37) and (12.38) is called a zero-sum modification. The defining property of a zero-sum modification is that the effect of the modification disappears from the overall cost function N i=1 Gi when the system is in ’balance’, i.e. the interconnection equations hold. In this case the overall cost can be calculated as the sum of the individual values of the cost functions in the following way: G(m) =

N  i=1

˜ i (mi , xi , γ) = G

N  i=1

Gi (mi , xi )

(12.39)

12.5. The balancing principle

151

How to select the coordination variable γ ∈ Γ is not discussed here. However, in the next section it is shown how Langrange multiplier theory can be used to implement the balancing principle. In the balancing principle the decision problem for each sub-system decision unit is given by ˜ i (mi , xi , γ) = G ˜ i (mi (γ), xi (γ), γ) min G

(12.40)

(mi ,xi )

Define now a mapping φi : Γ ⊃ D(φi) → R, φi (γ) =

min

(mi ,xi )∈D(fi )

˜ i (mi , xi , γ) G

(12.41)

where D(φi) = {γ ∈ Γ|φi (γ) exists}

(12.42)

Consider now the modified cost function that can be written as N ˜ G (mi , xi , γ) i=1 N i = g (m , x , f 1 (m i=1

i

i

i

i

2 i , xi ), fi (mi , xi )

+

N i=1

ψi (xi , fi2 (mi , xi ), γ)

(12.43)

If the original overall optimisation problem has a solution m◦ and there exists a γ ◦ ∈ ∩D(φi ) so that the lower-level γ ◦ -optimal solution (m(γ ◦ ), x(γ ◦ )) satisfies x(γ ◦ ) =

N 

Cij fj2 (mj (γ ◦ ), xj (γ ◦ )) ,

i = 1, 2, . . . , N

(12.44)

i=1

or in a more explicit form x(γ ◦ ) = K(m (γ ◦ ))

(12.45)

then it holds that ◦ ◦ ◦ ˜ φ(γ ◦ ) = N φi (γ ◦ ) = N i=1 i=1 Gi (mi (γ ), xi (γ ), γ ) ◦ ◦ ◦ = i=1 Gi (mi (γ ), xi (γ )) = G (m(γ ))

(12.46)

because of the zero-sum modification. On the other hand ∀γ ∈ D(φ) ˜ φ(γ) = min(m,x)∈D(f ) N i=1 Gi (mi , xi , γ) N ˜ i (mi , xi , γ) ≤ min(m,x)∈D(f ),x=K(m) i=1 G N = min(m,x)∈D(f ),x=K(m) i=1 Gi (mi , xi ) = G(m◦ ) ≤ G (m(γ ◦ ))

(12.47)

152

Session 12. Hierarchical Systems Theory

and was shown previously in Section 12.3, G(m◦ ) = G (m(γ ◦ )). Consequently in tha balancing principle the coordinator variable γ is taken to be the solution of the maximisation problem max φ(γ)

(12.48)

γ∈D(φ)

In practise it can be difficult (impractical) to solve analytically this maximisation problem, and numerical methods have to be used instead. Typically the gradient ∇φ is available (see next section), and for example a steepestdescent algorithm can be used to solve iteratively the maximisation problem.

12.6

The Langrange technique

In the Langrange technique the overall optimisation problem is modified by adjoining the interconnection equation into the cost function, resulting in the following Langrange function L(m, x, y) =

N 

Gi (mi , xi ) +

i=1

N  j=1



γi , xi −

N 



Cij fj2 (mj , xj )

(12.49)

i=1

where < ·, · > is the inner product in Xi (for simplicity it is assumed here that Xi for i = 1, 2, . . . , N is always the same space and that it is reflexive, i.e. the dual space of Xi is Xi . Reflexive spaces are for example the Euclidian space RN (discrete-time case with finite time-axis) and L2 [0, T ] (continuoustime case with finite time-axis), which are one of the most commonly used spaces in control theory. By changing the summation order it can be shown that the Langrange function can be written equivalently as

L(m, x, γ) = N i=1 Li (m, x, γ) 2 Li (m, x, γ) = Gi (m, xi ) + γi , xi  − N j=1 γj , Cij fi (mi , xi )

(12.50)

In summary L(m, x, γ) has been divided into a sum where each term Li (mi , xi , γ) depends only on the variables related to sub-system i and the Langrange multiplier γ. This modification is clearly a zero-sum modification. As was shown in the previous section, in the balancing technique the sub-system decision problem is to solve with a given γ (which is now the Langrange-multpier) the optimisation problem φi (γ) :=

min

mi ,xi ∈D(fi )

Li (mi , xi , γ) = Li (mi (γ), xi (γ), γ)

(12.51)

12.6. The Langrange technique

153

and upper-level decision problem is max φ(γ)

γ∈D(φ)

(12.52)

where (as previously) φ=

N 

φi

(12.53)

i=1

As was mentioned previously, the upper level decision problem does not necessarily have a nice closed-form solution. Consequently in order to use numerical optimisation methods, the gradient ∇φi is needed. However, in the Langrange technique this is just (can you show this?) ∇γ φ(γo ) = [1 , 1 , . . . , N ]T := 

(12.54)

where i = xi (γo ) − Cij fj2 (m(γo ), xj (γo))

(12.55)

and consequently the coordination strategy can be chosen to be η(γ, m, x) = γ + k · 

(12.56)

where k is a step-length parameter the algorithm designer has to select. This results in the following algorithm: 1) Set k = 0 and select an initial guess γk = γ0 . 2) Solve the sub-system optimisation problems with γ = γk . 3) If the system is in balance, stop, the overall optimisation problem has been solved. Otherwise set k → k + 1 and update γk+1 = γk + k · ∇γ φ(γk )

(12.57)

and go to step 2. Remark 4 If sub-system decision problems are linear programmes (i.e. they are of the form cT x then it can be shown that the Langrange multipier can be understood to be the price that a sub-system has to pay to the other sub-system for the transfer of material (information). In other words coordination is equivalent to ’an optimal pricing policy between departments’, see [3]!

154

Session 12. Hierarchical Systems Theory

The toy example in the following section will illustrate the Langrange approach. Note that even this example considers only static sub-system models, this technique can be applied on dynamical systems models without any major complications, see [2] and [1] for further details.

12.7

A toy example

Consider the linear static system 

y1 = 2m1 + u1 = P1 (m1 , u1 ) y2 = 2m2 + u2 = P2 (m2 , u2 )

(12.58)

with the interconnections u1 = m2 , u2 = m1 , and the overall cost function is defined to be G(m, y) = m21 + m22 + (y1 − 1)2 + (y2 − 2)2

(12.59)

It is a straightforward exercise to show that the optimal solution for the optimisation problem is m ˆ = [1/5 7/10]T . The corresponding Langrange function is given by L(m, y, γ) = m21 +m22 +(y1 −1)2 +(y2 −2)2 +γ1 (u1 −m2 )+γ2 (u2 −m1 )(12.60) resulting in following two sub-system cost functions G1 (m1 , y1 , γ) = m21 + (y1 − 1)2 + γ1 u1 − γ2 m1 G1 (m2 , y2 , γ) = m22 + (y2 − 2)2 + γ2 u2 − γ1 m2

(12.61)

For a fixed γ the optimal control policies for P1 and P2 become

10 4 4 2



m1 u1



=

4 + γ2 2 − γ1



,

10 4 4 2



m2 u2



=

8 + γ1 4 − γ2

(12.62)

The gradient of φ(γ) becomes

∇γ φ(γ) =

u1 − m2 u2 − m1

(12.63)

and the update law for the coordination variable (Langrange multplier) becomes

γ1 (k + 1) γ2 (k + 1)



=

γ1 (k) γ2 (k)



+k

u1 − m2 u2 − m1

(12.64)

12.8. Conclusions

155

where k = 0.1 (a sophisticated guess). The initial guess for γ is γ(0) = [10 10]T . Fig. 12.4 shows how the ’input functions’ converge, i.e. in this figure the Euclidian norm of [m1 − 1/5 m2 − 7/10]T is plotted as function of the iteration rounds, showing a reasonable convergence speed. Fig. 12.5 on the other hand shows the value of the modified cost function. From this figure it can be seen that the coordinator is maximising the modified cost function, which is the expected result.

Convergence of m1 and m2

25

l2−norm of m−mref

20

15

10

5

0 0

20

40

60

Iteration round k

80

100

Figure 12.4: Convergence of the inputs

12.8

Conclusions

In this chapter a general theory for the optimisation of two-level hierarchical systems has been introduced. This theory can be applied on a wide range of applications, examples being economics, organisation theory and largescale industrial plants. In this theory the system is divided into sub-systems, where each sub-system tries to achieve its own objectives without considering whether or not they contradict with other sub-systems. In order to rectify

156

Session 12. Hierarchical Systems Theory

The modified cost function 0.7 0.68 0.66 0.64

J(k)

0.62 0.6 0.58 0.56 0.54 0.52 0.5 0

20

40

60

80

100

Iteration round k Figure 12.5: The value of the modified cost function the ’selfishness’ of the sub-system decision making, an upper-level decision making system has to be introduced. The objective of the upper-level decision making unit is to coordinate the decision making of each sub-system, so that an overall harmony is achieved. A special emphasis was put on the so called balancing technique, because it can be implemented numerically with the well-known Langrange technique. Note that in this report no rigorous convergence theory was presented for the iterative balancing technique. Convergence, however, can be proved in some special cases, see [2] for further details. The required theoretical background for understanding the theory of twolevel hierarchical systems in the most abstract setting is, unfortunately, reasonably sophisticated. Consequently the theory can be quite hard for an engineer to digest, and therefore it has not found its way to the mainstream material taught at graduate level in control engineering courses. On the other hand it offers an interesting option for more mathematically oriented engineers to do high-quality control for complex systems without resorting to ad-hoc methods (i.e. fuzzy control, agents etc.).

Bibliography [1] L. Hakkala, Hierarkisten järjestelmien optimointi: koordinointiperiaattet, Report C 23, Systems Theory Laboratory, Helsinki University of Technology, 1976 [2] Mesarovic, Macko, Takahara, Theory of Hierarchical Multilevel Systems, Academic Press, New York, 1970 [3] Lasdon, Optimisation Theory for Large Systems, The Macmillan Publishing Company, New York, 1970

157

158

BIBLIOGRAPHY

Session 13 Qualitative Approaches Petteri Kangas [email protected] As a tool to model complex systems, system dynamics is an alternative. It can be used for wide range of applications. Mostly qualitative results can be achieved by using this method. The most valuable benefit is to understand the inner functionality of the observed system. System dynamics has been used for real applications during 50 years.

13.1

Introduction

In the previous chapters, complex systems have been introduced. Their structures have been presented, their behaviour has been introduced and different ways to model complex systems have been shown. The aim of this chapter is to show the basic ideas of system dynamics and system thinking. These are tools for modelling and understanding different kind of complex systems, from technology to biology and from economy to human behaviour. System dynamics and thinking can be seen as a general tool for modelling different kind of complex systems. First, this chapter will give short historical perspective of system dynamics during last 50 years. It tries to connect system dynamics to other fields of research. A well-known example, the Beer distribution game is presented. Few other applications are shown. At the end, the key concepts of system dynamics are presented with a small example model. Finally, important publications, software and research institutes are listed. 159

160

Session 13. Qualitative Approaches

As an example of a dynamic system, Naill’s natural gas model [1] can be seen in the figure 13.1.

13.2

History of system dynamics

The father of System Dynamics is Jay Forrester. He graduated from Engineering Collage at the University of Nebraska. After that, he got a job from Massachusetts Institute of Technology as research assistant under supervision of Gordon Brown, a pioneer in “feedback control systems”. At Servomechanism Laboratory, they built an experimental control for radar, which was used during World War II [1], [2]. After the war, Jay Forrester continued to work at MIT. He started to build a flight simulator for U.S. Navy. Soon they noticed that analog computers were not able to handle problems this large. Development of digital computers started. In 1947, the Digital Computer Laboratory of MIT was founded. It was placed under the direction of Jay Forrester [1]. They designed the Whirlwind digital computer, which later on was developed to SAGE (Semiautomatic Ground Environment) air defence system for North America [2]. At the middle of 50’s Jay Forrester felt that pioneering days of digital computers were over. He was also running projects worth of several million dollars and noticed that social systems are much harder to understand and control than are physical systems [1]. A business school was found in MIT and Jay Forrester joined that school in 1956. He started to think management problems with engineering background [2]. The Jay Forrester’s first dynamic system was a simple pencil and paper simulation of employment instability at General Electrics Kentucky plants. Forrester was able to show that the instability of employment was due to the internal structure of the firm, not external forces such as the business cycle [2]. Next step was a compiler SIMPLE (Simulation of Industrial Management with Lots of Equations) developed by Richard Bennett in 1958. Later, 1959, SIMPLE and other system dynamics compiler was developed to DYNAMO compiler by Jack Pugh [2]. The system dynamics became an industrial standard for over 30 years. Forrester published his classic book, Industrial Dynamics, in 1961 [1]. Urban dynamics by Jay Forrester was the book, the brought system dynamics to even wider knowledge. It was the very first non-corporate application of

13.2. History of system dynamics

Figure 13.1: Naill’s natural gas model [1].

161

162

Session 13. Qualitative Approaches

system dynamics [1]. This model was developed with John Collins, the former mayor of Boston and his experts in 1968. The results of this study were not generally accepted, at first. They showed that many policies widely used for urban problems were in fact ineffective or even caused worse problems. Solutions, which seemed to be wrong at first glance, were, at the other hand, very effective [2]. The next turning point of system dynamics was year 1970. Jay Forrester was invited by the Club of Rome, to a meeting. He was asked if system dynamics could be used to describe the "predicament of mankind". Jay Forrester said YES and developed the first model of world’s socioeconomic system, called WORLD1. Later that model was refined and the WORLD2 model was published in the Forrester’s book titled World Dynamics [2]. This model and book showed that world’s socioeconomic system would collapse sometime during 21st century if no actions were taken to lessen the demands for earth’s carrying capacity. Forrester’s formed Ph.D. student Dennis Meadows developed the WORLD2 model further and the Club of Rome published a book titled the Limits of Growth on based on the WORLD3 model [1]. Although, the Limits of Growth is widely criticised, it is an excellent example of usability of system dynamics. Last years Jay Forrester has been developing National model for U.S. In addition, he has introduced system dynamics to education from kindergartens to high schools [1]. At 1990 Peter Senge, a director of the Systems Thinking and Organizational Learning Programs at MIT’s Sloan School of Management published a best seller book the Fifth Discipline – The Art & Practice of The Learning Organization. It re-introduced the system thinking to managers worldwide. The relationship of system thinking and system dynamics is following: They look the same kind of system from the same kind of perspective. They construct the same causal loop diagrams but system thinking rarely takes additional steps for constructing or testing computational simulations [3]. In Finland Markku Perkola is using ideas of system thinking [4]. At the beginning of the new millennium, 2000, Professor John Sterman for MIT’s Sloan School of Management, published a book Business Dynamics – System Thinking and Modelling for a Complex World. The roots of this book relates back to the publications of Jay Forrester and the Fifth Discipline by Peter Senge. The ideas of system dynamics and system thinking are combined into this book. There are wide range of examples for using system dynamics and thinking for different kind of applications, mostly economical. In addition, the wide range of graphically attractive computer programs for

13.3. Beer distribution game

163

system dynamics modelling has brought the ideas of Jay Forrester back again. System dynamics is seen as an option for modelling the complex world. System dynamics has almost 50 years history behind it. It is a real tool used for solving complex system problems, not a piece of ironic new science.

13.3

Beer distribution game

A classical example of dynamic systems is the Beer distribution game. The concepts of this dynamic system are shown at first. Next, the management flight simulator is presented.

13.3.1

Rules of the beer distribution game

Beer distribution game is said to be an enjoyable and effective introduction to the principles of system thinking [5]. MIT’s Sloan School of Management developed it in the beginning of 60’s. The game is used to teach supply chain management. It illustrates that the problems of system is caused by the system itself, not an external factor [6]. Beer distribution game shows how inventories and delays create fluctuation and overcompensation in supply chains. The rules of the game are rather simple: 1. There are four different links in supply chain: factory, distributor, wholesaler and retail, seen in the figure 13.2. 2. Orders are sent to upstream and shipments downstream. There are delays before ordered goods arrive. 3. Orders from downstream and shipments from upstream are the only information available for supplier. The levels of other inventories are not known. 4. If supplier cannot deliver orders, they will be logged in backlog, which is expensive. 5. Keeping goods in inventory is expensive; backlogging goods is even more expensive. The goal is to minimize these fees. Players are trying to follow those rules. They try to keep inventories as low as possible and at same time, deliver all the orders to their customer. Players

164

Session 13. Qualitative Approaches

soon realise that even the rules of this game are very simple, the whole system is complex. Overcompensations and fluctuations appear soon. The supply chain of the beer distribution game is shown in the figure 13.2 [7].

Figure 13.2: Supply chain of the beer distribution game [7]. One can also inspect the factors affecting on a supplier in the supply chain. These effects can be illustrated by using causal loop diagrams, which are the basis of the system dynamics and thinking. The figure 13.2 was an example of causal loop diagrams. In the figure 13.3 those loops are shown for a single supplier. Customer orders define the supplied goods. There is a delay however. According to the customer orders, the sales forecast is calculated. In addition, desired inventory must be calculated. According to the level of current inventory, desired inventory and sales forecast, the orders are sent to supplier. Again, there is a shipping delay before goods arrive to supplier’s inventory. All these effects can be seen in the figure 13.3. We have to keep in mind that system thinking is causal thinking. There is a cause and an effect. We cannot look and calculate correlations between the different things. Correlations among variables reflect the past behaviour of a system [5], not necessary the true causality between variables. Well know example is correlation with ice cream eating and drowning during a warm summer. We have to look at the true correlations between different things. Sometimes this is easy; sometimes several experts are needed in order to find

13.3. Beer distribution game

165

Figure 13.3: Simple, single-stage model, in the Beer distribution game [7]. right cause and effect [2]. Causal laws of distributor’s inventory can be seen in the figure 13.4.

Figure 13.4: Causal tree of distributor’s inventory in the Beer distribution game [7]. After playing the beer distribution game for a while, the players noticed the overcompensations and fluctuations are caused by the system itself. An example of these can be seen in the figure 13.5. One way to compensate these overcompensations and fluctuations is to increase the available information. Vendor Managed Inventory (VMI) is discussed here [7]. Every supplier in the supply chain knows the average retails

166

Session 13. Qualitative Approaches

Figure 13.5: Multi-stage response to changes in retailer level. Picture from the Beer distribution game [7]. sales (retails sales forecast). In addition, every supplier knows the levels of all inventories in the supply chain. Each stage ships goods using these values: the average retails sales plus the balance of inventories downstream. After applying this simple method the overcompensations and fluctuations seen in the figure 13.5 are almost removed, which can be seen in the figure 13.6. More information and an example game can be found in the Internet [7].

13.3.2

Management flight simulators

A management flight simulator is the visually attractive front end to the underlying system dynamic model. The aim is to invite decision makers to train in the simulator just as an aircraft pilot does. The decision maker runs the dynamic system through the flight simulator. He or she makes decisions; runs model further, sees the response of the system and makes another decision. After the simulation is run through, the decision maker is invited to examine why the system behaved as it did. The same simulation can be run repeatedly with different decision policies. After certain amount of runs, the decision maker hopefully understands the underlying system and will apply the learned lessons to the reality [1]. Peter Senge used the management flight simulators as a principal tool used by learning organization. He focused to the idea that employees should learn the underlying system. Employees of learning organization have shared, holistic,

13.4. Applications

167

Figure 13.6: Multi-stage response to changes in retailer level after implementing Vendor Managed Inventory (VMI). Picture from the Beer distribution game [7]. and system vision, and have capacity and commitment to continually learn. They just do not execute a master plan put forth by the “grand strategist” at top of the hierarchy [6]. Peter Senge’s ideas are somehow contradictory to the hierarchical structure. They are similar to the ideas of networked structure. Hierarchical and network structures are presented in the previous chapters as an examples of complex systems.

13.4

Applications

The supply chain game, the Beer distribution game, is not the only applications of system dynamics and system thinking. Few other applications were already presented in the beginning of this chapter. As earlier mentioned the system dynamics is a real tool used for solve real life problem. Here are some additional applications presented. World models showed that the finite nature resources would be overused. The result is overshoot and collapse of world’s socioeconomic system. Models WORLD2 and WORLD3 were developed for the Club of Rome [1]. The life cycle theory of geologist M. King Hubbert is based on the idea that there is a limited amount of certain resource, e.g. oil, available. If consumption is exponential, the limited resources will lead to peak in consumption after growth and then long decline in consumption. Later Naill confirmed

168

Session 13. Qualitative Approaches

this theory in his Master’s thesis. The result was a three-year study of U.S. energy transition problem. This problem is currently solved by importing energy resources, which is cheaper than developing new domestic energy supplies [1]. Naill also developed a model of natural gas resources. This model was aggregate. However, the model was refined and further developed to COAL1, COAL2 and FOSSIL1. Name was FOSSIL1, since it looked at the transition of an economy that is powered by fossil fuels to one that is powered by alternative energy sources. FOSSIL1 was later developed to FOSSIL2 and it was used by the U.S. Department of Energy to analyse following things from the late 1970s to the early 1990s [1]: • The net effect of supply side initiatives (including price deregulation) on US oil imports. • The US vulnerability to oil supply disruptions due to political unrest in the Middle East or the doubling of oil prices. • Policies aimed at stimulating US synfuel production. • The effects of tax initiatives (carbon, BTU, gasoline, oil import fees) on the US energy system. • The effects of the Cooper-Synar CO2 Offsets Bill on the US energy system. These days the FOSSIL2 model is refined to IDEAS model and it is still actively used. John Sterman, as a PhD student, noticed that the models above, COALFOSSIL-IDEAS-models, are isolated models from the rest of the economy. Sterman showed that there are interactions between the energy sector and the rest of economy. He built significant energy-economy interactions in his dissertation [1]. Fiddemann’s presented Economy-climate interactions in his dissertation 1997. The developed FREE (Feedback-Rich Energy Economy) –model was the first one dynamic model of energy-economy interactions [1]. Applications presented above are shown in the figure 13.7. These were few examples of the real life problems, which has been modelled by using system dynamics. As earlier mentioned it is a good tool for understanding large complex systems.

13.5. Basic concepts of system dynamics

169

Figure 13.7: Development of system dynamic models [1].

13.5

Basic concepts of system dynamics

The basic concepts of system dynamics are presented. The stocks and flows are discussed. Causal loops are presented. In addition, a simple model of Finnish car population is build. A flight simulator is built as well. The model and flight simulator are built using a demo version of Powersim simulator. This model is by no means complete or verified. It is just a humble example.

13.5.1

Stocks and flows

At the first stock and flows are presented. We build an aging chain, which describes the aging of Finnish cars /[5]/. Finnish cars are divided into three different groups, 0-5 years old cars, 5-10 years old cars and over 10 years old cars. Inflows to model are brand new cars and old cars from Germany. Outflows are very old cars sold to Russia and wrecked cars. Stock and flows are shown in the figure 13.8. Correspondences between stock and flows and tanks and valves can be easily seen.

13.5.2

Causal loops

The causality is another basic concept of system dynamic. It is presented next. When we are dealing with stocks, the delays between the different storages must be defined. The first storage is for the new cars, 0-5 years

170

Session 13. Qualitative Approaches

Figure 13.8: Aging chain of Finnish cars.

13.5. Basic concepts of system dynamics

171

old cars. The only inflow is newly purchased cars. After 5 years these cars are moved to the next storage, 5-10 years old cars. To this storage, there is another inflow, old cars from Germany. The originally new cars are moved from the old car storage to the very old car storage after 5 years. The old cars from Germany are staying only 2 years in the old car storage (Old cars from Germany are 8 years old). The final storage, very old cars, has all the cars, which are over 10 years old. Outflow from the very old cars is wrecked cars. Very old cars are not sold to Russia in this model. Delays are added using causal loops to the figure 13.8 and the result can be seen in the figure 13.9.

Figure 13.9: Delays of Finnish car populations. Correspondences between causal loops and control systems can be easily seen.

172

Session 13. Qualitative Approaches

13.5.3

Equations behind the model

Even the system dynamics is usually considered a qualitative tool; the numbers have to be entered to the model if simulations are wanted to be run. Next, we have to define the ratios, which control the inflows, outflows and flows between the different age groups. New_cars0 = 250000 ∂N ew _cars = P urchace_ratio − Aging_1 ∂t

(13.1)

Old_cars0 = 250000 ∂Old_cars = ∂t Old_cars_f rom_Germany − Aging_2 + Aging_1

(13.2)

V ery_old_cars0 = 500000 ∂V ery _old_cars (13.3) = ∂t −W recking_ratio − V ery_old_cars_to_Russia + Aging_2 Aging_10,1,2,3,4 = 50000 Aging_1t = P urchace_ratiot−5

(13.4)

Aging_20,1,2 = 50000 + 0 Aging_23,4,5 = 50000 + Old_cars_f rom_Germanyt−2 Aging_2t = Aging_1t−5 + Old_cars_f rom_Germanyt−2

(13.5)

W recking_ratio = (New_cars + Old_cars + V ery_old_cars) − T otal_amount_of _cars W recking_ratio ≥ 0

(13.6)

Old_cars_f rom_Germany0 = 0

(13.7)

P urchace_ratio0 = 50000

(13.8)

V ery_old_cars_to_Russia = 0

(13.9)

Total amount of cars is defined as a curve, seen in the figure 13.10. The amount of cars is predicted to increase. We can adjust the curve if forecasts

13.5. Basic concepts of system dynamics

173

Figure 13.10: Total amount of cars (*1000) as function of time. that are more precise are available. In addition, different scenarios can be tested. The amount of cars is considered independent to the amount of new cars. If more new cars are bought, more old cars are wrecked. If less new cars are bought, people use old cars longer and there will be less wrecked cars. This is of course an assumption. The model defined here describes the age distribution of Finnish cars. Model is rather simple but the qualifiedly correct. If tax revenue is added to this model, it will result to a much more interested and realistic model.

13.5.4

Flight simulator

Third basic concept of system dynamics is a flight simulator. An example is built here. Jay Forrester tried to build a real flight simulator in the 50’s before he moved to the system dynamics. The flight simulators are still here, although they are commonly called management flight simulators. A management flight simulator is a front end to dynamic model, which is easy to use and can illustrate system behind the flight simulator. It might have graphs and sliders. User can run and test different combinations. The idea is to learn behaviour of the system and make decisions according this new knowledge. We try to build a simple flight simulator for our model of Finnish car popula-

174

Session 13. Qualitative Approaches

tion. The most important thing is to watch the total amount of cars as well as the amounts of different age groups. In Finland, we have the oldest cars in the Europe. Those cars are unsafe and the pollution is huge. The goal of politicians should be to lower the average age of Finnish cars. Politicians can do this by lowering the taxes of imported cars from Germany or the taxes of new cars in Finland. With this flight simulator, it is possible to test how the average age of cars behaves, if a great amount of cars is imported from Germany or if a great amount of new cars is bought.

Figure 13.11: Flight simulator of Finnish car population.

After testing different combinations, it can be seen that the only way to lower the average age of Finnish cars is to lower the taxes of new cars. Later on the model can be further developed and taxes and prices of cars can be added to the model.

13.6. Literature and research institutes

13.6

175

Literature and research institutes

At the end of this chapter, current literature of the system dynamics and system thinking is presented. The related software is discussed. The research institutes studying system dynamics and thinking are listed.

13.6.1

System dynamics literature

Jay Forrester invented system dynamics in the late 50’s. His work has been referred earlier when the history of system dynamics was presented. At 90’s after the book The Fifth Disciple by Peter Senge, the system dynamic and thinking seemed to gain popularity. Most likely visually attractive software has brought system dynamics to wider knowledge. In addition, the ideas of complex systems at the beginning of new millennium are new ground for system dynamics. The following books are published between 1990-2000. These are related to Sterman’s Business Dynamics. The list is form Amazon.com [8]: • Jamshid Gharajedaghi, Systems Thinking: Managing Chaos and Complexity: A Platform for Designing Business Architecture, ButterworthHeinemann (Trd); (May 1999) • Barry Richmond, The "Thinking" in Systems Thinking: Seven Essential Skills, Pegasus Communications; (January 5, 2000) • Stephen G. Haines, The Manager’s Pocket Guide to Systems Thinking, Human Resource Development Pr; Pocket edition (February 1999) • Joseph O’Connor and Ian McDermott, The Art of Systems Thinking: Essential Skills for Creativity and Problem Solving, Thorsons Pub; (April 1997) • Peter Checkland, Systems Thinking, Systems Practice: Includes a 30Year Retrospective, John Wiley & Sons; (September 16, 1999) • Stephen G. Haines, The Systems Thinking Approach to Strategic Planning and Management, Saint Lucie Press; (June 13, 2000) • Gerald M. Weinberg, An Introduction to General Systems Thinking, Dorset House; (April 15, 2001)

176

Session 13. Qualitative Approaches

• Virginia Anderson and Lauren Johnson, Systems Thinking Basics: From Concepts to Causal Loops, Pegasus Communications; (March 1997) • John D. W. Morecroft. Peter Senge and Jay Forrester, Modeling for Learning Organizations, Productivity Press; (August 2000) Most of the literature discusses the business and management issues. It can be seen that the ideas of system dynamic can motivate those who control complex systems.

13.6.2

Research institutes

• MIT System Dynamics Group is the home of system dynamics and thinking. http://sysdyn.mit.edu/sd-group/home.html • System Dynamics Society is a good source of information about system dynamics. http://www.systemdynamics.org • The Santa Fe Institute is a private non-profit institute with multidiscipline project also focusing on the complex systems. http://www.santafe.edu/

13.7

Conclusions

In this chapter, system dynamics and system thinking were presented as a qualitative tool for modelling complex system. System dynamics has almost 50 years history behind it. The most well known application is WORLD3 model which was used by the Club of Rome in its’ book The Limits of Growth. System dynamics is actively used for solving real life problems. History of system dynamics is heavily bound to the development of digital computers and programming languages. System thinking has similar ideas as network structures. It is almost reverse to the ideas of hierarchical structures. In addition, system dynamics and thinking has common thoughts with system theory presented in the next chapter. System dynamics and thinking are ways to understand complex systems. They are not ironic new science.

Bibliography [1] Radzicki, M., U.S. Department of Energy’s Introduction to System Dynamics – A System Approach to Understanding Complex Policy Issues, 1997. Available at http://www.systemdynamics.org/DL-IntroSysDyn [5.4.2003]. [2] Forrester, J., The Beginning of System Dynamics, Banquet Talk at the international meeting of System Dynamics Society, Stuttgart, Germany, July 13, 1989. Available at http://sysdyn.clexchange.org/sdep/papers/D-4165-1.pdf [30.3.2003]. [3] Home page of System Thinking Europe Oy. Available at http://www.steoy.com/index.html [5.4.2003]. [4] Home page of System Dynamics Society. Available at http://www.systemdynamics.org/ [5.4.2003]. [5] Sterman, J., Business Dynamics – System Thinking and Modelling for Complex World, McGraw-Hill, 2000. [6] Senge, P., Fifth Discipline – The Art & Practice of The learning Organization, Doubleday, New York, 1990. [7] Forrester, M. & Forrester, N., Simple Beer Distribution Game Simulator. Available at http://web.mit.edu/jsterman/www/SDG/MFS/simplebeer.html [30.3.2003]. [8] List of books at Amazon.com. Available at http://www.amazon.com/exec/obidos/ASIN/0750671637/ ref=pd_sxp_elt_l1/002-1327772-7212832 [30.3.2003]. [9] Sastry, A., & Sterman, J., Desert Island Dynamics: An Annotated Survey of the Essential System Dynamics Literature. Available at http://web.mit.edu/jsterman/www/DID.html [30.3.2003].

177

178

BIBLIOGRAPHY

Session 14 Towards a Systemic View of Complexity? Yuba Raj Adhikari Laboratory of Process Control and Automation Helsinki University of Technology, Finland General System Theory concept is close to theory of control systems. In this chapter, we try to find a systematic view of complexity from the viewpoint of system concepts. The first part of the chapter covers about General System Theory from Bertalanffy’s book “General System Theory” [1]. In this book he shows that reductionism is completely wrong and modelling the system from the holistic viewpoint is the correct approach. One approach to looking at complex systems in a systemic perspective and a way to control the emerging patterns as well is demonstrated.

14.1 14.1.1

General System Theory Introduction

“System Theory” represents a novel paradigm in scientific thinking [1]. General system theory is similar to “theory of evolution”, which comprises about everything between fossils digging, anatomy and the mathematical theory of selection, or behavior theory extending from bird watching to sophisticated neurophysiological theories. Broadly speaking, three major aspects of the theory are: System science, system technology and system philosophy. 179

180

Session 14. Systems Theory

System science deals with scientific exploration and theory of systems in the various sciences, physics, biology, psychology, etc., and general system theory as doctrine of principles applying to all systems. Technology has been led to think not in terms of single machine but in those of systems. There is system philosophy, i.e. the reorientation of thought and worldview ensuring from the introduction of system as a new scientific paradigm. It has been learned that for an understanding not only the elements but their interrelations as well are required. This is the domain of “general system theory”. System thinking plays a dominant role in a wide range of fields from industrial enterprise and armaments to esoteric topics of pure science. Professions and jobs have appeared going under names such as system design, system analysis, system engineering and others. They are very nucleus of a new technology and technocracy.

14.1.2

History

The idea of general system theory was first introduced by the author of the book General System Theory by Ludwig von Bertalanffy prior to cybernetics, systems engineering and the emergence of related fields. There had been a few preliminary works in field of general system theory. Köhler’s “physical gestalten” (1924) pointed in this direction but did not deal with the problem in full generality, restricting its treatment to physics. Lotka (1925) dealt with a general concept of system but being himself a statistician his interests more lie in population problem than biological problem [1]. In early 20’s, Bertalanffy advocated an organismic conception in biology that emphasizes consideration of the organism as a whole or system, and sees the main objective of biological sciences in the discovery of the principles of organization at its various levels. His first statement goes back to Whitehead’s philosophy of “organic mechanism”, published in 1925. Canon’s work on homeostasis appeared in 1929 and 1932. In the first year of the Center for Advanced Study in the Behavioral Sciences (Palo Alto), the biomathematician A. Rapoport, the physiologist Ralph Gerard and Bertalanffy found themselves together. The project of a Society for General System Theory was realized at the Annual Meeting of the American Association for the Advancement of Science (AAAS) in 1954 and its name is later changed to “Society for General System Research”, affiliated to AAAS. Local groups of the Society were established at various centers in United States and subsequently in Europe. Meanwhile another development had taken place. Norbert Wiener’s Cy-

14.1. General System Theory

181

bernetics appeared in 1948, resulting from the then recent developments of computer technology, information theory, and self-regulating machines.

14.1.3

Trends in system theory

Miniskirt and long hair are called teenage revolution; any new styling of automobiles or drug introduced by the pharmaceutical industry can also termed so and in strictly technical sense one can speak of “scientific revolutions”. A scientific revolution is defined by the appearance of new conceptual schemes or paradigms. These bring to the fore aspects, which previously were not seen or perceived, or even suppressed in “normal” science. The system problem is essentially the problem of the limitations of analytical procedures in science. This used to be expressed by half-metaphysical statements such as emergent evolution or “the whole is more than a sum of its parts” but has a clear operational meaning. “Analytical procedure” means that an entity investigated be resolved into, and hence can be constituted or reconstituted from, the parts put together, these procedures being understood both in their material and conceptual sense. This is the basic principle of “classical” science, which can be circumscribed in different ways: resolution into isolable causal trains, seeking for “atomic” units in the various fields of science etc. Application of analytical procedure depends on two conditions. The first is that interactions between “parts” be non-existent or weak enough to be neglected for certain research purposes. Only under this condition, the parts can actually be worked out, logically, and mathematically, and then be put together. The second condition is that the relations describing the behavior of parts be linear; only then is the condition of summativity given e.g. partial processes can be superimposed to obtain the total process etc. These conditions are not fulfilled in entities called systems, i.e., consisting of parts in interaction. The prototype of their description is a set of simultaneous differential equations which are nonlinear in the general case. A system or “organised complexity” may be circumscribed by the existence of “strong interactions” or interactions which are nontrivial, i.e., nonlinear. The methodological problem of system theory, therefore, is to provide for problems which, compared with the analytical-summative ones of classical science, are of a more general nature. There are various approaches to deal with such problems. The more important approaches are as follows [1]:

182

Session 14. Systems Theory

• Classical system theory • Computerization and simulation • Compartment theory • Set theory • Graph theory • Net theory • Cybernetics • Information theory • Theory of automata • Game theory • Decision theory • Queuing theory. A verbal model is better than no model at all. Mathematics essentially means the existence of an algorithm which is more precise than that of ordinary language. It may be preferable first to have some non-mathematical model with its shortcomings but expressing some previously unnoticed aspect, hoping for future development of a suitable algorithm, than to start with premature mathematical models following known algorithms and, therefore possibly restricting the field of vision. Models in ordinary language have their place in system theory, the system idea retains its value even where it cannot be formulated mathematically. There are, within system approach, mechanistic and organismic trends and models, trying to master systems either by “analysis”, “linear causality”, “automata” or else by “wholeness”, “interaction”, “dynamics”. These models are not mutually exclusive and the same phenomena may even be approached by different models. The fundamental statement of automata theory is that happenings that can be defined in a “finite” number of words, can be realized by an automaton (e.g., Turing machine). The automaton can, by definition, realize a finite series of events (however large), but not an infinite one. What if numbers of steps required is immense? To map them in a Turing machine, a tape of immense length would be required, i.e., one exceeding not only practical but physical limitations.

14.1. General System Theory

183

These considerations pertain particularly to a concept or complex of concepts which indubitably is fundamental in the general theory of systems: That of hierarchic order. The universe is seen as a tremendous hierarchy, from elementary particles to atomic nuclei, to atoms, molecules, high-molecular compounds, to the wealth of structures between molecules and cells to cells, organisms and beyond to supra-individual organizations. A general theory of hierarchic order obviously will be a main stay of general system theory. “General System Theories” extended by Bertalanffy in the 1950’s is one of the existing classification frameworks that has been examined and tested in the list of complex systems of interest. The list as presented by Bertalanffy had a strong orientation towards his discipline of biology and is summarized as below [1,8]. This is the systems classification according to Bertalanffy: • Static structures • Clock works • Control mechanisms • Open systems • Lower organisms • Animals • Man • Socio-cultural systems • Symbolic systems. In this list, each successive item is meant to be more complex and to some degree to incorporate the preceding entries. In addition, Bertanlaffy suggests the “theories and models” useful in each level of the hierarchy.

14.1.4

Ideas in general system theory

There exist models, principles, and laws that apply to generalized systems or their subclasses, irrespective of their particular kind, the nature of their component elements, and the relations or forces between them. It seems legitimate to ask for a theory, not of systems of a more or less special kind,

184

Session 14. Systems Theory

but of universal principles applying to system in general. In this way, General System Theory is postulated, and its subject matter is the formulation and derivation of those principles which are valid for systems in general. A consequence of the existence of general system properties is the appearance of structural similarities or isomorphism in different fields. For example, an exponential law of growth applies to certain bacterial cells, to populations of scientific research measured by the number of publications of bacteria, of animals or humans, and to the progress of scientific research measured by the number of publications in genetics or science in general. The entities such as bacteria, animals, men, books etc. are completely different, and so are the causal mechanisms involved whereas the mathematical law is the same. Or there are systems of equations describing the competition of animal and plant species in nature. But the same systems of equations apply in certain fields in physical chemistry and in economics as well. This correspondence is due to the fact that the entities concerned can be considered, in certain aspects, as “systems”, i.e., complexes of elements standing in interaction. Similar concepts models and laws have often appeared in widely different fields, independently and based upon totally different facts. There are many instances where identical principles were discovered several times because the workers in one field were unaware that the theoretical structure required was already well developed in some other field. General system theory will go a long way towards avoiding such unnecessary duplication of labor. Major aims of general system theory are [1]: 1. There is a general tendency towards integration in the various sciences, natural and social. 2. Such integration seems to be centered in a general theory of systems. 3. Such theory may be an important means for aiming at exact theory in the non-physical fields of science. 4. Developing unifying principles running “vertically” through the universe of the individual sciences, this theory brings us nearer to the goal of the unity of science. 5. This can lead to a much-needed integration in scientific education.

14.1. General System Theory

14.1.5

185

Open vs. closed systems

In any closed system, the final state is unequivocally determined by the initial conditions, e.g., the motion in a planetary system where the position of the planet at a time is equivocally determined by its position at time T0 . In equilibrium, the final concentrations of the reactants depend on the initial concentration. If either the initial conditions or the process is altered, the final state will also be changed. This is not so in an open system. Hence the same final state may be reached from different initial conditions and in different ways. An open system is defined as a system exchanging matter with its environment, presenting import and export, building up and breaking down of its material components. Living systems are basically open system.

14.1.6

The system concept

Three different kind of distinctions may be made when dealing with complexities of elements: Distinctions according to number, species and relations of elements. In first and second types, complexity can be understood as the sum of elements in isolation (summative and constitutive, respectively). In summative case the characteristics of elements within and outside the system are the same. In the third type, not only the elements should be known, but also the relations between them. Constitutive characteristics are those which are dependent on the specific relations within the complex. The example for the first type is weight or molecular weight. An example in second type is chemical characteristics (isomorphism). The meaning of the somewhat mystical expression “the whole is more than the sum of the parts” is simply that constitutive characteristics are not explainable from the characteristics of isolated parts. The characteristics of the complex, therefore compared to those of the elements, appear as new or emergent. A system can be defined as a set of elements standing in interrelations. In mathematics a system can be defined in various ways. We can choose a system of simultaneous differential equations for different applications, e.g, law of mass action, demographic problems, kinetics of cellular processes and the theory of competition within an organism, etc. The differential equations can be used in describing the “growth” of the system. A solution of the equation, the “exponential law” is useful in various different fields. As we talk about the system or whole, every whole is based on the competition of its elements, and presupposes the “struggle between parts”. The systems of differential

186

Session 14. Systems Theory

equations also indicate competition between parts. One characteristic in the system can be described as the power function of another characteristic (e.g., allometric equations). Summativity in mathematical sense means that the change in the total system obeys an equation of the same form as the equations for the parts. This is only possible in the linear case. There is a further case that appears to be unusual in physical systems but is common and basic in biological, psychological and sociological systems, in which the interactions between the elements decrease with time. In this case the system passes from wholeness to a state of independence of elements. We may call this progressive segregation. Progress is possible only by passing from a state of undifferentiated wholeness to differentiation of parts. This implies that parts become fixed with respect to a certain action. Therefore progressive segregation also means progressive mechanization. In a differential equation, if the coefficient of a quantity is large and all others are very small then the system is centered around the leading element, called principle of centralization. The differential equations can have different kind of solutions that describe the finality of the system. The system concept asks for an important addition. Systems are frequently structured in a way so that their individual members again are systems of the next lower level. Such superposition of systems is called “hierarchical order”. For its individual levels, again the aspects of wholeness, and summativity, progressive mechanization, centralization, finality, etc., apply [1].

14.2

Towards a New Science of industrial automation

There are still plenty of challenges we have to face in the field of automation and system engineering. Because of the new devices and technologies, there is an “information explosion” what comes to available process data. The process models are also becoming more and sophisticated. We need new approaches to cope with the wealth of data and models. Theory of complex systems promises wonderful solutions to such problems [2]. The properties of a complex system can also be pointed as below: • It consists of a large assemblage of interconnected nonlinearly interacting parts. • Parts evolve over time, adapting to the environment.

14.2. Towards a New Science of industrial automation

187

Figure 14.1: An illustration of complexity [12] • They tend to be organized hierarchically • They obey decentralized control • Dynamics is basically top down and bottom up. • Overall behavior is – self-organized – emergent – consists of a new-equilibrium order. As someone speaks very loosely, he can describe chaos as the study of how simple systems can generate complex behavior whereas the study of complexity is the study of how a complicated systems can generate simple behavior [12]. There are various different approaches in defining the complexities. It is tried here to visualize some of them: Malanson states that “the goal of the science of complexity is to understand how simple, fundamental processes, derived from reductionism, can combine to produce complex holistic systems . . . ” [9] . . . A system that is complex, in the sense that a great many independent agents are interacting with each other in a great many ways. (Waldrop 1993:11) [7]

188

Session 14. Systems Theory . . . To understand the behavior of a complex system we must understand not only the behavior of the parts but how they act together to form the whole. (Bar-Yam, 1997:1) [7]. ... You generally find that the basic components and the basic laws are quite simple; the complexity arises because you have a great many of these simple components interacting simultaneously. The complexity is actually in the organization—the myriad possible ways that the components of the system can interact. (Stephen Wolfram, quoted in Waldrop 1993:86) [7]

Let us try to figure out some of the examples of complex system. • Predator-prey relationships of natural ecologies • Economic dynamics of world market • Turbulence in fluid flow • Chaotic dynamics of global weather pattern • Firing patterns of neurons in a human brain • Information flow in the Internet • Apparently goal-directed behavior of an ant colony • Competing strategies of a nation’s political infrastructure. So, there are various examples of complex system around us and we can notice something fundamentally similar beneath the surface. However, the systems are so multifaceted that it is difficult to see what this underlying similarity is. It is assumed that there exist some underlying simple processes, so that when we iterate them massively, something qualitatively different comes out. The observed complexity is just an emergent phenomenon, and it is enough just to reveal the underlying simple single function to completely understand the fundamentals [2,3]. Emergence is appearance of higher level properties and behaviors that, while obviously originating from the collective dynamics of the system’s components, are neither found in or nor are directly deducible from lower level properties [12]. Wolfram’s claim is his cellular automata are the means to deal perfectly with any complexities [10]. We could say this is only the one approach to deal with

14.2. Towards a New Science of industrial automation

189

and might also say a bad modelling technique because everything is reduced to elementary units as it is always allowed to avoid metaphysical questions away and to take only the fruitful issues in developing the model [2]. System thinking approach may help in dealing complexities. System thinking is a better way to deal with our most difficult problems [11].

14.2.1

Towards new paradigm?

Thomas Kuhn put it forward that there are paradigm shifts within a science [1] — things are seen from another point of view, and new conceptual tools are introduced. Systematic thinking may help in understanding complex systems, and may visualize new challenges. A new, data-centered world issuing to general system theory is trying to be demonstrated [2]. Theory of complex system may give new tools when searching for new tools for mastering complicated automation systems. At different level of abstraction, the appropriate way of looking at the whole system changes altogether. A good example can be modelling of gases [2]: • Elementary particles of gas behave stochastically (quantum theory to be applied) • Atoms behave deterministically (Newtonian ideal gas model) • Atom groups behave stochastically (statistical mechanics) • Large volumes behave deterministically (states described by pressure and temperature) • Still larger volumes behave stochastically (turbulence and fluctuations becoming acute) • Perfectly stirred volume behaves deterministically (ideal mixer model). There still remain the same underlying laws, but the lower level tools are not the most economical ones in describing the complexities. The key point is to look at the phenomena from a higher abstraction level, ignoring the details of the physical components. Rather than concentrating on the actual realizations of the dynamic signal, one looks at the statistical and static relationships between the “qualifiers” or the process parameters and corresponding “qualities”, or process behaviors. It can be claimed that a higherlevel statistical model emerges from the lower level deterministic behaviors.

190

Session 14. Systems Theory

Extensive Monte Carlo simulations with slightly varied operating conditions should deliver the necessary information. The key concept to get rid of the details or reaching the higher level of abstraction is presented in the figure below [2,3,4].

Figure 14.2: A more abstract level of looking at a dynamic system Assuming that the necessary information are available, data is unimodal and the variations of data are normally distributed, then the data can be expressed by a multivariate Gussian distribution which can offer a nice mathematical model. Most of the information (variations) is captured in the subspace determined by most significant principle components. Unfortunately, when studying a complex system, the unimodality assumption cannot be made. When studying the properties of a distribution in general, the data mining problem is being faced. There exists some structural properties that can be applied, and special assumptions about the distribution properties can be made. It can be claimed that sparse components are the best suited for modelling of natural, multimodal data representing a Gaussian mixture model. So, the appropriate model is piecewise linear, consisting of linear substructures that are spanned by the individual data clusters each having separate normal distribution. Mixture modelling (clustering) concerns modelling a statistical distribution by a mixture of other distributions. The different Gussian mixtures can be motivated in a rather natural way as below [2]:

14.3. Conclusion

191

• Continuous nonlinearity: Smooth nonlinear functions can be approximated by locally linearising the function and using the nearest submodel to represent each sample. • Clustered data: Assuming there exit different operating modes and fault conditions. • Independent components: They are capable of nicely capturing physically relevant phenomena. Linear representations are too weak to capture the real-life complexity. If the linear structures are summed or combined the resultant structure is still linear. But still the linear basic underlying structure (affine) is assumed and it is boosted with nonlinearity to make complex representations possible and to facilitate emergence of sparsity.

14.2.2

Theoretical issues concerning New Science

In complex system research, it is assumed that interesting patterns emerge from massive iterations. But there can arise some questions like when this is possible, what is interesting to begin with etc. To answer such questions, truly deep philosophical questions need to be studied. The key terms are epistemology, ontology and semantics [2]. To have something meaningful, non-trivial automatically emerge out from some mindless mechanical machinery, the domain specific meaning (semantics) must somehow be captured in the functions. The domain area expertise and understanding is traditionally hand coded into the model in symbolic form. However, if the models remain purely on syntactic level, no real semantics can be buried in the data structure. Rather than speaking of real semantics of the system one can speak of its behavior in an environment and its reactions Simulations supply for the grounding, hard data and the evaluation supplies for interpretation and expert understanding. This means the proposed structure can carry some formalized semantics or may be possibly emergence of something interesting.

14.3

Conclusion

General system theory should be, methodologically, an important means of controlling and investigating the transfer of principles from one field to another and it will no longer be necessary to duplicate or triplicate the discovery

192

Session 14. Systems Theory

of the same principles in different fields isolated from each other. Bertanlaffy suggests the “theories and models” are useful in each level of the hierarchy of the system structure. Holistic view of understanding the system is that not only the parts but also the interactions in between them must be considered once the system is modelled. The most natural way to connect low-level models to high-level tools is simulation. New data-centered world might be a new approach to be issued in “General System Theory”. Mastering a complex large-scale system is to understand what is happening, what is relevant and what is not in the wealth of data.

Bibliography 1. Bertalanffy L.:General System Theory. (revised edition). Foundations, Development, Applications. George Braziller, New York, 1969. 2. Hyötyniemi, H.: Emergence and Complex Systems — Towards a New Science of Industrial Automation? Proceedings of the 4th International Conference on Intelligent Processing and Manufacturing of Materials (IPMM’03), May 18–23, 2003, Sendai, Japan. 3. Hyötyniemi, H.: Towards New Languages for Systems Modelling . Proceedings of the 42’nd Scandinavian Simulation Conference SIMS’02, September 26–27, 2002, Oulu, Finland. 4. Hyötyniemi, H.: Emergent Phenomena and Optimization of Parameters. SIMS’02 Conference, Oulu, Finland,September 26–27, 2002. 5. Hyötyniemi, H.: Studies on Emergence and Cognition — Part 1: LowLevel Functions. Finnish Artificial Intelligence Conference (STeP’02), December 16–17, 2002, Oulu, Finland. 6. Hyötyniemi, H.: Studies on Emergence and Cognition — Part 2: Higher-Level Functionalities. Finnish Artificial Intelligence Conference (STeP’02), December 16-17, 2002, Oulu, Finland. 7. http://www.new-paradigm.co.uk/complex-od.htm 8. Lloyd S.: Complexity — the State of art . ESD Internal Conference Extended Abstracts, MIT, March 25, 2002. http://esd.mit.edu/wps/esd-wp-2002-05.pdf .

14.3. Conclusion

193

9. http://geog.tamu.edu/∼laura/classes/complexity.html (10th , April, 2003). 10. Wolfram S.: A New Kind of Science. Illinois, 2002.

Wolfram Media, Champaign,

11. http://www.thinking.net/Systems_Thinking/Intro_to_ST/intro_to_st.html, (15th May, 2003) 12. Ilachinski, A.: Analytical Implication of Complexity. Systems and Tactics Team (Operating Forces Division) Center for Naval Analyzes.

HELSINKI UNIVERSITY OF TECHNOLOGY CONTROL ENGINEERING LABORATORY Editor: H. Koivo

Report 132

Gadoura, I. A. Design of Robust Controllers for Telecom Power Supplies. September 2002.

Report 133

Hyötyniemi, H. On the Universality and Undecidability in Dynamic Systems. December 2002.

Report 134

Elmusrati, M. S., Koivo, H. N. Radio Resource Scheduling in Wireless Communication Systems. January 2003.

Report 135

Blomqvist, E. Security in Sensor Networks. February 2003.

Report 136

Zenger, K. Modelling, Analysis and Controller Design of Time-Variable Flow Processes. March 2003.

Report 137

Hasu, V. Adaptive Beamforming and Power Control in Wireless Communication Systems. August 2003.

Report 138

Haavisto, O., Hyötyniemi, H. Simulation Tool of a Biped Walking Robot Model. March 2004.

Report 139

Halmevaara, K., Hyötyniemi, H. Process Performance Optimization Using Iterative Regression Tuning. April 2004.

Report 140

Viitamäki, P. Hybrid Modeling of Paper Machine Grade Changes. May 2004.

Report 141

Pöyhönen, S. Support Vector Machine Based Classification in Condition Monitoring of Induction Motors. June 2004.

Report 142

Elmusrati, M. S. Radio Resource Scheduling and Smart Antennas in Cellular CDMA Communication Systems. August 2004.

Report 143

Tenno, A. Modelling and Evaluation of Valve-Regulated Lead-Acid Batteries. September 2004.

Report 144

Hyötyniemi, H. Hebbian Neuron Grids: System Theoretic Approach. September 2004.

Report 145

Hyötyniemi, H. (ed.) Complex Systems: Science at the Edge of Chaos - Collected papers of the Spring 2003 postgraduate seminar. October 2004.

ISBN 951-22-7507-4 ISSN 0356-0872 Picaset Oy, Helsinki 2005