Cellular Automata modelling paradigm - CiteSeerX

17 downloads 152355 Views 8MB Size Report
means, clectronic, mechanical. by pliotocopying, recording or otherwise, without wrillcn prior .... contributing to the education and training of water professionals and to building the capacity of sector ..... auto-regressive neural networks. ASP.
QIUWEN CHEN

UNESCO-IHE Institute for Water Education

CELLULAR AUTOMATA AND A R T I F I C I A L I N T E L L I G E N C E IN ECOHYDRAULICS MODELLING

•• •• • • •

••



I

!

I

I

I

!

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

Cellular Automata and Artificia Intelligence in Ecohydraulics Modellin DISSERTATION Siibmitled in fiilfilment o f the requircments o f the Board for Doetorates o f D c l l l University o f Technology and o f the Academie Board o f the U N E S C O - I H E Institute for Water Education for the Degree o f D O C T O R to be defended in pubhe on Tuesday, 1 June 2004 at 13:00 hours in Deltt, The Netheriands by Qiuwen Chen Master of Science in Hydroinformulics wilh Distinction. hom in lliiangmei, Huhei Province (China)

lUE

This dissertation has been approved by the promotor: Prof. dr. ir. A . E . Mynett. T U Delft / U N E S C O - I H E Delft. The Netherlands

Memhcrs o f the Awarding Committee: Chairman Rector Magnificus T U Delft, The Netherlands Co-ehairman Rector U N E S C O - I H E Delft, The Netherlands Prof dr. ir. A . E . M y n e l l T U Delft / U N E S C O - I H E , The Netherlands, Promotor, P r o f dr. ir. G . S . Stelling T U Delft, The Netherlands P r o f dr. ir. G . Ooms T U Delft, The Netherlands P r o f dr. ir. E. Backer T U Delft, The Netherlands Prof. dr. L . R . M u r University o f Amsterdam, The Netherlands Prof. dr. K . .lorde University o f Idaho, Boise. U S A Prof. dr. ir. M.J.F. Stive T U Delft, The Netherlands, reserve member

Copyright € 2004 Tayior & Francis Group plc, London, UK. AH rii^lils reserved. No part of this piihliccilion or ihe infonnalion containeü herein mar hc rcprochiced, slored in a relrieval system. or Iransmilted in any form or In' any means, clectronic, mechanical. by pliotocopying, recording or otherwise, without wrillcn prior permission from the piihlisher. Although all care is taken lo ensiire ihe integrity and qiialily of ihis piihlicalion and the infonnalion herein, no responsibility is assumed by the piiblisher nor the authors for any damage to property or persons as a resiilt of operation or use of this publication and/or the information contained herein. Published by

A . A . Baikema Publisher, a member o f Tayior & Francis Group plc. www.balkema.nl and www.tandfco.uk

I S B N 90 5809 696 3 (Tayior & Francis (iroup) Keywords: environmental hydroinformatics. cellular automata. artificial intelligence, eutrophication, ecohydraulics. computer modelling

•ti V. Kifii,-R',^-! --imfö(ix^L;iJiJ" 340-278) The way to truth is long.... I must keep on pursuing... -QuYuan

(340-278 B . C . )

Contents Abstract

ix

Samenvatting

xi

Acknowledgement

xiii

List o f symbols

xv

List o f abbreviations 1

2

Introduction

1

1.1 Background

1

1.2 Scope o f the research

4

1.3 Outline o f the thesis

7

Cellular automata modelling paradigm

9

2.1 Fundamentals o f cellular automata

9

2.1.1

A brief historical view

2.1.2

Concepts o f S t a n d a r d cellular automata

10

2.1.3

Characteristics and behaviour o f cellular automata

14

2.1.4

Recent advances in cellular automata

17

2.2 Some initial applications o f cellular automata

9

18

2.2.1

Laltice-gas modelling hydrodynamics

18

2.2.2

GameofLilc

22

2.3 Cellular automata modelling prey predator dynamics

27

2.3.1

Devclopment o f E c o C A

28

2.3.2

Quantification o f model results

29

2.4 Effects o f cell size and configuration

3

xix

37

2.4.1

Design o f numerical experiments

39

2.4.2

Experiment results

39

2.4.3

Scales analysis and coupling

43

2.5 Discussion

45

Ruled based modelling techniques and application to algal blooms

47

3.1 Algal blooms

47

3.2 Fundamental factors affecting algal blooms

50

3.2.1

Abiotic factors

50

3.2.2

Biotic lactors

58

viii

Contents

3.3 Approaches to algal bloom modelling 3.3.1

Deductive approaches

63

3.3.2

Induclive approaches

66

3.3.3

Fuzzy logie and decision trees as rule based approaches

68

3.4 Rule based algal bloom modelling in practice Fuzzy logical modelling cyanobacteria bloom in Taihu Lake

73

3.4.2

Fuzzy logical modelling algal biomass in Duteh coastal waters

83

3.4.3

Decision trees modelling Phaeocystis globosa bloom

86 96

Cellular automata and rule based techniques in ecohydraulics modelling ...99 4.1 Cellular automata modelling macrophytes competition and succession

99

4.1.1

Description of study area

4.1.2

Model development

100

4.1.3

Results and discussion

106

4.2 Integrated numerical and decision tree modelling of H A B

99

108

4.2.1

Model development

109

4.2.2

Case study in the Duteh coast

113

4.3 Integrated numerical and fuzzy cellular automata modelling of H A B

5

73

3.4.1

3.5 Discussion 4

63

1 15

4.3.1

Description of study area

115

4.3.2

Model development

117

4.3.3

Results and discussion

1 18

4.4 Discussion

121

Conclusions and recommendations

123

5.1

123

Review o f the research objectives

5.2 Conclusions

123

5.3 Recommendations

127

Reference

129

Appendixes

141

1 E c o C A modelling environment

141

2 C A modelling environment o f macrophyte dynamies

145

Index Curriculum Vitae

147

Abstract Water and environment are recognised to be globaliy important issues that require international coiiaboration to secure the sustainabie development o f these scarce resources. Recentiy, the United Nations Summit in Johannesburg (2002) and the 3"'' World Water Forum in Kyoto, Shiga and Osaka (2003) brought together tens o f thousands o f participants, focusing the world's attention on issues reiated to water and environment. The emphasis nowadays is on stimulating direct actions towards meeting the dift'icult challenges o f improving water resources management, conserving natural resources and improving peopie's lives around the world. A t the same time, the World Water Council has been promoting awareness and has been building political commitment on critical water issues like efficiënt conservation, planning, development and management o f valuable resources. Each o f the worldwide events emphasised the vital importance o f information and communication technologies (ICT) and their role in knowledge sharing and technology transfer. U N E S C O - I H E Institute for Water Education, formerly known as I H E Delft, is contributing to the education and training o f water professionals and to building the capacity o f sector organisations, knowledge centres and other institutions that are active in the fields o f water, environment and infrastructure. Hydroinformatics originated at I H E Delft (1991) and has been actively involved in developing tools and technologies for engineering, management and knowledge dissimilation reiated to the aquatic environment. The Environmental Hydroinformatics group at U N E S C O - l H E is working closely together with W L | Delft Hydraulics and with T U Delft. The P h D research reported here was conducted within the framework o f environmental modelling, with a special interest in the development o f Instruments for modelling eutrophication and aquatic ecosystem dynamics. Ecohydraulics is an interdisciplinary subject which couples hydrodynamics and ecodynamics. It strives to achieve sustainabie solutions to problems reiated to water and the environment, by developing simulation methods and decision support systems for assessing and protecting valuable ecosystems. These areas are receiving considerable attention in international research and application activities. Research fields cover water quality, environmental flows, eutrophication effects, algal blooms, river and lake restoration and wetlands dynamics. However, literature review shows that so far one o f the major obstacles in the field of ecohydraulics concerns the integration o f hydrodynamic and ecological modelling. In the few integrated ecohydraulics models available, the problem o f multiple spatialtemporal scales coupling between hydrodynamic and ecological processes remains largely unresolved. The same is true for ways o f defining and formulating appropriate ecological 'rules' from the limited data available. Some o f the current ecohydraulics models start from a physical description o f the hydrodynamic and transport processes, often formulated in terms o f partial differential (conservation) equations. However, incorporation o f chemical, biological and ecological processes into numerical algorithms is often still too complicated, in particular when not all processes are known or can be described in mathematical equations (yet). Alternatively, other ecohydraulics models are based on aggregated formulations that lump species into biomass. These models may violate at least two fundamental ecological considerations, viz. (1) that individual species (properties) can be different from each other, and (2) that interactions often take place locally. A s a result, the spatial properties and patchy dynamics may not be well captured.

X

Abstract

The possibility to integrate eeologieal processes with numerical hydrodynamie simulation models still requires considerable research. Contributing to this field is precisely the interest o f Environmental Hydroinformatics research at U N E S C O - I H E and W L [ Delft Hydraulies. In order to improve current ecohydraulics models and/or develop new modelling tools, the research in this thesis foeuses on exploring possible alternative modelling paradigms and simulation techniques. First, the applicability o f cellular automata ( C A ) to eeologieal modelling is investigated. Cellular automata are mathematical systems which are discrete in time and space while the dynamies are driven by local evolution rules. Cellular automata are known to have the capability o f achieving spatial heterogeneity through loeal interactions. Numerical experiments based on the "Game o f L i f e " and E c o C A (cellular automata based prey-predator modelling) indicate that cellular automata could well be a relevant paradigm in eeologieal modelling. However, the definition and implementation o f local evolution rules still requires further research. In order to derive these rules from the often limited data available from in-situ measurements, techniques like fuzzy logic rule-based systems are explored in this research, combining data with expert knowledge from biologists and ecologists. Within the context o f the E U - H A B E S project (Harmful Algae Blooms Expert System) headed by W L | Delft Hydraulies and supervised by an international expert panel an expert system was developed for a range o f species and eonditions. Case studies on algal bloom prediction in Taihu Lake, China and along the Duteh coastal waters showed that fuzzy logic can be a useful teehnique in ecohydraulics modelling, espeeially i f only limited data are available. The fuzzy logic teehnique developed within this thesis was integrated into the cellular automata model to formulate eeologieal rules; the resulting integrated FuzzyC A module is then coupled with the hydrodynamie module o f Delft3D. Application o f the resulting model along the Duteh coastal zone showed promising results. From this, the general conclusion is drawn that it seems feasible to integrate numerical techniques with data analysis procedures and knowledge base systems, leading to practical tools in ecohydraulics modelling. Nowadays, advaneed software systems are a prominent way to encapsulate knowledge. This research took the available DelftSD hydrodynamie and transport modules as a starting point and contributed to the extension o f the eeologieal module. The research results are to be embedded in the well-established Delft Hydro Software system o f W L | Delft Hydraulies.

Samenvatting Water en milieu worden wereldwijd gezien als belangrijke aandachtsgebieden voor de komende eeuw. Internationale samenwerking zal nodig zijn om de duurzame ontwikkelingen van schaarse middelen veilig te stellen. Tijdens de recente Verenigde Naties Summit in Johannesburg (2002) en het 3*^^ Wereld Water Forum in Kyoto, Shiga and Osaka (2003) brachten enkele tienduizenden deelnemers de wereld water problematiek onder de aandacht van de wetenschap, de samenleving en de politiek. De nadruk ligt momenteel vooral bij het stimuleren van concrete akties om de toenemende problemen het hoofd te bieden, teneinde de natuurlijke rijkdommen voor de toekomst te behouden. Tegelijkertijd streeft de World Water Council ernaar het bewustzijn en de politieke wil te vergroten om duurzaam beheer van water en milieu te stimuleren. A l deze instanties benadrukken het cruciale belang van informatie- en communicatie technologie (ICT) voor het uitwisselen en delen van kennis en ervaring. U N E S C O - I H E Institute for Water Education, voorheen bekend als I H E Delft, draagt hieraan bij door middel van onderwijs en onderzoek gericht op zich ontwikkelende landen over de gehele wereld en stimuleert kennis-overdracht naar instellingen en instanties die zich bezig houden met aspecten op het gebied van water en milieu. Hydroinformatics, ontstaan aan I H E Delft, w i l de mogelijkheden van informatie- en communicatie technologie benutten om te helpen bij het oplossen van problemen op het gebied van water en milieu. Daartoe worden op computer gebaseerde methoden en technieken ontwikkeld die kunnen worden toegepast voor het analyseren van gegevens, het simuleren van voorgenomen maatregelen en het ondersteunen van beleidsbeslissingen. De groep Environmental Hydroinformatics bij U N E S C O - I H E werkt nauw samen met W L I Delft Hydraulies en met T U Delft. Het promotie onderzoek in dit proefschrift richt zich op het ontwikkelen van een software instrumentarium waarmee kan worden gewaarschuwd voor het optreden van plaagalgen. Binnen het interdisciplinaire vakgebied van de ecohydraulica wordt de hydrodynamica gekoppeld aan de ecodynamica, met als doel om het gedrag van geintegreerde systemen te kunnen simuleren en het effect van beoogde maatregelen na te gaan. Internationaal bestaat hiervoor veel belangstelling; onderwerpen van onderzoek zijn ondermeer het modelleren van waterkwaliteit, het nagaan van effecten van eutrofiëring, het herstellen van natuurlijke siviersystemen, etc. Een van de uitdagingen binnen het vakgebied van de ecohydraulica betreft de integratie van verschillende technieken tot een betrouwbaar en samenhangend (software) instrument geschikt voor praktische toepassingen. Het identificeren van karakteristieke tijd- en lengteschalen van de onderhavige processen blijft altijd een lastige opgave; het koppelen van processen met sterk uiteenlopende karakteristieken vereist steeds weer een majeure inspanning. Dat geldt ook voor het analyseren van de vaak schaarse meetgegevens uit het veld, alsmede voor afleiden daaruit van karakteristieke processen. Sommige ecohydraulische modellen zijn gebaseerd op fysische principes zoals behoud van massa, impuls en energie die kunnen worden weergegeven met behulp van wiskundige vergelijkingen. Echter, het is niet altijd mogelijk om in deze vergelijkingen ook de chemische, biologische en ecologische proeessen mee te nemen. Dit geldt in het bijzonder als (nog) niet alle processen tot in detail bekend zijn en begrepen worden. Vandaar dat veel ecologische modellen zich beperken tot het beschrijven van het gedrag van geaggregeerde grootheden, zoals biomassa. Deze modellen hebben uiteraard hun beperkingen aangezien (I) geen rekening wordt



Samenvatting

gehouden met de verschillende soorten waaruit deze biomassa bestaat, en (2) lokale interacties tussen deze soorten worden verwaarloosd. Het gevolg daarvan is dat details in ruimtelijke structuren niet goed kunnen worden weergegeven. De integratie van numerieke simulatiemodellen voor hydrodynamica en ccodynamica vereist nog het nodige onderzoek; daaraan een bijdrage leveren is precies het aandachtsgebied van het Environmental Hydroinformatics onderzoek bij U N E S C O - I H E and W L | Delft Hydraulics. Z o is in dit proefschrift nagegaan welke mogelijkheden Cellulaire Automata ( C A ) kunnen hebben op het gebied van ecologisch modelleren. C A zijn discrete wiskundige systemen in ruimte en tijd gebaseerd op relatief eenvoudige lokale interacties waarmee complexe ruimtelijke structuren kunnen worden gerealiseerd. Numerieke experimenten gebaseerd op de "Game o f L i f e " en E c o C A (een in het kader van deze thesis ontwikkeld software systeem voor het simuleren van jager-prooi gedrag) tonen aan dat C A een uiterst relevante techniek kan zijn voor het beschrijven van ecosysteem dynamica, zij het dat het vaststellen van de evolutieregels nog het nodige onderzoek vereist. Een mogelijkheid zou zijn om deze regels af te leiden uit meetgegevens, alhoewel die vaak beperkte resolutie hebben in tijd en ruimte. Vandaar dat deze gegevens vaak moeten worden aangevuld met kennis van het gedrag van achterliggende processen. Technieken die zich daarvoor goed lenen zijn Fuzzy Logic ( F L ) en Expert Systemen. In het kader van een recent EU-project H A B E S (Harmful Algae Blooms Expert System), een internationaal consortium waarvan W L | Delft Hydraulics als projectleider optrad, is de kennis van verschillende internationale experts samengebracht in een software systeem. Toepassingen voor Taihu Lake (China) en voor plaagalgen bloei langs de Nederlandse kust geven aan dat fuzzy logic in combinatie met expert systemen zeer bruikbare technieken om ecosystemen te modelleren, zeker wanneer relatief weinig specifieke gegevens bekend zijn van het betreffende gebied. De F L techniek die in het kader van dil proefschrift is ontwikkeld is gebruikt om de C A module te voorzien van evolutieregels; het resulterende F u z z y C A systeem is vervolgens geïntegreerd met de hydrodynamische en transportmodule van Delft3D en toegepast op een praktijksituatie voor de Nederlandse kust. De resultaten lijken veelbelovend. A l s algemene conclusie geldt dan ook dat integratie van numerieke en datageorienteerde technieken in combinatie met kennissystemen kan leiden tot een praktisch bruikbaar modelinstrumentarium op het gebied van de ecohydraulica. Aangezien software systemen een steeds belangrijker instrument worden om kennis op te slaan en over te dragen, zij de resultaten van het onderzoek als beschreven in dit proefschrift geïmplementeerd in de Delft Hydro Systems van W L | Delft Hydraulics.

Acknowledgements This PhD thesis is a rcsult of the research programme in Environmental Hydroinformatics at UNESCO-IHE, carried out in close collaboration with W L | Delft Hydraulies (WL) and Delft University of Technology (TU Delft). The research was financed by the Department of Strategie Research & Development (S&O) of WL, as part of its ongoing strategie research and development programme. 1 would likc to express my sincere appreciation to al! organisations and persons that directly or indirectly motivated me to conduct this research. Special thanks are duc to UNESCO-IHE, in particular to the department oi' Hydroinformatics and Knowledge Management (HIKM) for the platform it provided lor education and research. Whcn starting my MSc thesis at IHE, exploring the C A paradigm, 1 could not suspect that I actually set out for a much longer way ... to pursue PhD research. This was made possible by the tlnancial support from W L I Delft Hydraulies. I am very grateful for this and for having experienced the stimulating working environment at the S&O Department. Without this, it would have been very hard to carry out this research. Also, the cooperation with Delft University of Technology that developed during the course of this research programme is much appreciated and could well be the starting point of continued future collaboration. My deepest appreciation goes to my promotor Professor Arthur Mynett, who provided me this opportunity and offered me the guidance when I most needed it. As a professor in Environmental Hydroinformatics at UNESCO-IHE and T U Delft, he introduced me to the long way of doing tedious research and taught me how to pursue even in düTicult times. As Head of the S&O Department at W L | Delft Hydraulies he showed me the importanee of planning my research and communicating intermediate results with colleagues from other departments and disciplines. Moreover, he provided me the opportunity to travel with hini to workshops and conferences around the world, allowing me to gain cxperience in presenting my work and stimulating me to publish in scientific journals. My sinecre gralitude is givcn to Dr. Anthony Minns (IHE / WL) for introducing me to the topic of C A , to Dr. Henk van den Boogaard (WL) and Dr. Vincent Quinot (IHE) for guiding me through the initial steps of my research, and to Emeritus Professor Michael Abbott (IHE / WL) for stimulating me right from the start. Dr. Hans Goossens, Ir. Anouk Blauw and Drs. Hans Los provided considerable support on delicate biological and eeologieal issues. I am much indebted to Prof Guus Stelling (TUD) and Ir. Leo Postma (WL) for their strict rcvicws and constructive discussions. I am much indebted to my parents and to my wife Mrs. Hengjin X i a who continuously sent me their long distance support during my studies in Delft. Their love and care always accompanied me on my walk through the long joumey ... and helped me to pursue. 1 sincerely hope that soon 1 will be able again to take my responsibility as son and husband. I deeply appreciate my fermer employers Mr. Yuan Hongren and Mr. Weng Lida who provided me the first chance to come to study in Delft and encourage me all the time to keep on pursuing. I acknowledge deeply the encouragement from my friend Mrs. Ying L i who also coached me playing badminton. I w ish her very suceess in her upcoming PhD defence. I am also grateful to all my friends: we help and learn from each other.

Qiuwen Chen March, 2004 Delft, The Netheriands

List of notations Roman lower case a üij

iiilrinsic growlh rale ot'prcy area ol'patch //

S^^.

state o f cell//• at time /

SDI Kj

Sharon diversity index light attenuation coëfficiënt coëfficiënt of determination

[-] [m]

[m']

RMSE Sr SD(S) S(CJ) T T.„„ U

root mean square error s a m p l e s o f subset r in samples S Standard deviation o f sample S source term temperature optimal g r o w t h temperature uptake rate maximum uptake rate random variable

^„

k dimensional time series o f Icngth n

z z. z,. z„,

water depth compensalion dcplh critical depth mixing depth photic depth

[m] [m] [m] [m] [m]

carbon oxidise concentration

[mol/1]

ammonia concentration nitrite concentration nitrate concentration

[mg/1] [mg/1] [mg/1]

2p [Hco;'] [NH4] [NO2-]

[NO,-]

["C] ["C] [|amol/h] [|amol/h]

[-]

Greek lower case a

functional response o f prey to predator functional response o f predator to prey

P

[day'] [day-']

S t a n d a r d deviation o f variable k

7 T k

Newton iteration convergencc factor

'1

scale factor o f Wavclel transform learning rate

time Step lattice space neighbourhood function

[unit time] [unit length]

c.

centring point of Waveiet transform

ïl,

k dimensional normalised time series o f length n

^.a P.

normalised value o f variable k in time series o f len gth n mass density o f particles parameter in spatial correlation

Pv.v.Av

spatial correlation between x and x+Ax

P

derivative on time u V Vi co (d,z. (0*

(0(1])

1)

sinking or buoyancy velocity o f species lattice viscosity the fired degree o f rule ; species concentration species concentration at depth z. time / species concentration at stationary state Waveiet transform Waveiet variance

[m/s]

[-] [cell/1] [cell/1] [cell/1]

growlh rate pil) //g(x) A, 00

maximum growth rate light-dependent speeifie growth rate membership function o f fuzzy set B o f variablex collision operator infmity

[day [day [day

[-]

List of abbreviations Al ANN ANOVA AFDW ARNN ASP AZP BOD CA CFP Chia DIN DIP DO DOP DSP EcoCA EU FFT FHP FL FuzzHAB GA GIS HABES HABs HPP lAHR IBM LV MEP NS NSP NW ODE OMS OOP PAR PCA PDE PSP RS RMSE SOFM TBA

artitlcial intelligence artificial neural networks analysis o f variance ash frec dry weight auto-regressive neural networks amnesie shellllsh poisoning azaspiracid shellfish poisoning biological oxygen demand cellular automata ciguatera fish poisoning chlorophyll a dissolved inorganic nitrogen dissolved inorganic phosphorus dissolved oxygen dissolved organic phosphorus diarrhetic shellfish poisoning C A based ecological model developed by author European Union fast Fourier transform Frisch, Hasslacher, Pomeau fuzzy logic C A based fuzzy logic model for H A B s developed by author genetic algorithm geographic information system harmful algal bloom expert system harmful algal blooms Hardy, de Pazzis, Pomeau international association o f hydraulics research individual based modelling Lotka-Volterra multiple layer percepton Navier-Stokes neurotoxic shellfish poisoning Noorwijk ordinary differential equations open modelling systems object-oriented programming photosynthetically active radiation principal component analysis partial differential equations paralytic shellfish poisoning remote sensing root mean square error self organization feature maps Taihu Basin Authority

Chapter 1

Introduction 1.1 B a c k g r o u n d

As a conseqiience o f global populalion growlh, increascd urbanisalion and nonsustainable e.xploration o f water resources around the world, parts ol" the global aquatic environment are deteriorating rapidly. The loss o f biodiversity and the degradation o f ecosystems are increasing. The impacts on local livelihoods and economies are apparent in many places around the world ( W W F 3 , 2003). In order to achieve sustainabie development, the 1" World Water Forum in Marrakech (1997), the 2™' World Water Forum in The Hague (2000) and the 3'''' World Water Forum in Kyoto, Shiga and Osaka (2003) brought together tens o f thousands o f participants. The aim was to focus the world's attention on the importance o f aquatic resources and stimulate direct action towards meeting the ditTicult challenges o f improving water resources management, conserving natural resources and improving peopie's lives around the world. Meanwhile, the World Water Council has been promoting awareness and building political commitment on critical water issues at all levels, including the highest decision-making level, to facilitate the efficiënt conservation, protection, development, planning, management and use o f water in all its dimcnsions on an environmentally sustainabie basis for the benefit o f all life on earth. These worldwide events all identitled that information and knowledge sharing is vital to the effective management o f water resources and to environmental sustainability ( W W F 3 , 2003). A t the United Nations Johannesburg Summit (2002), it was concluded that technology development and transfer are required to promote world water security (Rijsberman, 2003). U N E S C O - l H E , Institute for Water Education, formerly known as IHE Delft is contributing to the education and training o f water professionals and lo building the capacity o f sector organisations, knowledge centres and other institutions acti\e in the fields o f water, environment and infrastructure in developing countries and countries in transition. Evolving water management and policy issues create new information and knowledge demands. while advancing information and communication technologies open up new avenues for knowledge partnership and coiiaboration across geographic, political, scholarly, societal and institutional boundaries. The fields o f Hydroinformatics. originated at I H E Delft (Abbott, 1991), centres on applying advanced information technology to water reiated problems. Hydroinformatics takes computational hydraulics as the core (Abbott and Minns. 1998) and encapsulates many other techniques in particular artificial intelligence (Babovic, 1996). Since its emergence, Hydroinformatics has been actively involved in developing tools and technologies for engineering, management and knowledge dissimilation orientated to aquatic environment (Price and Mynett, 2002). The Environmental Hydroinformatics group at U N E S C O - l H E is working closely together with W L | Delft Hydraulics on ecohydraulics issues (Mynett, 2002a, 2002b). Ecohydraulics is an interdisciplinary subject which couples hydrodynamics and ecodynamics. Il dcvelops ecosystem approaches to water management, methods for

2

Cellular Aulomala and Arlillcial Intelligence in Ecohydraulics Modelling

assessing and protecting ecosystems that providc invaluable goods and services to societies, and ways in which sustainable management can lead to conservation in order to improve people's lives, important related topics include the restoration o f wetlands, lake conservation, environmental flows, and water treatment. The PhD research reported hcre was conductcd within the framework o f Environmental Hydroinformatics, with a special interest to the development o f Instruments for modelling eutrophication and harmful algal blooms ( H A B s ) prediction. H A B s in fresh or brackish water distinguish two groups o f organisms: the toxin producers and the high biomass producers. The toxin producers can kill (shell) fish and contaminale seafood, causing illness or even death o f its consumers. The highbioniass producers can lead to anoxia and deterioralion o f aquatic ecosystems after teaching dense concentrations. However, no matter what kind o f organisms they are, all the blooms usually exhibit strong local behaviour and patchy dynamies, as observed from the satellite or remote sensing images (Fig 1.1-1.2). Taihu Eake is localed in the most developed and highly populated Yangtze Delta, and is the third largest shallow freshwater lake in China. Due to excessive discharge from agriculture and sewage, the lake exhibits serious eutrophication problem. The nearly annual Cyanobacteria (blue-green algae) bloom in the lake ( F i g 1.1) caused great problems on aquatic fisheries and the water supply to the Yangtze Delta ( T B A , 1998; Chen 2001). To support sustainable development o f the Yangtze Delta, some restoration projects are under implementation aimed at improving the water quality and aquatic ecosystem o f the lake. However, before initiating any engineering work, it is indispensable to have Instruments to study the efficiency o f any such projects.

Fig 1.1 Remote sensing image of Cyanobacteria bloom in Taihu Lake, China In the tropical Hong Kong waters o f the South China Sea, there is a long record o f harmful algal blooms because o f the warm weather and high input from the Pearl River. The blooms lead to great economie loss due to flsh mortality, shellfish contamination and even human casualties (Lee, et al., 2003). In 2000, the algal blooms in the Hong Kong waters killed 650 tons llsh and resulted in $70 million loss (Fig. 1.2). Although pollution abetmenl and water quality improvement are essential ways to mitigate the disasters, they usually take times as long as decades. Hence, for the time being, forecasting and early warning o f H A B s are seen to be important measures to minimise the economie loss (Lee, 2002).

Introduction

3

FIg 1.2 Algal bloom in the Mirs Bay, Hong Kong in 2000(left) and the fish mortality (right)

Eutrophication and H A B s are not only problems in developing countries, but also serious issues in developed regions. Lake Veluwe (Fig 1.3) is an artificially isolated part o f a large Lake IJssel in the centre o f the Netherlands. It was formed by the construction o f dams in the south-east part o f the IJsselmeer in 1952. The main function o f this lake is groundwater retention and recreation. According to the longterm documentation, the submerged vegetation in the lake has greatly shifted after its formation due to the change o f nutriënt loading (Van den Berg, 1999). Before 1968, the water in the lake was clear, with diverse macrophyte vegetation. From 1970 to 1989, the water became turbid and Chara aspera completely disappeared, and bluegreen algae blooms onset occasionally due to eutrophication caused by the discharge of wastewater from some small cities around the lake. From 1979, measures were taken to reduce the phosphorus loading in the lake (Hosper, 1997), including wastewater treatment and lake tlushing. Alter 1990, C. aspera colonised steadily and replaced the dominance of Potamogeton pectinatus gradually. In order to review and understand the effectiveness o f the restoration measures, there is a demand o f research to invcstigate the compctition and succession processes between P. pectinatus and C. aspera.

Fig 1.3 Location of Lake Veluwe, the Netherlands

Yet another application reiated to the Dutch coast that receivcs drainage from the river Rhine and Meuse and is one o f the most productive fishing areas in the world. In

4

Cellular Automata and Arlificial Intelligence in Ecohydraulics Modelling

the past 20-50 years, the increase o f nutrients diseharged by the rivers bas led to eutrophieation o f the coastal zones. Spring phytoplankton blooms oceur frequently that are dominated by diatoms and Phaeocystis globosa (Fig 1.4). The blooms o f P. globosa usually bring thick foams to the beach and terrible smells under certain wind eonditions, which severely hamper the recreation. The sedimentation o f dead algae can cause anoxia eonditions in the bottom layer, which can lead to massive death o f mussels and great economie loss. The P. globosa bloom in year 2001 alone caused a loss o f €20 million (Peperzak, 2002). The E U - H A B E S (Harmful A l g a l Blooms Expert System) project is espeeially dedicated to the construction o f a rule based warning system for H A B s in European coastal waters. One o f the goals o f the project is to bring together available knowledge on physical and biological processes involved.

Fig 1.4 Satellite image of P. globosa bloom in Duteh Coast of the North Sea in 2003 (left), the foams after P. globosa bloom (right)

In general, eutrophication and H A B s are serious problems for global aquatic ecosystems. Prevention o f water pollution and protection o f aquatic ecosystems are becoming mandatory issues to be considered in water-related activities. This implies that engineers, scientists and decision-makers alike urgently need effective tools for integrated management o f the aquatic environment. Ecohydraulics modelling can be an important instrument to support these demands (Franks, 1997). 1.2 S c o p e o f t h e r e s e a r c h

Ecohydraulics modelling is an interdiseiplinary subject that tries to relate eeologieal processes to the hydrodynamics o f open water bodies. This area is drawing increasing attention in international research and applieation activities such as l A H R Hydroinformatics Conferences (Mynett 2002a, 2004), N A T O Advanced Research Workshop (Mynett, 2001); Regional Workshop on Coastal Eutrophication-Hong K o n g (Mynett, 2002b) and the Third World Water Forum (Mynett, 2003). However, literature review shows that so far few breakthroughs have been made in this field, espeeially with respect to the integration o f hydrodynamie and eeologieal modelling (Donaghay and Osborn, 1997; Chapra, 1998). Most o f the currently available ecohydraulics models start from a physical description o f the hydrodynamie and transport processes formulated in terms o f partial differential equations (PDEs). However, due to the high complexity and non-linearity o f ecosystem behaviour, the

Introduction

5

possibility o f formulating ecological processes in this way still requires considerable research. A s a result, the validity o f these models remains doubtful (Recknagel, 1997). Besides, these models are usually aggregated lumping species into biomass and formulating the dynamics in ordinary or partial differential equations form, e.g. the Lotka-Volterra model for the description o f prey-predator system ( M a y , 1976). They may violate at least two fundamental ecological considerations, viz. (I) that properties of individual species can be significantly different from each other, and (2) that interactions often take place locally. Therelbre, these models usually fail to capture the local behaviour and patchy dynamics that are commonly exhibited in most ecosystems, and the modelled results somclimes become categorically unacceptable (Minns, et al., 2000; Wootton. 2001; Chen et al., 2002; Chen and Mynett, 2003). In the tlelds o f eutrophication and H A B s , it has been widely recognised that the problems are multidisciplinary and complex since hydrodynamic, chemical and biological processes take place simultaneously. The accurate modelling o f the underlying physical processes is the key to the prediction o f the initiation, growth, competition and species composition (Huisman, et al., 1999; 2002; Chen et al., 2004; Chen and Mynett, 2004). The performance o f eutrophication and red tides models has so far been restricted by the insufficiënt ability to integrale both the biological and the underlying physical processes (Chapra, 1998; Verkhozina, et al., 2000; Chen, et al., 2003). Therelbre, a reliable model to predict initiation, transport, competition, succession and colonisation o f aquatic species should successfully couple hydrodynamic modules with ecological modules (Chen, 2000). The integration o f hydrodynamic and ecological processes consequently brings up the issue o f spatial-temporal scale analysis and multiple scales coupling. Spectral analysis and Waveiet analysis are particularly suitable to invcstigate spatial-temporal scales o f processes and are presently employed to examine the characteristic scales o f physical and biological properties (Molincr, el al, 1996; Deutschman, et al, 1996; Katul, et al., 2001; Borcard and Legendrc, 2002). Once determined, the principal spatial-temporal scales can be used for multiple scales coupling (Lunati, et al., 2001; W u and David, 2002; Chen et al., 2002). In order to take into account spatial heterogeneity and local interactions and to capture pattem dynamics, cellular automata ( C A ) and individual based modelling ( I B M ) paradigms are explored in our research group. The I B M expresses the population dynamics by describing the individuals and the messages passing between them to trigger appropriate behaviours. It has been widely used in modelling o f fish and forest dynamics, and is expected to be a promising ecological model paradigm in the coming years (DeAngelis and Groos, 1992; Masahiko, et. a l , 1991; Hong, et. al, 1999). Several important research teams including Gak Ridge National Laboratory, Applied Ecology at University o f Amsterdam, Institute for Forestry and Nature Research in University o f Wageningen, U N E S C O - l H E and W L | Delft Hydraulics (Mynett, 1999) are heading the application o f I B M . Current research activities include prediction o f densily-dependent dynamics o f smallmouth bass population (DeAngelis and Godbout, 1991), superindividuals for modelling large populations (Scheffer, et. al., 1995), modelling o f marine fish early life history (Hinckley, et. al., 1996), modelling o f predator-prey functional response (Blaine and DeAgelis, 1997), experts system o f animal foraging simulation (Carter and Finn, 1999), modelling o f juvenile salmon migrating (Petersen and DeAngelis, 2000), modelling o f lake and river (shell)fish communities (McDermot and Rose, 2000; Morales, et al, 2003). However, they are still at the theoretical and analytical stage; besides, the integration with underlying physical processes is not al ways well explored (Anderson, 1995; Chapra, 1998).

6

Cellular Autoniala and Artificial Intelligence in [icohydraulics Modelling

C A based models attempl to reproduee eomplex spatial-temporal dynamic patterns by some simple loeal evolution rules. A well-known example is Conway's game o f life (Minns, et al., 2()()()). Cellular automata bave shown to be a viable paradigm in spatially-explicit models (Wolfram, 1984; Wortmann, et al., 1997; Chen et al., 2002) and have been widcly applied to eeologieal modelling (Karafyllidis and Thanailakis, 1996; Wootton, 2001; Chen and Mynett, 2003). Recent research in C A and its applieation to ecosystems on a large scale include: the evolution o f urban land-use patterns (Engelen, et al., 1998), vegetation dynamies (Balzter et al, 1997), rainforest dynamies (David and Richard, 2000) and many others. On a smaller scale, recent studies consist o f population dynamic o f animals (Gronewold and Sonnenschein, 1998; Sirakoulis, et. al., 2000), spreading o f phytoplankton species (Babovie and Baretta, 1996). Some other applications include patchiness analysis o f marine species (Caswell and Etter, 1996) and prey/predator system dynamies (Minns, et. al, 2000). Compared to aggregated-based models, both I B M and C A type models have the flexibility to implement individual property differences and local interactions. The main diserepaney is that I B M takes individual species as the study object (similar to a Lagrangrian approach in classical Huid mechanics), while C A model schematise the investigated space into cells and takes each individual cell as study object (similar to an Eulerian approach). However, eutrophication and H A B s are not only governed by local interactions, but also determined by extemal foreings. The evolution rules o f traditional cellular automata depend entirely on geometrie relations, which are obviously not enough for eutrophication and H A B s modelling. One potential approach is to extend conventional cellular automata to include external foreings in the local evolution rules, either in a deterministic or/and a stochastic way. Eutrophication and H A B s involve a number o f different processes, o f which some such as hydrodynamics can be investigated in detail, while many others still remain unclear. However, there are other techniques such as fuzzy logic (FE) and decision trees that can be ineorporated into eeologieal modelling. This implies that fuzzy logic techniques could be combined with cellular automata paradigm for the formulation o f empirical evolution rules other than deterministic or stochastic rules. Such integration is not only necessary, but also possible because both C A and F L have finite values and are rule based modelling approaches. Therefore, the full setting o f the model developed within the framework o f this thesis for the prediction o f eutrophication and red tides is to ultimately couple the well-established numerieally based hydrodynamie models like Delft3D with the fuzzy cellular automata based eeologieal modules (Chen, 2000; Chen and Mynett, 2004). Thus the objective o f the research is to develop an integrated numerical-artificial intelligence ( A l ) eutrophication/HABs model which applies cellular automata as an eeologieal module paradigm. This research will take the hydrodynamie and transport modules o f the Del fl3D-system as a starting point and focus on contributing to further development and extension o f the D e l f t 3 D - E C 0 to sustain the state-of-the-art ecohydraulics module. Practical applieation areas involve the modelling o f prey/predator population dynamies and system stabilities, competitive growth and species succession o f underwater macrophytes in eutrophic lakes, harmful algal blooms in fresh and brackish waters. Part o f the research results are contributed to the E U - H A B E S project which involved 13 institutes / universities from 9 E U countries and is headed by W E | Delft Hydraulies and supervised by an international expert panel.

Introduction

7

1.3 O u t l i n e o f t h e t h e s i s

The thesis is structureel in 5 chapters where Chapter 1 introducés the research background and motivation, the description ol" the research problems and the reiated methodologies. Chapter 2 elaborates the fundamental aspects o f the cellular automata model paradigm, some initial applications o f C A to lattice-gas hydrodynamics and "Game o f Life", the modelling o f prey predator system, effects o f C A computational stencils, and the remaining problem about local rules formulating. Chapter 3 presents the problem o f harmful algal blooms and the reiated physical, chemical and biological processes involved, the numerical, rule based and integrated modelling o f H A B s , and the remaining problem about capturing local behaviour and patchy dynamics. Some case studies are given to substantiate the strength o f rule based approaches such as fuzzy logic and decision trees in H A B s modelling. Chapter 4 starts at the C A modelling o f competition and succession o f macrophytes in the eutrophic Lake Veluwe and integrated numerical and decision tree modelling o f H A B s . The research finally coupled the developed rule based technique and cellular automata paradigm and conducted a case study on modelling H A B s along the Dutch coast. The integrated model took Del 113D as the starting point and incorporated the methodologies described in chapter 2 and chapter 3. Chapter 5 concludes the research activities and highlights the fmdings o f the research campaign. In addition, unsolved problems and recommendations for future explorations are outlined.

Chapter 2

Cellular Automata modelling paradigm A brief historical review o f the origin and further development o f Cellular Automata ( C A ) is presented in this chapter. The main purpose is to emphasise the fundamental aspects o f cellular automata and to show some initial applications to modelling hydrodynamics and prey predator system dynamics. The effects o f cellular automata computational stencils are also studied. 2.1 2.1.1

F u n d a m e n t a l s of c e l l u l a r a u t o m a t a A brief historical review

Cellular Automata are discrete dynamical systems, in which many simple components act together locally to produce complex pattems on a global scale, which may exhibit "self-organising" behaviour. The idea o f a cellular automaton was proposed by V o n Neumann in the 1940's when he was interested in finding a logic abstraction (machine) that could have self-control and even some self-repair mechanism (Wolfram, 1983; Chopard and Droz, 1998). During the following decades, cellular automata have been developed and reformatted in many different ways, and have been applied to very broad fields. The flrst cellular automaton as conceived by V o n Neumann (1949), was composed of a two-dimensional square lattice with 200,000 elementary cells, and each o f these cells had up to 29 possible states. The evolution rules depended on the different combinations o f the state o f each cell and its four nearest neighbours, located in north, south, west and east directions (Burks, 1970). Due to complexity, V o n Neumann's cellular automaton has only been partially implemented on a computer. However, V o n Neumann had succeeded in finding a discrete structure o f cells that bear within themselves the recipe to generate new identical individuals, viz. self-production (Chopard and Droz, 1998). V o n Neumann's work on self-reproducing automata was completed by Burks (1970). After V o n Neumann, the research on cellular automata still followed the same line o f self-reproduction and computational universality, and remained largely focused on theoretical aspects (Codd, 1968; Preston, et al., 1984). In 1970, John Horton Conway, a young mathematician at Gonville and Caius College in Cambridge introduced bis now famous game of life via Martin Gardner (1970). Conway's cellular automata were quite simple and had only two states, compared with V o n Neumann's twenty-nine; a cell would be either alive or dead. Very rich behaviour appeared from even such a simple form o f automata: "Lifeforms" of gliders, blinks, stills, small and large oscillators, etc. Some simulation snapshots o f the game and life forms are given in section 2.2. The system is seen to always lead to stable states after sufficiënt iterations. It is precisely this simple "ecological" behaviour that has brought the concept o f cellular automata to the attention o f a broad audience (Beriekamp, et al., 1982; Wolfram, 1986; Minns, et a l , 2000). A t the beginning o f the I980s, Stephen Wolfram studied the statistical mechanics of one-dimensional cellular automata, which is now called Wolfram's Cellular Automata (Wolfram, 1983). He found that a cellular automaton is a discrete dynamical system that could exhibit many o f the behaviours encountered in continuous systems, but in a much simpler framework. He also studied the computational theory o f cellular

1O

Cellular Automata and Artillcial Intelligence in Hcohydraulics Modelling

automata (Wolfram, 1984a) and investigated cellular automata as models o f complexity (Wolfram, 1984b). A l s o in the 1980s, it was recognised that the lattice-gas models developed in the 1970s (Hardy, et al., 1976) were in fact cellular automata. Lattice gas models are simple discrete dynamical systems in which particles are moving and colliding in such a way that mass and momentum are conserved. Lattice gas models have been successfully used in simulating tluid dynamies (Salem and Wolfram, 1985; Frisch, et al., 1986; Chopard and Masselot, 1999). The relation between microscopie discrete lattice-gas systems and macroseopic continuous tluid dynamies was also set up through the Chapman-Enskog method (Chopard and Masselot, 1999; Wolf-Glasow, 2000). Cellular automata were originally motivated by biological concepts, and have also been adopted to studies o f biological processes. B y about 1951, molecular biologists had identified the structure o f D N A and its re-production mechanism. It is remarkable to find that V o n Neumann's essence o f self-reproduction is self-organisation, viz. the ability o f a biological system to contain a complete description o f itself and use that Information to create new copies. The idea o f cellular automata was recently introduced to study the development o f tumours and H I V infections (Mielke and Pandey, 1998; SlooL et al., 2002). From the 1990s, following ecologists' awareness o f the significant roles o f spatial heterogeneity and local interactions within ecosystems, the development and applieation o f cellular automata were greatly stimulated in environmental and eeologieal fields (Engelen, et al, 1993; Wortmann, et al, 1997). In addition, the conventional cellular automata have been extended to incorporatc external factors, where the evolution depends not only on geometrie relations (state of a cell and its nearest neighbours), but also on external driving factors (Engelen, et al, 1996; Wootton, 2001; Chen, et al., 2002a; Chen and MynetL 2004c). Cellular automata are now an important component o f artificial life, a domain to better understand real life and the behaviour o f living species through computer models (Chopard and Droz, 1998). It is also a different computational paradigm relative to ordinary / partial differential equations (May, 1975) or individual-based schemes (DeAngelis and Cross, 1992; Lett, et al., 1999), espeeially when modelling spatial-temporal dynamies o f physical systems (Chen, et al, 2002a; Chen and Mynett, 2003b). Clearly object-oriented programming (OOP) techniques greatly facilitated the computer implementation o f specitlc algorithms. 2.1.2

C o n c e p t s of Standard cellular a u t o m a t a

Definition of cellular automata Cellular automata are an idealisation o f a physical system where space and time are discrete, and the physical quantities take only a finite set o f values.

Fig 2.1 1 and 2-dimensional cellular automata

A cellular automata system consisls o f a regular lattice o f cells. Each cell bas tmite possible values which are also called states, and the states o f all the cells are updated

Cellular Automata modelling paradigm

simultaneously in discrete time steps according to predefmed local rules ƒ that tbr each cell depend on the last states ofthe cell itselfas well as its neighbourhood. There are 1, 2 and 3-dimensional cellular automata, but mostly the I and 2-dimensional cellular automata have been studied so far (Fig 2.1). According to its dellnition, a cellular automaton is a combination o f data and computing rules that has at least six properties (Chopard and Masselot, 1999): 1. Dimensions, the number o f spatial coordinates n\ 2. Width ofdimension, the number o f cells in the /'^ (J = 1,2,3.../?) coordinate w,; 3. Neighbourhood, a spatial region for a cell to gather information for updating; 4. Width of neighliourhood, the number o f neighbourhood cells in the / * coordinate df, 5. Cell state, the possible values that a cell can take on; 6. Localised evolution rules, the function ƒ which defmes the update o f a cell state at each time step O f these six properties, neighbourhood schemes and evolution rules are considered the most important ones within the context of this thesis, and worth further elaboration. Neighbourhood Schemes Since cellular automata rules are defined locally, a spatial region must be specified for a cell to gather information from its vicinity when updating its state. This spatial region is called neighbourhood. For one-dimensional cellular automata, the nearest neighbourhood contain two cells, located at the left and right (Fig 2.2a). For two-dimensional cellular automata, two types o f neighbourhood are often considered: the V o n Neumann neighbourhood that consists o f a central cell and its four nearest neighbours located in north, south, west and east (Fig 2.2b). It is the neighbourhood that was used in V o n Neumann's first cellular automata. Another important one is the Moore neighbourhood which contains, in addition to the V o n Neumann type. the four neighbours on the diagonal entries, located in north-east, north-west, south-east and south-west resp (Fig 2.2c). The Moore neighbourhood is most widely used nowadays. In ecosystem studies, an extended Moore type neighbourhood is also often applied which takes into account the information from the second nearest neighbours in longitudinal and transvcrsai direction (Fig 2.2d).

3 (a)

(b)

22 222

V, 2"

(c)

(d)

Fig 2.2 (a) neighbourhood of one-dimensional cellular automata; (b) Von Neumann neighbourhood: (c) Moore neighbourhood and (d) extended Moore neighbourhood; the dark cell indicates the central cell which is updated according to the state of the cell and that of the shaded cells

However, during sixty years existence, cellular automata have often been re-shaped conserving the basic concepts. In lattice-gas, cellular automata are arranged in a triangular scheme for reasons o f isotropy (Fig 2.3a, 2.3b). The same reason holds to hexagon automata where the neighbourhood contains six cells (Fig 2.3c). Another important neighbourhood is the Margolus neighbourhood (Fig 2.3d) which is useful in physical modelling, particularly when microscopie reversibility is o f concern.

12

Cellular Automala and Artificial Intelligence in Eicohydraulics Modelling

It maybe valuable to provide some more details o f triangular and Margolus eellular automata. A n advantage o f triangular automala eonsists in their property o f isotropy. Margolus seheme allows a partitioning o f space and reduetion o f rule complexity. Cells are partitioned into finite, disjoint and uniformly arranged blocks usually o f size 2x2. Bloek rules are defined that look at the contents o f a block and update the whole bloek rather than a single cell. The same rules are applied to every block. Blocks do not overlap and no Information is exchanged between adjacent blocks. The partition is ehanged from one time step to the next. It alternates between an odd and an even partition (Fig 2.3d) so that Information can propagate outside the boundaries o f the blocks when evolution takes place.

Fig 2.3 (a) triangular lattice gas; (b) triangular cellular automata; and (c) hexagon cellular. The dark cell indicates the central cell which is updated according to the state of the cell and that of the shaded cells. (d) Margolus neighbourhood of 2x2 blocks. It alternates between the odd partition (thick lines) and the even partition (thin lines) from one step to the next step. The cell labelled o will become • at the next time step with alternative partitioning Local Evolution rules The state o f every cell is updated at each time step according to the local rules /that are functions o f the state o f a cell and its neighbourhood. Mathematically, the evolution o f one-dimensional cellular automata (Fig 2.1) wilh nearest neighbours can be described as: a»' = / « , , a ; , a ; , , )

(2.1)

Similarly, the two-dimensional C A with the nearest nine neighbours (Moore seheme) is denoted by «G' = ƒ ( « ^ i . i ^ i ' '

'

' "'.j'.1'

"Um ' «'+1.7'

)

(2.2)

For two-dimensional cellular automata, the so-called totalistic rules are often considered in which the state o f a cell depends only on the sum o f the values in the neighbourhood that is conveniently specified by a code C (Packard and Wolfram, 1985): C =X./'W^"

(2.3)

where n is the number o f neighbouring cells, and k is the number o f possible states. Each code value refers to a certain rule in a lookup table. The outer totalistic rules are also often considered where the stale depends separalely on the sum o f the values o f its neighbours and on the value o f itself:

Cellular Automata modelling paradigm

13

This can also be specified by a code (Packard and Wolfram, 1985): C = X,/T"."] are concentrations o f light intensity, nitrogen and phosphorus; ki, k\, kp are saturation constants; f(T) is temperature dependency function. Many processes in biological systems have a temperature optimum and some models take it into account such as Jorgensen (1976) ./•(r) = A , „ „ e x p ( - 2 . 3 ^ ^ ^ ^ )

(3.4)

in which k„p,'\% the optimal process rate, and T„i„ is the corresponding temperature. The equation assumes that the temperature dependency is symmetrical around the optimal value. Different types o f light functions have been developed, including or excluding light-inhibition. Vollenweider (1965) has formulated the irradiance dependency as: ./(/)-

, ' — • k,^] + {nk,y

;

(3.5) {^\+{aiyt-

where / is the actual light intensity, ki is the light saturation value and a and kj are constants. Eilers and Peeters (1988) proposed another light dependency function which is also widely applied in phytoplankton dynamic modelling where «, h and c are constants:

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

60

Equation 3.3 presumes tliat the total effect o f several limiting factors is their product. However, it is argued that at any particular time there is only one limiting factor. Broqvist (see Jogensen, 1994) developed a model according to Liebig's minimum laws: // =

/ ( r ) m i n ( - ^ , - ^ , - ^ ) Ci.+k,, C^+k^ Ci+k,

(3.7)

The reproduction o f most algal species is through binary fission, which follows typical logistic growth. Therefore, the growth o f population is usually deseribed by T' order kinetic equation: doJ =/uü)

„. (3.8)

dt where a is the population and 1^ is growth rate. In reality, a population can not grow exponentially to infinity because some enviromnental factor w i l l soon become liiniting. The equation o f population dynamies is thus modified as do)

= ra(\-0}IK)

(3.9)

dt where r is the intrinsic growth rate K is the carrying capacity. From the equation 3.9, it is easy to see the differenee between r-strategy and AT-strategy. A l g a l bloom is a typical /--strategy that is characterised by rapid development, high mortality and short life-span. Competition Inler- and intraspecific competitions between individuals can be direct and indirect. A frequently observed direct interspecifie competition is allelopathy where a species excretes chemical substances in the environment which inhibit or stimulate growth and development o f other organisms. It is concluded from tleld measurements and laboratory results (Gentien, 1990; A r z u l , et al., 1993) that some dinoflagellate such as Gyrodinium aureolum severely hamper the growth o f diatoms through the release o f inhibitory compounds. Indirect competition which is also called exploitation involves the use o f limited resources, including physical space, nutrients and light. B y consuming the same resource, a species lowers its availability and thus exerts a negative effect on other species. Indirect competition generally does not affect the maximum algal biomass o f a system, but they are important to species composition. Although there are numerous growth limiting factors, a phytoplankton community is usually limited by few resources at a particular time (Huisman, 1997). Therefore, coexistence o f a small number o f phytoplankton species would be expected (Hutchinson, 1961). In reality, phytoplankton communities are generally characterised by high diversities, which is known as the famous Hutchinson's paradox o f plankton. A possible solution to this paradox is spatial heterogeneity and temporal variability on very small time-scale. Recent developments in resources competition theory have indicated that multiple resources competition may generate complex dynamies such as oscillations and chaos (Huisman and Weissing, 1999) which create the potential for coexistence o f more species than the number o f limiting resources. A s a general rule, diversity o f phytoplankton communities is related to the degree o f eutrophication o f the system: the higher the eutrophication, the lower the biodiversity. During an algal bloom, often

61

Rule hascd modelling techniques and applications to algal hlooms

one species becomes very dominanl. Therefore, forecasting algal blooms depends on the ability to identify the exact causes for the catastrophic event o f sudden dominance and massive reproduction ofthe blooming species. Cell size can be an important factor during competition for nutrients since the surface/volume ratio is crucial to nutrients uptake, for instance the change from colonial matrix to solitary cells o f Phaeocystis globosa under phosphorus depletion (Peperzak, 2002). Migration and swimming Some phytoplankton species like dinotlagellates and ciliates showed vertical migration cued to day-night cycles. There are some important advantages to the vertical migration: (I) Enhancement o f competition. Dinollagellales have a relatively high light saturation constant, so they migrate to the surface in the daytime for active photosynthesis where other species may be inhibited. A t night, they migrate downwards to deeper waters where nutrients are not so depleted as near the surface. This behaviour is more significant in stratitled systems.

O KJ

O (1)

(2)

(3)

O

(4)

(5)

(6)

Fig 3.10 Horizontal transport of migratory organism by vertical migration between two water masses flowing in different directions (from Barnes and Hughes, 1982)

(2) Horizontal transport. Vertical migration, both on a daily and on a seasonal time scale, may aid a population both in maintaining its location in a very dynamic environment and in dispersal. It is often the case that different layers o f water travel in different directions or at different speeds. B y moving downwards and then upwards, an organism enters different currents thus changes its horizontal position in the photic zone, as illustrated in Fig 3.10. Understanding the relation between vertical migration and horizontal transport is very important to explain some onshore algal blooms which in fact are originally initiated offshore, as observed in e.g. the G u l f o f Maine (Anderson, 2002). Some phytoplankton can swim relative to the water to increase their nutrients uptake (Pasciak and Gavis, 1974). However. it is concluded that swimming is not a very effective method to alleviate diffusion limitation in low Reynolds conditions. In order to elaborate this conclusion, the "stirring" concept given by Berg and Purcell (1977) is applied. The time required for swimming distance L, is Lr/u, where L , is the distance and ;/ is speed. On the other hand, the time required for transporting a nutriënt by diffusion is L//D, where D is the diffusion coëfficiënt. The effectiveness

62

Cellular Automata aiul Artillcial Intelligence in Ecohydraulics Modelling

oF swimming versus diffusion, for any given distance and diffusion constant, is given by the equation time for transported by diffusion time for movement

L'ID ^ =^ = L/i/D LJii

(3.10)

For scales on the order o f 1 //m, the ratio works out to approximately 10"". Diffusion is about 100 times faster than their own movement. Therefore, in the world of low Reynolds numbers, very little is gained by swiinming. The only possible advantage to the organisins that undertake locomotion is that they might encounter nutrients in a higher concentration (Purcell, 1977, Sommer, 1988). Phytoplankton species like Phaeocystis globosa have no ability to migrate or swim, but can develop a eertain buoyancy by Ibnning a large matrix (Riegman, et al., 1993; Peperzak, 1993) to increase their competition capability. Loss Species loss can occur in a number o f ways, including respiration, cell lyses, sinking and grazing - o f those grazing is the most important. Photosynthesis lakes place only in the daytime when light energy is available, but respiration happens all the time. Complete respiration can be expressed as C„H,,0,+603 ^6C0,+6H,0+energy

(3.11)

Respiration consumes oxygen. During algal blooins, a large ainounl o f dissolved oxygen is used by phytoplankton for respiration, which sharply depleles oxygen content at night and causes massive fish mortality, for example in Taihu Lake (Chen, 2001a). Respiration rate is an important characteristic o f species which is related to their competition ability. When the environment becomes stressful due to light or nutriënt limitation, phytoplankton will quickly settle down to the bottom and be decomposed by bacteria. In an algal bloom event, a large amount o f dead algae sinks to the bottom, which leads to anoxia o f deep water. The bloom oi' Phaeocystis glolwsa along the Duteh coast in 2001 caused € 20 million loss o f blue inussels Mytihis ediilis (Peperzak, 2002). Grazing is mostly the dominant factor to algae loss and is the only top-down control o f algal blooms. In tropical oceans, the increase o f phytoplankton is quickly consumed by grazers and algal blooms can hardly develop. Therefore, there is seldoin a phytoplankton or herbivores peak compared to temperate and polar regions. The differenee in seasonal patterns o f phytoplankton biomass between North Pacific and North Atlantic is also explained by grazing acti\ities. There are usually prominent spring blooms o f phytoplankton in the North Atlantic, and the mechanism is that in winter turbulent mixing o f the entire water column brings nutrients to the surface waters, after which thermal stratiflcation due to increased irradiance sets in and the phytoplankton are held in the euphotic zone so that they can inultiply quickly (Mills, 1989). In the North pacillc, because o f the continued biological production throughout the winter, the zooplankton population is ready and able to prevent a build-up o f phytoplankton biomass when the phytoplankton production increases in spring with the shallowing o f the mixing layer (Evan and Parslow, 1985; Boyd, et al., 1995a, b). Hily (1991) stated that the grazing o f benthic suspension feeders is one o f the factors controlling eutrophication in the Bay o f Brest, France despite the high inputs o f nutrients.

Rille based modelling lechniqiies and applicalions to algal blooms

63

Disruption o f natural grazer populations or mismatches between algae and grazers may play an important role in red tide dynamics. Buskey et al (1997) reported that protozoan and planktonic grazers were almost eliminated and benthic biomass also declined before the onset ofthe brown tide species Aureococciis anophageffèrens in Laguna Madre, Texas, U S A . The disruption ofthe grazer communities may have been caused by high salinity due to extended period o f drought. Selective predation is another important aspect to phytoplankton dynamics, in particular to species composition. Selective predation is advantageous to species that are not favoured by grazers. Many red tide species such as dinotlagellates and freshwater nuisance species like cyanobacteria are nol preferred by grazers. The acceleration o f nutriënt cycling due to grazing can indireclly enhance the competition o f nuisance species. The trophic inter-relations between algae and grazers are by no means entirely one way. B y consuming algal biomass and metabolizing their tissues, grazers release nutrients from their organic binding and excrete them into the environment. Equivalently, by feeding on baeteria which have themselves absorbed the dissolved organics released by the phytoplankton, the microplankton release further stocks o f nitrogen and phosphorus back into the water. These consumers thus regenerate incorporated nutrients and make them available to the phytoplankton again. Therefore, grazing can enhance nutrients cycling and stimulate producli\ity. This is especially favourable to red tide species i f predation is selective to non-nuisance taxa. Intrinsic mortality and cell lyses are sometimes main contributors to biomass loss. Cell lyses is usually a natural process. bul il can also be induccd by strong irradiance. 3.3 A p p r o a c h e s t o a l g a l b l o o m m o d e l l i n g

In view ofthe huge ecological and economie impacts o f harmful algal blooms, it is important lo have miligalion measures. Abatement o f pollulion and improvement o f water quality are essential ways to alleviate the disasters (Hosper, 1997; Meijer, 2000). However, these approaches usually lake times as long as decades (Scheffer, 1998; Meijer, 2000). Forecasting and early warning o f possible algal blooms and species compositions are seen to be important to tlshery industries although they can not essentially resolve the problems ( L u , el al., 2000; Lee, 2002). Through predictions, economie loss can be reduced by early harvesting, towing away llsh cages and pumping uncontaminated water to dilute concenlralions. Models are the major tools to oblain insight inio the dynamics o f algal blooms and lo provide predictions (Franks, 1997). It has been widely recognised that accurate modelling o f the underlying physical and (bio)chemical processes is key to the prediction o f ihe initiation and species composition o f algal blooms. The performance of red tide models has so far been restricted by the insufllcient ability to integrate both the biological and the underlying physical processes (Donaghay and Osborn, 1997; Verkhozina. el al.. 2000; Chen, et al., 2003). Therefore, a reliable model to predict initiation, transport and persistence o f red tides has yet lo be established (Chapra, 1998; Chen, 2000). In general, potential approaches to algal bloom modelling can be inductive as well as deductive; induclive methods prove lo be both straightforward and useful (Recknagel. 1997). 3.3.1 D e d u c t i v e a p p r o a c h e s

The deductive approach lo modelling phytoplankton dynamics is to numerically solve a set o f differential equations in order lo calculale the phytoplankton growth rate.

64

Cellular Automata and Artitlcial Intelligence in Ecohydraulics Modelling

whcrc usually a Monod type method is implcmcntcd tbr calculating nutriënt and light dependent growth. The Delft software systems o f W L | Delft Hydraulies offer three different ways, viz. D Y N A M O , B L O O M 11 and D E L P A R T , to model algae dynamies, their growth and mortality. The D Y N A M O approach is a traditional and relatively simple model o f primary production based on Monod kinetics for the calculation o f growth. T w o groups o f algae are considered: diatoms and non-diatoms. The main differenee between the types is that diatoms utilise silica as an essential nutriënt. This approach originated from the D Y N A M O model for the North Sea (de Vries, et al., 1990; Klein and Buuren, 1992) and is recommended for general reconnaissance studies, in which distinction o f different species and competition between species are not considered o f prime importanee. The aim is to calculate primary production in terms o f algal biomass concentrations. The B L O O M 11 approach is based on an optimisation teehnique that distributes the available resources (nutrients and light) among different types o f algae (Los and Brinkman, 1988). B L O O M 11 optimises through linear-programming the species composition to obtain the maximum growth rate under the given eonditions. A large number o f groups and/or species o f algae and even different phenotypes within one species can be considered. In the same way, phytoplankton (algae living in the water column) and water plants (living on sediment) can be included with their ecophysiological characteristics. With the B L O O M 11 approach, apart from the calculation o f biomass concentrations, the dynamies o f algae communities including competition for light and nutrients, adaptation to enviromnental eonditions and species compositions can be simulated. D E L P A R T is a partiele tracking model whieh estimates concentration distributions by following the tracks o f particles in time. It is a 3-dimensional model that can track thousands o f particles in time. The model was originally devoted to a detailed description o f concentration contours o f instantaneous or continuous releases o f conservative or simple decaying substances, including simulations o f physical quantities like temperature and density. Model simulations include tidal variations, effects o f time-varying wind fields on partiele patches, effects o f bottoin-friction on the patches, and limited vertical dispersion due to stratiflcation o f the water column in two layers. The instrument was later extended to allow for algal bloom modelling, which simulates transport and growth kinetics o f phytoplankton species including nutrients and settling velocities o f phytoplankton cells on the sediment. The choice o f the approaches depends priinarily on the modelling objective and also on the availability o f data and computation time. When dealing with a relatively eonstant environment and biomass concentrations being the main modelling objective, the D Y N A M O approach w i l l often be sufficiënt. It gives a rather linear picture o f biomass development and a simplitled picture o f the outcome. The B L O O M 11 approach can simulate non-linear, more dynamical behaviour. When confronted with a changing environment that affects the algal community on a relatively small time scale, B L O O M 11 w i l l often be the better choice even i f biomass concentration is the main objective. Whenever modelling competition tbr resources, growth liiniting factors, species composition within the algal community and/or adaptation to changes of the environment is desired, B L O O M II should be applied. In general, the availability and accuracy o f data needed to determine the values o f the inodel coeftlcients as well as data for model validation limit the reliability of modelling results. Although, the D Y N A M O approach has fewer coefficients compared to B L O O M II, it has to be noted that describing complex processes by

65

Rule based modelling techniques and applications to algal blooms

simple equations makes it neeessary to tune the eoelTicients tbr loeal conditions. Moreover. coetTicients o f the D Y N A M O approach are rather hard to determine and the modelling results can be very sensitive lo the values o f the coeftlcienls. Comparativcly, validalion o f the model results for a wide range o f both freshwater and marine systems is a special advantage o f B L O O M II (Los, et al., 1994). The computation time for both D Y N A M O and B L O O M II approaches are nol signiHcantly different because D Y N A M O has fewer equations and parameters, bul uses a much smaller time scale. The advantages o f D E L P A R T are that processes o f transport, sedimentation, buoyancy and vertical migration o f phytoplankton cells can be easily included and the model is possible lo create sleep concentration gradients which are frequenlly observed during red tide events. The shortcomings lie in the simple growth kinelics and the absence o f species competition. Despite all the differences ofthe approaches, the general procedures o f calculaling algae concentrations (growth and transport) are almost the same that are briefed below. According lo Lambert-Beer law, the light intensity at depth z is given by equation 3.1, which is repeated here for convenience: /, =

(3.12)

where lo is the irradiance at surface and Kj is light attenuation coëfficiënt. The attenuation is mainly caused by the water body (background attenuation, a), suspended solids (.v.v), salts (.sa/), and phytoplankton (Chla), and is usually calculated by a linear regression: K, =a + h*ss + c*sa! + d*Ch/a

(3.13)

The mass balance o f inorganic nitrogen (NH4', NO3") in water can be written as: '^'^^^ - minendisalion dl

- denilri/kalion

- uptake

(3.14)

The budget o f inorganic phosphorus in the water column is given as: ^HUL = mineralisalion dt

+ disorplion - adsorplion - uptake

(3.15)

The mass balance for dissolved silicon in water column is: = mineralisalion - uptake

(3.16)

dt The light and nutrients limiting growth rate is described by the Michaelis-Menten equation 3.3, where //„„„ is the maximum growth rate o f modelled species, K, is halfsaturation constant, C can be light intensity, nitrogen, phosphorus, or silicon concentration or others. ^ - ' - Y ^

(3..7)

The Monod-kinelic type growth o f phytoplankton can be written as 3.18, in which co is the concentration and / is the loss rate including respiration, gra/ing and sinking. ^

dt

= (M-l)co

(3.18)

Cellular AutomaUi and Artillcial Intelligence in Ecoliydraulics Modelling

66

The transport equation,

is then deseribed through a 3-dimensional adveetion-dilïusion

dC

d

dC

d

dC

ct

cx

cx

dr

dy

+

f,(Cj)

+ — {D.~-iLC) & ' f-r

where the reaetion term /«((', /) is given aeeording to equation 3.1 8. 3.3.2 I n d u c t i v e m o d e l s

Inductive approaches are not based on detailed descriptions o f physical mechanisins but are built on cause-effect relationships. The major induction approaches in algal bloom modelling include regression analysis, neural networks and fuzzy logic or their combinations. Regression methods set up relations between environmental eonditions and algal bloom initiation by linear or non-linear functions, which can be used at least as a first step towards early waming o f the likelihood o f an algal bloom event. Wang (1993) has proposed a correlation model predicting the frequency o f red tides in Hong K o n g waters through the use o f a eutrophic index that quantifies the eutrophication status in the water. However, the validity o f this relatively simple approach remains doubtful due to the poor correlation between nutriënt concentrations and red tide occurrences in this area found by other researchers (Yung, et al., 1997). Peperzak (2002) developed a linear regression model for predicting /'. globosa blooms in the Duteh coastal waters. Chen and Mynett (2()()4b) argued that linear regression inherently has the difficulty to deal with the cominon phenoinenon in algal blooms, viz. that the limiting factor is changing, which directly affects species competition and species composition. Therefore, piecewise regression, also called model tree, is proposed (Chen and Mynett, 2()()4b. Chen. et al., 2004). Artificial neural networks (ANNs) constitute a structure composed o f a number o f interconnected units (artitlcial neurones). Each unit has an input/output characteristic and implements a local computation or function, The output o f any unit is determined by its input/output characteristics, its interconnections to other units and possibly external inputs. The network usually develops an overall functionality through one or more forms o f training. A N N s are nowadays regarded as universal approxiinators by the scientific community owing to the suceess of mapping the relationship between input and output with sufficiënt accuracy. This kind o f black-box model shows great advantage under the condition that understanding the processes o f the system can be limited provided a large amount o f data is available. There are several types o f networks, which are suited to different applications. The most widely used network is the multi-layer perceptron ( M E P ) network, espeeially after the introduction o f the error-back-propagation learning strategy. Since inforination at the current time step usually has an innuence on future tiine steps, auto-regressive neural networks ( A R N N s ) , have been introduced, where the computed outputs are fed back to the network's inputs for one or more future time steps (Boogaard, et. al.. 1998). The construetion o f a network model includes the determination o f input/output factors, setting up the network architecture and training. B y using historical data, the network learns the relationships between inputs and outputs, which are refiected by

Rule based modelling teclinic|ues and applications to algal blooms

67

the weights. Therefore, neural network models pose very high requirements on the quality and quantity o f data. There are a number o f applications o f A N N s to algal bloom modelling. Recknagel et al (1995), Recknagel (1997) developed neural network models for predicting the abundance and succession of cyanobacteria (blue-green algae) in Finnish and Australian lakes. Maier et al (1998) applied neural networks to modelling cyanobacteria species Anaemi in the River Murray, South Australia. Scardi (1996), Scardi et al (1999) developed neural network models lor predicting phytoplankton primary production in estuaries like the Chesapeake Bay. Barciela et al (1999) integrated dynamic models and neural networks to simulate primary production in the Western Spanish coastal embayment Ria de Arousa which is affected by upwelling. Lee et al (2003a) used neural networks to develop a real-time algal bloom forecasting system for Tolo Harbour along the Hong Kong coast. Most ofthe authors concluded that in general A N N is a good approach in algal bloom modelling. However, water quality and biological data are usually sparse and their samplings are quite irregular, while algal blooms are instantaneous and sporadic events. Following the concept ofthe Nyquist frequency, neural network models seem to be not applicable in practice or the results remain doubtful. Therefore, it is difficult or even hardly possible to build neural network models by using limited biological data, or the model results remain questionable (Chen and Mynett, 2003a, Lee, el al., 2003b). Fuzzy logic (Zadeh, 1965) is a method in between purely numerical and purely data-driven approaches which integrates partial knowledge and partial data. O w i n g to the ability to deal with imprecise, uncertain data or ambiguous relationships among data sets (Meltemichl, 2001), fuzzy logic has been increasingly applied lo ecological models (Recknagel, et al., 1994; Foody, 1996; Vonk and Michielsen, 1998; Chen and Mynett, 2003a, 2004a, 2004c). The construclion o f fuzzy logic models consists ofthe selection o f inputs, the defmition o f membership functions and the formulation o f inference rules. Since fuzzy logic is transparenl and straightforward, it is easier to bring together ecologists/biologists who have less modelling backgrounds and modellers who are lacking biological knowledge. Bokhorst and Vonk (1997) developed a fuzzy logic model to predict nuisance surface scums o f the cyanobacteria Microcystis in Lake IJsselmeer, the Netherlands. Within the European Commission project Harmful Algal Bloom Expert System ( H A B E S ) , Chen and Mynett (2003a, 2004a, 2004c). Blauw et al (2002) have developed fuzzy logic models to forecast P. globosa blooms in the Dutch coastal waters. Chen et al (2002c) also compared the perfonnance o f fuzzy logic and A N N models in predicting globosa blooms in the same area. Instead o f constructing a model purely based on fuzzy logic, the approach can be integrated with numerical models. Bokhorst and Vonk (1997) successfully demonstrated the combination o f fuzzy logic with partiele tracking techniques for modelling nuisance surface scums o f Cyanobacteria Microcystis in the IJsselmeer, The Netherlands. Chen and Mynett (2004c) integrated D e l f t 3 D - W A Q (water quality module) with fuzzy logic to predict algae concentrations in the Dutch coastal waters, where the transport o f nutrients is simulated in detail by W A Q and the algal biomass is calculated by means o f fuzzy logic. The disadvantages o f fuzzy logic lie in the detlnilion o f inference rules and the modelling o f multiple species, in particular species competition and succession.

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

68

3.3.3 F u z z y l o g i c a n d d e c i s i o n t r e e s a s r u l e b a s e d a p p r o a c h e s

First proposed by Zadeh (1965), fuzzy logic lias proved to be a convenient way to map an input space to an output space based on a natural language. Since fuzzy logic is tolerant to imprecise data, conceptually easy to understand and can be built on top of the cxperience o f experts, it is very useful when P D E description becomes difficult and data is liinited. Although fuzzy logic has been widely used in system controls (Von Altrock, 1995; Czogala and Leski, 2000; L i n , et al., 2001), the development o f F L in ecohydraulics modelling is relatively slow (Salski and Sperlbaum, 1991; Jorgensen, 1994). Recent applications include fuzzy modelling o f vegetation d y n a i T i i c s (Foody, 1996) and skylarks population dynamies (Daunicht, et a l , 1996), fuzzy clustering o f eeologieal data (Friederichs, et al., 1996; Melcher and Matthies, 1996), modelling o f eutrophication and algal blooms in lake and reservoirs (Recknagel, et al., 1994; Los and Vonk, 1996; Vonk and Michielsen, 1998; Chen and Mynett, 2003a), modelling o f harmful algal blooms i n coastal waters (Chen, et al., 2002c; Chen and Mynett, 2004a), and fuzzy expert system for wastewater plants (Lee et a l . , 1997; Marsili-Libelli and Giunti, 2002). Knowledge base: -set of rules -definition of fuzzy sets

INPUT

OUTPUT Linguistic Linguistic tcrnis

approximation

Linguislic terms

^

Fuzzy sets Crisp values

Defuzzificalion

Fuzzy sets

Crisp values

Fig 3.11 Information flow of fuzzy-knowledge based model A fuzzy inference system eonsists o f three major components that are fiazzy set, membership functions and logic operators. Development o f a fuzzy based model involves (1) determination o f model structure, i.e. input and output variables; (2) construetion o f appropriate membership functions; (3) set up appropriate linguistic rules. A general framework o f fuzzy logic modelling is given in Fig 3.11. Fuzzy logie is implemented by a group o f "if-then" rules. For instance, an eutrophication system can be described by fuzzy logic as; if water_temperature then c h i o r o p h y i i if water_temperature then c h i o r o p h y i i if water_temperature then c h i o r o p h y i i i f water_temperature then c h i o r o p h y i i

i s h i g h and n u t r i e n t s I s h i g h a i s high i s low and n u t r i e n t s i s h i g h a i s middle i s h i g h and n u t r i e n t s i s low a i s low i s low and n u t r i e n t s i s low a i s low

Hcre "high", "middle" and " l o w " are fuzzy sets, and "and" is an operator. The mapping o f different environmental eonditions to different eutrophication levels depends on the membership functions, which are in the inference component. The major problem o f fuzzy logic modelling is to defme appropriate inembership functions and linguistic rules describing the system to be modelled. They can be taken

69

Rule based modelling techniques and applications to algal blooms

either directly from expert's experiences or from machine learning. T o define membership functions: (I) when there is knowledge available, use the knowledge such as threshold, optimal condition and critical conditions as parameters. From F i g 3.12, an exemplary P-I curve o f diatoms, it is seen that for / £ ( ( ) , / , ) light is limiting, / e[l,,/-,], light is optimal, / e ( / | , T o ) , light is toxic. Therefore, / i and / : can be used as parameters to define membership functions. (2) When empirical knowledge is not available, the membership functions can be dellned by learning from data. Fig 3.13 illustrates the cluster analysis through self organizing feature map (Boogaard et al., 1998; Chen, 200la) and Fig 3.14 gives a partitioning analysis (C-means classitlcation) when there is no eminent cluster (Chen and Mynett, 2003a, 2004c).

1,

I

l2

I

Fig 3.12 Exemplary P-I curve of diatom

0.8 0.6

t

0,4 0.2

Fig 3.13 lllustration of cluster analysis

_

0.050

5

0.040

-

0.030 0.020





• * * •

* «•

9

• .. • * ^

0.010 • 0.000 0.000

1.000

1.500 T I N (mul)

Fig 3.14 Partitioning analysis of TIN vs. TIP

In self organising feature maps (Fig 3.15), un-supervised learning algorithms are implemented (Ross, 1994; Boogaard, et al., 1998; Chen, 2001a). The procedure o f S O F M can be summarised as follows (Kohoncn, 1982):

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

70

1. Initialisatioir. choose random values for the initial weight veetors ii/(0). The only restriction here is that the MV(0) is different for/= 1,2,...,TV, where A' is the number of neurones in the lattiee; 2. Sampling: draw a sample X from the input distribution with a eertain probability; 3. Similarity matching: find the best-matching (winning) neurone / at time /, using the minimum distance Euclidean criteria; ;(.v) = arg, minjl.v(«) - ir,|!, j=\,2...N

(3.20)

4. Updating: adjust the synoptic weight vectors o f all neurones, using the update formula

M ' . ( / + 1) = »r, (/) + /;(/)•/((/,/-)• ( | „ - vv^.(/)) , where

ij(t},A(t,r)

are

the

learning rate and neighbourhood function respectively, and /• is the radius o f the neighbourhood; 5. Continiiation: map are observed.

continue with step 2 until no noticeable changes in the feature

Fig 3.15 Self organising feature map

There are two strategies for inference rules induction: feature reasoning (Kohonen, 1982; Yager and Filev, 1994; Chen and Mynett, 2()03a), and case reasoning (Wang and Mendel, 1992; A b c and Ean, 1995; Chen, et al., 2()02c, 2003). Both strategies have been investigated in this research and the seleetion o f them depends on the characteristics o f the model led systems. In feature reasoning, the clusters obtained from S O F M analysis are used. Each feature (cluster) provides at most one rule (Shen and Chouchoulas, 2001; Ein and Tsai, 2001; Chen and Mynett, 2003a). These rules are complemented to the general rule base. F i g 3.16 is the S O F M analysis o f total inorganic nitrogen (TIN), total inorganic phosphorus (TIP) and chiorophyii a (Chla) concentrations in the eutrophic Taihu Eake, China. It is seen from the result that three rules can be generaled. This is a typical nitrogen limiting system where lipid is produced rather than chiorophyii a. if if if

T I N i s h i g h and T I P i s M i d d l e then c f i l o r o p f i y l l a i s H i g h T I N i s M i d d l e and T I P i s h i g h then c h i o r o p h y i i a i s m i d d l e T I N i s low and T I P i s low then c h i o r o p h y i i a i s low

In case reasoning. each case is studied and the rules are generaled through statistie analysis o f all individual cases. The whole learning process goes through 5 steps: 1. fuzzification o f all cases: consider the original data o f case / given by [ , V | ' , . Y - , ' , . . . . v / . . . . Y J ; f ' ] , where .Y, are input variables and y is the output.

71

Rule based modelling techniques and applications to algal blooms

According

to

max(/;.'(A-,.),/;.-(.V, ),.../;'(.v,))

whcrc

.//(.v,)

is

the

k"'

membership function o f .v,, it can be fuzzilled into [H, , « , ' , . . . ! / / . . . M , J ; M , ' ] , and

2. 3.

expressed in linguistic temis such as [high. l o w . . . . , middle,...; middle]. This can then be considered as a rule generaled from case /; checking whether rules already exist in the general rule base; i f so, neglect the particular generated rule, i f not, continue; checking i f newly generated rules conflict (same premisc but different reasoning) with the general rule base; i f there are conflicts, preference is given to the general rule base, eliminating the newly generated rules; i f not, continue; Plol ol' Mc;ms Ibi l-ach (luslcr I Slandardi?.L'd liala

.A

Cluslcr No. I

-O-

Clusa-r No. 2

-~>~ Cluster No. .i

1 -•"

J

-O-

1.0

0.0 -0..S

-1.0 TIN

Fig 3.16 SOFM analysis of TIN, TIP and Chla in the eutrophic Taihu Lake, China (Adapted from Chen and Mynett, 2003a)

4.

selecting relevant rules by Bernoulli test (Krone and Taeger, 2001): let /")= — n and p; = — , where n is the total number o f cases, ni is the number o f cases with certain reasoning C, / ; j s the number o f cases with certain premisc S, and / » J s the number o f cases with reasoning C under the premise 5. If the confidence intervals o f p and />, do nol intersect. the rule ' T f 5" then C' is a positively relevant rule.

5. checking for any conflict among the remaining newly generated rules: i f conflicts arise. count the number o f cases o f each conflict rule. and weigh them according to their occurrences; i f there are no conflicts, add the rules directly to the rule base. Decision tree is a model technique which can provide qualitative discrete outputs of a system under certain conditions represented by parameters (Solomatine, and Dulal, 2003; Chen and Mynett, 2004b). It splits the parameter domain into subdomains, and the system output for each sub-domain is learncd through historical records. When new condition comes up, the system output is predicted by looking up which sub-domain it belongs to. A split point is a node and the corresponding output is a leaf A node can be further split, and the whole procedure results in a tree-like

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

72

modelling structure. Therefore, the key issue for construciing a decision tree model is to find the right attribule(s) and optimal splits o f the parameter domain (Quinlan, 1992). The splitting criterion is to have maximum entropy gains: Gam{S,A)=

|5„

E{S)-Y

E{S^

(3.21

where Gain(S, A) is the entropy gain o f samples S split on attribute A, E{S) is the entropy o f the samples 5', 5, are samples belonging to subset v and E{Sy) is its entropy. The computation o f entropy is given by (3.22) in which is the proportion that samples enter into subset /. To illustrate the splitting, an example is presented here. Suppose a tennis player decides to play or not according to outlook and temperature, 14 records are collected, given in table 3.1. Table 3.1 Records o f a tennis player Record 1 2 3 4 5 6 7

Outlook Sunny Sunny Overcast Rainy Rainy Rainy Overcast

Temperature Hol Hol Hot Mild Cool Cool Cool

Play? No No Yes Yes Yes No Yes

Record 8 9 10 11 12 13 14

Outlook Sunny Sunny Rainy Sunny Overcast Overcast Rainy

Temperature Mild Cool Mild Mild Mild Hot Mild

Play? No Yes Yes Yes Yes Yes No

There are 9 "yes" and 5 "no", the entropy o f the samples is £•(5')= -(9/14)log2(9/14)-(5/14)log2(5/14) = 0.94

(3.23)

If split on the attribute "Outlook", the entropy gain is GülniS, Ou/look) = 0.246

(3.24)

A n d i f split on the attribute "temperature", the entropy gain is Gain{S, Temperature) = 0.029

(3.25)

Therefore, the best split is on the attribute "Outlook". The same procedure is continued until no significant entropy gain is obtained. The constructed tree is illustrated in F i g 3.17. Oudook

Sunny

Temperature

Overcast

Rainy

Temperature

Fig 3.17 The decision tree of the simple example

However, it can obviously be seen that the computation load increases rapidly with the division and the nuinber o f attributes. In addition, over-fitting may appear

73

Rule based modelling technic|ues and applicalions to algal blooms

due t o relentless splitling. Therefore, the constructed tree needs t o be pruned t o prevent these problems (Solomatine and Dulal, 2003). The idea o l ' m o d e l tree is similar to decision trees except that at each leaf there is a multi-variable linear regression function instead o f discrete constant output (Quinlan, 1992; Witten and Frank, 2000; Chen and Mynett, 2004b), as illustrated in F i g 3.18. Therefore, it can predict continuous numerical outputs. Due to the splitting o f the parameter domain, the multi-regression function at each leaf is usually seen to have few variables even i f totally there are many parameters. Although the regression function at each leaf is linear, the function o n the whole domain is non-linear. The splitting criterion in piecevvise regression is to have maximum S t a n d a r d deviation reduction, given by: SDR(S,A)

= SD(S)-J^^-^SD(SJ V I S\

(3.26)

where SDR(S, A) is the S t a n d a r d de\ iation reduction o f samples S split o n attribute A, SD{S) is the Standard deviation o f samples S, 5, are samples belonging to subset v, and SD(Sy) is its S t a n d a r d deviation.

I



• X

Fig 3.18 An lllustration of piecewise regression model

Taking the same procedure as in decision trees to split the parameters' space, and setting up multi-variable regression at each leaf a model tree is then constructed. To prevent possible over-lltting from relentless division and to increase interpretability, the constructed model must be pruned, for example by replacing a node with a leaf A t the final stage, smoothing process is performed to compensate the sharp discontinuities that w i l l inevitably occur between adjacent linear models at the leaves o f t h e pruned tree. in particular for the model constructed from a small number o f records.

3.4 R u l e b a s e d a l g a l b l o o m m o d e l l i n g in p r a c t i c e

Case studies of modelling harmful algal blooms by integrated numerical-artiflcial intelligence approaches have been conducted both in freshwater and brackish water systems in this research. The following sections present the studies in the eutrophic Taihu Lake. China and along the Dutch coast ofthe North Sea. 3.4.1 F u z z y l o g i c a l m o d e l l i n g Cyanobacteria

b l o o m in T a i h u L a k e

Taihu Lake (E 30'(56'~31Y53', N 1 19>54'~120Y?6') is located in highly populated Yangt/e Delta and is the third largest shallow China (Fig 3.19). The water depth ranges from 1 to 2.5 m, with an The total water area is about 2338.11 km" and the mean water

the developed and freshwater lake in average o f 1.89 m. volume is around

74

Cülkilar Aulomala and Arlillcial Inlelligcncc in Ivctihydraulics Modelling

44,297* 10** m \ The lake is well mixed almost all the time, Due to excessive discharge from agriculture and sewage, the lake exhibits serious problems o f eutrophication. Espeeially in summer, blue-green algae (Cyanohactcria) blooming may become very severe in some local areas ( T B A , 1998; Chen, 2001a). F i g 3.20 is the remote sensing image o f Cyanobacteria bloom in Taihu Lake in the year 2000. The blooms cause great damage to recreation, water supply and aquatic fisheries ( T B A , 1998; Chen, 2001a).

E30"y56'

Ejn.l.V

Fig 3.19 The map of Taihu Lake basin and studied area (from TBA, 1998)

Fig 3.20 Block diagram of the model process (dash line means relevant techniques)

A fuzzy logic model was developed for prediction o f algal biomass (Chla) concentration in this eutrophic Lake which integrated data mining techniques and heuristic knowledge (Fig 3.21). Principal component analysis ( P C A ) was used to identify major abiotic driving factors and to reduce dimensionality. S O F M teehnique and heuristic knowledge were applied jointly for membership function definition and inference rules induction.

75

Rille based modelling techniques and applications to algal blooms

Raw dataset PCA

Reduced dataset SOFM Clusters

Membership functions

Ileuristic knowledge

Inference rules

FL model Fig 3.21 Block diagram of the model process (dash line means relevant techniques)

Biweekly water quality data were collected from 1991 to 1995, and the sampled parameters included station, time, p H , conductivity, concentrations o f B O D , D O , N H 4 * , N O 3 ' , N O i ' , total inorganic phosphorus (TIP), and chlorophyll a (Chla). Total inorganic nitrogen (TIN) was measured by N H 4 ' + N O 3 " + N O 2 ' . The N / P ratio was calculated by TIN/TIP. The moniloring system comprised 32 regular sampling stations, which almost covered the dynamics o f the whole aquatic ecosystem o f the lake. Obscrvations were taken from April to October each year, with a sampling frequency o f once every half-month. The data from the 5 stations in the Northwest area (Fig 3.19), where the most serious blooming exhibited, was chosen for study. Three ( S I 3 , S I 4 , S16) o f them were used for model con.struction and the other two (SI2, S I 7 ) for model testing. According to correlation analysis, T I N , TIP, N / P and N H 4 ' were preliminarily selected as abiotic driving factors for Chla. A l l obser\alions of a particular variable from different stations were pooled together as i f they had all been made at the same station (Petersen, et al., 2001). After eliminating the cases with missing data, there were 195 pairs o f samples left for model construclion. In the raw data seis, some variables have large variation or spread, w hich may lead to trivial results in P C A and S O F M analysis due to the non-commensurale units between different variables (Haan, 1977; Boogaard, et al., 1998). Therefore the raw data set A",, =

(.v„, , . v „ . . . . . , . v „

^ jshould be normalised i n l o ^ = ( 4 ,,^„ ^ , . . . , 4 ^ ) , where

n is the number o f cases and k is the index o f variables. (3.27) cr,

and (3.28)

(3.29)

Cellular AuUiniata and Artiriclal Intelligence in Ivcoliydraulics Modelling

76

Dimensionality reduetion in a fuzzy logic approach can not only exponentially reduce rule set size and improve efficiency, but also help experts to formulate inference rules (Shen and Chouchoulas, 2001; Chen and Mynett, 2004a). P C A is a multivariate linear teehnique useful for data reduetion which enables highly correlated variables to be reduced to a small number o f orthogonal components (Ould-dedah, et al., 1999). Typical criterion for selecting principal components is 75% to 95%) o f total variance. This criteria sometimes results in too drastic reduetion, so an alternative criterion is to select those components that account for a higher than average variance (Weiss and Indurkhya, 1998). If a variable has no significant factor loading with a principal component, then that variable is not contributing much to the variance o f this component and therefore, it can be eliminated from further consideration (Haan, 1977). P C A has been widely used in data reduetion (Legendre and Legendre, 1998; Park and Park, 2000; Shen and Chouchoulas. 2001), Identification o f major factors and processes o f ecosystems (Park and Park, 2000; Petersen, et al., 2001), analysis o f spatial and temporal patterns o f ecosystems (Baldacci et al., 2001; Ould-dedah, et al., 1999). Because Chla is the model output, P C A is performed only on the abiotic d r i \ i n g factors T I N , TIP, N / P and NHi (Table 3.2, Table 3.3). From Table 3.2, it is found that only the firsl two components account for a higher than average variance that can be selected as principal components. The factor loading in Table 3.3 revealed the different contributions o f the four factors to these two principal components. The results show that only T I N and T I P are the significant abiotic driving factors. This result is consistent with the cross-correlation coefficients between T I N , TIP and Chla, which are 0.52 and 0.32 respectively. Therefore, T I N and TIP are selected as model input variables. It is seen that P C A successfully reduced the dimension o f inputs from four to two. Table 3.2 Engine values and explained ratio Component

1 2 3 4 y

Engine value 1.437

1.067 0.962 0.534 4.00

Explained ratio 0.359 0.267 0.241 0.134 1.00

Table 3.3 Factor loading ol 'variables; Variables TIN TIP N/P NHj'

Component 1 0.202 0.848 -0.700 0.420

Component 2 0.935 -0.165 0.309 -0.263

Common to any F L model, the definition o f membership functions and induction of inference rules are the most difficult parts (Castro, et al., 2001; Krone and Taeger, 2001; Chen and Mynett, 2004a). They conventionally rely on the use o f semiqualitative "heuristic knowledge" (Kompare, et al., 1994). A n alternative is to obtain them from real observations when data is available (Dzeroski, et al., 1997; L i n , et al., 2001) or from combinations when both data and heuristic knowledge are semiqualitative (Recknagel, et al., 1994; Eau, et al., 2001). A n important and widely used teehnique for analysing multi-dimensional data is self organising feature maps. The clustering results can be used to provide Information about the better location o f the IF-part membership functions, and the extracted

77

Rule based modelling lechniques and applications to algal blooms

features can be used as infcrciice rules (Liii and Tsai. 2()ül; Shen and Chouchoulas, 2001). The data o f T I N , TIP and Chla wcre classitled by a 3*3 lattice and three prominent clusters were found ( F i g 3.22). The mean and 95% confidence intervals o f each cluster are given in Fig 3.23 ~ 3.25. There is no overlap found between the clusters, therefore the classitlcation is statistically satislled and the mean o f each Plot of Mcans lor Cacli t liisk-r i Si.iiKlardisci) ilala l

cluster (Table 3.4) is representative. Fig 3.22 Results of SOFM cluster analysis of TIN, TIP and Chla Table 3.4 Mean o f each cluster (after de-standardised) Cluster 1 Cluster 2 Cluster 3

TIN 1.73 4.45 2.42

TIP 0.07 0.12 0.22

Chla 7.06 12.11 25.97

W h skcr plol o f 1 IN by cliislL-f

Fig 3.23 Mean and 95% confidence interval of each cluster of TIN Taking the means as typical values which have the membership degree o f 1.0, the membership functions for T I N and TIP were defmed ( F i g 3.26 and 3.27). Since output variables in F L approach usually have no Icss than 5 linguistic terms (Von Altrock,

Ccllular Automata and Artit'icial Intelligcnce in Bcohydraulics Modelling

78

1995; Czogala and Leski, 2000). the 3 clusters o f Chla wcre thcn linearly interpolated into 5 clusters manually. which gives the membcrship fuiiction in Fig 3.28. Whiskcr plol ol'l il' b\ LIUSILT

Fig 3.24 Mean and 95% confidence interval of each cluster of TIP WhiskLT plot o l C h l - i i b> CIIISRT

^ 15

Fig 3.25 IVlean and 95% confidence interval of each cluster of Chla

OW

ntiddle

high

10

12 14 L'oiiCL'iilralidii (mg/l)

Fig 3.26 Membership functions of TIN (input variable)

79

Rule based modelling techniques and applications to algal blooms

O

0.1

0.2

0.3

0.4

conccnlralion (ing/1)

Fig 3.27 Membership functions of TIP (input variable)

\ e r \ ktw

O

low

middle

10

higii

\oryliigh

20

30

40

50

c o n c c n l r a l i o n (ug/1)

Fig 3.28 Membership functions of Chla (both for input and output)

I

2

.1

4

5

6

7

X

9

10

11

12

monlli Fig 3.29 Membership functions of seasons (heuristic knowledge)

Empirical knowledge from experts indicates that algal growth is not only governed by available inorganic nutrients, but is also greatly affected by previous biomass concentration and current irradiation condition. Unfortunately, often there are no irradiation data available, neither are the water temperature data. In order to still incorporate these effects, an input variable was added w hose fuzzy set o f four seasons (Fig 3.29) was constructed purely based on experience from experts. The effects o f previous biomass concentrations were incorporated by using Chla, as one ofthe model input variables. The same membership function as F i g 3.28 is used for Chla,. The model can thus be generalised by:

Cellular Automata and Artillcial Intelligence in Hcoliydraulics Modelling

80

Chla,+A, = /(TIN,. TIP,. Chla,. S e a s o n — )

(3.30)

where Al is around 15 days here, a n d / i s the interenee rule that is to be diseussed next. The elusters obtained trom S O F M analysis ean also provide interenee rules in addition to the inembership funetion definitions (Lin and Tsai, 2001; Shen and Chouehoulas, 2001). Each feature (Fig 3.22) generates at most one rule, in this case; (1) i f low TIN, and low TIP,, then very low ChlaitAi; (2) i f middle TIN, and high TIP,, then middle Chla.^At; (3) i f high TIN, and middle TIP,, then high Chla,.^,; Since cluster 2 and 3 are crossed (Fig 3.28), they can be interpolated into another two rules: (4) if middle TIN, and middle TIP,, then low C h l a , - A , ; (5) if high TIN, and high TIP,, then very high Chla,^Ai; It is found from the analysis that high TIP, and low T I N , did not result in corresponding high Chla, .\, because phytoplankton in this case forms lipids but not biomass. These rules conform to the previous tlndings that nitrogen was the locally liiniting factor in areas where serious blooming exhibited (Chen, 2001a). It was also confirmed by the species composition that Cyanobacteria Microcystis was dominant during bloom period because it can fix molecular nitrogen (No) when T I N was limited while TIP was excessive. The rules about irradiation effects were set up also according to heuristic knowledge that can be generalised as: under the same T I N , and TIP, condition, from winter to spring and to summer, Chla,. \, concentration increases; and vice versa. For example: i f spring, middle TIN,, middle TIP,, low Chla,, then low Chla,. v; i f summer, middle T I N , , middle TIP,, low Chla,, then middle C h l a , , A , ; i f winter, middle T I N , , middle TIP,, low Chla,, then very low Chla,.AI. Using the above 5 rules extracted from the data as a basis and incorporating the heuristic knowledge about irradiation cfTects, there are 15 rules defined for spring, 20 for summer, 15 for autumn and 10 for winter. A summary o f these rules is given in Table 3.5. Table 3.5 Summary o f inference rules Season Spring Summer Autumn Winter

Basic rule 5

Chla, low, middle, high low. middle, high, very high middle, high. very high very low, low

5 5 5

Number of rules 15 20 15 10

The constructed F L model was tested at two sampling sites S12 and S17 (Fig 3.19) using the month, Chla, and observed TIN,, TIP, values as inputs. The model outputs as summarised in Table 3.6 are in general seen to be qualitatively in agreement with the field observations, espeeially for the cases o f high concentration. The incorrect fuzzy predictions were mainly for the cases o f low or middle concentration. Table 3.6 Number o f cases o f incorrect fuzzy prediction Station

L/M

L/H

M/L

S12 817

7 4

0 0

6 10

M/H 2 0

H/L

H/M

Z

0 0

2 0

17 14

L: iow; M : middle; H : high; L / M : observation is low bul model output is middle. and the same format is used to olhers.

81

Rule based modelling techniques and applications to algal blooms

In order to conduct quantitative comparison, the outputs were dcfuzzified by the centre o f gravity method, which is given by equalion 3.31.

DL.

= ^-±:,

'

(3.31)

where v, is the tlred degree o f rulc /, and M(Bi) is the fuzzy mean o f the corresponding output fuzzy set o f rule /, that is given by equation 3.32, in which aj; is the membership function o f fuzzy set B on variable x: \' xoc^(x)dx

M(B) = ^^

(3.32) =^B(x)dx

The defuzzified outputs are plotted together with observations (Fig 3.30) and the modelled and observed series are seen to match quite well. T w o measures, root mean square error (RMSE) and coëfficiënt o f determination (R^) are used to quantitatively evaluate the model performance. RMSE measures the deviation o f the modelled values from the actually observed values. R' represents the proportion o f variation that has been explained or accounted for by the modelled values. The ideal value for RMSE is O, and for is 1. Their definitions are:

RMSE

= .-Y.^x,-i,f

(3.33)

and

E(-^,-^,)^ R'^=\-JL^^

(3.34)

where n is the number o f observations, .v, is the particular observation, .v,. is model result, X is the mean o f observations. For the Taihu Lake case, the computed RMSE and /?' are 5.08 and 76% at S12, and 3.26 and 69% at S17 ( F i g 3.31). From these values, it is seen that the inodel outputs are also quantitatively in agreement with the field observations. The model performance at S17 is observed to be worse than S I 2 . One reason from the model itself is that data from the light blooming area (SI3) were mixed together with those from severe blooming areas (S14, S16) for model construction. A s a result, the membership funcfions and inference rules were biased towards the severe blooming cases while they may not be valid enough to the light blooming cases because they may follow different ecological mechanisms. This can be remedied by using spatially explicit model paradigms such as cellular autoinata (Chen, 2001b; Chen, et al, 2002a) which can implement different model processes i f spatially different ecological mechanisms occur. The model has not successfully captured the very high peaks, which is clearly seen at S12 (Fig 3.31). The same words hold for very low values that are obviously indicated at S17 (Fig 3.31). One reason is that centre o f gravity method was used here for defuzzification o f the output Chla, which gives the minimum value o f 3.55 |xg/l and maximum o f 34,18 | i g / l . A s a result, observations that are out o f this range are

Ccllular Automata and Artitlcial Inteliigencc in Ecoliydraulics Modelling

82

impossible to be reproduced by the model. A similar phenomenon was obseired i n Ozelkan and Duckstein (2001). Another reason is that the model variables are divided into three or five classes, which is too coarse to be precise. This could be improved i f a llner division is appiied. Obscrvations vs. M o d e l t)Litpuls (SI2)

Obscrvalions M o d e l oulpul

15/4/92

17/10/92

16/4/93

14/10/93

14/4/94

1.5/10/94

16/4/95

16/10/95

Time (2 observations/month, A p r i l ~ O c t / Y e a r )

Observations vs. M o d e l outputs (S17)

Übser\ations M o d e l output

15/10/91

15/4/92

17/10/92

16/4/93

14/10/93

14/4/94

15/10/94

16/4/95

16-10/95

Time (2 observations/month, Apri!-~Oct/Year)

Fig 3.30 Plot of observations vs. simulations at test site S12 (top) and SI 7 (bottom). 2 data points per month, from April to October each year. There is no observation between the gaps. The first observation of each April is used as initial conditiën, so no model output for this point. 30 = 0.690.3 R= =0.7621

25 .O O

20 1 5 10 5 0

40 siniLilatcd

."iO

10

20

.30 simiilated

Fig 3.31 Scatter plot of measured vs. simulated data at testing site SI 2 (left) and SI 7 (right)

Rule based modelling techniques and applications to algal blooms

83

A very slight phase error was also observed trom the model, which exhibits a little time lag at turning points. The phenomenon may come from (I) the use o f Chla, as a model input variable; (2) some other factors that in reality change the growth trend but were not taken into account. This drawback is nol serious, in particular when the time interval A ; is effectively small (Huarng, 2001). Despite the drawbacks, the model is acceplable for qualitative prediction (without defuzzification). The results are also promising even for quantitative prediction (with defuzzification), especially i f the membership functions and inference rules are improved further by incorporating some optimisation and sensitivity analysis techniques. Spatially explicit model paradigms such as cellular automata (Minns, et al., 2000; Chen, 2001b; Chen, el al., 2002a) can be applied to cope wilh spatially different ecological mechanisms i f they appear. The detailed coupling o f fuzzy logic and cellular automata is lo be presented in Chapter 4. 3.4.2

F u z z y l o g i c m o d e l l i n g a l g a l b i o m a s s in D u t c h c o a s t a l w a t e r s

The case study o f the Taihu Lake applied feature reasoning technique in the fuzzy logic model. In this section, a case reasoning approach is introduced in fuzzy logic modelling o f algal biomass in the Dutch coastal waters.

Fig 3.32 Solitary flagellaled and colonies non-flagellated Phaeocystis cells (ref. L. Peperzal 0.2 (i.e. 0.0062 mg/1) inorganic phosphorus and values > 100 Wh/m^day"' irradiance (Peperzak, et al., 1998). A high nitrate-ammonia ratio is believed to promote the colonial life form as well (Riegman, et al., 1992). Relatively high growth rates (> 0.5 d a y ' ) occur at salinity levels o f (20 - 35 psu) within a temperature range o f (7 - 22 "ïC) and a daily irradiance value o f (> 100 Wh/m"day"'). N o photoinhibition was observed in this area. Biweekly data were collected at 17 stations (Fig. 3.34) during the period from M a y 1975 to March 1983. The observations include temperature, p H , salinity, SjO^, total inorganic phosphorus, N O 2 , N O f , N H 4 ' , chiorophyii a and others, totalling 18 parameters. Since there is no observation o f Phaeocystis concentration directly, it is set on the basis o f laboratory bio tests (Peperzak, 2002) that any Chla concentration above 30.0 ocg/1 represents a Phaeocystis bloom. Since the data at station Noordwijk 2 ( N W 2 ) are more complete, they were selected for this research. According to the knowledge on Phaeocystis physiology (Lancelot, et al., 1987; Riegman, et al., 1992; Peperzak, 2002) and correlation analysis (Chen, et al, 2003), Salinity (S), temperature (T), total inorganic nitrogen (TIN) and total inorganic phosphorus (TIP) were used as input to predict the Chla concentration. N o data was available on wind or solar irradiance although they are known to be important factors in the studied field (Peperzak, 2002). The total data set contained 171 records, o f which 145 ( M a y , 1975 ~ Dcc, 1981) were selected for model construction and 26 (Jan, 1982 ~ Mar, 1983) were used for verification. According to the observations, the salinity values are between (19-31 psu) which is within the range o f optimal growth; therefore, salinity is nol seen to be a significant affecting factor in the studied area. Following the knowledge from ecologists that high growth rates occur when the temperature T is between (7 - 22 "ïC), the typical values for defining the membership function for T is set as low (7 ^C), middle (14.5 \ C ) and high (22 "ïC). A S O F M analysis was performed which lead to distinguishing three predefined classes for T I N and TIP and five classes for Chla respectively (Fig. 3.35). The resulting values used to detlne the membership functions are summarised in Table 3.7. Table 3.7 Typical values for definition o f membership functions Variables T TIN TIP Chla

Very low

1.92

Low 7 0.076 0.052 7.04

Middle 14.5 0.16 0.084 14.89

High 22 0.25 0.129 27.09

Very high

56.53

85

Rulc bascd modelling techniques and applications to algal blooms

L

H

M

VI

I

M

II

conccnlralidii (mgl)

VII

concenlralion (iig/1)

Fig 3.35 Membership functions of TIN (left) and Chla (right)

In rule form, the ecologist's knowledge can be generalised as if'TINi Is high and TIP, is middle and T— is not low, then Chbi+A, remains

increasing

where At is about 15 days. The logic is "'when availahle inorganic nutrienis are enough and irradiance is not limited, then algal concentration will not be low". This rule serves as a benchmark to accept or reject the rules gencrated trom the data. reproduel ion oi" learning cases

ü

24

48

72

96

120

^ 144

Fig 3.36 Model reproduction of learning data (time interval ~ 15 days)

The extended case reasoning strategy is applied for rule generation. Aftcr the meiTibership functions o f each variable are defined, the 145 learning data are fuzzificd following the procedures described in case reasoning. From these 145 records a limited set o f 15 rules are learned. The model was then tested on the learning data to check i f it could reproduce them. Such test can also reveal whether the generated rules are representative or not. From the test results (Fig 3.36) it is apparent that the developed model can reproduce the dynamic patterns o f the system, especially in a qualitative way (viz. predicting the occurrence o f algae bloom or not). After this test, the constructed fuzzy model was verified using the other 26 records ( J a n l 9 8 2 ~ M a r l 9 8 3 ) . According to the assumption that Chla concentration above 30.0 ccg/1 represent a Phaeocystis bloom, two bloom events are observed from the data. It is found that the model successfully captures these two blooms (Table 3.8). Therefore, with respect to algal bloom alarm, the model performance is acceptable. In order to conduct a quantitative comparison, the model outputs are defuzzified by the centre o f gravity method with normalised weighted sum combination, which is given by equation (3.35).

Ccllular Automata and Artificial Inteliigencc in Ecohydraulics Modelling

86

DL.,

V " WV M(B.) = ^'±.: ' '

(3.35)

where Dfoui is the numerica! output, w, is the weight associated with rule /, v, is the tlred degree o f rule /, and /W(5,) is the fuzzy mean o f the corresponding output fuzzy set o f rule ;. Table 3.8 Comparison between modelled results and observations Observation Blooms Training data Testing data

7 2

Qualitative comparison Correct Wrong 4 3 2 0

Fig 3.37 Chla concentration at NW2 (zlf = 15 days)

Quantitative comparison RMSE R9.6 45% 7.0 83%

Fig 3.38 Scatter plot and ff

The defuzzified outputs are plotted together with observations for comparison (Fig 3.37, 3.38). The two series are seen to match ciosely, and R~ goes up to 0.83. Although there seems to be a considerable discrepancy at point 14, closer analysis o f the procedure revealed that no rule was generated from the leaming data for this particular case. A s a result, the average value was assigned by the fuzzy model as default, while the aclual measurement indicated a very low concentration. It is also seen that the model fails to numerically reproduce the very low or very high values. One possibie reason is that the model variables are divided into three or five classes, which is too coarse to be accurate. This could be improved by appiying a finer division into subclasses i f quantitative prediction is the interest (Bardossy and Duckstein, 1995). The results have also been compared to those o f an artificial ncural network model for the same data set (Chen, et a l , 2002c). The overall conclusion is that the fuzzy logic model can be a good tooi for algal bloom alarm, provided irradiance data from meteorological observations and nutriënt data from a hydrodynamic model, e.g. the Delft3D System, are available (Chen and Mynett, 2003a). 3.4.3

D e c i s i o n t r e e m o d e l l i n g P. globosa

bloom

Due to lack o f data about specific species concentrations (cell number per litre), both the last two fuzzy logic models took chlorophyll a as the bloom indicator. However, algal blooms are usually prcdominated by single species and the prediction o f the particular species is the main interest o f H A B s models. For example, the objective o f the Dutch pilot study within the E U - H A B E S project is specified to model the bloom timing, intensity and duration o f P. gtohosa. The use o f the lumped indicator Chla is seen to be too rough to meet this requirement.

87

Rule based modelling techniques and applications to algal blooms

Reconsidering the problem that is to answer whether P. globosa will bloom (Yes) or not bloom (No) and the bloom intensity tmder certain conditions, the problem actually becomes a decision-oriented and localised issue. Suitable techniques for such kind o f problems include decision trees and piecewise regression. The research is intended to introducé these techniques to the Dutch pilot study at the second stage o f the H A B E S project. The decision tree model is able to predict bloom timing (bloom or nol bloom on a certain day) and Ihc nonlinear piecewise regrcssions can predict the bloom intensity (cell concentrations) on the basis o f available meleorological data. A multi-variable regression model has been developed to predict bloom duration i f the decision tree forecasls blooming lo take place. Since field obscrvations and laboratory experiments indicate that irradiance plays an important role in this particular ecosystem, three different scenarios considering irradiance conditions have been investigated. It is found from the modelling exercise that using mean water column irradiance o f photic zone provides the best performance in the model. The studied area NoordwijklO (Fig 3.34) is about 10 km off the Dutch coast near the village Noordwijk. The water depth at the location is approximately 19 m. The water is usually well mixed. Weak stratification, mainly due to salinity difference, appears only temporarily (de K o k , et al., 2001) with the mixing depth o f about 7.5 m. The yearly averaged percentage o f freshwater from the Rhine is around 10%, which means the river infiow may have a certain infiuence on the seasonal and annual variation o f t h e phytoplankton biomass (Gieskes and Schaub, 1990; Cadee, 1991; Schauband Gieskes, 1991). Ten years (1990-1999) o f data have been collected: (I) meteorological data that contain daily information o f irradiance (/), air temperature, and wind speed and wind direction; (2) hourly data o f river discharge at station Maasluis that is located at the river Rhine outlet; (3) water quality data that have different sampling frequeneies. In spring and summer (April ~ August), sampling was performed every week (sometimes more frequently during bloom); in winter generally once a month. The water quality data contains 28 parameters, including water temperature ( W T ) , p H , suspended solid (ss), salinity (sal), attenuation coëfficiënt (K,/), N H 4 ' , N O 2 . N O j ' , PO4'", S i O : , (Chla) concentration, P. globosa cell concentrations. Dissolved inorganic nitrogen (DIN) is calculated as a summation o f N H 4 ' , N O 2 and N O ; , . The surface photosynthetically active radiation ( P A R , /«) is given by (Peperzak, 1993): /„ = 0.45

/

(3.36)

Because o f the variation in sampling frequency, only the data from March to September o f each year are used in the study. After eliminating the records with missing data, there are finally 261 records left. Data from 1995 ~ 1997 (81 records) are used for independent model testing and the others (180) are used for model construction. According to the Lambert-Beer Law, the mean water column irradiance in photic zone can be computed by: (3.37) in which is the incident irradiance at the surface (i.e. P A R ) , Kj is attenuation coëfficiënt ( m ' ) . Z,, is the depth o f photic zone (m), which is defined as the depth o f the 0.01/,)(Brush, et al., 2002; Brawley, et al., 2003). Therefore, Z,, is given by Z,, =4.6\/

K,

(3.38)

Cellular Automata and Artificial Intelligence in Ecohydraulics Modelling

88

If the photic depth Z,, is larger than the water depth Z, then Z,, = Z. However, de K o k et al (2001) pointed out that weak stratiflcation appears temporarily in the study area in spring and early summer with mixing depth Z,„ o f 7.5 m. It is suggested to use this depth to compute the mean water column irradiance. For comparison, the research investigated three different scenarios regarding the computation of/,„ (Table 3.9). Table 3.9 Scenarios different in considering water column irradiance Scenario /,„

1 Use weekly mean /(, directly

2 Computed by using Z„

3 Computed by using Z,„

Since lo can be easily obtained from / by equation 3.49, the problem then becomes the estimation o f the attenuation coeftlcient K,/. Peperzak et al (1993) used in situ measurements and estimated that the K,i at N W I O is 0.55 ± 0.26. A s a result, a constant value 0.55 was used in bis model. Colijn (1983), Riegman and Hemdl (2002) found that Kj is mainly determined by suspended matter (ss), salinity (sal) and phytoplankton (Chla). Therefore, in this study, is to be estimated from SS, salinity and C h l a by a linear regression function: K,=a

+ h* SS + c*sal + d* Chla

(3.39)

where the most important factor is suspended matter which vary considerably with tidal cycles and wind (Cadée, 1982; Riegman and Hemdl, 2002; Colijn and Cadée, 2003). There are 75 records within the data that contains all measured Kj, ss, sal and Chla. The multiple regression function is obtained by least square error method with R-of0.4\: KJ = 2.416 + 0.0224*.v.v - 0.0664*.v«/ +0.0066*C/7/«

(3.40)

It is seen (in normalisation format) that the contribution from phytoplankton is considerably small comparing to the other Iwo factors and it is usually more difficult to obtain Chla data than ss and sal. However, under the H A B E S project, there is a Smartbuoy installed in N W I O which can instantly provide hourly data o f 55 and Sal. Therefore, it will be practically more useful i f the Chla component can be dropped out without significant adverse effect on the results. Similarly, another regression function is obtained with R' = 0.39, and the values vs. observations are plotted in F i g 3.39: KJ = 2.028 + 0.0238*ii- - 0.0521 *sal 1.4

Predicted Values of K,,

Fig 3.39 Regression vs. observation of Ka

(3.41)

89

Rule based modelling technic|ues and applications to algal blooms

Compared to the Hrst regression, thcre is no significant difference. Therefore, for practical application o f the model, the pilot study adopted the second regression function for estimating Kj values.

10

11

Fig 3.40 Seasonal variation of meteorological, hydrological, nutrients and Phaeocystis

Basic statistical analyses o f the data are conducted to understand the annual and seasonal dynaraics o f the studied ecosystem. The analysis includes annual and

12

Cellular Automata and Artificial Inteliigencc in l-colivdraulies Modelling

90

monthly means and Standard deviations. Because water quality data have irregular sampling intervals, only monthly analysis was perlbrmcd. The monthly analysis reveals seasonal variations that p n n i d e preliminary information about the relations between driving factors and P. i^lohosa development. The annual analysis can possibly provide explanations on the differences o f P. globosa conditions between different years. The seasonal variations o f meteorological, hydrological conditions, nutrients and P. globosa are presented in Fig 3.40. The annual variations o f meteorological, hydrological conditions and P. globosa are presented in Fig 3.41.

660 ' 640 620

I

600

è

580

5

560


0.2 Bloom = N

Cellular Aiitoniata and Artükial Intelligence in Fieoliydraulies Müdelling

92

Rule 2: W K I „ > 159.56 DIP > 0.2 W K A i r T < 11.7 Bloom = Y Rule 3: WK.I„> 159.56 DIP < 0.2 W K A i r T > 1 1.7 W K A i r T < 13.3 Bloom = Y Rule 4: DIP > 0.2 W K A i r T > 11.7 Bloom = N Rulc 5: W K A i r T > 13.3 Bloom = N Default Bloom = N Evaluation on training data (180 cases, error rate 7.8%): (a) (b) 13.08 then Bloomlnt = 4.6293 - 0.34*WT + 0.18*WK.AirT - 0.03*SiO2 - 0.28*DIP + 0.005DIN + 0.0006WKL Rule 3: if W K L > 102.13 W T < 13.08 then Bloomlnt = 8.8816 - 0.123*810. - O.I5*WKAirT - 0.I2*WT - 0.32*DIP + 0.005*DIN + 0.0006*WKI„, To evaluate predictions o f bloom intensity and duration, two numerical measures are applied: root mean square error {RMSE) and coëfficiënt o f determination {R'). The evaluation o f the model performance is presented in Table 3.11. Table 3.11 Comparison between model outputs and observations o f bloom intensity Training data Testing data

Number of cases 180 81

RMSE 2.11 2.20

0.52 0.45

Correlation p 0.72 0.70

During the basic statistie analysis o f the data, it is observed that bloom duration is related to the bloom timing and the starting intensity. Other affecting factors may include irradiance, river inflow and temperature. Since Information o f irradiance, river inflow and temperature can be easily obtained from meteorological stations (weekly weather forecast), and bloom timing and intensity can be predicted by the model, it is possible to set up a multi-variate regression model to predict bloom duration i f it occurs: BloomDiir

= a + b* StartDay + c* StarCon + d* MeanI + c * MeanQ + ƒ * MeanAirT

where BloomDur. bloom duration; StartDay: the day that bloom starts; StartCon: the concentration o f bloom when it starts; MeanI: weekly mean instant irradiance at surface; MeanQ: weekly mean discharge from Maasluis; MeanAirT: weekly mean air temperature; O, b, c, d, e,f: coefficients o f the regression function

(3 43)

Ccllular Automata and Artificial Inlclligence in Ecohydraulics Modelling

94

The main task is to find the optimal coefficients' set o f driving factors that give the least square error between regression outputs and observations, i.e.

min(2] ( B l o o m D w

- Ohs)-)

(3.44)

in which Obs is the observed bloom duration (days), n is the number o f records. B y examining the nonnalised regression coefficients, the irrelevant parameters can be eliminated and the regression w i l l become simpler. The positive coëfficiënt indicates that the factor has positive relation with the model output and so does the negative coëfficiënt to negative relation. Therefore, the multi-variable regression can not only predict bloom duration, but also provide information about important factors and their relations. The obtained regression function is: BloomDw

= 346.5 + 9.398 * AirT - 0.865 * StartDav - 38.087 * SlarCon

(3.45) - 0.043 * MeanQ + 0.01* MeanI

with the R~ = 0.83. However, it is observed by checking normalised regression coefficients that irradiance does not have a significant effect on the bloom duration provided it is higher than the threshold (102.13 Wh/m" day"'), which is usually the case in bloom season. Therefore, a final regression model was made without irradiance given by: BloomDiir

= 346 + 9.762 * AirT - 0.873 * StartDav - 36.903 * StarCon - 0.043 * MeaiiQ

with R' = 0.83. Regression outputs and observation are presented in Fig 3.42.

Regression 95% confid. 35

45

55

Predicled bloom durations (days)

Fig 3.42 Regression and observation of bloom duration

Three model scenarios have been investigated in the research, and comparatively the scenario 2, which used mean water column irradiance o f photic depth, gave the best performance. This also indicates the necessity o f estimating the attenuation coëfficiënt Kj. The multi-variate regression function that estimates Kj by using only ss and sal with acceptable accuracy is very applicable in practice because o f the installation o f SmartBuoy which provides hourly information o f .y.v and sal. Five rules are constructed from the data in the decision tree model, which shows notably good performance in predicting bloom timing. In addition to the low error rates, 7.8% for training data and 12% for testing data, the model successfully captured

Rule based modelling techniques and applications to algal blooms

95

the bloom timing except 1993 (Table 3.10). Several remarks can be made From the model trees: 1. The model is compact. Although 9 parameters are prov ided, only 3 (irradiance, air temperature and DIP) are needed by the model. These data are easily obtained from meteorological forecast; therefore the model is applicable in practice. 2. The model has good interpretability. Rule 1; before bloom starts, irradiance is limited; Rule 2: bloom starts in spring and early summer when irradiance is highcr than certain threshold and DIP is above minimum requirement; Rule 3: bloom duration, DIP is depleted; Rule 4: bloom terminates; Rule 5: late summer, post bloom termination. 3. It is striking that the model found the threshold o f DIP (0.2 ^ mol/1) without any predefmed interference. This threshold value for P. globosa bloom had been independently discovered by Escaravage et al (1995) and Brussaard et al (2003) through laboratory experiments. When available DIP is lower than the threshold value, initiation o f P. globosa bloom is constraincd (Peper/ak, 2002). In the nonlinear piecewise regression model for predicting bloom intensity, three linear functions were constructed. From the quantitative evaluations in Table 3.12, the model gives acceptable performance. A s the rules are concerned, some important points are the following: 1. A l l these rules can have ecological interpretations. Rule 1: the phase before P. globosa initiates, irradiance is limited; Rule 2: the phase P. globosa initiates and develops towards to bloom; Rule 3: the phase P. globosa dies and degrades. 2. It is remarkable that the model found the irradiance threshold (102.13 W h / m " d a y ' ) automatically. Peperzak (1998) independently identified the irradiance threshold 100 Wh/m" d a y ' for P. globosa initiation by laboratory experiments. 3. However, as long as the irradiance is above the threshold, the effect o f irradiance on bloom intensity is comparativcly small, or even negligible. This can be seen from the normalised coëfficiënt in the regression function. This on the other hand confirms that irradiance just acts as a trigger to the bloom. 4. Temperature is also identilled as an important factor, which is consistent with the conclusions from Egge and Aksnes (1992) and Peperzak et al (1998). However, the requirement o f water temperature is found to be 13.1 \ C that is a little lower than the value (14 \ C ) given by Peperzak (1993) and Hodgkiss et al (2000) who used laboratory experiments. The multi-variate regression model for predicting bloom duration (if it happens) is very comparable with the tleld data (R' = 0.83), which is also shown in Fig 3.48. From the regression function, a negative relation between duration and bloom starting day was found. The possible reason is i f P. globosa bloom starts late, other phytoplankton such as diatoms may already begin developing and taking up the nutrients, in particular phosphorus. When P. globosa initiates, it is depressed by the low DIP, like the year 2002. Brussaard et al (2003) and Ruardij et al (2003) discovered through experiments that under phosphorus depleted bloom, P. globosa is easy to get infection o f virus. Different from nitrogen depleted bloom where P. globosa disintegrated into ghost colonies without sticking together into large aggregates, phosphorus depleted blooms form very large aggregates overnight and easily sink down to the bottom. A s a result, the bloom completely collapses. This is also confirmed by the data that blooms collapsed when phosphorus concentration is found to be lower than 0.2 ^mol/1. A similar reason holds to the negative relation between duration and starting peak. If the starting concentration is too high, the remaining sources such as DIP may become too low to sustain the bloom, so the

96

Cellular Automata and Artitlcial Inlelligence in licohydraulics Modelling

bloom collapses i n shoil time. The river intlow during this period also has negative effects on P. glohosa duration. Ho\\e\er. it is not significant. The river discharge may stimulate the growth o f other phytoplankton, presumably the second diatom bloom (Peperzak, e t al, 1998) through supply o f silica. Irradiance is seen to have little effect on the bloom duration. This again confirms that irradiance just acts as one o f the triggers to the bloom. A s long as it is higher than the thrcshold, extra irradiance has almost no influence on the bloom. In general, the model clearly indicates that the joint effects o f water column irradiance (/„,), temperature (T) and dissolved inorganic phosphorus (DIP) determine bloom timing and intensity o f P. glohosa. while D I N is not a limiting factor in the ecosystem. The bloom duration depends on the bloom timing (starting day), starting intensity and temperature. Irradiance is seen to act just as one o f the triggers to P. glohosa bloom. A s long as it is higher than the thrcshold, extra irradiance plays little role in bloom intensity or duration. River discharge from the Rhine does not have significant instant effect on the P. glohosa bloom. The thrcshold values o f /„„ T and DIP independently found by the model are in accordance with those discovered by other researchers through laboratory experiments (Egge and Aksnes, 1992; Escaravage. et al., 1995; Peperzak, 1998). These are implicitly the facts o f interspecific competition with diatoms. The conclusion is consistent with the previous discoveries obtained through laboratory and mesocosm experiments (Egge, 1993; Brussaard et al 1995; Peperzak, et al, 1998; Peperzak, 2002). Due to splitting o f the parameter space by decision trees and piecewise regression, the model is capable o f dealing with the common problem in algal blootns that limiting factor is changing. Therefore, decision tree and piecewise regression can be good alternative techniques in harmful algal bloom modelling at species specific level.

3.5 D i s c u s s i o n

Algal bloom is a worldwide serious and complex problem that involves hydrodynamic, and biological process. Knowledge on H A B s is usually semi-qualitative, while large ecological data sets hardly exist or the data is semi-qualitative due to high sampling cost. Therefore, modelling algal blooms is an ambitious and difficult topic. O w i n g to their ability to deal with imprecise, uncertain data or ambiguous relationships a m o n g data, fuzzy logic has proven to be a practical and successful approach to harmful algal bloom modelling. In the case study o f Taihu Lake, this research developed a methodology that combines data mining techniques with heuristics knowledge for dimensionality reduction, membership function definition and inference rules induction, combined within a fuzzy logic approach. The method was shown t o be promising as indicated by the results o f modelling algal biomass concentration in the eutrophic Taihu Lake. S O F M is only applicable when the data has clear clusters embedded, which is usually not the case in reality. Therefore, partitioning analysis in combination with case reasoning techniques were applied to a case study for the Dutch coast. The model developed within this research proved robust when investigating results o f predicting algal biomass (Chla) concentration at station Noordwijk 2 in the North Sea. However, algal blooms are usually dominated by single species and the prediction of the particular species instead o f lumped biomass is the m a i n interest o f H A B s modelling. In order to forecast P. glohosa bloom along the Dutch coast, decision trees and piecewise regression methods have been introduced to develop a model on the basis o f Noordwijk 10 data. The model was tested by an independent dataset from the Chemical

Rille based modelling techniques and applications to algal blooms

97

same area, and the model results agree well wilh the observations both qualitatively and quantitatively, viz. regarding to P. globosa bloom timing, intensity and duration. Therefore, deeision tree modelling and pieeewise regression analyses can be good alternative techniques for harmful algal bloom modelling at the species speeifie level. In general, the fuzzy logic and decision tree models are intuitive and transparent. B y integrating data with heuristic knowledge, these approaches can be applied in cases with sparse data and semi-qualitative knowledge. However, they are acknowledged to be lacking simulation system dynamies because the underlying physical processes are not coupled. Besides, these models are constructed and tested on the basis o f point data. The spatial heterogeneity and local behaviour are not taken into account. Hence, the models have no capability to capture the patchiness phenomenon o f H A B s . Therefore, the next research aetivity is to couple the Delft3D water quality module (Delft-WAQ) with the rule based modules to build up an integrated algal bloom forecasting system. The D e l f t - W A Q module simulates abiotic factors while the rule based modules developed in this thesis predict the eeologieal dynamies for a range o f eonditions. The cellular automata paradigm is applied to realise spatial heterogeneity and local behaviour. More details on integrated numerical / fuzzy logic / cellular automata modelling, including some case studies, are presented in chapter 4.

Chapter 4

Cellular automata and rule based techniques in ecohydraulics modelling In the context of this chapter, the integration o f the cellular automata paradigm and rule based modelling techniques is investigated. Clearly, the dynamics ofthe E c o C A model introduced in chapter 2 is still purely dependent on geometrie relations between neighbouring cells, and does not account for physical or biological processes. Therefore, the approach followed here is to extend the conventional cellular automata paradigm by incorporating external forcing. First a C A based model is developed to simulate the competition and succession o f two macrophyte species in the eutrophic Lake Veluwe. The E c o C A and the macrophyte models use either deterministic rules or stochastic rules. However, there are many systems, in particular ecosystems such as H A B s , where the detailed mechanisms and their statistical behaviour remain unclear. Hence, the developed fuzzy logic technique is introduced into the cellular automata paradigm for rule formulation, namely fuzzy cellular automata. It has been emphasised that hydrodynamics often drive the evolution o f aquatic ecosystems. Hence, the coupling o f tlow and water quality is mandatory to any ecohydraulics model. In this research integrated numerical and decision tree modelling is explored first, followed by the integration o f the fuzzy cellular automata module E c o C A with the numerically based Delft3D system. The instantiation o f integrated models in predicting algal blooms along the Dutch coast substantiates the importance and prospectives o f combining different methods and different paradigms in future ecohydraulics modelling.

4.1 C e l l u l a r a u t o m a t a m o d e l l i n g m a c r o p h y t e g r o w t h a n d s u c c e s s i o n 4.1.1 D e s c r i p t i o n o f s t u d y a r e a

Lake Veluwe is an artificial isolated part ofthe larger Lake IJssel in the centre ofthe Netherlands. The total water surface is around 3300 ha, with an averaged depth o f l.4m. It was formed by the construction o f dams in the Southcast part of Lake IJssel in 1952 (Fig 4.1). According to long-term documentation, the submerged vegetation of the lake has experienced a great change after its formation due to the change in nutriënt loading (Marcel, 1999). Before 1968, the water in the lake was clear, with diverse macrophyte vegetation. Due to discharge o f wastewater from some small cities, the lake was eutrophicated, and blue-green algae became dominant (Hosper, 1997). Some restoration measures were taken in late 1970s, which resulted in the increase of P. pectinalus. The increase of P. pectinatus provided the precondilion for the return o f C. aspera. After 1990, C. aspera colonised steadily and gradually replaced the dominance of P. pectinatus. From an ecological point o f view, it secmed that P. pectinatus would outcompete C. aspera in this lake system, since: {\) P. pectinatus has better ability than C. aspera to live in moderately turbid water; (2) P. pectinatus germinates earlier and colonises the upper layer, which shades C. aspera. In Lake Veluwe, (3) P. pectinatus is less sensitive to eutrophication level, especially to phosphorus concentration, than C. aspera. However, C. aspera outcompcted P. pectinatus and replaced it gradually in

100

Cellular Automata and Artificial Intelligence in licohydraulics Modelling

Lake Veluwe. Analysis o f long-tenn observations indicated a self-reinforcing ability o f C. aspcnt during eutrophication. C. aspcni returned at a lower phosphorus level (0.1 mg/l) than the level at the time o f their disappearance (0.3 mg/1), a phenomenon known as hysteresis (Hosper, 1997: Marcel, 1999) , therefore phosphorus is not a key factor in this case. It is supposed that the competition o f dissolved inorganic carbon H C O j ' and competition o f light are the two main factors o f the succession. However, the replacement process is still unclear, from which emerged the demand o f model simulation. Considering the environmental heterogeneity and the local interactions, this research selected a C A approach to simulate the competition o f light and H C O , ' , and in order to explain the essential features o f the replacement process.

Fig 4.1 The study area Lake Veluwe 4.1.2 M o d e l d e v e l o p m e n t

In this C A model, deterministic evolution rules obtained through laboratory and field experiments are applied. The model is designed to contain two partly interacting parallel submodels, one for P. pectinaliis and the other for C. aspera. The processes considered in each submodel include shading, attenuation, HCO",' competition, photosynthesis, respiration, morality and sprcading. A conceptual framework o f the model is presented in Fig 4.2, where solid lines refer to inass or energy tlow, and dash lines indicate related processes. The local interactions between the two species are indicated by the two-directional dashed lines, for instance "shading". Beforc model construction. some important assumptions are made for simplification as summarised below: (1) the spatial scale is 10*10 m". and the temporal scale is I day: (2) the model luinpcd vegetation in each cell into biomass, which is counted by ash free dry weight (AFDW); (3) the whole depth is considered as one laycr since Lake Veluwe is a shallov\ lake with very little stratification; (4) within a day, the irradiation is assumed to be constant in daytime; (5) since C. aspera and P. pectinatus colonise at different vertical positions, different depths are used «hen computing light attenuation; (6) oulside well-vegetated areas, the light attenuation coefikienl is considered constant in the whole growing period; inside \egetaled areas, both species have the same attenuation value at first, hut after P. peclimitns reaches the water surface, it takes smaller values than C. aspera;

Cellular automata and rule based teehniques in eeohydraulies modelling

101

(7) in the 1'irsl month alter the end of initialisation of C. aspera, the concentration of H C O , i s the same everywhere; after that, it decreases to 0.4 m M in vegetated areas, while remaining 2.5 m M outside; (8) resources for growth are allocated after fultllment of respiralion; (9) respiration is proportional to the total biomass presenting; (10) mortality is also proportional to total biomass; (11) biomass loss due to wave action is ncglectcd; (12) grazing effects on biomass loss and propagules dispersai is ncgligiblc (Hospcr, 1997); (13) the saturated biomass density is 350 g A F D W / m ' (Marcel, 1999) for both species; (14) a cell is considercd vegetated whcn 75% saturated biomass is reached. (15) spreading happens when the cell is saturated and there is space available in the adjacent 8 cells; (16) the loss of propagules during winter is a proporlion of the total propagules in the cell.

Irradiation ^

Shiiding

^

^

Attenuation

^

Underwater light condition ^

Underwater 12ht iicor

Photosynthesis


conveision [HCOfl Respiration rate Ig O , 6 ' AFDW h - ' i Respiration time {h> Mortality rate Saturation biomass Seed yield ratio Seed loss ratio

cottt

[HCO,] CRR RT CMR CSB CSYR CSLR

25 2?;. 0.00018 0.12 10 I0(l20tll ISOttl). 14 (I5llh-210lh». 12 l2l Uh 24iilhl I2u (12mh l. 191.29 " Temp 0.42 -> no bloom 10.958] The model performance is evaluated in Table 4.6 Table 4.6 Evaluation o f model performance Training 92.11% Accuracy RMSE' 0.2654 —'* RMSE: _root mean square error

Testing 76.92% 0.4437

In this part, the one-dimensional P. glohosa model is integrated with the decision tree model to forecast possible blooms depending on irradiance and nutriënt conditions. Although the model still needs large improvement, the case study results of Noordwijk 70 are seen to be acceptable. The one-dimensional physically based model is helpful to understand the fundamental mechanism o f P. globosa blooming process. However, it is not sufficiënt to use only critical depth for bloom prediction. Although the computation o f tide/wind induced turbulence diffusion can be achieved, the determination o f critical turbulence is difficult i f not impossible, which indicates w h y incorporation o f numerical simulation is necessary. 4.3 I n t e g r a t e d n u m e r i c a l a n d f u z z y c e l l u l a r a u t o m a t a m o d e l l i n g o f H A B

In this section, an integrated numerical and fuzzy cellular automata model was developed to predict possible P. globosa blooms in the Dutch coast basing on irradiance, nutrients and neighbourhood conditions. The numerical module uses Delft3D to simulate hydrodynamic and water quality processes, and fuzzy rule based systems and cellular automata are applied lo predict ecosystem behaviour. The simulation results o f year 1995 are compared with field obscrvations, and the modelled spatial patterns are compared to Ihc salellite image. 4.3.1 D e s c r i p t i o n o f s t u d y a r e a

The Dulch coast receixes drainage from the river Rhine and Meuse and is one ofthe most productive fishing areas in the world. In the past 20-50 years, the increase o f nutriënt discharged by the rivers has led to eutrophication ofthe coastal zones (Klein and Burren, 1992). Spring algal blooms occur frequently in Dutch coastal waters that are dominated by diatoms and followed by Phaeocystis globosa. The blooms, defined as chlorophyll a concentration > 30 o^g/l, are usually non-toxic, but they may be harmful because the sedimentation o f dead algae can lead to anoxia and result in massive mussel mortality (Peperzak, 2002). Algal blooming is a multidisciplinary and complex problem where hydrodynamics, chemical and biological processes lake place simultaneously. The blooms usually occur very locally and show strong patchy dynamics. Some ofthe processes such as the hydrodynamics can be investigated in detail, while there are still a lol o f biological mechanisms that remain unclear. Besides, water quality and biological data are usually sparse and uncertain for detailed analysis (Chen and Mynett. 2003a).

116

Cellular Automala and Artificial intelligence in Ecohydraulics Modelling

In the last decade, a number o f studies have been conducted in this area and some numerical models were set up to simulate the water quality and eutrophication (Nelissen and Stefels, 198H; Villars, 1997; Huthnance, 1997). There are also some other types of models developed for the studied area. Chen and Mynett (2()04a), Blauw et al (2003) developed fuzzy logic models for forecasting algal blooms in the Dutch coast under the European Commission project H A B E S . Chen and Mynett (2004b) also applied decision tree and a piecewise regression approach to predict P. glohosa blooms. Theses models are eilher completely deductive or completely inductive, as discussed in Chapter 3. However, it has been widely recognised that accurate modelling o f the underlying physical and (bio)chemical processes is crucial to the prediction o f initiation and species composition o f algal blooms (Franks, 1997; Anderson, 2002). The perfortnance o f red tide models has so far been restricted by their insufficiënt ability to integrate both the biological and the underlying physical processes (Donaghay and Osborn, 1997; Verkhozina, et al., 2000; Chen, et al., 2003). The strong patchy dynamics mainly resulted from spatial heterogeneity and local interactions which are absent in most conventional models. Therefore, a reliable model to predict initiation, transport, and persistence o f red tides has yet to be established (Chapra, 1998; Chen and Mynett, 2004a). In this research, an integrated numerical and fuzzy cellular automata model was developed to predict phytoplankton biomass and hence algal blooms in the Dutch coastal waters. The numerical D e l f t 3 D - W A Q (water quality module) simulates the llow and transport conditions, water column irradiance, nitrogen and phosphorus concentrations that are intluenced by the discharge from the River Rhine. The fuzzy logic module was transferred from the one that was developed on the basis o f Noordwijk 10 data (Chen and Mynett, 2004a) and was used to predict algal biomass on the basis o f the computed abiotic factors. In order to take into account the spatial heterogeneity and local behaviour and to capture patchiness dynamics, a cellular automata paradigm was irnplemented in the developed model. Thus, at the final stage of this research, fuzzy cellular automata approach which has been initiated by W u (1996). Mielke and Pandey (1998) was e.xplored where the local evolution rules ƒ are defined by fuzzy logic.

Fig 4.13 Monitoring localions in the BIOMON, DONAR and MONISNEL monitoring programs (MONISNEL uses a limited number of monitoring stations. Ref. Blauw, A.N., 2003)

Ccllular automata and rule based teehniques in ecohydraulics modelling

117

The study focuses on the near shore area o f the Dutch coast (Fig 4.13). The water depth is between O and 30 m, and water temperature varies from 5 to 22 " C , and the irradiance is between 132-1700 W h m ' d a y " ' . The concentrations o f inorganic nitrogen and phosphorus are between 0.007-1.246 mg/1 and 0-0.073 mg/1 respectively. The biomass concentration (in chlorophyll a) is from 0.1 - 90.2 ocg/l. The discharge from the River Rhine at the IVIaassluis station (including tidal effects) is between -2744-4649 mVs, with a mean o f 1382 mVs. The water is usually wel! mixed except for temporary weak stratillcation caused by salinity.

4.3.2 M o d e l d e v e l o p m e n t

A curvilinear grid is used for the computations with D e l f t 3 - W A Q (Fig 4.14), totalling 1157 computational cells in the studicd area. Nitrogen and phosphorus processes include mineralization o f organic compounds, denitrification (for nitrogen) is the reduction from nitrate to gaseous nitrogen, and uptake is the part assimilated by algae. Following the computational approach as outlined in Chapter 3 (Eq. 3.12 - 3.19), the nitrogen, phosphorus and the transport are calculated first.

Fig 4.14 The curvilinear computational grid (1157 computation cells)

The water column irradiance is calculated according to the Lambert-Beer law, and the attenuation coëfficiënt is estimated by the equation in Chen et al (2003). Since the studied area is usually well-mixed with only temporary and weak stratifications, the mean water column irradiance (Eq. 4.33) is then used.

0.00

0.04

O.OS

0.12

0.16

0.20

riN(mg/l)

O

10

20

.30

40

50

60

70

Clila(ug/I)

Fig 4.15 Membership functions of model variables and output (left: TIN; right: Chla)

118

Cellular Aulomala and Artificial Intelligence in Ecohydraulics Modelling

The fuzzy logie model developed by Chen and Mynett (2004a), Chen et al (2003) was used to prediet algal biomass on the basis o f the ealeulated nutriënt eoneentrations. The membership funetions o f Chla and dissolved inorganie nitrogen are shown in F i g 4.15. The other variables inelude Julian date, water temperature and concentration o f dissolved inorganic phosphorus. There are totally 72 inference rules in the rule base that come both from ecologists' cxperience and from the data leaning process (Chen and Mynett, 20()4a). The simulation time step for nutrients is 1 hour and that for algal biomass is 7 days, therefore a time aggregation is made before initiating the fuzzy cellular automata module. In this research, an arithmetic average is used to compute the weekly mean values. The cellular automata module is directly implemented on the curvilinear grid, which o f course does not strictly Ibllow the original definition o f C A v\here square grids are used. However, this approximation seems acceptable as the geometry o f the cells does not vary much in the nearest neighbours. The Moore neighbourhood configuration is applied in the C A model and the local evolution rules constructed by fuzzy logic techniques follow the general formula:

(4.39)

s;:; =f(s'^:s::;,^s',)

where S'*- is the state o f cell (/, /) at time step /+1, ' S'*' is the state o f the cell (/,y) at time step /+1 which is preliminarily predicted by the fuzzy logic model, and are the states o f the eight neighbouring cells at time step /, while / represents the local fuzzy evolution rules. Thus the model framework can be sketched as:

Dell'l31)-WAQ . numerical module

\1 [ïxtcrna l'orcin gs

/ * 0

/4 1

fuzzy cellular automata

\1 Fig 4.16 Sketch of the model framework 4.3.3

Results and discussion

The modelled results o f dissolved inorganic nitrogen, inorganic phosphorus and chiorophyii a concentrations in 1995 are given in Fig 4.17 where the figures in the right column represent the results at the peak-bloom period.

Cellular automata and rule based techniques in ecohydraulics modelling

Nitrate {N03)

Nitrate (N03)

Bloom Intensity

Bloom Intensity

119

Fig 4.17 Concentrations of modelled inorganic nitrogen, inorganic phosphorus and algal biomass in winter period (left) and peak-bloom period (right) It is seen from the modelled results that the nutriënt concentrations are high in the winter period and very low in late spring and early summer. Conlrary, algal biomass concentration is quite low in winter and becomes suddenly high in eariy summer time. In the spatial domain, the nutriënt concentrations are highcr in the estuaries than in the coastal waters, hence the algal biomass is greater there as well, since it greatly depends on nutriënt availability. Another observation is that the concentrations are higher in South-North direction (along the shore) than in East-West direction.

120

Ccllular Automata and Artificial intelligence in Ecohydraulics Modelling

The maximum concentrations in the coastal waters reach values of 2.8 mg/1 for NO3", 0.18 mg/1 for P04^", and 48 |ig/l for chlorophyll u. They are even higher in the river mouth areas. In order to compare and cvaluate the performance of the fuzzy cellular automata module, the B L O O M II model of DclftSD systcm was also used. The B L O O M II model applics optimization teehniques to obtain the maximum growth rate under given conditions through lincar programming methods (Los, 1988). The algal biomass concentrations at the first pcak period calculated by the B L O O M II model is presented in Fig 4.18. Chifa

Fig 4.18 The first peaW of chlorophyll a concentration modelled by BLOOM II of DelftSD software system Also, for comparison, the satcllitc image of algal bloom at the beginning of M a y 2003 along Dutch coast is shown in Fig 4.19, where the main species was idcntified as P. globosa.

Fig 4.19 Algal bloom in early May, 2003 along the Dutch coast

The nutriënt concentrations are higher in winter and early spring than in the summcr period because of (1) the internal cycling from decomposition; (2) the main rainfall period bcing in winter and early spring which results in largc cxtcmal input by river discharge; and (3) the low uptake by phytoplankton due to light limiting growth.

Cellular automala and rule based technic|ues in ecohydraulics modelling

121

In spatial pattcrn. tiie nutriënt concentrations are higher along the coast. The reason is that the residual llow ot'river discharge is from South to North following the coast line, due to the etïects o f Coriolis force. Therefore, the algal blooms are centred on near shore areas. It is also seen that the blooms are more severe near the Noordwijk transect and Wadden Sea area because o f the discharge from the land. By examining the observations o f 1995, it is found that the algal bloom initiated by the end o f A p r i l and the first peak at station Noorwijk 10 appeared on M a y 3"* with a chlorophyll a concentration o f 58.2 = 2 t h e n # prey will reproduce Ppd = O Pp,,, = k i * N„,, / (Np,, + Npd) else if

# predator

will

reproduce

Ppy = O Ppd = k2 * Npy / (Npd+1)

=c

i f cell is prey:

if

Npy = O a n d Np^ = O t h e n # prey may die (loneliness) Ppy = 0 . 9 (= p;) i f Npy = O a n d Np,j ? O t h e n # prey w i l l probably be eaten Ppy = 0 . 1 (= p;) e l s e i f Npd > Npy+1 t h e n # prey w i l l d e f i n i t e l y be eaten Ppy = O else / survival

depends

on Npy and Np^

142

Cellular Aulomala and Arlilleial Inlelligenee in Ecohydraulics Modelling

Ppy = 1 -

X

k:. * Np„ / (Np,.+1)

i f cell is predator:

if

Npy = O a n d Np^ = O t h e n # predator w i l l probably die # (no food & loneliness) Ppd = 0 . 2 (= p,) e l s e i f Np,y = O a n d Np,, * O t h e n # predator w i l l d e f i n i t e l y die # (no food & competition) Ppd = O (= P^) e l s e i f Np,,. * O a n d Np, = O t h e n # predator may die ( l o n e l i n e s s ) Ppa = 0 . 8 (= p O e l s e i f Np,, > Np,, + 1 t h e n # predator will survive Ppd = 1 else # survival depends on N^y and Np^ Ppd = k.; * Np,, / (N„,i +

1)'

Several probability constants and "adjustment' parameters ean be identified in the above rules that may affect the evolutionary process. These are: P\ the probability that a prey w i l l survive in the absence o f any neighbours P2 the probability that a single prey w i l l survive in the presence o f predators py the probability that a predator will survive on its own with no food P4 the probability that a predator will survive in a group with no food Pi the probability that a single predator w i l l survive in the presence o f prey. k] adjustment factor for reproduction rate o f prey A': adjustment factor for reproduction rate o f predator Al adjustment factor for effect o f predators upon prey survival A4 adjustment factor for effect o f prey upon predator survival In a first experiment, the probability values were defmed by the rules shown above and the adjustment factors (A,) were set equalling to unit. Fig 1 shows the simulation environment of the E c o C A . FU12HAB - [UnMIeiI] F4e Contgu-ation Qptron

'•

y

Simiation Help

..Ifjxl

e • t

Time

Prey Uw*

^'^'^

Fig 1 simulation environment of E c o C A

Appendix 1

143

C h a n g e of cell size

To invesligate the effects o f spatial scales, the cell size can be changed in integer times while retain the initial conditions. This procedure is reversible, namely it can be restored. Fig 2 is the initial condition o f cell size Ax, and Fig 3 is Ax/3,



e t

o ï B

Popul»llor.s

Population dyo.mits

Fig 2 initial configuration of E c o C A , cell size A\ ™ Fde to^KMatnn öptian ^mJMxi Hete

PiFp

TIBTül

Predator popiilali

Fig 3 Initial configuration of E c o C A , cell size

Ax/3

C h a n g e of spatial c o n f i g u r a t i o n

It has been observed that neighbourhood schemes or so-called spatial configurations have important impacts on spatial evolution pattems and system stability o f C A models. Commonly used neighbourhood scheme include V o n Neumann scheme, Moore scheme and e.xtended Moore scheme (Chen, and Mynett, 2003a). A l l the three schemes have been implemented in the E c o C A model. Fig 4 shows the interface to change different schemes. In addition, a Gussian type spatial correlation function is implemented in E c o C A to consider the effects o f distance on local interactions. P.-...-P.^i^-Pj^'-'^""^'

(2)

144

Cellular Automala and Artitlcial Intelligence in Ecohydraulics Modelling

Fig 4 Initial configuration of E c o C A , cell size

Ax/3

Pattern analysis

The simulated spatial patterns at each time step can be saved to A S C I I file for post analysis or comparing with GIS raster format data (Chen, and Mynett 2003b).

Appendix

2

CA modelling macrophytes dynamics Introduction

A C A model was developed to simulate the eompetition and eonsequently succession of two underwater macropytes, Chaia aspera (C. aspera) and Polamogelon Pectinatiis (P. pectinatiis), in a eutrophic lake in the Netherlands. Simulation environment

The model is coded in Visual C++ where each cell is an object. The simulation environment has friendly user interface, great tlexibility and online animation. Fig 1 illustrates a demonstration run o f the model. ,S. U n U l e d - L Y C Fte

D

View ö

Map

Tools

H

Help

i-

m

f Pectinatus:

Charophyte: