MARSH BULLETIN
Marsh Bulletin 3(2)(2008)125-135
Amaricf_Basra
[email protected] [email protected] [email protected]
Rainfall-Runoff modeling by using M5 model trees technique: an example of Tigris catchment area in Baghdad, Middle of Iraq A.M. Atiaaa and H.B. Ghalibb Dept. of Geology, Coll. of Sci., Univ. of Basra, Basra, Iraq a b
e-mail:
[email protected]
e-mail:
[email protected]
Abstract This paper investigates the applicability of the M5 model trees technique to emulate rainfall-runoff transformation of Tigris catchment area in Baghdad city, Middle of Iraq. For building M5 model, a free of charge open –leading machine learning and data mining weka software is used. Four models are build firstly to study the interdependency among the input variables and to select the effective variables. The applicability of this technique is studied by predicting runoff (discharge) of Tigris River one and two months ahead. The results show the high accuracy of the M5 technique to identify low values and some of high values of flow with very high accuracy, but most of the high flows were underestimated. M5 model tree and other data-driven models could be used alone or corporation with physically-based models such as HEC-HMS to manage water resources of Iraq after a detailed monitor hydrological programming surveys are employed.
Key words Model tree, Rainfall-runoff transformation, Tigris River, Iraq
watershed
1-Introduction The
rainfall-runoff
transformation
complex
runoff
involves
processes
such
many as
highly
interception,
process is one of the most complex and non-
depression storage, evaporation, base flow, and
linear relationship in hydrology and water
other many hydrological components. For many
resources. The transformation of rainfall to
years,
water
PDF created with pdfFactory Pro trial version www.pdffactory.com
resources
practitioners
and
126
A.M. Atiaa and H.B. Ghalib / Marsh Bulletin 3(2)(2008)125-135
engineers have attempted to understand and
published papers are founded, for example to
investigate
model
the
control
processes
of
stage-discharge
relationship
transformation of a given rainfall to runoff in
(Bathettchery and Solomatine, 2005) and for
order to predict stream flow for hydrological
simulating rainfall-runoff process (Solomatine
studies. Many models were developed during
and Xue, 2004). Works of Solomatine and
the last decades to capture and simulate this
others proved that M5 is a very effective
highly non-linear relationship. These models
technique and more understandable and allows
can be basically divided into three groups: (1)
one to build a family of models of varying
the deterministic models seek to simulate the
complexity and accuracy.
physical process in the catchment involved in
The aim of this paper is to investigate the
the transformation of rainfall to runoff. (2) the
applicability of the M5 model tree for modeling
stochastic model describe the hydrological time
rainfall-runoff relationship of Tigris catchment
series of the several measured variable such as
area in Baghdad city, capital of Iraq. Also, this
rainfall, evaporation and stream flow involving
paper tries to introduce these techniques to field
distribution in probability (Shaw, 1999). (3)
of water resources of Iraq as alternative tool to
data-driven models which attempt to handle
manage water resources of country with
hydrological system as black box and try to find
inexpensive techniques.
a relationship between causal variables such as
M5 model trees
rainfall and consequent output such as stream
A decision tree is a logical model
flow. These models do not require an explicit
represented as a binary (two-way split) tree that
well-defined representation of the physical
shows how the values of a target (dependent)
processes and govern equations of the process.
variable can be predicted by using the values of
Several data driven models have been applied
a set of predictor (independent) variables. These
successfully to simulate the rainfall-runoff
are basically two types of decision trees: (1)
dynamics of the catchment for example by
classification trees are the most common and
using artificial neural network (Solomatine and
are used to predict a symbolic attribute (class).
Dulal, 2003), adaptive neuro-fuzzy system
(2) regression trees which are be used to predict
(Vernieuwe et al., 2005), evolutionary neural
the value of a numeric attribute (Witten and
networks (Nazemi et al., 2003), genetic
Frank, 1999). If each leaf in the tree contains a
programming (Whigham and Crapper, 2001)
linear regression model, that is used to predict
and M5 model trees (Solomatine and Xue,
the target variable at that leaf, is called a model
2004).
tree.
Application
of M5
model trees
in
The M5 model tree algorithm was
hydrology is limited; only some of the
originally developed by Quinlan (1992). Detail
PDF created with pdfFactory Pro trial version www.pdffactory.com
127
A.M. Atiaa and H.B. Ghalib / Marsh Bulletin 3(2)(2008)125-135 description of this technique is beyond of this
which, 1415 km are within Iraq with a
paper and can be found in Witten and Frank
catchment area of 235000 km2 Fig.2. The River
(1999). A bit description of this technique
emerges from the south-west part of Turkey
follows. The M5 algorithm constructs a
(form lake Hazar) which fed by a series of
regression tress by recursively splitting the
small watercourses originating from Geldjuk
instance space using tests on a single attributes
lake at an altitude of about 1200 m above sea
that maximally reduce variance in the target
level (Iraqi Ministries of Environment, water
variable. Figure 1 illustrates this concept. The
resources, and Municipalities and public works,
formula to compute the standard deviation
2006). Three large tributaries join the river
reduction (SDR) is: (Quinlan, 1994)
within the area of Turkey; these are Batman, Garzan, and Bokhtanch. These tributaries provide most of the water in the upper reach of
where
represents a set of example that
reaches the node;
represents the subset of th
examples that have the i potential set; and
the Tigris. Five large tributaries join the Tigris on its left bank within the area of Iraq: the
outcome of the
Greater Zab River, the Rawanduz River, the
represents the standard
Lesser Zab River, the Adhaim River, and the
deviation.
Diyala River. In its lower reaches, the Tigris
After the tree has been grown, a linear multiple
traverses
regression is built for every inner node using
adjoining area is composed of alluvial deposits.
the data associated with that node and all the
From Kut, the river runs through the marshland
attributes that participate for tests in the subtree
area. From the Hawizhe Marshes, water return
to that node. After that, every subtree is
to the Tigris River through the Kassarah and
considered for pruning process to overcome the
Swaib canals, which join the Tigris respectively
overfitting problem. Pruning occurs if the
59 km and 23 km upstream of the city of Al-
estimated error for the linear model at the root
Qurna.
the
Mesopotamian
plain.
The
of a subtree is smaller or equal to the expected
Baghdad is situated in the middle of Iraq;
error for the subtree. Finally, the smoothing
this city lies on the Tigris River at its closest
process is employed to compensate for the
point to the Euphrates, 40 km to the west. The
sharp discontinuities between adjacent linear
terrain surrounding Baghdad is a flat alluvial
models at the leaves of the pruned tree.
plain 34 m above sea level. Historically, the
Study area and available data set The Tigris is one of the largest rivers of the Middle East stretching for over 1900 km, of
city has been inundated by periodic floods. The city is located on a vast plain bisected by the Tigris River. The Tigris splits Baghdad in half,
PDF created with pdfFactory Pro trial version www.pdffactory.com
128
A.M. Atiaa and H.B. Ghalib / Marsh Bulletin 3(2)(2008)125-135
with the eastern half being called ‘Al-Risafa’
flat and low-lying, being of alluvial origin due
and the western half known as ‘Al-Karkh’. The
to the periodic large floods which have
land on which the city is built is almost entirely
occurred on the river.
X2 Model 2
˜ ˜ ˜ ˜˜
Model 3 ˜ ˜ ˜ ˜
˜
˜ ˜ ˜ ˜ ˜
˜ ˜
˜
˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜
Model 4
˜ ˜ ˜ ˜
˜
˜
˜
˜
˜
Model 1
Model 6 ˜
˜ ˜ ˜
˜
˜ ˜ ˜
˜
˜
˜
˜
Model 5
˜
˜ ˜
˜
˜
X1
Y (output)
X1 & X2 inputs of system
X2>2 Yes
No
X1>2.5
X1