WCCI 2012 IEEE World Congress on Computational Intelligence June, 10-15, 2012 - Brisbane, Australia
IJCNN
Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa, and Fernando Gomide
Abstract—A primary requirement of a broad class of evolving intelligent systems is to process a sequence of numeric data over time. This paper introduces a granular neural network framework for evolving fuzzy system modeling from fuzzy data streams. The evolving granular neural network (eGNN) efficiently handles concept changes, distinctive events of nonstationary environments. eGNN constructs interpretable multi-sized local models using fuzzy neural information fusion. An incremental learning algorithm builds the neural network topology from the information contained in data streams. Here we emphasize fuzzy intervals and objects with trapezoidal membership functions. Triangular fuzzy numbers, intervals, and numeric data are particular instances of trapezoids. An example concerning weather time series forecasting illustrates the neural network performance. The goal is to extract, from monthly temperature data, information of interest to attain accurate onestep forecasts and better rapport with reality. Simulation results suggest that eGNN learns from fuzzy data successfully and is competitive with state-of-the-art approaches.
I. I NTRODUCTION The prominent presence of data streams and online information processing in real-world systems, along with the necessity of modeling, analyzing, and understanding these systems, has brought new challenges, higher demands, and new research directions. Research and development of conceptual frameworks, methods, and algorithms capable to extract knowledge from data streams is also motivated by a manifold of important applications [1] - [3]. Data stream modeling is fundamentally based on computational learning approaches that both, process data continuously as an attempt to find similarities in their spatiotemporal features, and thereafter provide insights about the phenomenon which governs the data. The ultimate goal is to obtain more abstract (often human-centric) representations of large amounts of detailed data with no apparent value. As real-world systems become more complex, modeling, processing and disposing information become more complex as well. Data streams are characterized by nonstationarity, nonlinearity, and heterogeneity; they are potentially endless, and may be subjected to changes of various kinds. Direct application of machine learning and data mining algorithms to data streams is often infeasible because it is difficult to maintain all the data in memory. Particularly, a challenge faced in stream modeling concerns handling uncertainty. Daniel Leite and Fernando Gomide are with the Department of Computer Engineering and Automation, School of Electrical and Computer Engineering, University of Campinas, 13083-852 BRA, e-mail: {danfl7, gomide} @dca.fee.unicamp.br. Pyramo Costa is with the Graduate Program in Electrical Engineering, Pontifical Catholic University of Minas Gerais, 30535-610 BRA, e-mail:
[email protected].
U.S. Government work not protected by U.S. copyright
Uncertainty is an attribute of information once our ability to perceive the reality is limited [4]. The more complex a system is, the more uncertain is the available information, and the more imprecise is our understanding of that system. As Kreinovich stated, measurements and expert estimates are never exact [5]. Granular computing theory [6] - [10] hypothesizes that accepting some level of uncertainty may be beneficial and therefore suggests a balance between precision and uncertainty. Information granulation for uncertainty representation is a fundamental manifestation of the human knowledge [6]. Information granulation means that instead of dealing with detailed real-world data, the data are considered in a more abstract and conceptual perspective. The result of information granulation is called information granule - being a granule, a clump of objects, subsets, clusters, or elements of a universe put together by similarity, proximity or functionality [11]. There are close relations between granulation, data mining [12], data fusion [13], and knowledge discovery [14]. Granular models developed from data streams can be expressed into several computationally tractable frameworks. Of special concern to this paper are fuzzy granular data streams and evolving neurofuzzy modeling framework. Fuzzy granular data may rise from expert judgment, readings from unreliable sensors, and summaries of numeric (singular) data over time periods. Artificial neural networks are nonlinear, highly plastic systems equipped with significant learning capability. Fuzzy sets and fuzzy neurons provide neural networks with mechanisms of approximating reasoning and transparency of the resulting construction. Fuzzy sets and neurocomputing are complementary in terms of their strengths thus motivating neurofuzzy granular computing. The evolving aspect of neurofuzzy networks accounts for endless streams of nonstationary data and structural adaptation of models on an incremental basis. This paper introduces an evolving granular neural network (eGNN) approach for fuzzy time series forecasting. Refer to [15] for the pioneering work in granular non-evolving neural networks, [16] - [17] for regression and semi-supervised classification applications of eGNN, and [18] - [19] for related interval and fuzzy evolving granular approaches. In this paper, the proposed eGNN plays the role of an evolving predictor able to capture the essence of uncertain (fuzzy) time series data in a more abstract and compact representation. The remainder of this paper is organized as follows. Section II presents fuzzy aggregation neurons which are key constructs of granular neurofuzzy networks. The topology
of eGNN is introduced in Section III. Section IV addresses the gradual construction of granular networks by means of one-pass recursive learning algorithm. Section V presents the results obtained by eGNN and alternative approaches in temperature time series forecasting. Section VI concludes the paper and suggests issues for further investigation. II. F UZZY AGGREGATION N EURON Aggregation neurons are artificial neuron models based on aggregation operators [20]. Evolving granular neural networks may use different types of aggregation neurons to perform information fusion. In general, there are no specific guidelines to choose a particular aggregation operator to construct a fuzzy neuron. The choice depends on the application environment and domain knowledge [21]. Aggregation operators 𝐴 : [0, 1]𝑛 → [0, 1], 𝑛 > 1 combine input values in the unit hypercube [0, 1]𝑛 into a single output value in [0, 1]. They must satisfy two fundamental properties: (i) monotonicity in all arguments, i.e., given 𝑥1 = (𝑥11 , ..., 𝑥1𝑛 ) and 𝑥2 = (𝑥21 , ..., 𝑥2𝑛 ), if 𝑥1𝑗 ≤ 𝑥2𝑗 ∀𝑗 then 𝐴(𝑥1 ) ≤ 𝐴(𝑥2 ); (ii) boundary conditions: 𝐴(0, 0, ..., 0) = 0 and 𝐴(1, 1, ..., 1) = 1. The classes of aggregation operators considered in this work are summarized below. See [7], [21] for a detailed coverage and [16] for other examples of operators that can be used to construct granular networks.
𝑛
𝑀 (𝑥) =
1∑ 𝑥𝑗 . 𝑛 𝑗=1
(6)
Averaging operators are idempotent, strictly increasing, symmetric, homogeneous, and Lipschitz continuous. C. Fuzzy aggregation neuron model Let 𝑥 ˜ = (˜ 𝑥1 , ..., 𝑥 ˜𝑛 ) be a vector of membership degrees of a sample 𝑥 = (𝑥1 , ..., 𝑥𝑛 ) in the fuzzy sets 𝐺 = (𝐺1 , ..., 𝐺𝑛 ). Let 𝑤 = (𝑤1 , ..., 𝑤𝑛 ) be a weighting vector such that 𝑤𝑗 ∈ [0, 1], 𝑗 = 1, ..., 𝑛.
(7)
Fuzzy aggregation neurons employ product T-norm to perform synaptic processing and an aggregation operator 𝐴 to fuse the individual results of synaptic processing in the neuron body. The output of a fuzzy aggregation neuron is ˜𝑛 𝑤𝑛 ). 𝑜 = 𝐴(˜ 𝑥1 𝑤1 , ..., 𝑥
(8)
An aggregation neuron produces a diversity of nonlinear mappings between neuron inputs and output depending on the choice of weights 𝑤, and aggregation operator 𝐴. The structure of a fuzzy aggregation neuron is shown in Fig. 1.
A. T-norm aggregation T-norms (𝑇 ) are commutative, associative and monotone operators on the unit hypercube whose boundary conditions are 𝑇 (𝛼, 𝛼, ..., 0) = 0 and 𝑇 (𝛼, 1, ..., 1) = 𝛼, 𝛼 ∈ [0, 1]. An example of T-norm is the minimum operator: 𝑇𝑚𝑖𝑛 (𝑥) = min 𝑥𝑗 ,
(1)
𝑗=1,...,𝑛
Fig. 1.
Fuzzy aggregation neuron model
which is the strongest T-norm because III. E VOLVING G RANULAR N EURAL N ETWORKS 𝑛
𝑇 (𝑥) ≤ 𝑇𝑚𝑖𝑛 (𝑥) for any 𝑥 ∈ [0, 1] .
(2)
The minimum is also idempotent, symmetric and Lipschitzcontinuous. A further example of T-norm is the product, 𝑇𝑝𝑟𝑜𝑑 (𝑥) =
𝑛 ∏
𝑥𝑗 ,
(3)
𝑗=1
which is a non-idempotent, but symmetric and Lipschitzcontinuous aggregation operator. B. Averaging aggregation An aggregation operator 𝐴 is averaging if for every 𝑥 ∈ [0, 1]𝑛 it is bounded by 𝑇𝑚𝑖𝑛 (𝑥) ≤ 𝐴(𝑥) ≤ 𝑆𝑚𝑎𝑥 (𝑥),
(4)
where 𝑆𝑚𝑎𝑥 is the maximum S-norm operator, 𝑆𝑚𝑎𝑥 (𝑥) = max 𝑥𝑗 . 𝑗=1,...,𝑛
(5)
The basic rule is that the output value of an averaging operator cannot be lower or higher than any input value. An example of averaging operator is the arithmetic mean:
The eGNN approach concerns online modeling of fuzzy data streams. Generally speaking, fuzzy data arise from imprecise perception or description of the value of a variable [22] - [23]. This paper emphasizes fuzzy trapezoidal data, granular data expressed by trapezoidal fuzzy numbers. Trapezoids allow some freedom in the choice of representative granules once they encompass triangular fuzzy numbers, intervals and real values as particular instances [24]. The basic processing units of eGNN are fuzzy aggregation neurons. Its topology encodes a set of fuzzy rules, and neural processing conforms with a fuzzy inference system. The topology results from a gradual network construction that is transparent and interpretable. eGNN manages to discover more abstract high-level granular knowledge from finer granular data. High-level granular knowledge can be easily translated into a fuzzy knowledge base. The consequent (Then) part of an eGNN rule is composed by linguistic and local functional (real-valued function) terms. Independently of the choice of aggregation neurons, network parameters, and the nature of input-output data, the linguistic term of the rule consequent produces a granular output while the functional term gives a singular (pointwise) output.
Learning in eGNN means to recursively accommodate new data into existing granular models. Learning may add, remove and combine granules, neurons and respective connections whenever necessary. The parameters of the realvalued functions of rule consequents are also object of learning. This means that eGNN captures new information from data streams, adapts itself to the new scenario, and avoids redesigning and retraining. A. Fuzzy data stream Fuzzy data arise from perceptions of expert knowledge, inaccurate measurements, variables that are hard to be precisely quantified, or when pre-processing steps introduce uncertainty in singular data. A fuzzy data stream is a sequence of samples that conveys fuzzy granular information. Fuzzy intervals and numbers are instances of fuzzy data. A fuzzy datum 𝑥𝑗 has the following canonical form: ⎧ 𝜙𝑗 , 𝑧 ∈ [𝑥𝑗 , 𝑥𝑗 [ ⎨ 1, 𝑧 ∈ [𝑥𝑗 , 𝑥𝑗 ] 𝑥𝑗 (𝑧) = (9) 𝜄𝑗 , 𝑧 ∈ ]𝑥𝑗 , 𝑥𝑗 ] ⎩ 0, otherwise
companion local function 𝑝𝑖 . Here in this paper we use realvalued affine functions: 𝑥1 , ..., 𝑥 ˆ𝑛 ) = 𝑦ˆ𝑖 = 𝑎𝑖0 + 𝑝𝑖 (ˆ
𝑛 ∑
𝑎𝑖𝑗 𝑥 ˆ𝑗 ,
(12)
𝑗=1
Parameters 𝑎𝑖0 and 𝑎𝑖𝑗 are real values; 𝑥 ˆ𝑗 is the midpoint of 𝑥𝑗 = (𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 ), computed as follows: ˆ𝑗 = mp(𝑥𝑗 ) = 𝑥
𝑥𝑗 + 𝑥𝑗 . 2
(13)
𝑥𝑖1 , ..., 𝑥 ˜𝑖𝑛 ) is the result of matching Similarity degrees 𝑥 ˜𝑖 = (˜ between input 𝑥 = (𝑥1 , ..., 𝑥𝑛 ) and fuzzy sets of 𝐺𝑖 = (𝐺𝑖1 , ..., 𝐺𝑖𝑛 ), see Section IV-C. The third (aggregation) layer has fuzzy aggregation neurons 𝐴𝑖 , 𝑖 = 1, ..., 𝑐, to combine the values from different inputs. A fuzzy neuron 𝐴𝑖 combines ˜𝑖𝑛 𝑤𝑛𝑖 ) into a single weighted similarity degrees (˜ 𝑥𝑖1 𝑤1𝑖 , ..., 𝑥 𝑖 value 𝑜 . The fourth (output) layer processes weighted values (𝑜1 𝑦ˆ1 𝛿 1 , ..., 𝑜𝑐 𝑦ˆ𝑐 𝛿 𝑐 ) using a fuzzy aggregation neuron 𝐴𝑓 to produce a singular output 𝑦ˆ[ℎ] .
where 𝑧 is a real number in 𝑋𝑗 . If the fuzzy datum 𝑥𝑗 is normal (𝑥𝑗 (𝑧) = 1 for at least one 𝑧 ∈ ℜ) and convex (𝑥𝑗 (𝜅𝑧 1 + (1 − 𝜅)𝑧 2 ) ≥ 𝑚𝑖𝑛(𝑥𝑗 (𝑧 1 ), 𝑥𝑗 (𝑧 2 )), 𝑧 1 , 𝑧 2 ∈ ℜ, 𝜅 ∈ [0, 1]), then it is a fuzzy interval [7]. In particular, if 𝜙𝑗
=
𝜄𝑗
=
𝑧 − 𝑥𝑗 𝑥𝑗 − 𝑥𝑗 𝑥𝑗 − 𝑧 , 𝑥𝑗 − 𝑥𝑗
and
(10) (11)
then the fuzzy datum (9) has a trapezoidal membership function and can be represented by the quadruple (𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 ). When 𝑥𝑗 = 𝑥𝑗 , the fuzzy datum is a fuzzy number. In this paper we focus on data streams of trapezoidal and symmetric fuzzy intervals, similarly as in [23]. Fuzzy granular data streams generalize singular data streams by allowing fuzzy data. B. Structure and processing Let 𝑥 = (𝑥1 , ..., 𝑥𝑛 ) be an input vector and 𝑦 its corresponding output. Assume that the data stream (𝑥, 𝑦)[ℎ] , ℎ = 1, ..., are samples produced by a nonstationary function 𝑓 . Inputs 𝑥𝑗 and output 𝑦 are symmetric fuzzy data. Fig. 2 depicts the four-layer eGNN structure. The first layer inputs samples 𝑥[ℎ] , one at a time, to the network. The second (granular) layer consists of a collection of fuzzy sets 𝐺𝑖𝑗 , 𝑗 = 1, ..., 𝑛; 𝑖 = 1, ..., 𝑐, stratified from the input data. Fuzzy sets 𝐺𝑖𝑗 , 𝑖 = 1, ..., 𝑐, form a fuzzy partition of the 𝑗th input domain, 𝑋𝑗 . Similarly, fuzzy sets Γ𝑖 , 𝑖 = 1, ..., 𝑐, give a fuzzy partition of the output domain 𝑌 . A granule 𝛾 𝑖 = 𝐺𝑖1 ×...×𝐺𝑖𝑛 ×Γ𝑖 is a fuzzy relation, a multidimensional fuzzy set in 𝑋1 ×...×𝑋𝑛 ×𝑌 . Thus, granule 𝛾 𝑖 has membership function 𝛾 𝑖 (𝑥, 𝑦) = 𝑚𝑖𝑛{𝐺𝑖1 (𝑥1 ), ..., 𝐺𝑖𝑛 (𝑥𝑛 ), Γ𝑖 (𝑦)} in 𝑋1 × ... × 𝑋𝑛 × 𝑌 . Granule 𝛾 𝑖 is denoted by 𝛾 𝑖 = (𝐺𝑖 , Γ𝑖 ) with 𝐺𝑖 = (𝐺𝑖1 , ..., 𝐺𝑖𝑛 ), for short. The granule 𝛾 𝑖 has a
Fig. 2.
eGNN topology and singular output
An 𝑚-output eGNN needs a vector of local functions (𝑝𝑖1 , ..., 𝑝𝑖𝑚 ), 𝑚 output layer neurons (𝐴𝑓1 , ..., 𝐴𝑓𝑚 ), and 𝑚 outputs (ˆ 𝑦1 , ..., 𝑦ˆ𝑚 ). The network output 𝑦ˆ, obtained as shown in Fig. 2, is a singular approximation of 𝑓 , independently if input data are singular or granular. Granular approximation of function 𝑓 at step 𝐻 is a set of granules 𝛾 𝑖 , 𝑖 = 1, ..., 𝑐, such that: (𝑥, 𝑦)[ℎ] ⊆
𝑐 ∪
𝛾 𝑖 , ℎ = 1, ..., 𝐻.
(14)
𝑖=1
The granular approximation is constructed by granulating both, input data 𝑥[ℎ] into fuzzy sets of 𝐺𝑖 , as shown in Fig.
2, and output data 𝑦 [ℎ] into fuzzy sets Γ𝑖 , as summarized in Fig. 3. Note that the granular approximation is the convex ∗ hull of output fuzzy sets Γ𝑖 , where 𝑖∗ are indices of active granules, that is, those for which 𝑜𝑖 > 0. This guarantees that the singular approximation 𝑦ˆ[ℎ] is included in the granule.
(a) eGNN singular approximation of 𝑓
Fig. 3.
Granular approximation from input and output data granulation
The convex hull of trapezoidal fuzzy sets Γ1 , ..., Γ𝑖 , ..., Γ𝑐 , 𝑖 with Γ𝑖 = (𝑢𝑖 , 𝑢𝑖 , 𝑢𝑖 , 𝑢 ), is a trapezoidal fuzzy set ch(Γ1 , ..., Γ𝑐 ) whose representation is ch(Γ1 , ..., Γ𝑐 ) = (𝑚𝑖𝑛(𝑢1 , ..., 𝑢𝑐 ), 𝑚𝑖𝑛(𝑢1 , ..., 𝑢𝑐 ), 1
(b) eGNN granular approximation of 𝑓 𝑐
𝑚𝑎𝑥(𝑢1 , ..., 𝑢𝑐 ), 𝑚𝑎𝑥(𝑢 , ..., 𝑢 )). (15) ∗
∗
∗
Fig. 4.
eGNN singular (a) and granular (b) approximation of function 𝑓
𝑖∗
In particular, the trapezoid (𝑢𝑖 , 𝑢𝑖 , 𝑢𝑖 , 𝑢 ) of Fig. 3 that ∗ results from ch(Γ𝑖 ), 𝑖∗ = {𝑖 : 𝑜𝑖 > 0, 𝑖 = 1, ..., 𝑐}, is a granular approximation of 𝑦. It is worth noting that granular approximation at step ℎ does not depend on the availability of 𝑦 [ℎ] because 𝑜𝑖 is obtained from 𝑥[ℎ] (see Fig. 2). Only the collection of output fuzzy sets Γ𝑖 is required. Figure 4 ∪ illustrates the singular and granular approxima𝑐 tion, 𝑝 and 𝑖=1 𝛾 𝑖 , of a function 𝑓 . In Fig. 4(a), a singular input 𝑥[ℎ1 ] and a granular input 𝑥[ℎ2 ] produce singular outputs 𝑦ˆ[ℎ1 ] and 𝑦ˆ[ℎ2 ] using 𝑝. In Fig. 4(b), the granular input 𝑥[ℎ] activates the fuzzy sets of 𝐺2 and 𝐺3 . Therefore, the granular output is ch(Γ2 , Γ3 ). Notice that 𝑦 [ℎ] ⊂ ch(Γ2 , Γ3 ). eGNN develops functional and linguistic fuzzy models. While functional fuzzy models are more precise, linguistic fuzzy models are more interpretable. Accuracy and interpretability require tradeoffs and one usually excels over the other. eGNN links functional and linguistic systems into a single framework. Under assumption on specific weights and neurons types, fuzzy rules extracted from eGNN can be of the type:
The algorithm builds the network structure in plug-and-play mode. Single pass over data enables eGNN to address the issues of unbounded data sets and scalability. A. Expansion 𝑖
Membership functions of 𝐺𝑖𝑗 = (𝑔 𝑖 , 𝑔 𝑖𝑗 , 𝑔 𝑖𝑗 , 𝑔 𝑗 ) and of 𝑗
input data 𝑥𝑗 = (𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 ) are trapezoidal. Similarly, 𝑖
Γ𝑖 = (𝑢𝑖 , 𝑢𝑖 , 𝑢𝑖 , 𝑢 ) and output data 𝑦 = (𝑦, 𝑦, 𝑦, 𝑦) are trapezoids. Each rule antecedent 𝐺𝑖 = (𝐺𝑖1 , ..., 𝐺𝑖𝑛 ) has a correspondent consequent Γ𝑖 . With 𝛾 𝑖 = (𝐺𝑖 , Γ𝑖 ), eGNN looks at examples (𝑥, 𝑦) at a coarser granule size. The support and the core of trapezoidal membership function 𝐺𝑖𝑗 are:
functional
As an example, the eGNN can combine Mamdani and functional Takagi-Sugeno fuzzy models.
=
core(𝐺𝑖𝑗 )
=
𝑖
[𝑔 𝑖 , 𝑔 𝑗 ], 𝑗 [𝑔 𝑖𝑗 , 𝑔 𝑖𝑗 ].
(16) (17)
The midpoint and width of 𝐺𝑖𝑗 are as follows:
𝑅𝑖 : IF (𝑥1 is 𝐺𝑖1 ) AND ... AND (𝑥𝑛 is 𝐺𝑖𝑛 ) THEN (ˆ 𝑦 is Γ𝑖 ) AND 𝑦ˆ = 𝑝𝑖 (𝑥1 , ..., 𝑥𝑛 ) linguistic
supp(𝐺𝑖𝑗 )
mp(𝐺𝑖𝑗 )
=
wdt(𝐺𝑖𝑗 )
=
𝑔 𝑖𝑗 + 𝑔 𝑖𝑗
, 2 𝑖 𝑔𝑗 − 𝑔𝑖 . 𝑗
(18) (19)
IV. R ECURSIVE L EARNING
The maximum width fuzzy sets 𝐺𝑖𝑗 are allowed to expand is denoted by 𝜌, i.e., wdt(𝐺𝑖𝑗 ) ≤ 𝜌, 𝑗 = 1, ..., 𝑛; 𝑖 = 1, ..., 𝑐. Let the expansion region of a fuzzy set 𝐺𝑖𝑗 be
Construction of the fuzzy rules encoded in the eGNN structure and approximation of nonstationary functions from granular data streams are the key goals of the learning approach. Because application domain may be unknown beforehand, eGNN learning is mostly bottom-up. We assume that no granules and neurons exist before training starts.
𝜌 𝜌 (20) [mp(𝐺𝑖𝑗 ) − , mp(𝐺𝑖𝑗 ) + ]. 2 2 It follows that wdt(𝐺𝑖𝑗 ) ≤ wdt(𝐸𝑗𝑖 ) ∀𝑗, 𝑖. Expressions similar to (16) - (20) can be written for fuzzy sets Γ𝑖 . Values of 𝜌 allow different representations of the same process at 𝐸𝑗𝑖
=
different levels of abstraction. Expansion regions help to derive criteria for deciding whether or not granular data should be considered enclosed by the current model. B. Granularity adaptation Appropriate balance between parametric and structural adaptation is a key to capture changes in nonstationary systems online. The procedure developed next gives a mechanism to parsimoniously reconcile parametric and structural changes in eGNN. The value of 𝜌 affects the granularity, accuracy, and transparency of models. In practice, 𝜌 ∈ [0, 1] settles the size of expansion regions (20) and the need to either create or adapt rules. In the most general case, eGNN starts learning with an empty rule base and with no a priori knowledge of data properties. In these circumstances it is worth to initialize 𝜌 at an intermediate value, e.g. 𝜌[0] = 0.5. Let 𝑟 be the number of rules created in ℎ𝑟 steps. If the number of rules grows faster than a rate 𝜂, that is, 𝑟 > 𝜂, then 𝜌 is increased, ) ( 𝑟 𝜌(old). (21) 𝜌(new) = 1+ ℎ𝑟 Equation (21) acts against outbursts of growth once large rule bases increase model complexity and worsen generalization. If the number of rules grows at a rate smaller than 𝜂, that is, 𝑟 ≤ 𝜂, then 𝜌 is decreased, ) ( (𝜂 − 𝑟) 𝜌(old). (22) 𝜌(new) = 1− ℎ𝑟 If 𝜌 = 1, then eGNN is structurally stable, but unable to capture abrupt changes. Conversely, if 𝜌 = 0, then eGNN overfits the data and causes excessive complexity and irreproducible optimistic results. Life-long adaptability is reached choosing intermediate values for 𝜌. Reducing the maximum granule width may require shrinking larger granules to fit them to new data. In this case, the support of fuzzy set 𝐺𝑖𝑗 is narrowed as follows: > 𝑔 𝑖 then 𝑔 𝑖 (new) = mp(𝐺𝑖𝑗 ) − 𝜌(new) If mp(𝐺𝑖𝑗 ) − 𝜌(new) 2 2 𝑗 𝑖
𝑗 𝑖
If mp(𝐺𝑖𝑗 ) + 𝜌(new) < 𝑔 𝑗 then 𝑔 𝑗 (new) = mp(𝐺𝑖𝑗 ) + 𝜌(new) 2 2 𝑖
Cores [𝑔 𝑖𝑗 , 𝑔 𝑖𝑗 ], and supports [𝑢𝑖 , 𝑢 ] and cores [𝑢𝑖 , 𝑢𝑖 ] of fuzzy sets Γ𝑖 are handled similarly. Time-varying granularity is useful to avoid guesses on how fast and how often the data stream properties change. The accuracy-interpretability tradeoff is an important issue in neurofuzzy computing [25]. C. Computing similarity between data and models Data and granules are trapezoidal fuzzy objects and in this case a convenient similarity measure to quantify how the input data match current knowledge is: 𝑖
𝑥 ˜𝑖𝑗
=1−
∣𝑔 𝑖 − 𝑥𝑗 ∣ + ∣𝑔 𝑖𝑗 − 𝑥𝑗 ∣ + ∣𝑔 𝑖𝑗 − 𝑥𝑗 ∣ + ∣𝑔 𝑗 − 𝑥𝑗 ∣ 𝑗
𝑖
4(max(𝑔 𝑗 , 𝑥𝑗 ) − min(𝑔 𝑖 , 𝑥𝑗 ))
. (23)
𝑗
This measure returns 𝑥 ˜𝑖𝑗 = 1 for identical trapezoids and reduces linearly as any numerator increases.
D. Creation of new granules The incremental procedure to create granules runs whenever the support of at least one entry of (𝑥1 , ..., 𝑥𝑛 ) is not enclosed by expansion regions (𝐸1𝑖 , ..., 𝐸𝑛𝑖 ), 𝑖 = 1, ..., 𝑐. This is the case when fuzzy sets 𝐺𝑖 cannot be expanded beyond the limit 𝜌 in order to fit the sample. Analogously, if supp(𝑦) is not enclosed by 𝐸 𝑖 for at least one Γ𝑖 , then the sample should be enclosed by a new granule. and A new granule 𝛾 𝑐+1 is formed by fuzzy sets 𝐺𝑐+1 𝑗 𝑐+1 whose parameters match the parameters of the sample: Γ 𝑐+1
𝐺𝑐+1 = (𝑔 𝑐+1 , 𝑔 𝑐+1 , 𝑔 𝑐+1 , 𝑔𝑗 𝑗 𝑗 𝑗 𝑗
Γ𝑐+1 = (𝑢𝑐+1 , 𝑢𝑐+1 , 𝑢𝑐+1 , 𝑢
) = (𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 )
𝑐+1
) = (𝑦, 𝑦, 𝑦, 𝑦).
(24) (25)
The coefficients of the real-valued local function 𝑝𝑐+1 are 𝑎𝑐+1 = mp(𝑦), 𝑎𝑐+1 = 0, 𝑗 ∕= 0. 0 𝑗
(26)
E. Adaptation of granules Adaptation of granules means to expand or contract the support and the core of fuzzy sets 𝐺𝑖𝑗 and Γ𝑖 and simultaneously update the coefficients of the local functions 𝑝𝑖 . Granule 𝛾 𝑖 can be adapted whenever a sample (𝑥, 𝑦) falls within its expansion region, that is, supp(𝑥𝑗 ) ⊂ 𝐸𝑗𝑖 , 𝑗 = 1, ..., 𝑛, and supp(𝑦) ⊂ 𝐸 𝑖 . In situations in which two or more granules are qualified to enclose the data, adapting only one of the granules is enough to guarantee data inclusion. In particular, we may chose 𝛾 𝑖 such that 𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1 , ..., 𝑜𝑐 ). In other words, choose 𝛾 𝑖 , the granule with the highest activation level. Adaptation proceeds depending where the input datum 𝑥𝑗 is located compared with fuzzy set 𝐺𝑖𝑗 . More specifically: If 𝑥𝑗 ∈ [mp(𝐺𝑖𝑗 ) − 𝜌2 , 𝑔 𝑖 ]
then 𝑔 𝑖 (new) = 𝑥𝑗
If If If If If If If
then then then then then then then
𝑥𝑗 𝑥𝑗 𝑥𝑗 𝑥𝑗 𝑥𝑗 𝑥𝑗 𝑥𝑗
𝑗 𝜌 𝑖 2 , 𝑔𝑗 ]
∈ [mp(𝐺𝑖𝑗 ) − ∈ [𝑔 𝑖𝑗 , mp(𝐺𝑖𝑗 )] ∈ [mp(𝐺𝑖𝑗 ), mp(𝐺𝑖𝑗 ) + 𝜌2 ] ∈ [mp(𝐺𝑖𝑗 ) − 𝜌2 , mp(𝐺𝑖𝑗 )] ∈ [mp(𝐺𝑖𝑗 ), 𝑔 𝑖𝑗 ] ∈ [𝑔 𝑖𝑗 , mp(𝐺𝑖𝑗 ) + 𝜌2 ] 𝑖 ∈ [𝑔 𝑗 , mp(𝐺𝑖𝑗 ) + 𝜌2 ]
𝑗
𝑔 𝑖𝑗 (new) = 𝑥𝑗 𝑔 𝑖𝑗 (new) = 𝑥𝑗 𝑔 𝑖𝑗 (new) = mp(𝐺𝑖𝑗 ) 𝑔 𝑖𝑗 (new) = mp(𝐺𝑖𝑗 ) 𝑔 𝑖𝑗 (new) = 𝑥𝑗 𝑔 𝑖𝑗 (new) = 𝑥𝑗 𝑖 𝑔 𝑗 (new) = 𝑥𝑗
The first and last rules perform support expansion, and the second and seventh rules take care of core expansion. The remaining cases concern core contraction. Operations on core parameters, 𝑔 𝑖𝑗 and 𝑔 𝑖𝑗 , require adjustment of the midpoint of the respective fuzzy sets as follows: mp(𝐺𝑖𝑗 )(new) =
𝑔 𝑖𝑗 (new) + 𝑔 𝑖𝑗 (new)
. (27) 2 As result, support contraction may happen in two occasions: If mp(𝐺𝑖𝑗 )(new) − 𝜌2 > 𝑔 𝑖 then 𝑔 𝑖 (new) = mp(𝐺𝑖𝑗 )(new) − 𝜌2 𝑗 𝑖
𝑗 𝑖
If mp(𝐺𝑖𝑗 )(new) + 𝜌2 < 𝑔 𝑗 then 𝑔 𝑗 (new) = mp(𝐺𝑖𝑗 )(new) + 𝜌2 Adaptation of consequent fuzzy sets Γ𝑖 is done similarly using output data 𝑦. Coefficients 𝑎𝑖𝑗 of the local functions 𝑝𝑖
are updated using midpoints of trapezoidal fuzzy data (𝑥, 𝑦) and the recursive least squares algorithm [18] - [19]. F. Weights update Aggregation layer weights 𝑤𝑗𝑖 ∈ [0, 1] represent the importance of the 𝑗-th attribute of fuzzy set 𝐺𝑖𝑗 to the neural network output. If 𝑤𝑗𝑖 = 1, then the output is not affected. A relatively lower value of 𝑤𝑗𝑖 discounts the impact of the respective attribute. The procedure described below assigns lower weight values to less helpful attributes. Whenever a granule 𝛾 𝑐+1 is created, the learning procedure sets 𝑤𝑗𝑐+1 = 1, ∀𝑗. If it is known a priori that different input variables have different importance, then values for 𝑤𝑗𝑐+1 can be chosen in a way to reflect the application domain. Considering the similarity measure (23) and the current approximation error, 𝜖[ℎ] = 𝑦 [ℎ] − 𝑝(𝑥[ℎ] ),
(28)
𝑤𝑗𝑖
weights corresponding to the most active granule 𝛾 𝑖 , where 𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1 , ..., 𝑜𝑐 ), are recursively updated using ˜𝑖𝑗 𝑜𝑖 ∣𝜖∣. 𝑤𝑗𝑖 (new) = 𝑤𝑗𝑖 (old) − 𝑥
(29)
Equation (29) ascribes to the 𝑗-th attribute of 𝐺𝑖 a proportion of the approximation error. G. Pruning granules Pruning inactive granules aims at simplifying the eGNN structure and keeping it flexible to model dynamic behavior. Retaining a small number of highly active granules is a way to emphasize compactness and fast processing. Output layer weights 𝛿 𝑖 ∈ [0, 1] help pruning by encoding the amount of data assigned to granule 𝛾 𝑖 . Learning starts with 𝛿 𝑖 = 1. During the next steps 𝛿 𝑖 is reduced whenever 𝛾 𝑖 is not activated after ℎ𝑟 steps as follows: 𝛿 𝑖 (new) = 𝜁𝛿 𝑖 (old),
(30)
𝑖
where 𝜁 ∈ [0, 1]. Otherwise, if 𝛾 is activated at least once within ℎ𝑟 steps, then 𝛿 𝑖 is increased: 𝛿 𝑖 (new) = 𝛿 𝑖 (old) + 𝜁(1 − 𝛿 𝑖 (old)).
(31)
𝑖
If the value of 𝛿 is less than a threshold 𝜗, then granule 𝛾 𝑖 and its respective neuron 𝐴𝑖 are pruned because they do not affect system accuracy significantly. H. Combination of granules Relationships between granules may be strong enough to justify assembling a larger granule that inherits the information of the smaller granules. A suitable metric to measure the distance between trapezoidal objects is: 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ) =
𝑛 ∑ 1 ( (∣𝑔 𝑖1 − 𝑔 𝑖2 ∣ + ∣𝑔 𝑖𝑗1 − 𝑔 𝑖𝑗2 ∣ + 𝑗 4(𝑛 + 1) 𝑗=1 𝑗
+ ∣𝑔 𝑖𝑗1 − 𝑔 𝑖𝑗2 ∣ +
𝑖 ∣𝑔 𝑗1
−
𝑖 𝑔 𝑗2 ∣)
+ ∣𝑢𝑖1 − 𝑢𝑖2 ∣ + 𝑖
𝑖
+∣𝑢𝑖1 − 𝑢𝑖2 ∣ + ∣𝑢𝑖1 − 𝑢𝑖2 ∣ + ∣𝑢 1 − 𝑢 2 ∣).
(32)
𝐷 satisfies 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ) ≥ 0 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ) = 0 if and only if 𝛾 𝑖1 = 𝛾 𝑖2 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ) = 𝐷(𝛾 𝑖2 , 𝛾 𝑖1 ) 𝐷(𝛾 𝑖1 , 𝛾 𝑖3 ) ≤ 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ) + 𝐷(𝛾 𝑖2 , 𝛾 𝑖3 ) for any 𝛾 𝑖1 , 𝛾 𝑖2 and 𝛾 𝑖3 . Thus 𝐷 is a distance measure. In addition, 𝐷 is fast to compute and more accurate than both, distance between midpoints and distance between closest points. Granules are combined after ℎ𝑟 steps considering the lowest value of 𝐷(𝛾 𝑖1 , 𝛾 𝑖2 ), 𝑖1 , 𝑖2 = 1, ..., 𝑐, 𝑖1 ∕= 𝑖2 , and a decision criterion. The decision criterion may consider if the new granule obeys the maximum width allowed 𝜌. A new granule 𝛾 𝑖 , coarsening of 𝛾 𝑖1 and 𝛾 𝑖2 , is formed by trapezoidal membership functions 𝐺𝑖𝑗 as follows: 𝐺𝑖𝑗 = ch(𝐺𝑖𝑗1 , 𝐺𝑖𝑗2 ), 𝑗 = 1, ..., 𝑛.
(33)
Γ𝑖 is obtained similarly. The new granule 𝛾 𝑖 encloses the support and core of the granules combined. The coefficients of the new local function 𝑝𝑖 are found using the expressions: 𝑎𝑖𝑗 =
1 𝑖1 (𝑎 + 𝑎𝑖𝑗2 ), 𝑗 = 0, ..., 𝑛. 2 𝑗
(34)
I. Learning algorithm The learning algorithm to evolve granular neural networks can be summarized as follows: ——————————————————————— BEGIN Select a type of neuron for the aggregation and output layers; Set parameters 𝜌, ℎ𝑟 , 𝜂, 𝜁, 𝜗, 𝑐 = 0; Read (𝑥, 𝑦)[ℎ] , ℎ = 1; Create granule 𝛾 𝑐+1 , neurons 𝐴𝑐+1 , 𝐴𝑓 , and respective connections; For ℎ = 2, ... do Read (𝑥, 𝑦)[ℎ] ; Input 𝑥[ℎ] to the network; Compute compatibility degrees (𝑜1 , ..., 𝑜𝑐 ); Aggregate values using 𝐴𝑓 to get singular approximation 𝑦ˆ[ℎ] ; ∗ Compute convex hull of Γ𝑖 , 𝑖∗ = {𝑖, 𝑜𝑖 > 0}; ∗ ∗ ∗ 𝑖∗ Find granular approximation (𝑢𝑖 , 𝑢𝑖 , 𝑢𝑖 , 𝑢 ); [ℎ] [ℎ] [ℎ] =𝑦 − 𝑦ˆ ; Compute output error 𝜖 If 𝑥[ℎ] is not within expansion regions 𝐸 𝑖 ∀𝑖 𝑐+1 𝑐+1 , neuron 𝐴 and connections; Create granule 𝛾 Else 𝑖 Update the most active granule 𝛾 , 𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1 , ..., 𝑜𝑐 ); Update local function parameters 𝑎𝑖𝑗 using recursive least squares; Update connection weights 𝑤𝑗𝑖 ∀𝑗, 𝑖; If ℎ = 𝛼ℎ𝑟 , 𝛼 = 1, 2, ... Combine granules when feasible; Update model granularity 𝜌; Adapt connection weights 𝛿 𝑖 ∀𝑖; Prune inactive granules and respective connections; END
——————————————————————— V. W EATHER T EMPERATURE F ORECASTING This section considers fuzzy granular data streams derived from monthly mean, minimum, and maximum temperatures of weather time series of different geographic regions. The aim is to predict monthly temperatures for all regions.
A. Weather forecasting Weather forecasting are useful to plan activities, protect property, and assist decision making in several economic sectors such as energy, transportation, aviation, agriculture, inventory planning. Any system that is sensitive to the state of the atmosphere may benefit from weather forecasts. Monthly temperature data carry a degree of uncertainty due to imprecision of atmospheric measurements, instrument malfunction, and equivocated transcripts. Usually temperature data are numerical, but the processes which originate and supply the data are imprecise. Temperature estimates in finer time granularities (days, weeks) are a common demand. Evolving granular approaches such as eGNN provide guaranteed granular predictions of the time series in these cases. The satisfaction of the granular prediction depends of the prediction model compactness. Granular predictions together with singular predictions are important because they convey a value and a range of possible temperature values. Computational experiments assume fuzzy data whose membership functions are translations of the average minimum, mean and maximum monthly temperatures to triangular fuzzy numbers. The data were normalized in the range [0, 1]. We use data from different weather stations. They are summarized in Table I (data available at http://eca.knmi.nl and http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn.html). TABLE I M ONTHLY T EMPERATURE VALUES
Station Bucharest Death Valley Helsinki Lisbon Ottawa
Samples 960 1308 1680 1200 1380
From Jan 1930 Jan 1901 Jan 1871 Jan 1910 Jan 1895
To Dec 2010 Dec 2009 Dec 2010 Dec 2009 Dec 2009
Std.Dev. 0.1795 0.1835 0.1842 0.1556 0.1790
During the computational experiments described subsequently, the eGNN scans the data only once on a per-sample basis to simulate online data stream processing. Algorithm performance is evaluated using the root mean square error of normalized singular predictions, 𝐻 1 ∑ (mp(𝑦)[ℎ] − 𝑦ˆ[ℎ] )2 , 𝑅𝑀 𝑆𝐸 = ⎷ 𝐻
(35)
ℎ=1
the non-dimensional error index, 𝑁 𝐷𝐸𝐼 =
𝑅𝑀 𝑆𝐸 , 𝑠𝑡𝑑(mp(𝑦)[ℎ] ∀ℎ)
(36)
average number of rules in the model structure, and persample CPU time in milliseconds. The computer has a dualcore 2.54GHz processor with 4GB of RAM. B. Performance analysis Different computational intelligence approaches were chosen for performance assessment. They are: dynamic evolving neuro-fuzzy inference system (DENFIS) [2], evolving Takagi-Sugeno (eTS) [26], fuzzy set-based evolving modeling (FBeM) [19], interval-based evolving modeling (IBeM)
[18], multilayer perceptron neural network (MLP) [27], and extended Takagi-Sugeno (xTS) [28]. The task of the different approaches is to give one step ahead forecast of the monthly temperature 𝑦 [ℎ+1] using the last 12 observations 𝑥[ℎ−11] , ..., 𝑥[ℎ] . Online methods employ the sample-per-sample testing-before-training approach as follows. First, an estimation 𝑦¯[ℎ+1] is derived for a given input (𝑥[ℎ−11] , ..., 𝑥[ℎ] ). One time step later, the actual value 𝑦 [ℎ+1] becomes available and model adaptation is performed if necessary. Table II shows the forecasting results. TABLE II T EMPERATURE F ORECASTS
Station
Method DENFIS eGNN eTS Bucharest FBeM IBeM MLP xTS DENFIS eGNN eTS Death Valley FBeM IBeM MLP xTS DENFIS eGNN eTS Helsinki FBeM IBeM MLP xTS DENFIS eGNN eTS Lisbon FBeM IBeM MLP xTS DENFIS eGNN eTS Ottawa FBeM IBeM MLP xTS
# Rules 5.00 3.80 3.00 7.57 5.88 – 10.00 8.00 3.91 3.00 8.00 8.79 – 10.00 24.00 2.78 4.00 6.00 10.38 – 16.00 12.00 2.77 4.00 5.63 3.59 – 11.00 7.00 3.88 3.00 6.80 9.28 – 14.00
𝑅𝑀 𝑆𝐸 0.0800 0.0594 0.0598 0.0603 0.0643 0.0892 0.0643 0.0600 0.0498 0.0491 0.0506 0.0541 0.0584 0.0503 0.0780 0.0607 0.0634 0.0602 0.0764 0.0892 0.0651 0.0880 0.0577 0.0714 0.0599 0.0687 0.0955 0.0744 0.0770 0.0575 0.0604 0.0609 0.0734 0.0769 0.0631
𝑁 𝐷𝐸𝐼 0.4457 0.3309 0.3331 0.3359 0.3582 0.4969 0.3582 0.3270 0.2714 0.2676 0.2757 0.2948 0.3183 0.2741 0.4235 0.3295 0.3442 0.3268 0.4148 0.4843 0.3534 0.5656 0.3708 0.4589 0.3850 0.4415 0.6138 0.4781 0.4302 0.3212 0.3374 0.3402 0.4101 0.4296 0.3525
CPU 4.7 1.6 1.1 1.1 1.0 35.5 1.0 4.7 1.6 1.0 1.1 1.0 44.2 1.1 5.7 1.6 1.4 1.1 1.2 35.5 1.1 5.2 1.7 2.3 1.2 1.0 48.2 1.0 4.9 1.5 1.0 1.1 1.1 41.3 1.1
Table II shows that eGNN gives the most precise forecasts in 3 of the 5 temperature data sets, seconded by eTS and FBeM with one each. The structures of the eGNN are, in average, the most parsimonious. Alternative evolving approaches such as DENFIS, eTS and xTS use numeric data the mean temperature. In contrast, granular approaches such as eGNN, IBeM and FBeM take into account the mean and its neighbor data to bound forecasts. IBeM and xTS are the fastest among the algorithms evaluated in this paper. As an example, the one-step singular and granular forecasts of eGNN for the Helsinki time series are shown in Fig. 5. The additional plots of Fig. 5 show the granularity, error indices, and number of rules. Note that while the singular prediction 𝑝 attempts to match the actual mean temperature
value, the corresponding granular information [𝑢, 𝑈 ] formed by the lower and upper bounds of the consequent trapezoidal membership functions intends to envelop previous data and the uncertainty of the unknown temperature function 𝑓 .
ACKNOWLEDGMENT The first author acknowledges CAPES, Brazilian Ministry of Education, for his fellowship. The second author is grateful to the Energy Company of Minas Gerais - CEMIG, Brazil, for grant P&D178. The last author thanks CNPq, the Brazilian National Research Council, for grant 304596/2009-4. R EFERENCES
Fig. 5.
eGNN Helsinki temperature forecasts
The results suggest that eGNN benefits from data uncertainty and neurofuzzy granular framework to provide accurate and linguistic predictions of fuzzy time series. eGNN is an evolving approach able to process fuzzy data streams and provide simultaneous singular and granular forecasts. VI. C ONCLUSION This paper has introduced a fuzzy data stream modeling framework based on an evolving granular neural network approach. The eGNN framework processes fuzzy data streams using fuzzy granular models, fuzzy aggregation neurons, and incremental learning algorithm. Its neurofuzzy structure encodes a set of fuzzy rules and embeds a fuzzy inference system. The resulting modeling approach trades off precision and interpretability combining functional and linguistic fuzzy models in a single framework. The eGNN provides singular approximation as well as granular approximation of functions. An application example concerning weather temperature forecasting has shown that eGNN is highly competitive with state-of-the-art evolving approaches. Further work shall address methods to control the specificity of granules during learning, and linguistic approximation.
[1] Angelov, P.; Filev, D.; Kasabov, N. (Eds.) Evolving Intelligent Systems: Methodology and Applications. Wiley-IEEE Press Series on CI, 2010. [2] Kasabov, N. Evolving Connectionist Systems: The Knowledge Engineering Approach. Springer-Verlag - London, 2nd edition, 2007. [3] Lughofer, E. Evolving Fuzzy Systems - Methodologies, Advanced Concepts and Applications. Springer-Verlag, Berlin Heidelberg, 2011. [4] Bouchon-Meunier, B.; Marsala, C.; Rifqi, M.; Yager, R. Uncertainty in Intelligent and Information Systems. World Scientific - SG, 2008. [5] Kreinovich, V. “Interval computations as an important part of granular computing: an introduction.” In: Pedrycz; Skowron; Kreinovich (Eds.) Handbook of Granular Computing, pp: 1-31, 2008. [6] Bargiela, A.; Pedrycz, W. Granular Computing: An Introduction. Kluwer Academic Publishers - Boston, 1st edition, 2002. [7] Pedrycz, W.; Gomide, F. Fuzzy Systems Engineering: Toward HumanCentric Computing. Wiley - Hoboken, 2007. [8] Yao, J. T. “A ten-year review of granular computing.” IEEE International Conference on Granular Computing, pp: 734-739, 2007. [9] Yao, Y. Y. “Granular computing: past, present and future.” IEEE International Conference on Granular Computing, pp: 80-85, 2008. [10] Zadeh, L. A. “Fuzzy sets and information granularity.” In: Gupta, M. M.; Ragade, R. K.; Yager, R. R. (Eds.) Advances in Fuzzy Set Theory and Applications, North Holland - Amsterdam, pp: 3-18, 1979. [11] Zadeh, L. A. “Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic.” Fuzzy Sets and Systems, Vol. 90, Issue 2, pp: 111-127, 1997. [12] Witten, I. H.; Frank, E.; Hall, M. A. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 3rd edition, 2011. [13] Liggins, M. E.; Hall, D. L.; Llinas, J. (Eds.) Handbook of Multisensor Data Fusion: Theory and Practice. CRC Press, 2nd edition, 2008. [14] Maimon, O. Z.; Rokach, L. The Data Mining and Knowledge Discovery Handbook. Springer - New York, USA, 2005. [15] Pedrycz, W.; Vukovich, W. “Granular neural networks.” Neurocomputing, Vol. 36, pp: 205-224, 2001. [16] Leite, D.; Costa, P.; Gomide, F. “Evolving granular neural networks from fuzzy data streams.” Neural Networks (Submitted), 17p. 2012. [17] Leite, D.; Costa, P.; Gomide, F. “Evolving granular neural network for semi-supervised data stream classification.” World Congress on Computational Intelligence, pp: 1877-1884, Jul. 2010. [18] Leite, D.; Costa, P.; Gomide, F. “Interval approach for evolving granular system modeling.” In: Mouchaweh, M.; Lughofer, E. (Eds.) Learning in Non-stationary Environments, Springer - NY, 30p. 2012. [19] Leite, D.; “Evolving Fuzzy Granular Modeling from Nonstationary Fuzzy Data Streams.” Evolving Systems (Submitted), 25p. 2012. [20] Bouchon-Meunier, B. (Ed.) Aggregation and Fusion of Imperfect Information (SFSC). Physica-Verlag, Heidelberg, New York, 1998. [21] Beliakov, G.; Pradera, A.; Calvo, T. Aggregation Functions: A Guide for Practitioners (SFSC). Springer-Verlag - Berlin, Heidelberg, 2007. [22] Zadeh, L. A. “Generalized theory of uncertainty (GTU) - principal concepts and ideas.” Computational Statistics & Data Analysis, Vol. 51, pp: 15-46, 2006. [23] Yager, R. R. “Participatory learning with granular observations.” IEEE Transactions on Fuzzy Systems, Vol. 17, Issue 1, 2009. [24] Yager, R. R. “Learning from imprecise granular data using trapezoidal fuzzy set representations.” In: Prade, H.; Subrahmanian, V. S. (Eds.) Lecture Notes in Computer Sc., Vol. 4772, pp: 244-254, 2007. [25] Pedrycz, W. “Heterogeneous fuzzy logic networks: fundamentals and development studies.” IEEE Transactions on Neural Networks, Vol. 15, Issue 6, pp: 1466-1481, 2004. [26] Angelov, P.; Filev, D. “An approach to online identification of TakagiSugeno fuzzy models.” IEEE Transactions on Systems, Man, and Cybernetics - Part B, Vol. 34, Issue 1, pp: 484-498, 2004. [27] Haykin, S. Neural Networks: A Comprehensive Foundation. Prentice Hall, 2nd edition, 1999. [28] Angelov, P.; Zhou, X. “Evolving fuzzy systems from data streams in real-time.” Int. Symp. on Evolving Fuzzy Systems, pp: 29-35, 2006.