An Approach to Employ eTS Learning Algorithm ... - Semantic Scholar

1 downloads 0 Views 685KB Size Report
The eTS performance was compared to three algorithms implemented in KEEL, including model trees for regression, artificial neural network, and support.
An Approach to Employ eTS Learning Algorithm for the Valuation of Residential Premises Tadeusz Lasota1, Zbigniew Telec2, Bogdan Trawiński3, Krzysztof Trawiński4, 1

Wrocław University of Environmental and Life Sciences, Dept. of Spatial Management Ul. Norwida 25/27, 50-375 Wroclaw, Poland 2,3 Wrocław University of Technology, Institute of Informatics, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland 4 European Centre for Soft Computing, Edificio Científico-Tecnológico, 3ª Planta, C. Gonzalo Gutiérrez Quirós S/N, 33600 Mieres, Asturias, Spain [email protected], {Zbigniew.Telec, Bogdan.Trawinski}@pwr.wroc.pl, [email protected]

Abstract. An attempt has been made to employ evolving Takagi-Sugeno algorithm (eTS) to built models assisting property valuation on the basis of actual data drawn from cadastral system, registry of sales transactions, and a cadastral map. Seven methods of feature selection were applied an evaluated. The eTS performance was compared to three algorithms implemented in KEEL, including model trees for regression, artificial neural network, and support vector machine. The results confirmed the advantages of eTS algorithm.

Keywords: eTS learning algorithm, property valuation, feature selection, GIS

1 Introduction Evolving intelligent systems are self-developing and self-learning systems that combine intelligent systems with the on-line learning algorithms extracting knowledge from data. The evolving concept, on which evolving intelligent systems are based, means expanding or shrinking model structure being capable of adapting to the changes in the environment, which cannot be provided/learned a priori. Evolving intelligent systems is a field that was created due to the lack of on-line learning intelligent systems for real-time applications. The foundation for evolving systems is established by fuzzy and neuro-fuzzy systems, which include evolving connectionist system (ECOS) [7], evolving fuzzy neural network (EfuNN) [6], and dynamic evolving neuro-fuzzy inference systems (DENFIS) [8]. The other group is formed by evolving fuzzy systems (EFS). They are based on the fuzzy rules, that is fuzzy antecedents and generic consequence. EFS combines unsupervised learning with regard to antecedent part of the model with supervised learning in the case of consequent parameters. An evolving Takagi-Sugeno system (eTS) was proposed by Angelov and Filev in [2], [3]. Its rule-base and parameters continuously evolve by adding new rules with more summarization power, and/or modifying existing rules

and parameters. The core of the systems is keeping up to date Takagi-Sugeno consequent parameters and scatter antecedents derived from a fuzzy clustering process. In our previous works [10], [11] we investigated different machine learning algorithms, among others genetic fuzzy systems devoted to build data driven models to assist with real estate appraisals using MATLAB and KEEL tools. However our former works did not consider time issue, but the evolving models, which are able to learn online offer a promising solution to our problem of selecting an appropriate model for the system assisting real estate appraisal. In this paper we apply eTS model to perform feature selection of data set obtained by combining cadastral data of residential premises, records of sales/purchase transactions, and GIS data derived from a cadastral map. The concept of data driven models for premises valuation, presented in the paper, was developed basing on sales comparison method. The architecture of the proposed system is shown in Fig. 1. The appraiser accesses the system through the internet and input the values of the attributes of the premises being evaluated into the system, which calculates the output using a given model. The final result as a suggested value of the property is sent back to the appraiser.

Fig. 1. Concept of information systems for property valuation using eTS system

2 Evolving Fuzzy Ssytems Evolving systems are self-developing and self-learning systems that combine intelligent systems with on-line learning algorithms being capable to extract knowledge from data. In this contribution we employ the evolving Takagi-Sugeno model (eTS), proposed [2], [3] to assist with real estate appraisal. In the eTS model, its rule-base and parameters continuously evolve by adding new rules with more summarization power, and/or modifying existing rules and parameters. The core of the systems is keeping up to date Takagi-Sugeno consequent parameters and scatter antecedents derived from a fuzzy clustering process. The eFS is based on the TS model, which deals with the special type of fuzzy rules with the fuzzy antecedent and functional consequents: ℛ𝑖 : 𝐼𝐹 𝑥1 𝑖𝑠 𝐴𝑖1 𝐴𝑁𝐷 … 𝐴𝑁𝐷 𝑥𝑛 𝑖𝑠 𝐴𝑖𝑛 𝑇𝐻𝐸𝑁(𝑦𝑖 = 𝑎𝑖0 + 𝑎𝑖1 𝑥1 + ⋯ + 𝑎𝑖𝑛 𝑥𝑛 )

(1)

where ℛ𝑖 denotes ith fuzzy rule, i={l,…,R}; xj denote the jth input, j={l,…,n}; Aij stands for the antecedent fuzzy sets; yi is the output of the ith linear subsystem with ail, as its parameters l={0,…,n}, and R and n are the number of rules and input variables respectively. The data is divided on the basis of fuzzy regions, where each region is connected with a linear subsystem. At the end, it results with collection of loosely-fuzzily combined multiple subsystems. The standard eTS model uses the Gaussian antecedent fuzzy sets: 𝜇𝑖𝑗 = 𝑒 −𝛼

𝑥 𝑗 −𝑥 𝑖𝑗∗

2

(2)

where α=4/r² and r is a positive constant, which defines the spread of the antecedent and the zone of influence of the ith model. The identification of eTS model is divided into two parts. Firstly, unsupervised learning of antecedent part of the model takes place, and then, supervised one is applied to generate consequent parameters. The first task is solved by clustering the input-output data space into fuzzy regions. The eTS on-line clustering approach is based on the subtractive clustering, an improved version of the so-called mountain clustering approach, to calculate a measure of the spatial proximity between a particular point and all other data points, the so-called potential, for each data sample. Potential of the new data point Pk(zk) is recursively calculated as follows: 𝑃𝑘 𝑧𝑘 =

𝑘−1 𝑘 − 1 𝜗𝑘 + 1 + 𝜎𝑘 − 2𝑣𝑘

(3)

𝑗 2 𝑗 𝑗 𝑗 𝑗 𝑛+1 𝑗 2 where 𝜗𝑘 = ∑𝑛+1 𝜎𝑘 = ∑𝑘−1 𝑣𝑘 = ∑𝑛+1 𝛽𝑘 = ∑𝑘−1 𝑙=1 ∑𝑗 =1 (𝑧𝑙 ) ; 𝑙=1 𝑧𝑙 . 𝑗 =1 (𝑧𝑙 ) ; 𝑗 =1 𝑧𝑘 𝛽𝑘 ; 𝑗 Parameters 𝜗𝑘 and 𝑣𝑘 are calculated from the current data point 𝑧𝑘 , while 𝛽𝑘 and 𝜎𝑘 𝑗 𝑗 𝑗 𝑗 2 are recursively updated as 𝜎𝑘 = 𝜎𝑘−1 + ∑𝑛+1 𝑗 =1 (𝑧𝑙 ) ; 𝛽𝑘 = 𝛽𝑘−1 + 𝑧𝑘−1 . The potential of the cluster centers is calculated as follows:

𝑃𝑘 (𝑧𝑙∗ ) =

(𝑘 − 1)𝑃𝑘−1 (𝑧𝑙∗ ) 𝑗

𝑘 − 2 + 𝑃𝑘−1 𝑧𝑙∗ + 𝑃𝑘−1 𝑧𝑙∗ ∑𝑛+1 𝑗 =1 (𝑑𝑘

𝑘−1

)2

(4)

where the 𝑃𝑘 (𝑧𝑙∗ ) is the potential at time k of the cluster center. When the new data point z is substantially different than existing clusters, the existing clusters are updated. If the potential of the new data point is higher, then the potential of existing centers and the new data point is close to an old center the new data point becomes a center of the cluster and replace it. In the case that the only first condition is fulfilled, this point is added to the rule-base as a new rules center. For fixed cluster centers, the eTS model transforms into a linear model: 𝑦 = 𝜓𝑇 𝜃

(5)

where 𝜃 = 𝜋1𝑇 , 𝜋2𝑇 , … , 𝜋𝑅𝑇 𝑇 is a vector composed of the linear model parameters; 𝜓 = 𝜆1 𝑥𝑒𝑇 , 𝜆12 𝑥𝑒𝑇 , … , 𝜆𝑅 𝑥𝑒𝑇 is a vector of the inputs that are weighted by the normalized firing levels of the rules.

The subsystem parameters can be learned recursively using the recursive least squares algorithm called also the Kalman filter: 𝜃𝑘 = 𝜃𝑘−1 + 𝐶𝑘 𝜓𝑘 (𝑦𝑘 − 𝜓𝑘𝑇 𝜃𝑘−1 ) 𝐶𝑘 = 𝐶𝑘−1 −

(6)

𝐶𝑘−1 𝜓𝑘 𝜓𝑘𝑇 𝐶𝑘−1 1 + 𝜓𝑘𝑇 𝐶𝑘−1 𝜓𝑘

(7)

where C is the 𝑅 𝑛 + 1 × 𝑅 𝑛 + 1 co-variance matrix. The eTS employs the extended Kalman filter, as the linearity assumption fails when the cluster centers are continuously updated [2], [3].

3 Experiment Description and Results The main goal of our study was to explore the potentiality of the eTS model, i.e. very short execution time and relative resistance to outliers, to perform feature selection of data set obtained by combining cadastral data of residential premises, records of sales/purchase transactions, and GIS data derived from a cadastral map. Our resulting data set consisted of 3843 records with 19 input variables characterizing premises and the price of a transaction as the output. The features with their descriptive statistics are presented in Table 1. In Fig. 2 it was illustrated how GIS data were obtained for given premises. Table 1. Features of residential premises supplemented with GIS data No. 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20

Name Area Year Floor Storeys Rooms ResPrem NResPrem Xc Yc Centre Shopping ComHub TownEdge RiverBank Park Forest Railway RailStation Airport Price

Max 178.6 2004 4 12 8 91 42 100296 49753 12396 3557 8493 6984 5330 2304 9637 2368 5777 15228 540200

Min 14.4 1850 0 1 1 1 0 82713 39809 20 81 15 965 92 86 872 31 161 2801 20500

Avg 49.8 1945 2 5 3 15 9 94272 44674 2150 1026 1052 5012 1342 868 6458 699 1828 8863 120939

Std 20.4 34 1 2 1 17 9 1910 1704 1200 528 662 989 773 401 1198 429 956 1826 64954

Med 46.8 1957 2 5 3 10 6 94139 44357 1973 1027 989 5135 1184 804 6536 625 1780 8712 102000

Description usable area of premises year of building construction floor of premises no. of storeys in a building no. of rooms in premises residential premises in a building non-resid. premises in a building geodetic coordinate Xc (NS) geodetic coordinate Yc (WE) distance from the centre of a city distance from the shopping centre distance from communications hub distance from the edge of a town distance from the river bank distance from the closest park distance from the closest forest distance from railway line distance from railway station distance from the airport price of premises

Distances 10 – from city centre 11 – from shopping centre 12 – from communic. hub 13 – from town edge 14 – from river bank 15 – from park 16 – from forest 17 – from railway line 18 – from railway station 19 – from airport

Fig. 2. Illustration of features obtained for given premises from cadastral map Table 2. Features ordered by importance using different methods Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Expert 01 04 03 02 05 10 14 15 06 07 09 08 11 12 17 16 13 18 19

FFS 01 02 14 06 12 09 17 03 15 11 07 04 10 16 13 05 08 19 18

BFS 01 02 19 11 05 18 15 04 14 10 09 12 16 13 17 07 06 08 03

Corr. 01 05 07 04 18 13 17 15 09 03 10 11 16 19 06 08 02 12 14

VSC 01 05 07 02 18 03 10 11 04 06 09 13 15 17 14 16 12 19 08

VIB 01 02 10 08 05 19 07 09 16 17 11 14 18 13 12 06 04 15 03

VIF 01 05 02 10 07 18 13 16 11 17 15 08 14 09 19 12 06 04 03

Seven different feature selection approaches were compared. Firstly, an academia expert in the field of real estate prioritized features as property price drivers. Secondly forward feature selection (FFS) and backward feature selection (BFS) were accomplished using eTS algorithm and mean squared error (MSE) as a selection criterion. These required laborious data processing including in total 380 runs of the eTS algorithm with data sets comprising from 1 to 19 input variables. Thirdly, the features were arranged in the order of decreasing linear correlation between individual features and premises prices. Finally, three methods available in Statistica Data Miner package were employed [5], i.e. variable screening, which does not assume the

relationships between the predictors and outcome variables (VSC), as well as variable importance measures for gradient boosting machine (VIB) and random forests (VIF) [9]. All the methods resulted in ranking the features according to their importance (see Table2). Then for each method 19 models comprising best 1, 2, .., and 19 variables were created using eTS algorithm. The resulting values of MSE for so obtained models are presented in Table 3, where values below 0.001 have been distinguished. The best performance revealed the models built on the basis of FFS, BFS and VIB methods and the expert’s ranking, whereas correlation, VSC, and VIF methods did not provide MES lower than 0.001 in any case. Table 3. MSE for models with increasing number of features according to their importance No. of features 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Expert

FFS

BFS

Correlation

VSC

VIB

VIF

0.00112 0.00092 0.00094 0.00093 0.00098 0.00132 0.00149 0.00178 0.00224 0.00156 0.00292 0.00338 0.00180 0.00268 0.00177 0.00192 0.00205 0.00276 0.00174

0.00112 0.00071 0.00071 0.00071 0.00074 0.00081 0.00112 0.00097 0.00086 0.00135 0.00168 0.00152 0.00146 0.00159 0.00190 0.00194 0.00205 0.00247 0.00174

0.00112 0.00071 0.00088 0.00081 0.00089 0.00109 0.00076 0.00094 0.00090 0.00114 0.00105 0.00126 0.00106 0.00118 0.00148 0.00108 0.00098 0.00135 0.00174

0.00112 0.00125 0.00114 0.00101 0.00114 0.00112 0.00117 0.00146 0.00169 0.00124 0.00116 0.00155 0.00181 0.00201 0.00234 0.00166 0.00179 0.00195 0.00174

0.00112 0.00125 0.00114 0.00116 0.00117 0.00157 0.00166 0.00144 0.00169 0.00135 0.00161 0.00164 0.00183 0.00208 0.00230 0.00262 0.00274 0.00152 0.00174

0.00112 0.00071 0.00073 0.00120 0.03000 0.00125 0.00147 0.00122 0.00178 0.00359 0.00336 0.00279 0.00162 0.00448 0.00479 0.00534 0.00250 0.00135 0.00174

0.00112 0.00125 0.00101 0.00155 0.00116 0.00121 0.00281 0.00194 0.00299 0.00109 0.00121 0.00145 0.00142 0.00178 0.00221 0.00164 0.00172 0.00135 0.00174

Fig. 3 and 4 show the performance of different feature selection methods for increasing number of variables embraced. It can be observed that to 8 features all methods except for VIF provide good performance. As the number of features increased the methods became unstable, giving first lower values of MSE and then higher, probably due to overfitting.

Fig. 3. Performance comparison of different methods of feature selection

Fig. 4. Performance comparison of different methods of feature selection

In order to further evaluate the features selected with eTS algorithm, 18 best sets of features giving MSE lower than 0.001 (from those shown in Table 3), were used to generate models using other experimental system such as KEEL [1]. Three popular machine learning algorithms implemented in KEEL were employed: M5Rules decision tree (M5R) [13], MLPercep- tronConj-Grad neural network (MLP) [12], and NU_SVR support vector machine (SVM) [4]. The list of models and features they encompassed is shown in Table 4. Table 4. Models chosen for comparative study with KEEL Models [Features] A=[01,02] B=[01,04] C=[01,02,10] D=[01,02,14] E=[01,02,19] F=[01,03,04]

Models [Features] G=[01,02,03,04] H=[01,02,06,14] I=[01,02,11,19] J=[01,02,03,04,05] K=[01,02,05,11,19] L=[01,02,06,12,14]

Models [Features] M=[01,02,06,09,12,14] N=[01,02,03,06,09,12,14,17] O=[01,02,03,06,09,12,14,15,17] P=[01,02,05,11,15,18,19] R=[01,02,04,05,11,15,18,19] S=[01,02,04,05,11,14,15,18,19]

Due to the different properties of eTS system and KEEL, because the latter does not allow to take into account time factor, the data set of 3883 sales transactions had to be divided into four two-year subsets covering 1999-2000, 2001-2002, 2003-2004, and 2005-2006 years respectively. For each subset we might assume that within two years the prices of premises with similar attributes were roughly comparable. The median of MSE obtained for all subsets and individual models is shown in Fig. 5. It could be observed that the lower values of MSE provided models comprising greater numbers of features i.e. from 6 to 9. However the values of MSE were much higher than those in the case of eTS algorithm, what could be explained that the eTS algorithm is devoted to process streams of data and is resistant to outliers and consequently it does not create big errors. Data used in our research needs to be further cleansed.

Fig. 5. MSE obtained for models with different features generated using KEEL

4 Conclusions and Future Work The main goal of our study was to explore the potentiality of the eTS model, i.e. very short execution time and relative resistance to outliers, to perform feature selection of data set obtained by combining cadastral data of residential premises, records of

sales/purchase transactions, and GIS data derived from a cadastral map. Our resulting data set consisted of 3843 records with 19 input variables characterizing premises and the price of transaction as the output. Seven methods of feature selection were applied an evaluated. The best performance revealed the models built on the basis of FFS, BFS and VIB methods and the expert’s ranking. The eTS method outperformed three algorithms implemented in KEEL, including model trees for regression, artificial neural network, and support vector machine. The study proved the usefulness of eTS algorithm to an internet system assisting with premises valuation especially as regards to its data processing speed and resistance to outliers. However further investigations are needed because our preliminary research revealed that eTS performance strongly depends on quality of data and the results reported in the paper refer to the set of actual data which provided the lowest values of MSE. Acknowledgments. Many thanks to Plamen Angelov for granting us his eTS algorithm code in Matlab, thereby allowing us to conduct the experiments.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Alcalá-Fdez, J. et al: KEEL: A software tool to assess evolutionary algorithms for data mining problems, Soft Computing 13:3, pp. 307--318 (2009) Angelov, P.: Evolving Rule-based Models: A Tool for Design of Flexible Adaptive Systems, Springer-Verlag, Heidelberg (2002) Angelov, P., Filev, D.: An approach to on-line identification of evolving Takagi-Sugeno models, IEEE Trans. on Systems, Man and Cyb., part B, 34:1, pp. 484--498 (2004) Fan, R.E., Chen, P.H., Lin, C.J.: Working set selection using the second order information for training SVM. Journal of Machine Learning Research 6, pp. 1889--1918 (2005) Hill, T., Lewicki, P.: Statistics: Methods and Applications. StatSoft, Tulsa (2007) Kasabov, N.: Evolving fuzzy neural networks for on-line supervised/unsupervised, knowledge-based learning, IEEE Trans. on Sys., Man and Cyb., part B, 31, 902--918 (2001) Kasabov, N.: Evolving connectionist systems: Methods and applications in bioinformatics, brain study, and intelligent machines, Springer Verlag, Heidelberg (2002) Kasabov, N., Song, Q.: DENFIS: Dynamic Evolving Neural-Fuzzy Inference System and Its Application for Time-Series Prediction, IEEE Trans. on Fuzzy Systems, 10:2, pp. 144-154 (2002) Sandri M., Zuccolotto P.: A Bias Correction Algorithm for the Gini Variable Importance Measure in Classification Trees. J. of Comp. and Graph. Stat., 17:3, pp. 611--628 (2008) Król, D., Lasota, T., Trawiński, B., Trawiński, K.: Investigation of evolutionary optimization methods of TSK fuzzy model for real estate appraisal. International Journal of Hybrid Intelligent Systems 5:3, pp. 111--128 (2008) Lasota, T., Mazurkiewicz, J., Trawiński, B., Trawiński, K.: Comparison of Data Driven Models for the Validation of Residential Premises using KEEL, International Journal of Hybrid Intelligent Systems, in press (2009) Moller, F.: A scaled conjugate gradient algorithm for fast supervised learning, Neural Networks 6, pp. 525--533 (1990) Quinlan, J.R.: Learning with Continuous Classes, 5th Australian Joint Conference on Artificial Intelligence (AI92), Singapore, pp. 343--348 (1992)

Suggest Documents