Combined model based on optimized multi-variable ... - IEEE Xplore

10 downloads 0 Views 412KB Size Report
Combined model based on optimized multi-variable grey model and multiple linear regression. Pingping Xiong1,2,*, Yaoguo Dang1, Xianghua Wu2, and ...
Journal of Systems Engineering and Electronics Vol. 22, No. 4, August 2011, pp.615–620 Available online at www.jseepub.com

Combined model based on optimized multi-variable grey model and multiple linear regression Pingping Xiong1,2,* , Yaoguo Dang1 , Xianghua Wu2 , and Xuemei Li1 1. College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China; 2. College of Mathematics and Physics, Nanjing University of Information Science and Technology, Nanjing 210044, P. R. China

Abstract: The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.

Keywords: multi-variable grey model (MGM(1,m)), background value, optimization, multiple linear regression, combined prediction model.

DOI: 10.3969/j.issn.1004-4132.2011.04.010

1. Introduction Since the grey system theory was proposed in 1982, it has been widely applied in many areas of economic management and so on [1,2]. Many scholars have carried out extensive studies for grey systems, such as grey correlation analysis, grey decision making and grey prediction. Chen discussed a grey measurement system and proposed the method of grey representation and information processing [3]. Xie commented several kinds of grey relational models [4]. Wang expanded the decision model of grey targets into some situation and also gave the optimization method of weights in grey targets [5]. Rao presented a new multiattribute decision making model based on optimal membership and relative entropy [6]. Gu proposed an effectiveness evaluation model of weapon systems [7]. Wang researched prediction formula of grey models under different initial conditions with recursive method and studied the optimization problem under two criteria [8]. Manuscript received October 16, 2009. *Corresponding author. This work was supported by the National Natural Science Foundation of China (71071077); the Ministry of Education Key Project of National Educational Science Planning (DFA090215); China Postdoctoral Science Foundation (20100481137) and Funding of Jiangsu Innovation Program for Graduate Education (CXZZ11-0226).

Grey model (GM(1,1)) is a main content of the grey prediction model, the multi-variable grey model (MGM(1, m)) is a spread of single-variable GM(1,1). Many scholars have improved MGM(1,m). Zhai proposed MGM(1,m) and verified that the accuracy of the model was higher than GM(1,1) by an example [9]. Li, Wang and Cui respectively used different methods to improve the accuracy of MGM(1,m) [10−12]. The constructive method of background value is the main factor affecting the prediction accuracy and adaptability. Tan and Luo studied the constructive error of background value of GM(1,1) and optimized the background value, respectively [13,14]. Wang made use of a non-homogeneous exponential function to fit an accumulated generating sequence and optimized the background value [15]. Wang researched the optimization of background value of the non-equidistance GM(1,1) [16]. In view of the respective advantages of many models, there have been many combination forecasting models of the grey model and other models at present, which can well use the advantages of each model and overcome their shortcomings. They have improved the prediction accuracy in practice. Bao predicted the data points of the possible future transition dates with GM(1,1) and used a piecewise linear regression function to predict the actual problems and achieved good results [17]. Liu established a combination model with the grey original multi-variable model and multiple linear regression to remove noise pollution for the data to improve the prediction accuracy [18]. On the basis of the study of the above mentioned scholars, this article will optimize the background value of MGM(1,m), and establish a combined prediction model with the multiple linear regression, which can effectively improve the accuracy of simulation and prediction. Finally, the practicality and effectivity of the combined model is verified by an example.

2. MGM(1,m) based on optimized background value 2.1 Construction of the original MGM(1,m) (0) Let the original nonnegative data vector be X (0) = {X1 ,

616

Journal of Systems Engineering and Electronics Vol. 22, No. 4, August 2011

(0)

(0)

(0)

(0)

X2 , . . . , Xm }T , here Xj



(0)

= {xj (1), xj (2), . . . ,

(0)

(0)

(0)

is conducted for the sequences X1 , X2 ,. . . , Xm , respectively, then we can get a first-order accumulated gen(1) (1) (1) erating vector X (1) = {X1 , X2 , . . . , Xm }T , where (1) (1) (1) (1) (1) Xj = {xj (1), xj (2), . . . , xj (n)}T , xj (i) = i  (0) xj (k) (j = 1, 2, . . . , m; i = 1, 2, . . . , n). k=1

The first-order differential equations of the MGM(1,m) are as follows: ⎧ (1) ⎪ dx ⎪ (1) (1) (1) ⎪ ⎪ 1 = a11 x1 + a12 x2 + · · · + a1m xm + b1 ⎪ ⎪ dt ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ dx(1) (1) (1) (1) ⎪ 2 ⎪ = a21 x1 + a22 x2 + · · · + a2m xm + b2 ⎨ dt (1) ⎪ ⎪ .. ⎪ ⎪ . ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ dx(1) ⎪ (1) (1) ⎪ ⎩ m = am1 x(1) 1 +am2 x2 +· · · +amm xm +bm dt

(2)

···

zm (3)



⎥ ⎥ 1 ⎥ ⎥ ⎥ .. .. ⎥ . ⎥ . ⎦ (1) · · · zm (n) 1

.. . (1)

z2 (n)

(1)

(0)

j = 1, 2, . . . , m.

We can get the identification value of the parameter matrix A and vector B ˆ = (ˆb1 , ˆb2 , . . . , ˆbm )T . Aˆ = (ˆ aij )m×m , B The time response vector of the MGM(1,m) is (1) (1) T ˆ (1) (k) = {ˆ X x1 (k), x ˆ2 (k), . . . , x ˆ(1) m (k)} = ˆ

ˆ − Aˆ−1 B, ˆ eA(k−1) (X (1) (1) + Aˆ−1 B)

(5)

and the restore vector is

T (0) (0) ˆ (0) (k) = x X ˆ1 (k), x ˆ2 (k), . . . , x ˆ(0) (k) = m (6)

(1)

k (1) dxj (1) (1) dt = [aj1 x1 +aj2 x2 +· · ·+ajm x(1) m +bj ]dt k−1 dt k−1

(1)

T X (1) (t) = {x1 (t), x2 (t), . . . , x(1) m (t)} =

(X

(1)

−1

(1) + A

−1

B) − A

B

j = 1, 2, . . . , m (3)

(1) (1) (1) {x1 (1), x2 (1), . . . , xm (1)}T .

where X (1) = Equation (1) is separated, and then we can get (0) xj (k)

=

m  l=1

Both sides of the m whitening equations in (1) are integrated in the interval [k − 1, k], then k

then the time response vector of (1) is

Through simplifying, we gain (0)

xj (k) =

m  l=1

(1) ajl zl (k)

+ bj

j = 1, 2, . . . , m; k = 2, 3, . . . , n (1)

(1)

a ˆj = (ˆ aj1 , a ˆj2 , . . . , a ˆjm , ˆbj )T = (P T P )−1 P T Yj , j = 1, 2, . . . , m

ajl

k k−1

(1)

xl dt + bj

j = 1, 2, . . . , m; k = 2, 3, . . . , n. (4)

where zl (k) = 0.5(xl (k − 1) + xl (k)). Due to the least square method, we can obtain the parameter sequences

where

(1)

1

2.2 MGM(1,m) with optimized background value

dX (1) (t) = AX (1) (t) + B, dt

(1)

(1)

ˆ (1) (k) − X ˆ (1) (k − 1), k = 2, 3, . . . , n. X

(1) can be noted as

(1)

zm (2)

z2 (3)

(0)

A = (aij )m×m , B = (b1 , b2 , . . . , bm )T ,

e

···

Yj = {xj (2), . . . , xj (n)}T ,

Note that

A(t−1)

(1)

z2 (2)

⎢ ⎢ (1) ⎢ z1 (3) ⎢ P =⎢ .. ⎢ ⎢ . ⎣ (1) z1 (n)

xj (n)}T (j = 1, 2, . . . , m). Acumulating generation (0)

(1)

z1 (2)

(7)

Contrasting (4) with (7), we can see that the calculating method of the traditional background value is actually k (1) (1) (1) replacing xj dt with zj (k), namely 0.5(xj (k − k−1

(1) xj (k))

(j = 1, 2, . . . , m) the difference of the two 1) + formulae is the error source of caculating formula of the traditional background value [10,11]. According to the non-homogeneous index vector form of (3) and the quasi-exponential law of the first-order accumulated generating sequence, we may set

Pingping Xiong et al.: Combined model based on optimized multi-variable grey model and multiple linear regression (1)

xj (t) = bj eaj (t−1) + cj , j = 1, 2, . . . , m, where aj , bj and cj are constants to be fixed. We give the optimized background value a mark (1) z¯j (k)

k =

(1)

k−1

xj dt, j = 1, 2, . . . , m.

Through caculating, we can get the formula (0)

xj (k)

(1)

z¯j (k) =

(0)

(0)

ln xj (k) − ln xj (k − 1)

+

(0)

[xj (k − 1)]k−1

(0)

xj (1) +

(0)

(0)

(0)

[xj (k)]k−3 [xj (k − 1) − xj (k)]

.

(8)

According to the least square method, we can get the (0) parameter vectors for the differential equation xj (k) = m  l=1

(1)

ajl z¯l (k) + bj (j = 1, 2, . . . , m; i = 2, 3, . . . , n) of

aj1 , a ˆj2 , . . . , a ˆjm , ˆbj )T = (P¯ T P¯ )−1 P¯ T Y¯j a ˆj = (ˆ j = 1, 2, . . . , m (9) where the matrix P¯ and the vector Y¯j are respectively ⎡ (1) ⎤ (1) (1) z¯1 (2) z¯2 (2) · · · z¯m (2) 1 ⎢ (1) ⎥ (1) (1) ⎢ z¯1 (3) z¯2 (3) · · · z¯m (3) 1 ⎥ ⎢ ¯ ⎥ P =⎢ .. .. .. .. ⎥ . ⎣ ⎦ . . . (1) (1) (1) z¯1 (n) z¯2 (n) · · · z¯m (n) 1 Y¯j =

(0) (0) {xj (2), . . . , xj (n)}T ,

Firstly, make use of the MGM(1,m) with optimized background value to simulate and predict the original sequence data of the explanatory variables, (0) (0) (0) namely {x1 (k), x2 (k), . . . , xm (k)} (k = 1, 2, . . . , n), (0) (0) ˆ2 (k), . . . , then get the simulation values {ˆ x1 (k), x (0) x ˆm (k)} (k = 1, 2, . . . , n) and the prediction values (0) (0) (0) {ˆ x1 (k), x ˆ2 (k), . . . , x ˆm (k)} (k = n + 1, n + 2, . . .) and take the simulation values as the base sequence data of the multiple linear regression modeling. (0) Secondly, according to the simulation values {ˆ x1 (k), (0) (0) ˆm (k)} (k = 1, 2, . . . , n; n  m + 1) x ˆ2 (k), . . . , x and the dependent variable y(k) (k = 1, 2, . . . , n), we can get the estimation value of the regression coefficients βˆ0 , βˆ1 , . . . , βˆm of the multiple linear regression model, and then gain the combined prediction model of the optimized MGM(1,m) and the multiple linear regression (0) (0) yˆ(k) = βˆ0 + βˆ1 xˆ1 (k) + βˆ2 x ˆ2 (k) + · · · + βˆm x ˆ(0) m (k)

k = 1, 2, . . . , n, n + 1, . . .

the optimization model:

j = 1, 2, · · · , m.

Taking (8) into (9), we can gain the parameter sequences a ˆj = (ˆ aj1 , a ˆj2 , . . . , a ˆjm , ˆbj )T (j = 1, 2, · · · , m) and get the simulation and prediction value of the grey multivariable model taking it into (6).

3. Multiple linear regression model with optimized MGM(1,m) The general form of the multiple linear regression model is as follows: yk = β0 + β1 x1k + β2 x2k + · · · + βm xmk k = 1, 2, . . . , n where m means the number of the explanatory variables, n is the sample number, and n  m + 1, β0 , β1 , β2 , . . . , βm are the regression coefficients to be fixed. Now, we discuss the combined prediction model of the optimized MGM(1,m) and the multiple linear regression. The specific modeling steps are as follows:

617

(10)

After estimating the regression coefficients, we need to make a statistical test for the model. If the statistical test is passed, we should go to the third step. (0) (0) Thirdly, taking the simulation values {ˆ x1 (k), xˆ2 (k), (0) . . ., xˆm (k)} (k = 1, 2, . . . , n) and the prediction values (0) (0) (0) {ˆ x1 (k), x ˆ2 (k), . . . , x ˆm (k)} (k = n + 1, n + 2, . . .) into (10), we can get the simulation and prediction values yˆ(k) (k = 1, 2, . . . , n, n + 1, . . .).

4. Application example Take the population and gross domestic product (GDP) of Guangzhou from 1990 to 2000 as explanatory variables x1 and x2 , the size of motor vehicle (SMV) as dependent variable y. Construct the combined prediction model of MGM(1,2) with optimized background value and binary linear regression and establish the binary linear regression model of the oroginal MGM(1,2) with the method in [18] and directly set the binary linear regression model at the same time. Then we simulate the actual values of the SMV from 1990 to 1997 and predict the variable values from 1998 to 2000. Finally, we test the accuracy of the model through comparing the relative error. The data of the population (x1 ), GDP (x2 ) and the SMV (y) of Guangzhou from 1990 to 2000 are shown in Table 1, and the data come from [18]. Firstly, we construct the MGM(1,2) of the optimized backgroud value and the original MGM(1,2). The simulation and prediction values of the two multi-variable models for the explanatory variables x1 and x2 are shown in Table 2, respectively.

618

Journal of Systems Engineering and Electronics Vol. 22, No. 4, August 2011 Table 1

Population (noting P ), GDP and the SMV of Guangzhou from 1990 to 2000

Year 1990

P /Million 594.20

GDP/Million 319.60

SMV/Vehicle 331 242

1991

602.22

386.67

397 882

1992

612.20

510.70

466 184

1993

623.66

740.84

581 682

1994

637.02

976.18

691 241

1995

646.71

1 243.07

868 590

1996

656.05

1 444.94

907 527

1997

666.49

1 646.26

1 017 577

1998

674.14

1 841.61

1 107 597

1999

684.25

2 056.74

1 233 217

2000

700.69

2 375.91

1 350 390

Table 2

Simulation and prediction values of the two multi-variable models for the explanatory variables x1 and x2 Optimal MGM(1,2) model

Year 1990

(0) x1 (k)

(0) x2 (k)

(0) x ˆ1 (k)

1991

602.22

386.67

601.57

1992

612.20

510.70

613.12

1993

623.66

740.84

1994

637.02

976.18

1995

646.71

1996 1997

Original MGM(1,2) model

(0) x ˆ2 (k)

(0) x ˆ1 (k)

(0)

594.20

x ˆ2 (k) 319.60

348.71

601.53

350.29

553.69

613.23

552.81

624.51

763.78

624.62

762.19

635.73

978.95

635.68

978.41

1 243.07

646.80

1 199.16

646.36

1 201.5

656.05

1 444.94

657.67

1 424.42

656.67

1 431.41

666.49

1 646.26

668.35

1 654.67

666.56

1 668.13

1998

674.14

1 841.61

678.80

1 889.86

675.98

1 911.61

1999

684.25

2 056.74

689.03

2 129.99

684.96

2 161.84

2000

700.69

2 375.91

699.03

2 374.96

693.41

2 418.71

594.20

319.60

594.20

319.60

Note: The simulation values for 1990−1997 and the prediction values for 1998−2000.

Secondly, we respectively establish a combined prediction model of the MGM(1,2) with optimized background value and the binary linear regression (noted as Model 1) and a combined prediction model of the original MGM(1,2) and the binary linear regression (noted as Model 2). At the same time, we directly construct the binary linear regression model (noted as Model 3) according to the data in Table 1. Then we can get the three models: yˆ(k) = −3 345 605+6

(0) 086.364ˆ x1 (k)+184.955

k = 1, 2, . . . , 11 (0)

(0) 1ˆ x2 (k)

(11) (0)

yˆ(k) = −3 141 821+5 731.119ˆ x1 (k)+208.999 2ˆ x2 (k) k = 1, 2, . . . , 11 (0)

(12) (0)

yˆ(k) = −2 005 414+3 779.077ˆ x1 (k)+311.514 4ˆ x2 (k) k = 1, 2, . . . , 11.

(13)

In the significance level of 0.05, the coefficients of determination (noted as R2 ) and F statistical values (noted

as F ) of the Model 1, Model 2 and Model 3 are shown in Table 3. Table 3 Coefficients of determination and F statistical values of three models Model R2 F

Model 1 0.989 855 243.917 7

Model 2 0.989 898 244.965 3

Model 3 0.994 458 448.580 6

Through looking up the table, we can get the quantile F0.05 (2, 5) = 5.79. The F statistical values of three models are all bigger than 5.79 from Table 3, thus the three binary linear regression models of (11) to (13) are all effective. Finally, according to (11) to (13), we can get the simulation and prediction values and the relative errors of the three models for y, shown in Table 4 and Table 5. From Tables 4 and 5, we can see that the simulation and prediction errors of Model 1 are smaller than that of the Model 2 and Model 3. So we can conclude that the simulation and prediction accuracy have been improved.

Pingping Xiong et al.: Combined model based on optimized multi-variable grey model and multiple linear regression

y

Year

Table 4

Simulation values and the relative errors of the three models for y Model 1 Relative error/% 0.367 7

Model 2 Relative error/% 0.252 4

1990

331 242

1991

397 882

380 265

4.427 8

378 728

4.813 9

376 594

5.350 4

1992

466 184

488 474

4.781 4

487 763

4.629 0

484 096

3.842 3

1993

581 682

596 655

2.574 1

596 950

2.624 7

592 586

1.874 5

1994

691 241

704 741

1.953 0

706 223

2.167 4

702 016

1.558 7

1995

868 590

812 846

6.417 8

815 690

6.090 3

812 449

6.463 5

1996

907 527

920 668

1.448 0

925 067

1.932 7

923 699

1.782 0

1 017 577

1 028 256

1.049 5

1 034 397

1.653 0

1 035786

1.789 4

Average relative error Table 5

2.877 4

yˆ 339 674

Model 3 Relative error/% 2.545 4

yˆ 330 024

1997

yˆ 330 406

619

3.020 4

3.150 8

The prediction values and the relative errors of the three models for y Model 1 Relative error/% 2.506 4

Model 2 Relative error/% 3.236 3

Model 3 Relative error/% 3.696 7

Year

y

1998

1 107 597

yˆ 1 135 358

1999

1 233 217

1 242 035

0.715 0

1 252 258

1.544 0

1 262 006

2.334 5

2000

1 350 390

1 348 207

0.161 7

1 360 768

0.768 5

1 376 108

1.904 5

5. Conclusions We optimize the background value of the original MGM(1,m) and establish the MGM(1,m) based on the optimized background value, then set the multiple linear regression model based on the simulation data of the optimal model in order to eliminate the fluctuations or random errors of the original observational data of all variables and improve the simulation and prediction accuracy. Finally, we construct a combined model of the MGM(1,2) with optimized background value and the binary linear regression and verify that the simulation and prediction accuracy of the optimal model is higher than the combined model of the original MGM(1,2) and the binary linear regression. The optimized combined model is also verified to be better than the the binary linear regression model.

References [1] J. L. Deng. The basis of grey theory. Wuhan: Press of Huazhong University of Science &Technology, 2002: 1–2. [2] S. F. Liu, Y. G. Dang, Z. G. Fang. Grey system theory and its application. Beijing: Science Press, 2004: 1–3. [3] C. L. Chen, D. Y. Dong, Z. H. Chen, et al. Grey systems for intelligent sensors and information processing. Journal of Systems Engineering and Electronics, 2008, 19(4): 659–665. [4] N. M. Xie, S. F. Liu. Research on evaluations of several grey relational models adapt to grey relational axioms. Journal of Systems Engineering and Electronics, 2009, 20(2): 304–309. [5] Z. X. Wang, Y. G. Dang, J. Wei, et al. Study on the extending multi-attribute decision model of grey target. Journal of Systems Engineering and Electronics, 2009, 20(5): 985–991. [6] C. J. Rao, Y. Zhao. Multi-attribute decision making model based on optimal membership and the relative entropy. Journal of Systems Engineering and Electronics, 2009, 20(3): 537– 542.

yˆ 1 143 442

yˆ 1 148 542

[7] H. Gu, B. F. Song. Study on effectiveness evaluation of weapon systems based on grey relational analysis and TOPSIS. Journal of Systems Engineering and Electronics, 2009, 20(1): 106–111. [8] Z. X. Wang, Y. G. Dang, B. Liu. Recursive solution and approximating optimization to grey models with high precision. The Journal of Grey System, 2009, 21(2): 185–194. [9] J. Zhai, J. M. Sheng, Y. J. Feng. The grey model MGM(1,n) and its application. Systems Engineering-Theory & Practice, 1997, 17(5): 110–114. (in Chinese) [10] X. X. Li, X. J. Tong, M. Y. Chen. MGM∼p(1,n) optimization model Systems. Engineering−Theory & Practice, 2003, 23(4): 47–51. (in Chinese) [11] F. X. Wang. Multivariable non-equidistance GM(1,m) model and its application. Systems Engineering and Electronics, 2007, 29(3): 388–390. (in Chinese) [12] L. Z. Cui, S. F. Liu, Z. P. Wu. MGM(1,n) based on vector continued fractions theory. Systems Engineering, 2008, 26(10): 47–51. (in Chinese) [13] G. J. Tan. The structure method and application of background value in grey system GM(1,1) (II). Systems Engineering− Theory & Practice, 2000, 20(5): 125–127. (in Chinese) [14] D. Luo, S. F. Liu, Y. G. Dang. The optimization of grey model GM (1,1). Engineering Science, 2003, 5(8): 50–53. [15] Z. X. Wang, Y. G. Dang, S. F. Liu. An optimal GM(1,1) based on the discrete function with exponential law. Systems Engineering–Theory & Practice, 2008, 28(2): 61–67. (in Chinese) [16] Y. M. Wang, Y. G. Dang, Z. X. Wang. The optimization of background value in non-equidistant GM(1,1). Chinese Journal of Management Science, 2008,16(4): 159–162. [17] Y. D. Bao, Y. P. Wu, Y. He. A new forecasting model based on the combination of GM(1,1) and linear regression. Systems Engineering–Theory & Practice, 2004, 24(3): 95–98. (in Chinese) [18] W. D. Liu, F. X. Wang, Y. R. Liu. Multiple linear regression model based upon multi-variables grey forecast model. Science Technology and Engineering, 2007, 7(24): 6403–6406.

620

Journal of Systems Engineering and Electronics Vol. 22, No. 4, August 2011

Biographies Pingping Xiong was born in 1981. She is a Ph.D. candidate of the School of Economics and Management in Nanjing University of Aeronautics and Astronautics. Now she is a teacher in Nanjing University of Information Science and Technology. Her main research interest is grey system theory. E-mail:[email protected] Yaoguo Dang was born in 1964. He is a professor in the School of Economics and Management, Nanjing University of Aeronautics and Astronautics. His current research interests include grey system theory and regional economic research. E-mail: [email protected]

Xianghua Wu was born in 1980. She is a Ph.D. candidate and a teacher in Nanjing University of Information Science and Technology. Her research interest includes grey system theory. E-mail: [email protected]

Xuemei Li was born in 1985. She is a master in the School of Economics and Management in Nanjing University of Aeronautics and Astronautics. Her research interest includes grey system. E-mail: [email protected]

Suggest Documents