PONS2train: tool for testing the MLP architecture and ...

2 downloads 0 Views 2MB Size Report
PACK, BLAS and ARMADILLO C++ linear algebra libraries. The PONS2train implements the first order local optimization algo- rithms: standard on-line and ...
PONS2train: tool for testing the MLP architecture and local training methods for runoff forecast Petr M´aca, Jirka Pavl´asek, Pavel Pech Department of Water Resources and Environmental Modeling, Faculty of Environmental Sciences, Czech University of Life Sciences Prague [email protected] Introduction

C. The Data Transformation.

The Case Study

The purpose of presented poster is to introduce the PONS2train developed for runoff prediction via multilayer perceptron – MLP. The software application enables the implementation of 12 different MLP’s transfer functions, comparison of 9 local training algorithms and finally the evaluation the MLP performance via 17 selected model evaluation metrics.

Two simple methods for data transfomation are implemented

The Runoff forecast at small micro catchment based on dataset with the hourly resolution.

The PONS2train software is written in C++ programming language. Its implementation consists of 4 classes. The NEURAL NET and NEURON classes implement the MLP, the CRITERIA class estimates model evaluation metrics and for model performance evaluation via testing and validation datasets. The DATA PATTERN class prepares the validation, testing and calibration datasets. The software application uses the LAPACK, BLAS and ARMADILLO C++ linear algebra libraries.

D. The Additional Functionalities.

• MLP with one hidden layer, the output layer activation function – linear

• the non-linear data transformation

• inputs Q(t − 1), P(t − 1), P(t − 2), P(t − 3), the output Q(t)

Dtrans = (1 − exp(−γDorig ))

0.8

I. The persistency index of 7 TA on 11 different AcF.

• Nguyen-Widrow’s method PI [−]

• uniform distribution

The over-fitting control

0.4

0.6

The MLP weight initialization:

• early stopping of training based upon the selected model evaluation metrics 0.2

• neuron saturation control

The OLS benchmark model. The multi-run training and simulation. The 17 model evaluation metrics.

0.0

The PONS2train implements the first order local optimization algorithms: standard on-line and batch back-propagation with learning rate combined with momentum and its variants with the regularization term, Rprop and standard batch back-propagation with variable momentum and learning rate. The second order local training algorithms represents: the Levenberg–Marquardt algorithm with and without regularization and four variants of scaled conjugate gradients.

• the linear transformation

FLET−cal

HEST−cal

PER−cal

POL−cal

LM−cal

BP−cal

BP_regul−cal

Training algorithms

PONS2train consists of 4 classes, written in C++ using armadillo template library.

II. The ensemble of validation runoff forecast. • same settings for trainings algorithms

2.0 0.5

1.0

1.5

Runoff [mm]

600

800

• network initialization

1200

3.5 2.0 1.0 0.5 600

800

1000

1200

600

800

Time [hour]

• network training

1000

0.0

0.5

3. The Neural Net Class Methods.

1.5

Runoff [mm]

1.0

Runoff [mm]

• neuron saturation control

2.5

2.5

• estimation of neuron output signal

3.0

• estimation of the activation, the derivatives of error function

1200

3.0

3.5

• selection the activation function

1000

Time [hour]

Time [hour]

Fig 3. The Levenberg-Marquardt TA - the validation results: Inverse abs (Left Up), Gaus AcF (Right Up), LogSig (Right Down) and Wave AcF (Left Down).

• early stopping

• neurons saturation control

III. The AcF correlation.

• multi-run training

• the Persistency index of 150 ensemble, the Scaled CoGr Perry

• multi run simulation

• network weight analysis

0.5

−10

0

−2.0

−0.5

−2.0

−0.5

−200

−50

−1

−1.5 −0.5

0.0

−4

cloglog

cloglogm 0

−1.5

• network training, testing and validation

• OLS benchmark model of given problem

0

−150

gaus

0.0

−10

htan

4. The Criteria Class Methods.

0.0

−2.5

inv_abs

0.0

−2.0

• estimation of 17 model evaluation metrics

loglog

logsig 0.0

−2.5

• saving the model evaluation results

rootsig

−4

sech

−12 −200 −50

−10

The PONS2train C++ implementation enables to extend MLP architecture to hybrid MLP case.

sigmoid

−50

wave −4

−2

0

−150

0.0

The simple MLP

0.4

−50

0.8

−2.5

0.0

0.4

−0.5

0.8

−2.5 −1.0

0.0

0.4

0.5

0.8

−12

0.0

0.4

−6 −2

0.8

−50

0.0

0.4

−20

0

0.8

cloglog

• Scaled conjugate gradient - Fletcher (FLET) • Rprop and Rprop-

0.6 0.0 0.6 0.0 0.6

0.0

0.0

logsig

ANN MLP

rootsig

ANN MLP

sech

0.0

ANN MLP

ANN outputs

0.6

0.0

ANN inputs

sigmoid

wave

ANN MLP

0.0

0.4

0.8

0.0

0.4

0.8

0.0

0.4

0.8

0.0

0.4

0.8

0.0

0.4

0.8

0.0

0.4

0.8

Fig 4. - green all 150 results, blue filtered models PI > 0. Fig 1. ANN architectures built with PONS2train

Currently only local first order and second order gradient methods are implemented.

loglog

0.6

• Scaled conjugate gradient - Hestenes (HEST)

0.6

The hybrid MLP

0.0

• Scaled conjugate gradient - Polak (POL)

0.0

inv_abs

• Levenberq Marquardt with regularization • Scaled conjugate gradient - Perry (PER)

htan

0.6

• Levenberq Marquardt (LM)

gaus

0.0

• Back-propagation with regularization term (BP regul)

• Batch back-propagation variable self-adapting learning rate

cloglogm

0.6

ANN MLP

ANN outputs

0.6

• Back-propagation with learning rate and momentum term (BP)

ANN inputs

0.0

The list of batch trainings methods

0.6

0.0

• standard back-propagation with regularization term

1200

0.6

• standard back-propagation with learning rate and momentum term

1000

Time [hour]

5. The MLP Architecture Examples. The list of online trainings methods

800

−2.0

B. The 9 Local Search Training Algorithms (TA)

600

0.0

Tab 1. The built in activation functions - AcF. Function name Transfer function 1 Logistic Sigmoid y(a) = 1 + exp(−a) Hyperbolic Tangent y(a) = tanh(a) y(a) = a Linear function Gaussian function y(a) = exp(−a2) a Inverse abs y(a) = 1 + |a| LogLog y(a) = exp(−exp(−a)) ClogLog y(a) = 1 − exp(−exp(a)) ClogLogm y(a) = 1 − 2exp(−0.7exp(a)) a √ y(a) = RootSig 2 1 + 1 + a  2 1 LogSig y(a) = 1 + exp(−a) 2 Sech y(a) = exp(a) + exp(−a) y(a) = (1 − a2)exp(−a2) Wave The AcF is constant for neurons in one layer.

0.0

2. The Neuron Class Methods. A. The 12 Activation functions PONS2train enables to build MLP with different activation functions – see following Tab.

2.0 0.5

1.0

• saving the results of network training, testing and validation

1.5

• preparing the training, testing and validation data patterns

Runoff [mm]

2.5

• data transformation

2.5

3.0

• loading the data form data files

3.0

3.5

1. The Data Pattern Class Methods.

3.5

• different activation function in hidden layer

0.0

The PONS2train Features

Fig 2. The each box-plot shows the results with PI > 0 selected from 11 AcF x 150 simulations, right validation - left calibration.

2.0

The runoff forecast case study focuses on PONS2train implementation and shows the different aspects of the MLP training, the MLP architecture estimation, the neural network weights analysis and uncertainty of model training.

The PONS2train Software Implementation

1.5

The other important PONS2train features are: the multi-run, the weight saturation control, early stopping of trainings, and the MLP weights analysis. The weights initialization is done via two different methods: random sampling from uniform distribution on open interval or Nguyen Widrow method. The data patterns can be transformed via linear and non-linear transformation.

The simple MLP could have an multiple outputs.

The Final Remarks • PONS2train is freely available at http://kvhem.cz or upon request from authors