financial forecasting system based on bayesian modeling - IIT

ESCUELA TÉCNICA SUPERIOR DE INGENIERIA (ICAI) INGENIERÍA INDUSTRIAL

FINANCIAL FORECASTING SYSTEM BASED ON BAYESIAN MODELING

Autor: Vicente Díaz García Director: Carlos Maté Jiménez

Madrid Mayo 2012

Proyecto realizado por el alumno: Vicente Díaz García

Fdo.: ………………….

Fecha: .…./.…./…….

Autorizada la entrega del proyecto cuya información no es de carácter confidencial EL DIRECTOR DEL PROYECTO Profesor Dr. Carlos Maté Jiménez

Fdo.: ………………….

Fecha: .…./.…./…….

Vº Bº del Coordinador de Proyectos Dña. Susana Ortiz Marcos

Fdo.: ………………….

Fecha: .…./.…./…….


Autor: Díaz García, Vicente. Director: Maté Jiménez, Carlos. Entidad colaboradora: ICAI – Universidad Pontificia Comillas

RESUMEN DEL PROYECTO La disponibilidad de buenas predicciones en los mercados financieros, como el de las divisas, el de las materias primas o el de las acciones, tanto en periodos de estabilidad como en periodos de turbulencia, es siempre un gran desafío para las empresas y organizaciones, así como un problema que admite múltiples enfoques y soluciones.

Partiendo de la realidad anterior, este proyecto tiene como principal objetivo el diseño de un sistema de predicción de los diferentes estados del mercado financiero. Concretamente, el proyecto propone un método para detectar y predecir tanto los estados financieros estables como los convulsos.

Dicho objetivo trata de conseguirse con un exhaustivo análisis de los diferentes tipos de estados que existen en los mercados financieros. La investigación se lleva a cabo con el estudio y análisis de varias series financieras (acciones, divisas y materias primas).

El proyecto ha dado lugar a un programa en MATLAB que es capaz de analizar y predecir de alguna manera el estado del mercado, distinguiendo en una serie financiera tres posibles estados: estables, pseudo-turbulentos y turbulentos. El modelo de predicción diseñado y propuesto trata de responder a un problema real y muy importante en el mundo financiero. Este problema está relacionado con la importancia de tener una buena predicción de los diferentes tipos de estados de los mercados financieros.

1

La capacidad de detectar la situación financiera actual y predecir lo que sucederá en el futuro inmediato ofrece al inversor una gran ventaja a la hora de elegir dónde invertir su dinero. La forma de invertir es diferente dependiendo del estado del mercado y por lo tanto, un inversor se encontrará en una mejor posición si conoce el tipo de estado de una determinada acción bursátil, un índice, una divisa, etc.

Para llevar a cabo la predicción se propone el modelo de cambios de régimen de Markov para dos y tres estados distintos. El método incluye una predicción automática tanto Bayesiana como no Bayesiana. Debido a que el inversor es capaz de ejecutar la predicción de la forma en que piensa que es mejor, su conocimiento previo se incluye en la predicción dando lugar a un pronóstico mucho mejor que si no estuviese incluido este conocimiento.

El programa está diseñado para analizar los estados a largo plazo haciendo uso de los recorridos mensuales de valores financieros. La herramienta informática devuelve características útiles de la serie financiera estudiada, como son las especificaciones y predicciones para cada estado, la probabilidad de transición entre los estados, las probabilidades estacionarias o las duraciones esperadas en los estados.

Para la predicción no Bayesiana se ha diseñado una técnica automática para explorar y detectar el estado de los periodos. Los periodos son catalogados como estables, pseudo-turbulentos o turbulentos de acuerdo a una prueba de ruido blanco aplicada a los datos y a las reglas de Nelson detectadas en la serie financiera. Los periodos de ruido blanco están relacionados con estados de estabilidad, mientras que los periodos que cumplen las reglas de Nelson se relacionan con periodos de turbulencia. De esta manera, cada periodo se asocia con su estado correspondiente. Posteriormente, cada estado se ajusta a un modelo auto regresivo de primer orden y se realizan varias predicciones.

La metodología Bayesiana está llamada a ser un elemento fundamental en los procesos de negocio orientados a predecir y prever las nuevas situaciones. Un modelo Bayesiano va más allá que un modelo de frecuencias (no Bayesiano) en el análisis y añade a los datos el conocimiento previo que el investigador tiene sobre el problema 2

tratado. Por lo tanto, la predicción puede llegar a ser mucho más robusta que con el sistema no Bayesiano.

Para desarrollar la aproximación Bayesiana se utiliza el software WinBUGS. Para ello se emplea un modelo de tres estados y otro de dos estados de predicción Bayesiana. Estos se basan en distribuciones normales de los periodos y en un modelo auto regresivo de primer orden, con lo que se consigue analizar y predecir el estado del mercado financiero. La selección del mejor modelo para llevar a cabo la predicción depende del caso y de lo que el experto busque. A lo largo de la investigación se manifiesta que el método Bayesiano cumple con los requisitos para llegar a ser un elemento clave en la predicción de los estados financieros. La metodología Bayesiana es necesaria para lograr una mejor predicción.

Con todo lo anterior es posible tomar una decisión de inversión teniendo en cuenta no sólo el precio, sino también el tipo de estado del valor financiero y la probabilidad con la que se espera que cambie de estado. De esta forma, y con esta información, el operador puede adquirir una gran ventaja y predecir los cambios futuros de estado, limitando así el riesgo de la inversión.

Este proyecto también proporciona un método para adentrarse en el análisis de periodos turbulentos en otros campos del conocimiento como son la medicina, la meteorología, el precio de la energía eléctrica, etc.

3

4


ABSTRACT The availability of good predictions in financial markets, such as foreign exchange, commodities or stocks, both in stable periods and in turbulent periods, is always a huge challenge for companies and organizations as well as a problem that supports multiple approaches and solutions.

Based on the above scenario, this project has as main objective the design of a predicting system for the different financial market states. Specifically, the project establishes a method for both detecting and predicting stable and turbulent financial states.

This project contains an in-depth analysis of the different types of states that exist in the financial markets. In the research several financial series (stocks, currencies and commodities) are studied and analyzed.

The project has led to a MATLAB program that is able to analyze and predict in some way the state of the market. The program differentiates three possible states for a financial series: stable, pseudo turbulent and turbulent periods. The designed and proposed prediction model tries to respond to a real and important problem in the financial world. This problem involves the importance of having a good prediction of the different types of financial market states.

The capability to detect the current financial state and predict what will happen in the immediate future gives an investor a great advantage to choose where to invest its money. The way to invest is different depending on the state of the market. Therefore, an investor will be in a better position if knowing the type of period of a specific financial stock, index, commodity, currency, etc.

5

To carry out the prediction a two and a three state Markov switching model is proposed. The method includes an automatic prediction both Bayesian and nonBayesian. The user is able to run the prediction the way he thinks is better. Therefore, the prior knowledge of the investor is included in the prediction making it a much better forecast than if no prior knowledge was added.

The program is designed to analyze the long-term states making use of the monthly range variations of the financial securities. The software platform returns useful characteristics of the financial series such as the specifications and predictions for each state, the probability of transition among states, the steady state probabilities or the expected durations in the states.

For the non-Bayesian prediction an automatic technique for exploring and detecting the period’s state is designed. The periods are catalogued as stable, pseudo turbulent or turbulent according to a white noise test performed to the data and the Nelson rules detected in the financial series. White noise periods are related with stable states whereas periods that satisfy Nelson rules are related with turbulent periods. This way each period is linked with its corresponding state. Then, each state is set according to a first order autoregressive model and various predictions are performed.

Bayesian methodology is called to be a fundamental element in business processes orientated to predict and forecast new situations and quantities. A Bayesian model goes further than a frequent model (non-Bayesian) in the analysis and adds to those data the prior knowledge that the researcher has about the treated problem. Therefore, the prediction can become much more robust than the non-Bayesian.

To compute the Bayesian approach the WinBUGS software is used. A three state and two state Bayesian predictions are developed to analyze and forecast the state of the financial market. The prediction is based on period’s normal distributions and a first order autoregressive model. The selection of the best model to perform the prediction depends on the case and what the researcher looks for. It is shown throughout the investigation that the Bayesian method meets the requirements to become a key element in the financial states forecast. Bayesian methodology is needed to achieve a better prediction. 6

From all the above an investment decision can be made considering not only the price, but also the type of state of the financial value and how likely will it change of state. Hence, the operator can take great advantage from this information and predict future changes in state, limiting the risk of the investment.

This project also provides a method to fall into the analysis of turbulent periods in other fields such as medicine, weather, electricity price, etc.

7

8

List of Figures Figure 2-1. Flows of Funds through the Financial System ........................................................... 26 Figure 2-2. Treasury bills features ......................................................................................................... 30 Figure 2-3. Reaction of stock price to good news information in efficient and inefficient markets .................................................................................................................................................... 43 Figure 2-4. Reaction of stock price to bad news information in efficient and inefficient markets .................................................................................................................................................... 44 Figure 2-5. The general method for assessing the value of a firm............................................. 46 Figure 2-6. IBM’s price chart. Trends and random movements. ................................................ 48 Figure 2-7. IBEX 35 Spanish index closing quotes between 1992 and 2009 ........................ 52 Figure 2-8. Periods of the monthly range variation of Santander bank quotes between January 2003 and March 2012 ....................................................................................................... 53 Figure 3-1. Classical decomposition methods ................................................................................... 59 Figure 3-2. Nelson rule number 1 ........................................................................................................... 62 Figure 3-3. Nelson rule number 2 ........................................................................................................... 63 Figure 3-4. Nelson rule number 3 ........................................................................................................... 63 Figure 3-5. Nelson rule number 4 ........................................................................................................... 64 Figure 3-6. Nelson rule number 5 ........................................................................................................... 64 Figure 3-7. Nelson rule number 6 ........................................................................................................... 65 Figure 3-8. Nelson rule number 7 ........................................................................................................... 65 Figure 3-9. Nelson rule number 8 ........................................................................................................... 66 Figure 3-10. Nelson rules 1, 2 and 5 applied to the monthly range variation of Santander bank quotes between January 2003 and March 2012 .......................................................... 67

9

Figure 3-11. White noise test for the monthly range variation of Santander bank quotes between January 2003 and March 2012 .................................................................................... 68 Figure 3-12. MAPE error and Theil’s U statistic for the turbulent periods of the monthly range variation of Santander quotes between January 2003 and March 2012 ......... 75 Figure 4-1. Monthly range variation of Santander bank quotes between January 2003 and March 2012 .................................................................................................................................... 96 Figure 5-1. Data analysis ............................................................................................................................ 99 Figure 5-2. Metropolis-Hastings algorithm ...................................................................................... 111 Figure 5-3. Target distribution and histogram of the MCMC samples at different iteration points ...................................................................................................................................................... 112 Figure 5-4. Approximations obtained using the MH algorithm with three Gaussian proposal distributions of different variances......................................................................... 114 Figure 5-5. Gibbs sampler ........................................................................................................................ 115 Figure 5-6. Generic reversible jump MCMC ...................................................................................... 117 Figure 5-7. WinBUGS output for Santander three states prediction. ..................................... 119 Figure 5-8. Bayesian results in the MATLAB interface for Santander three states prediction. ............................................................................................................................................. 120 Figure 6-1. Daily closing quotes plot for Santander from January 2003 to March 2012 ................................................................................................................................................................... 122 Figure 6-2. Monthly returns plot for Santander from January 2003 to March 2012 ...... 123 Figure 6-3. Weekly returns plot for Santander from January 2010 to March 2012 ........ 124 Figure 6-4. Daily returns plot for Santander from March 2011 to March 2012 ................ 125 Figure 6-5. Monthly range variations plot for Santander from January 2003 to March 2012 ......................................................................................................................................................... 126 Figure 6-6. Weekly range variations plot for Santander from January 2010 to March 2012 ......................................................................................................................................................... 127 Figure 6-7. Nelson rules 1, 2 and 3 for the monthly range variations plot for Santander from January 2003 to March 2012 ............................................................................................. 128 Figure 6-8. White noise test with an 18 rolling window size for the monthly range variations plot for Santander from January 2003 to March 2012 ................................. 129 Figure 6-9. Three states for the monthly range variations plot for Santander from January 2003 to March 2012......................................................................................................... 130

10

Figure 6-10. Two states for the monthly range variations plot for Santander from January 2003 to March 2012......................................................................................................... 131 Figure 6-11. Normality tests report for stable periods................................................................ 132 Figure 6-12. Stable periods of the three states monthly range variations plot for Santander from January 2003 to March 2012 ....................................................................... 133 Figure 6-13. Pseudo turbulent and turbulent periods of the three states monthly range variations plot for Santander from January 2003 to March 2012 ................................. 135 Figure 6-14. AR (1) computation with zero drift for the two states monthly range variations plot for Santander from January 2003 to March 2012 ................................. 140 Figure 6-15. WinBUGS history for Santander two states prediction. .................................... 144 Figure 6-16. WinBUGS history for Santander two states prediction. .................................... 147 Figure 6-17. Bayesian results for Santander two states prediction. ...................................... 149 Figure 6-18. Bayesian results for Santander three states prediction. ................................... 149 Figure 7-1. Nelson rules 1, 2 and 3 applied to the monthly range variation of BBVA quotes between January 2003 and March 2012. .................................................................. 152 Figure 7-2. White noise test for the monthly range variation of BBVA quotes between January 2003 and March 2012. .................................................................................................... 153 Figure 7-3. Financial states of BBVA between January 2003 and March 2012. ................ 153 Figure 7-4. Nelson rules 1, 2 and 6 applied to the monthly range variation of Euro/US Dollar quotes between January 2000 and December 2010. ............................................ 154 Figure 7-5. White noise test for the monthly range variation of Euro/US Dollar quotes between January 2000 and December 2010. ......................................................................... 154 Figure 7-6. Financial states of Euro/US Dollar between January 2000 and December 2010. ........................................................................................................................................................ 155 Figure 7-7. Nelson rules 1, 2 and 6 applied to the monthly range variation of Brent quotes between May 2003 and December 2010. ................................................................. 156 Figure 7-8. White noise test for the monthly range variation of Brent quotes between May 2003 and December 2010. ................................................................................................... 156 Figure 7-9. Financial states of Brent Crude oil between May 2003 and December 2010. ................................................................................................................................................................... 157 Figure 7-10. Nelson rules 1, 2 and 6 applied to the monthly range variation of Shell quotes between January 2003 and May 2012........................................................................ 157 Figure 7-11. White noise test for the monthly range variation of Shell quotes between January 2003 and May 2012. ........................................................................................................ 158

11

Figure 7-12. Financial states of Shell between January 2003 and May 2012. .................... 158 Figure 7-13. Shell closing quotes between January 2003 and May 2012............................. 159 Figure 7-14. Nelson rule 2 applied to the monthly range variation of BBVA quotes between January 2003 and March 2012. ................................................................................. 160 Figure 7-15. Nelson rule 2 applied to the monthly range variation of Euro/US Dollar quotes between January 2000 and December 2010. .......................................................... 160 Figure 7-16. Nelson rule 2 applied to the monthly range variation of Brent quotes between May 2003 and December 2010.................................................................................. 161 Figure 7-17. Nelson rule 2 applied to the monthly range variation of Shell quotes between January 2003 and May 2012....................................................................................... 161 Figure 7-18. Prediction results of BBVA. ........................................................................................... 162 Figure 7-19. Pseudo turbulent periods normal distribution tests of BBVA. ....................... 163 Figure 7-20. AR (1) prediction with zero drift of BBVA. ............................................................. 163 Figure 7-21. AR (1) prediction with drift of BBVA......................................................................... 164 Figure 7-22. Prediction results of Euro/US Dollar exchange rate. .......................................... 164 Figure 7-23. AR (1) prediction with drift of Euro/US Dollar. .................................................... 165 Figure 7-24. Prediction results of Brent Crude oil. ........................................................................ 165 Figure 7-25. AR (1) prediction with drift of Brent. ........................................................................ 166 Figure 7-26. Prediction results of Shell. ............................................................................................. 166 Figure 7-27. AR (1) prediction with drift of Shell. ......................................................................... 167 Figure 7-28. Bayesian results for BBVA three states prediction.............................................. 169 Figure 7-29. Bayesian results for Euro/US Dollar two states prediction. ............................ 170 Figure 7-30. WinBUGS history for Santander two states prediction. .................................... 171 Figure 7-31. Bayesian results for Brent two states prediction. ................................................ 171 Figure 7-32. Bayesian results for Shell two states prediction. ................................................. 172 Figure A-1. Yahoo finance downloaded csv file .............................................................................. 185 Figure A-2. Selecting the files window................................................................................................ 186 Figure A-3. Main program window. ..................................................................................................... 187 Figure A-4. Different available options in the main window. .................................................... 188 Figure A-5. “White noise test” and “Report” buttons .................................................................... 189

12

Figure A-6. White noise test .................................................................................................................... 190 Figure A-7. Report window. .................................................................................................................... 190 Figure A-8. Non-Bayesian prediction button ................................................................................... 191 Figure A-9. Non-Bayesian prediction window. ............................................................................... 192 Figure A-10. Normality test report and significance level options ......................................... 193 Figure A-11. Graph in a new window option ................................................................................... 193 Figure A-12. Drift and the calculate button options ...................................................................... 194 Figure A-13. AR (1) prediction window ............................................................................................. 194 Figure A-14. Bayesian prediction button. .......................................................................................... 195 Figure A-15. Two states Bayesian prediction window. ............................................................... 196 Figure A-16. Three states Bayesian prediction window. ............................................................ 196 Figure A-17. Two states Bayesian prediction specifications. .................................................... 197 Figure A-18. Three states Bayesian prediction specifications. ................................................. 197

13

14

List of Tables Table 1.1. GANT diagram of the projects activities.......................................................................... 24 Table 1.2. Hours expended in each activity of the project ............................................................ 24 Table 2.1. Principal Capital Market Instruments .............................................................................. 31 Table 2.2. The Buy Side of the Trading Industry .............................................................................. 34 Table 2.3. The Sell Side of the Trading Industry ............................................................................... 35 Table 2.4. Trading Instruments Summary .......................................................................................... 36 Table 2.5. World main exchanges sorted by continent .................................................................. 40 Table 3.1. Schematic of Trend and Seasonal Component of a series ........................................ 60 Table 3.2. AR (1) model for the three periods (stable, pseudo turbulent and turbulent) of the monthly range variation of Santander bank quotes between January 2003 and March 2012............................................................................................................................................. 71 Table 4.1. Estimation of the parameters, for monthly range variation of Santander quotes between January 2003 and March 2012, with zero drift ..................................... 96 Table 4.2. Estimation of the parameters, for monthly range variation of Santander quotes between January 2003 and March 2012, with nonzero drift ............................. 97 Table 5.1. Distributions in Bayesian Data Analysis ....................................................................... 102 Table 5.2. Conjugate distributions for other likelihood distributions ................................... 106 Table 6.1. Probabilities of transition for the three states monthly range variations plot for Santander from January 2003 to March 2012 ................................................................ 137 Table 6.2. Steady state probabilities and expected durations of the three states monthly range variations plot for Santander from January 2003 to March 2012 .................... 139 Table 6.3. Steady state probabilities and expected durations of the two states monthly range variations plot for Santander from January 2003 to March 2012 .................... 139 Table 6.4. WinBUGS output for Santander two states prediction. .......................................... 143 Table 6.5. WinBUGS output for Santander three states prediction. ....................................... 146

15

Table 7.1. WinBUGS output for BBVA three states prediction.................................................. 169 Table 7.2. WinBUGS output for Euro/US Dollar two states prediction. ................................ 170 Table 7.3. WinBUGS output for Brent two states prediction. .................................................... 172 Table 7.4. WinBUGS output for Brent two states prediction. .................................................... 172 Table 9.1. Project work packages.......................................................................................................... 182 Table 9.2. Estimated material costs ..................................................................................................... 182 Table 9.3. Amortization costs ................................................................................................................. 183 Table 9.4. Summarized budget ............................................................................................................... 183

16

Contents RESUMEN DEL PROYECTO

1

ABSTRACT

5

LIST OF FIGURES

9

LIST OF TABLES

15

CONTENTS

17

1.

21

INTRODUCTION

1.1

Project motivation

21

1.2

Objectives

22

1.3

Methodology

23

2. 2.1

FINANCIAL MARKETS

25

Introduction to Financial Markets

25

2.2 Classification of Financial Markets 2.2.1 Debt and Equity Markets 2.2.2 Primary and Secondary Markets 2.2.3 Exchanges and Over-the-Counter Markets 2.2.4 Money and Capital Markets

27 27 28 28 29

2.3 Financial Market Instruments 2.3.1 Money Market Instruments 2.3.2 Capital Markets Instruments

29 29 31

2.4 The Trading Industry 2.4.1 The traders 2.4.2 Trade facilitators

33 33 35

17

2.4.3 2.4.4 2.4.5

Trading Instruments Trading Markets Market Regulation

36 38 41

2.5 Methods and Hypothesis for Analyzing Financial Markets 2.5.1 Efficient Markets 2.5.2 Fundamental Analysis 2.5.3 Technical Analysis 2.5.4 Quantitative Analysis

42 42 44 46 49

2.6

51

Extreme events in financial markets

3. STATISTICAL TOOLS FOR THE ANALYSIS AND PREDICTION OF FINANCIAL TIME SERIES 55 3.1 Data analysis 3.1.1 Scales

55 55

3.2 Time series 3.2.1 Prediction methods based on data 3.2.2 Components of a time series 3.2.3 Classical decomposition methods

56 56 57 58

3.3

61

Statistical process control. Nelson rules

3.4 White Noise 3.4.1 Random Walk

67 69

3.5 Autoregressive models 3.5.1 Simple autoregressive models AR (p) 3.5.2 Simple Moving Average MA (q) 3.5.3 Autoregressive Moving Average Models ARMA (p, q) 3.5.4 Autoregressive Integrated Moving Average Models ARIMA (p, d, q)

70 70 71 72 72

3.6

73

4. 4.1

Error measurements

MARKOV CHAINS IN THE ANALYSIS OF FINANCIAL TIME SERIES Introduction

77 77

4.2 Markov Chains 4.2.1 Transition Probabilities 4.2.2 Probability pj (n) 4.2.3 Transition probability in n steps pij (n) 4.2.4 Classification of states 4.2.5 Classification of strings

79 79 80 81 81 83

4.3

Markov Switching Models

85

4.4

Regime switching in stock market returns

91

18

4.4.1 4.4.2 4.4.3 4.4.4 4.5

5. 5.1

Are there regimes in stock market returns? Different specifications of switching Multivariate specifications Time variation in transition probabilities

91 92 93 94

Application of the Markov Switching Autoregressive Model in a financial time series 95

BAYESIAN DATA ANALYSIS Introduction

99 99

5.2 Bayesian Analysis for Normal and other distributions 5.2.1 Univariate Normal distribution 5.2.2 Other distributions

103 103 105

5.3

Nonparametric Bayesian

106

5.4

Bayesian Numerical Computation

107

5.5 MCMC algorithms 5.5.1 The Metropolis-Hastings algorithm 5.5.2 The Gibbs sampler 5.5.3 Reversible jump MCMC

109 110 114 116

5.6 Bayesian Software 5.6.1 WinBUGS 5.6.2 MatBUGS

118 118 119

6.

121

6.1

METHODOLOGY Incorporation of market quotations

121

6.2 Non-Bayesian Model 6.2.1 Determination of stable, pseudo turbulent and turbulent periods 6.2.2 Prediction methodology

127 127 131

6.3 Bayesian Model 6.3.1 Two states prediction 6.3.2 Three states prediction 6.3.3 Bayesian results

140 141 144 148

7.

151

7.1

RESULTS Introduction

151

7.2 Non-Bayesian Model 7.2.1 Determination of stable, pseudo turbulent and turbulent periods 7.2.2 Prediction

152 152 162

7.3

168

Bayesian Model

19

8.

CONCLUSIONS AND FURTHER DEVELOPMENTS

175

8.1

Non-Bayesian model conclusions

176

8.2

Bayesian model conclusions

176

8.3

General conclusions

177

8.4

Future developments

178

9.

PROJECT BUDGET

181

9.1

Engineering costs

181

9.2

Investment and Elements Costs

182

9.3

Summarized budget

183

A. USER’S GUIDE

185

A.1

Data entry

185

A.2

Selection of data

187

A.3

Non-Bayesian prediction

189

A.4

Bayesian prediction

195

BIBLIOGRAPHY

20

199

Chapter 1

1. Introduction

1.1 Project motivation The availability of good predictions in financial markets, such as foreign exchange, commodities or stocks, both in stable periods and in crisis periods (turbulent periods), is always a huge challenge for companies and organizations as well as a problem that supports multiple approaches and solutions. Based on previous reality, this project has as main objective the designing of a predicting system for the different financial market states, system to be programmed in MATLAB. This project establishes a method for both detecting and predicting stable and turbulent financial states. The two types of predictions developed are based on a non-Bayesian and a Bayesian approach. Bayesian methods have matured and improved in several ways during the last fifteen years. Currently, they are increasingly becoming attractive to researchers. Successful applications of Bayesian data analysis have appeared in many different fields, including Actuarial Science, Biometrics, Finance, Market Research, Marketing, Medicine, Engineering or Social Science. The main characteristic offered by Bayesian data analysis is the possibility of incorporating the researcher’s knowledge about the problem to be handled. This implies obtaining better and more reliable results, as prior knowledge is more precise. The choice of this project has its purpose in the author’s interest for financial markets. Big banks growing need of automatic trading and prediction programs motivated the author to find an effective financial forecast system. With the personal goal of gaining a position in the financial job market, the author tries to develop a project that can serve as an introduction.

21

1.2 Objectives The project will create a MATLAB program, which aims to automate a process of financial forecasting. The prediction will differentiate between stable, pseudo turbulent and turbulent periods. Then, the program is tested to draw some conclusions. The general objectives of the project are the following: -

Design a prediction tool that takes into account the needs of the investor. This includes a prediction system with a Bayesian approach.

-

Test the tool for different cases. With the tool running and after checking several quotes, the effectiveness of the method is studied.

Along with the general objectives, some specific objectives are also followed in this project:

22

-

Application of Nelson rules and white noise techniques in the monthly range variations of different traded securities. This helps differentiating turbulent periods from stable periods.

-

Programming market behavior for two states (stable and turbulent) and three states (stable, pseudo turbulent and turbulent). The model for each state will be different. For this distinction between states Markov chains will be used.

-

Design of a financial forecasting system based on Bayesian modeling. This will allow the program to take into account prior knowledge of the investor and use it to have a more powerful predictive tool.

1.3 Methodology The problem of predicting will be addressed with the help of MATLAB. The steps to take are: - Incorporation of market quotations The first step is to include in the program the databases with the prices of the securities of interest. - Programming the determination of stable, pseudo turbulent and turbulent periods for the non-Bayesian approach The conditions to catalog periods as stable, pseudo turbulent or turbulent have to be programmed. The development of the whole methodology is based on monthly range variations. Nelson rules and the white noise rolling windows technique will be used to determine the state of each period. - Programming the non-Bayesian prediction methodology The methodology will be different depending on whether the period is stable, pseudo turbulent or turbulent. Each state will be modeled according to a normal distribution where the mean and the variance will have to be estimated. Probability of transition among states, steady state and estimated durations in each period will have to be calculated. To accomplish this Markov chains will be used. - Bayesian approach methodology It will consist on the development of the Bayesian prediction. The prediction will take into account the needs and knowledge of the investor. The aim will be to have a more precise tool. As in the non-Bayesian approach each state will be modeled according to a normal distribution where the mean and the variance will have to be estimated. Probability of transition among states, steady state probabilities and expected durations in each state will also have to be calculated.

23

- Results After completing the prediction tool its effectiveness will be tested for different market securities. The results will help to improve the program. - Memory Finally, all the important steps made during the project and the conclusions derived from it will be written down. The approximate dates of the tasks are shown in the following figure: Table 1.1. GANT diagram of the projects activities

The number of hours dedicated in each activity is shown in the following figure: Table 1.2. Hours expended in each activity of the project

Concept Previous research and documentation Development of the non-Bayesian methodology Development of the Bayesian methodology Analysis of the results Write the report including the User’s Guide Total

24

Hours 90 hours 150 hours 120 hours 50 hours 110 hours 520 hours

Chapter 2

2. Financial markets

2.1 Introduction to Financial Markets A financial market is a market in which people and entities can trade financial securities, commodities, and other fungible items of value at low transaction costs and at prices that reflect supply and demand. Securities include stocks and bonds, and commodities include precious metals or agricultural goods. Financial markets and financial intermediaries (bank, insurance companies, and pension funds) have the basic function of getting lenders and borrowers together. Well-functioning financial markets and financial intermediaries are needed to improve economic efficiency, and are crucial to economic health. This chapter presents an overview of the fascinating study of financial markets and institutions. Financial markets perform the essential economic function of channeling funds from people who have saved surplus funds by spending less than their income to people who have a shortage of funds because they wish to spend more than their income. The principal lender-savers are households, but business enterprises and the government sometimes find themselves with excess funds and so lend them out. The most important borrower-spenders are business and the government, but households also borrow to finance their purchases of cars, furniture and houses. In direct finance, borrowers borrow funds directly from lenders in financial markets by selling them securities (also called financial instruments). Securities are claims on the borrower’s future income or assets. Securities are assets for the person who buys as liabilities (debts) for the individual or firm that sells (issues) them.

25

When banks or other institutions collect demand or time deposits or borrow through the use of bonds and use them to provide loans or purchase securities including stocks and bonds there is indirect finance. [PECO05]. More information can be found in [BREA95]. Financial markets perform the essential economic function of channeling funds from people who have saved surplus funds by spending less than their income to people who have a shortage of funds because they wish to spend more than their income. This function is shown schematically in Figure 2.1.

Figure 2-1. Flows of Funds through the Financial System

26

2.2 Classification of Financial Markets The following description is a classification of financial markets, and illustrates their essential features.

2.2.1 Debt and Equity Markets There are two basic methods of raising funds in a financial market: by issuing debt, or by issuing equities. The most common method is to issue a debt instrument, such as a bond or a mortgage, which is a contractual agreement by the borrower to pay the holder of the instrument fixed dollar amounts at regular intervals (interest and principal payments) until a specified date (maturity date), when the final payment is made. The maturity of a debt instrument is the time (term) to that instrument’s expiration date. A debt instrument is short-term if its maturity is less than a year and long-term if its maturity is ten years or longer. Debt instruments with a maturity between one and ten years are said to be intermediate-term. The second method of raising funds is by issuing equities, such as common stock, which are claims to share in the net income (income after expenses and taxes) and the assets of a business. If you own one share of common stock in a company that has issued one million shares, you are entitled to 1 one-millionth of the firm’s net income and 1 one-millionth of the firm’s assets. Equities usually make periodic payments (dividends) to their holders and are considered longterm securities because they have no maturity date. In addition, owning stock means that you own a portion of the firm and thus have the right to vote on issues important to the firm and to elect its directors. The main disadvantage of owning a corporation’s equities rather than its debt is that an equity holder is a residual claimant; i.e., the corporation must pay all its debt holders before it pays its equity holders. The advantage of holding equities is that equity holder’s benefit directly from any increases in the corporation’s profitability and asset value because equities confer ownership rights on the equity holders. Debt holders do not share this benefit because their payments are fixed.

27

2.2.2 Primary and Secondary Markets A primary market is a financial market in which new issues of a security, such as a bond or a stock, are sold to initial buyers by the corporation or government agency borrowing the funds. A secondary market is a financial market in which securities that have been previously issued (and are thus second hand) can be resold. The primary markets for securities are not well known to the public because the selling of securities to initial buyers often takes place behind closed doors. An important financial institution that assists in the initial sale of securities in the primary market is the investment bank. It does this by underwriting securities: it guarantees a price for a corporation’s securities and then sells them to the public. When an individual buys a security in the secondary market, the person who has sold the security receives money in exchange for the security, but the corporation that issued the security acquires no funds. A corporation acquires new funds only when its securities are first sold in the primary market. Nonetheless, secondary markets serve two important functions. First, they make it easier to sell these financial instruments to raise cash; that is, they make the financial instruments more liquid. The increased liquidity of these instruments then makes them more desirable and thus easier for the issuing firm to sell in the primary market. Second, they determine the price of the security that the issuing firm sells in the primary market. The investors that buy securities in the primary market will pay the issuing corporation no more than the price they think the secondary market will set for this security. The conditions in the secondary market are therefore the most relevant to corporations issuing securities.

2.2.3 Exchanges and Over-the-Counter Markets Secondary markets can be organized in two ways. One is to organize exchanges, where buyers and sellers (or their brokers) of securities meet in one central location to conduct traders. The other method of organizing a secondary market is to have an over-thecounter (OTC) market, in which dealers at different locations stand ready to buy and sell securities to anyone who comes to them. Because over-the-counter dealers are in computer contact and know prices set by one another, the OTC market is very competitive and not very different from a market with an organized exchange.

28

2.2.4 Money and Capital Markets Another way of distinguishing between markets is on basis of the maturity of the securities traded in each market. The money market is a financial market in which only short-term debt instruments are traded. The capital market is the market in which longer-term debt and equity instruments are traded. Money market securities are usually more widely traded than longer-term securities and so tend to be more liquid. In addition, short-term securities have smaller fluctuations in prices than long-term securities, making them safer investments. This will be shown in Chapter 7 with the case study of the Euro/US Dollar exchange rate. As a result, corporations and banks actively use the money market to earn interest on surplus funds that they expect to have only temporarily. Capital market securities, such as stocks and long-term bonds, are often held by financial intermediaries such as insurance companies and pension funds. More information can be found in [MISH02].

2.3 Financial Market Instruments To complete the understanding of how financial markets perform the important role of channeling funds from lender-savers to borrower-spenders, one needs to examine the securities (instruments) traded in financial markets. This project will try to predict the fluctuations of these instruments and the financial states (stable, pseudo turbulent and turbulent) they pass through.

2.3.1 Money Market Instruments Because of their short terms to maturity, the debt instruments traded in the money market undergo the least price fluctuations and so are the least risky investments.

29

Treasury Bills These short-term instruments of the governments are issued in 3-, 6-, and 12-month maturities to finance the federal government. They pay a set amount at maturity and no interest payments, but the effectively pay interest by selling it at a discount, that is, at a price lower than the amount paid at maturity.

Figure 2-2. Treasury bills features

The main features of these instruments are: -

They are the most liquid of all the money market instruments because they are the most actively traded.

-

They are also the safest of all money market instruments because there is almost no possibility of default. The government is always able to meet its debt obligations because it can raise taxes or issue currency (paper money or coins).

Negotiable Bank Certificates of Deposit A certificate of deposit (CD) is a short or medium-term debt instrument sold by a bank to depositors that pays annual interest and at maturity pays back the original purchased price. CDs are an extremely important source of funds for commercial banks. Negotiable CDs have highly liquid secondary markets. Commercial Paper Commercial paper is a short-term debt instrument issued by large banks and well-known corporations, such as General Motors and AT&T. Before 1960, corporations usually borrowed their short-term funds from banks, but since then they have come to rely more heavily on selling commercial paper to other financial intermediaries and corporations for their immediate borrowing needs.

30

In other words, they engage in direct finance. Maturities on commercial paper rarely range any longer than 270 days. Bankers Acceptances A banker’s acceptance is a bank draft (a promise of payment similar to a check) issued by a firm, payable at some future date, and guaranteed for a fee by the bank that stamps it “accepted”. The firm issuing the instrument is required to deposit the required funds into its account to cover the draft. If the firm fails to do so, the bank’s guarantee means that it is obliged to make good on the draft. These money market instruments are created in the course of carrying out international trade and have been seen in use for hundreds of years. The advantage to the firm is that the draft is more likely to be accepted when purchasing goods abroad, because the foreign exporter knows that if the company purchasing the goods goes bankrupt, the bank draft will still be paid off.

2.3.2 Capital Markets Instruments Capital market instruments are debt and equity instruments with maturities greater than one year. They have wider price fluctuations than money market instruments and are considered to be fairly risky investments. The principal capital market instruments are listed in Table 2.1 Table 2.1. Principal Capital Market Instruments

31

Stocks Stocks are equity claims on the net income and assets of a corporation. Their value exceeds that of any other type of security in the capital market. The amount of new stock issues in any given year is typically quite small, less than 1% of the total value of shares outstanding. Individuals hold around half of the value of stocks. Pension funds, mutual funds, and insurance companies hold the rest. Corporate Bonds These are long-term bonds issued by corporations with very strong credit ratings. The typical corporate bond sends the holder an interest payment twice a year and pays off the face value when the bond matures. Some corporate bonds, called convertible bonds, have the additional feature of allowing the holder to convert them into a specified number of shares of stock at any time up to the maturity date. Because the outstanding amount of both convertible and non-convertible corporate bonds for any given corporation is small, they are not nearly as liquid as other securities such as government bonds. Although the size of the corporate bond market is substantially smaller than that of the stock market, the volume of new corporate bonds issued each year is substantially greater than the volume of new stock issues. Thus the behavior of the corporate bond market is probably far more important to a firm’s financing decisions than the behavior of the stock market. The main buyers of corporate bonds are insurance companies, pension funds and households. Bonds are also called fixed-income securities because the cash flow from them is fixed.

Government Securities Treasury notes (T-notes) have a term of more than one year, but not more than 10 years. Treasury bonds (T-bonds) have terms of more than 10 years. They bear a stated interest rate, and the owner receives semi-annual interest payments. Treasury securities can be bought at original issue or on the secondary market. At original issue, the Treasury Department sells new securities to the public. On the secondary market, traders buy and sell previously issued securities.

32

Additionally, Government Agencies and State and Local Governments also issue bonds to raise cash. Bills, notes and bonds are all fixed-income securities classified by maturity. All debt instruments are collectively known as fixed-income securities because the cash flow from them is fixed. Consumer and Bank Commercial Loans These are loans to consumers and businesses made mainly by banks but in the case of consumer loans, also by finance companies. There are often no secondary markets in these loans; this makes them the least liquid capital market instruments. More information about financial market instruments can be found in [MADU03].

2.4 The Trading Industry This section provides a brief overview of the trading industry. First it is consider who trades. Then trading instruments and the markets where they trade are characterized. Finally, it is examined how regulators oversee trading. 2.4.1 The traders Traders are the people who trade. Traders have long positions when they own something (securities). Traders with long positions profit when prices rise. They try to buy low and sell high. Traders have short positions when they have sold something that they do not own. Traders with short positions hope that prices will fall so they can repurchase at a lower price. When they repurchase they cover their positions. Short sellers profit when they sell high and buy low. The trading industry has a buy side and a sell side. The buy side consists of traders who buy exchange services. Liquidity is the most important of these services. Liquidity is the ability to trade when you want to trade. Traders on the sell side sell liquidity to the buy side.

33

The buy side The buy side of the trading industry includes individuals, funds, firms, and governments that use the markets to help solve various problems they face. These problems typically originate outside of trading markets. For example, investors use securities markets to solve inter-temporal cash flow problems: they have income today that they would like to have available in the future. They use markets to buy stocks and bonds to move their income from the present to the future. Many buy-side institutions are pension funds, mutual funds, trusts, and foundations that invest money. Table 2.2 provides a summary of the buy side of the trading industry. Table 2.2. The Buy Side of the Trading Industry

The sell side The sell side of the trading industry includes dealers and brokers who provide exchange services to the buy side. Both types of traders help buy-side trader’s trade when they want to trade. Dealers are individuals or entities, such as a securities firm, that act as a principal and stand ready to buy and sell for their own account. They accommodate trades that their clients want to make by trading with them when their clients want to trade. Dealers hold an inventory of securities, and profit when they buy low and sell high. In contrast, brokers trade on behalf of their clients. Brokers arrange trades that their clients want to make by finding other traders who will trade with their clients. Brokers profit when their clients pay them commissions for arranging trades with other traders. 34

Many sell-side firms employ traders who both deal and broker trades. These firms are therefore known as broker-dealers. The sell-side firms exist only because the buy side will pay for its services. Therefore it must be understand why the buy side trades before it can be understand when the sell side is profitable. Table 2.3 provides a summary of the sell side of the trading industry. Table 2.3. The Sell Side of the Trading Industry

2.4.2 Trade facilitators Many institutions help traders trade. Here it is introduce exchanges and clearing firms. Exchanges Exchanges provide forums where traders meet to arrange trades. They are the markets in which securities are traded. Exchange traders may include dealers, brokers and buy-side traders. Only members can trade at most exchanges. Non-members can trade by asking member-brokers to trade for them. Historically, traders met on exchange floors. Exchanges once were owned and controlled by their members. Nowadays, many exchanges have converted, or are in the process of converting, to corporate ownership. Now, at many exchanges, traders meet only via electronic communications networks.

35

Clearing Firms An organization associated with an exchange to handle the confirmation, settlement and delivery of transactions, fulfilling the main obligation of ensuring that transactions are made in a prompt and efficient manner. They are also referred to as "clearing firms" or "clearing houses".

2.4.3 Trading Instruments Trading instruments vary by type. They include real assets, financial assets, derivative contracts, insurance contracts and gambling contracts. This section describes various classes of trading instruments and special aspects of the markets in which they trade. A summary of the various types of instruments appears in table 2.4. Table 2.4. Trading Instruments Summary

Real assets Real assets include physical commodities, real estate, machines, patents, and other intellectual properties. The real assets that trade in the most liquid markets are industrial and precious metals, agricultural commodities, fuels, and pollution credits. The case for Brent Crude oil will be study in Chapter 6. These instruments are quite fungible: One unit is very similar, if not identical, to all other units.

36

Financial Assets Financial assets are instruments that represent ownership of real assets and the cash flow they produce. Stocks and bonds are financial assets because they represent ownership of the assets of a corporation. Next, it is defined some common financial assets. Stocks signify ownership in a corporation, and represent a claim on part of the corporation's assets and earnings. Santander bank, BBVA bank and the oil company Royal Dutch Shell are stocks that will be studied throughout this project. Bonds are debt securities issued by corporations, governments, and occasionally individuals. Debtors create bonds when they borrow money. Unit trusts or mutual funds are securities that give investors access to a well-diversified portfolio of equities, bonds and other securities. Currency is money circulated within an economy, including coins and paper notes. The Euro/US Dollar exchange rate is studied in Chapter 6. Derivative Contracts They are instruments that derive their values from the values of the underlying instruments upon which they are based. They include forward contracts, future contracts, options and swaps. Sellers create derivative contracts when they first sell them. Therefore they are in zero net supplied; that is, the sum of all long positions minus the sum of all short positions is zero. Almost all derivative contracts have an expiration date. On that date, traders make final settlement and the contract expires. Derivative contracts may be physically settled or cash settled. The former requires that the seller deliver the underlying instrument to the buyer when obliged to do so (expiration date). The latter requires that the seller deliver the cash value of the underlying instrument to the buyer when obliged to do so. Many derivative contracts require that buyers and sellers make variation margin payments on a regular basis (daily). Variation marginal payments transfer money from buyers to sellers or from sellers to buyers to adjust the prices of their contracts to reflect current market conditions. This procedure reduces the chance that traders will default when their contracts expire. Forward contracts are contracts for the future sale of some commodity. The commodity may be physical or financial. Most forward contracts don't have standards and aren't traded on exchanges. A farmer would use a forward contract to "lock-in" a price for his grain for the upcoming fall harvest.

37

Standardized future contracts are forward contracts that an exchangeclearing house guarantees. Futures traders do not care whether their counterparts are creditworthy. An option contract gives their holders the option -but not the obligation as in the case of future contracts-to buy or sell an underlying instrument at a fixed price. A call option is an option to buy at a fixed strike price. A put option is an option to sell at a fixed strike price. Swaps are contracts for the exchange of two future cash flows. A cash flow is a series of payments. For example, a currency swap provides for the exchange of a future series of fixed payments in one currency for a future series of payments in another currency. Insurance Contracts and Gambling Contracts Both insurance contracts and gambling contracts are instruments that derive their values from the outcomes of future events. The distinction between both depends on the reasons why people buy them. People who are concerned about a future loss upon future events buy insurance contracts. Hybrid Contracts Some trading instruments defy easy classification because they embody elements of more than one type of instrument.

2.4.4 Trading Markets When trading the most active stocks, public traders often trade with other public traders. When trading the least active stocks, public traders often trade with dealers. Most trades are small retail trades; large institutional traders account for most of the share volume. Although listed options contracts do not trade for most stocks, the number of different options far exceeds the number of stocks. For each option-eligible stock, options exchanges may list many put and call options for various expiration months and for various strike prices. Very few options trade frequently, however. The large number of corporate and municipal bonds issues ensures that most issues hardly ever trade. Since many fixed-income portfolios hold their bonds until maturity, some bond issues never trade again after they are first

38

issued. The buy side trades almost exclusively with dealers because public buyers and sellers rarely simultaneously want to trade the same bond issue. Government bond issues are far less numerous than corporate and municipal bond issues. They are also larger by far. The tremendous size of these issues and the widespread interest in these securities make these markets extremely liquid. Although the public often trades government bonds with dealers, buy-side traders often trade other buy-side traders in new electronic trading systems. Some of the world’s most liquid instruments trade in the futures market. Contracts on major agricultural, industrial, and financial commodities are extremely useful to hedgers throughout the economy. These contracts also awake the interest of many speculators. The most important world currencies trade in extremely liquid markets. Volumes are high because international trade and cross-border capital transactions generally require currency conversions. Real estate trades in brokered markets because every parcel is unique. The difficulties that buyers and sellers have finding each other make the real estate market the least liquid of the markets that have been discussed. Clearing and settlement in real estate markets is also quite expensive because the trades usually are large, complex, and among traders who do not have standing credit relationships. Stock Markets Corporations apply to exchanges in order to list their stocks. Exchanges generally list all companies that meet there listing standards and that pay their listing fees. All but the smallest publicly traded stocks are listed for trading at one or more markets. The listing standards of an exchange generally require that its listed companies meet specified minimum standards for capital value, a minimum number of shareholders, and financial strength. Most exchanges also require listed companies to report their accounts regularly according to generally accepted accounting practices (GAAP). Usually, securities admitted to the list may be suspended from dealings or removed from the list at any time that a company falls below a certain quantitative and qualitative continued listing criteria. When this happens, the exchange will review the appropriateness of continued listing.

39

Table 2.5. World main exchanges sorted by continent

Equity Options Markets Most organized trading in standardized stock option contracts takes place at the same exchange at which the underlying stocks trade. Investment banks also trade specialized option contracts OTC with their clients. These contracts usually have strike prices, maturity dates, settlement terms, or other features that are different from the standardized options available at the exchanges. Futures Markets Most futures exchanges have their own clearinghouses; therefore they do not compete to trade the same contracts. Instead, each exchange and its associated clearinghouse try to create contracts that will attract traders. Most exchanges have large research and marketing departments that design contracts they hope will attract traders. Futures exchanges generally trade several contracts that vary by expiration date for each commodity that they trade. Most commodities have at least four delivery months. 40

In financial and industrial commodities, traders mostly trade only the front month contract (the one that will expire next). When it expires, they roll their positions into the next contract. Corporate and Municipal Bond Markets Throughout the world, most corporate and municipal bonds trade overthe-counter in investment banks or commercial banks. Some stock exchanges list corporate bonds, but exchange bond trading volumes are usually trivial compared to OTC volumes. Treasury Markets Most national treasuries conduct public auctions at which they issue their bills, notes, and bonds. Some smaller nations, however, use underwriters to issue their securities. Generally, everyone may participate in Treasury auctions. Secondary trading of Treasury securities occur primarily over-the-counter in investment and commercial banks. Swaps and Spot Currency Markets Swaps and spot currencies mostly trade OTC in investment and commercial banks. Some brokers and some data providers organize markets in these instruments. For more information about trading and exchanges markets see [HARR02].

2.4.5 Market Regulation Regulators create and enforce rules that facilitate trading. Good regulations help ensure that traders communicate effectively with each other, that people do not defraud others, and that all things generally are as they appear. Governments usually require that regulators regulate in the public interest, but the definition of what is in the public interest, however, may be vague. Since regulators sets the rules of the game and since rules help determine who will be successful, people devote tremendous efforts to lobbying for rules that they favors.

41

2.5 Methods and Hypothesis for Analyzing Financial Markets Academics and practitioners have been discussing how financial markets work for more than one hundred years without reaching a consensus. Basically, there are four completely different lines of thought: -

Efficient Market Hypothesis. A lot of academics believe that markets are efficient and therefore follow random walks; i.e., the price of the trading instruments changes unpredictably, without exhibiting trends or patterns. Consequently, it is futile to use technical or fundamental analysis.

-

Fundamental Analysis. Fundamental analysts believe that markets are not efficient, and that trading instruments are sometimes undervalued and other times overvalued. Therefore, these traders try to benefit from the discrepancy between price and value.

-

Technical Analysis. Technical analysts believe that the price of the trading instruments exhibit patterns and trends, and therefore they develop trading rules that try to benefit from them. They also believe that the price of the trading instruments is not related to its “underlying fundamental value”.

-

Quantitative analysis. It is a set of financial analysis techniques that seek to understand behavior by using complex mathematical and statistical modeling, measurement and research. This is the line that this project follows supported by statistics theories.

In this section, all of them will be briefly describe.

2.5.1 Efficient Markets Efficient market theory is a field of economics, which seeks to explain the capital markets. The theory assumes several things including: -

42

Perfect information. Perfect information is a term used in economics and game theory to describe a state of complete knowledge about the actions of other players that is instantaneously updated as new information arises. Perfect information would practically mean that all consumers know all things, about all products, at all times, and therefore always make the best decision regarding purchase.

-

Instantaneous receipt of news

-

A marketplace with many small participants (rather than one or more large ones with the power to influence prices).

-

News arises randomly in the future (otherwise the nonrandomness would be analyzed and forecasted).

-

No transactions costs

The efficient market theory asserts that all financial prices accurately reflect all public information at all times. In other words, financial assets are always priced correctly, given what it is publicly known at all times. Price may appear to be too high or too low at times, but, according to the efficient market theory, this appearance must be an illusion. Stock prices, by this theory, approximately describe “random walks” through time: the price changes are unpredictable since they occur only in response to genuinely new information, which by the very fact that is new is unpredictable. Therefore, technical analysis and statistical forecasting will most likely be fruitless. In an efficient market, no group of investors should be able to consistently beat the market using a common investment strategy, and the expected returns from any investment will be consistent with the risk of that investment over the long term. The following two figures illustrate how share prices respond in real markets to new information.

Figure 2-3. Reaction of stock price to good news information in efficient and inefficient markets

43

Figure 2-4. Reaction of stock price to bad news information in efficient and inefficient markets

More information about the market efficiency theory can be found in [FAMA65] and in [DIMS98].

2.5.2 Fundamental Analysis Fundamental analysis is the examination of the underlying forces that affect the well being of the economy, industry groups, and companies. Fundamental analysis is based on the hypothesis that every trading instrument has an underlying value or “intrinsic value”. The method used by fundamentalists consists of: (i) (ii) (iii) (iv)

Analyzing the underlying forces that drive values Assessing them Calculating values Comparing them to current prices.

Fundamental analysts believe that prices will inevitably converge to values. The underlying forces that drive values depend on the trading instrument. For the national economy, fundamental analysis might focus on economic data to assess the present and future growth of the economy. At the industry level, there might be an examination of supply and demand forces for the products offered. At the company level, fundamental analysis may involve examination of financial data, management, business concept and competition.

44

Therefore, fundamental analysts specialize on a small set of trading instruments, for example: treasury bonds, or financial stocks, or emerging stock markets, or currencies… To forecast future stock prices, fundamental analysis combines economic, industry, and company analysis to derive a stock's current fair value and forecast future value. If fair value or “intrinsic value” is not equal to the current stock price, fundamental analysts believe that the stock is either over or under valued and the market price will ultimately gravitate towards fair value. By believing that prices do not accurately reflect all available information, fundamental analysts look to capitalize on perceived price discrepancies. There are several steps associated with fundamental analysis. The investor must make an examination of the current and future overall health of the economy as a whole, attempt to determine the short-, medium- and long-term direction and level of interest rates, and understand the industry sector involved, including the maturity of the sector and any cyclical effects that the overall economy has on it. Once these steps have been undertaken, then the individual firm must be analyzed. This analysis must include the factors that give the firm a competitive advantage in its sector (low cost producer, technological superiority, distribution channels, etc.). As well, an in-depth look at the firm must be undertaken. Such factors as management experience and competence, history of performance, accuracy of forecasting revenues and costs, growth potential, etc., must be examined. To forecast future stock prices, financial statement analysis is the biggest part of fundamental analysis. It involves looking at historical performance data to estimate the future performance of stocks. Followers of fundamental analysis want as much data as they can find on revenue, expenses, assets, liabilities and all the other financial aspects of a company. Listed firms must report this data regularly to the market regulator, and it is publicly available. Fundamental analysts look at this information to gain insight on a company's future performance. The three most important pieces of information used by fundamental analysts to assess a firm’s value are: -

The Balance Sheet. The balance sheet highlights the financial condition of a company at a single point in time. It lists all of the assets held by a company in addition to the portion of those assets that are financed by debt (liabilities) or equity (retained earnings and stock).

-

The Income Statement. In essence, an income statement tells you how much money a company brought in (its revenues), how much it spent (its expenses), and the difference between the two (its profit/loss), over a specified time.

45

-

The Cash Flow Statement. Cash flow shows us how the company has performed in managing inflows and outflows of cash and provides a sharper picture of the company's ability to pay bills and creditors, and to finance growth.

More information about fundamental analysis can be found in [COPE00].

Figure 2-5. The general method for assessing the value of a firm

2.5.3 Technical Analysis Technical analysts study market action, trying to identify recurrent price patterns and trends. Their goal is to profit from trading when trends or patterns recur. At the turn of the century, the Dow Theory laid the foundations for what was later to become modern technical analysis. Dow Theory was not presented as one complete amalgamation, but rather pieced together from the writings of

46

Charles Dow over several years. Of the many theorems put forth by Dow, three stand out: -

Price Discounts Everything Price Movements Are Not Totally Random What Is More Important than Why

- Price Discounts Everything This theorem is similar to the strong and semi-strong forms of market efficiency. Technical analysts believe that the current price fully reflects all information. Because all information is already reflected in the price, it represents the fair value, and should form the basis for analysis. After all, the market price reflects the sum knowledge of all participants, including traders, investors, portfolio managers, buy-side analysts, sell-side analysts, market strategist, technical analysts, fundamental analysts and many others. It would be folly to disagree with the price set by such an impressive array of people with impeccable credentials. Technical analysis utilizes the information captured by the price to interpret what the market is saying with the purpose of forming a view on the future. - Prices Movements are not Totally Random Most technicians agree that prices trend. However, most technicians also acknowledge that there are periods when prices do not trend. If prices were always random, it would be extremely difficult to make money using technical analysis. A technician believes that it is possible to identify a trend, invest or trade based on the trend and make money as the trend unfolds. Because technical analysis can be applied to many different time frames, it is possible to spot both short-term and long-term trends. The uptrend is renewed when the stock breaks above the trading range. A downtrend begins when the stock breaks below the low of the previous trading range. - "What" is more important than "Why" Technicians, as technical analysts are called, are only concerned with two things: 1. What is the current price? 2. What is the history of the price movement? The price is the end result of the battle between the forces of supply and demand for the company's stock. The objective of analysis is to forecast the

47

direction of the future price. By focusing on price and only price, technical analysis represents a direct approach. Fundamentalists are concerned with why the price is what it is. For technicians, the why portion of the equation is too broad and many times the fundamental reasons given are highly suspect. Technicians believe it is best to concentrate on what and never mind why. Why did the price go up? It is simple, more buyers (demand) than sellers (supply). After all, the value of any asset is only what someone is willing to pay for it. Who needs to know why?

Figure 2-6. IBM’s price chart. Trends and random movements.

Conclusions The beauty of technical analysis lies in its versatility. Because the principles of technical analysis are universally applicable, all trading instruments may be analyzed performing the same theoretical background. You don't need an economics degree to analyze a market index chart. Charts are charts. It does not matter if the time frame is 2 minutes or 2 years. It does not matter if it is a stock, market index, future, commodity or any tradable instrument where the price is influenced by the forces of supply and demand. The technical principles of support, resistance, trend, trading range and other aspects can be applied to any chart. While this may sound easy, technical analysis is by no means easy. Success requires serious study, dedication and an open mind. [PECO05].

48

2.5.4 Quantitative Analysis Quantitative analysis is a set of financial analysis techniques that seek to understand behavior by using complex mathematical and statistical modeling, measurement and research. By assigning a numerical value to variables, quantitative analysts try to replicate reality mathematically. Quantitative analysis can be done for a number of reasons such as measurement, performance evaluation or valuation of a financial instrument. It can also be used to predict real world events such as changes in a share price. Because of their backgrounds, quantitative analysts draw from three forms of mathematics: statistics and probability, calculus centered on partial differential equations, and econometrics. The majority of quantitative analysts often apply a mindset drawn from the physical sciences. The most commonly used numerical methods are: -

Finite difference method, used to solve partial differential equations

-

Monte Carlo method, also used to solve partial differential equations, but Monte Carlo simulation is also common in risk management.

There are two types of quantitative analysts: 

A typical problem for a statistically oriented quantitative analyst would be to develop a model for deciding which stocks are relatively expensive and which stocks are relatively cheap. The model might include a company's book value to price ratio, its trailing earnings to price ratio, and other accounting factors. An investment manager might implement this analysis by buying the underpriced stocks, selling the overpriced stocks, or both. Statistically-oriented quantitative analysts tend to have more of a reliance on statistics and econometrics, and less of a reliance on sophisticated numerical techniques and object-oriented programming. These quantitative analysts tend to be of the psychology that enjoys trying to find the best approach to model data, and can accept that there is no "right answer" until time has passed and one can retrospectively see how the model performed.



A typical problem for a numerically oriented quantitative analyst would be to develop a model for pricing, hedging, and risk-managing a complex derivative product. Mathematically-oriented quantitative analysts tend to have more of a reliance on numerical analysis, and less of a reliance on statistics and econometrics. These quantitative analysts tend to be of the psychology that prefers a deterministically 49

"correct" answer, as once there is agreement on input values and market variable dynamics, there is only one correct price for any given security (which can be demonstrated, although often inefficiently, through a large volume of Monte Carlo simulations). Both types of quantitative analysts demand a strong knowledge of sophisticated mathematics and computer programming proficiency. The typical investment problems that quantitative analysis may address are described next. Pricing Imaging there is an investment opportunity that will pay exactly $110 at the end of one year. How much is this investment worth today? In other words, what is the appropriate price of this investment, given the overall financial environment? If the current interest rate for one-year investment is 10%, then this investment should have a price of exactly $100. The price is determined by a simple application of the comparison principle. The investment can be directly compared with one of investing money in a one-year certificate of deposit (or one-year Treasury bill), and hence it must bear the same interest rate. This interest rate example is a simple example of the general pricing problem: Given an investment with known payoff characteristics (which may be random), what is the reasonable price; or, equivalently, what price is consistent with the other securities that are available? This problem shall be encountered in many contexts, for example determining the appropriate price of a bond, a stock, and derivative contracts such as futures and options. Hedging Hedging is the process of reducing financial risks that either arise in the course of normal business operations or are associated with investments. Futures and options contracts are ways of diminishing risk. Pure Investment Pure investment refers to the objective of obtaining increased future return for present allocation of capital. This investment problem is referred to as the portfolio selection problem, since the real issue is to determine where to invest available capital. Usually this is solved as an optimization problem, where each stock is modeled with two parameters, namely: an expected return and a standard deviation of that return.

50

2.6 Extreme events in financial markets What leads to extreme events and in the field of economics? Motivated by recent theories and economic developments that are taking place in the world today, this project opens the subject of extreme events in the financial field. In the spring and summer of 2007, the post-shock of the subprime crisis, a relatively small part of the U.S. financial market came to destabilize hedge funds and major international markets. In Britain, the interbank rate reached its highest level in 9 years. Modern economies are continually being subjected to such extreme events sometimes simultaneously or in rapid sequence, as in episodes of contagion in East Asia during the 90's. Extreme events are often perceived as unpredictable but this may not be true. Discussions of extreme economic events often assume that the ends are caused exogenously by nature, and have a constant probability of occurrence. But what is the likelihood that extreme events are also affected by our behavior? Aren’t peaks observed in the frequency of occurrence of extremes? The answer to both questions is yes. Endogenous dynamic extremes occur in economics and nature, including the effect of human activity on both the likelihood of extreme financial and in extreme weather. Thus, one can distinguish the existence of two types of static and dynamic extreme events; they can be endogenous and exogenous. This is the component end of a financial market can come from both outside and inside the economic operators themselves. It is important that when human activity increases the likelihood endogenously end, they may become less "rare". Financial crises and extreme events have a clear endogenous component because they affect many individuals in the national or global financial system. This extreme behavior is due to endogenous economic insiders, also known as externalities, which is the main cause of inefficiency in the system. Consequently, if extreme events are purely externalities, society cannot pay the price for it. How can this endogenous extremes formulation help? It does this in two ways, first, it helps to understand the source of some ends (endogenous), giving an idea of what can be seek to avoid. Second, it provides to banks and regulatory authorities a set of tools that can help address extreme events before and during its occurrence. According to [LAND08], the current crisis reminds that extreme events occur within the financial system. Supporting that endogenous extreme behavior is influenced by human activity; it should not be forget that financial systems are "human" systems. Its dynamics is influenced by the manner in which humans react to changes in their environment. These reactions can assist and amplify shocks to the point where they become a catastrophe. The "herd" behavior has been known for a long time as one of the most essential features of financial markets. The reactions of individuals can by virtue of their mutual interaction produce strong amplifying effect on the markets.

51

Finally, note that financial systems are complex systems. They are based on interdependence between multiple actors and counterparts. Transmissions occur through networks whose structure and architecture is constantly changing due to financial innovation and regulatory arbitrage. This makes them have the following characteristics: nonlinearity, discontinuities and sensitivity to initial conditions. The complexity creates the possibility of extreme events, but it does not necessarily happen. For all this, one can say that financial markets are prone to extreme events. [REXA11]. In the following figure the IBEX 35 Spanish index closing quotes between 1992 and 2009 is represented. It can be seen several turbulent periods or crises. Especially important is the current crises that started in 2007. The huge downtrend in these years is shown in the figure.

Figure 2-7. IBEX 35 Spanish index closing quotes between 1992 and 2009

The next figure shows the monthly range variation of Santander bank quotes between January 2003 and March 2012 and its different periods (stable, pseudo turbulent and turbulent). The classification of the periods will be studied in more detail in the following chapters.

52

Figure 2-8. Periods of the monthly range variation of Santander bank quotes between January 2003 and March 2012

53

54

Chapter 3

3. Statistical tools for the analysis and prediction of financial time series 3.1 Data analysis Statistics is an essential tool for understanding the behavior of a population or data set. This implies the availability of data from at least one variable. These variables are present in studying various events that are not deterministic. When studying a phenomenon in any field, usually a series of data is collected to be processed and work out a series of conclusions. It is very important the way data is collected and plotted. Engineering as a business is committed to finding solutions for different types of industries. This search for solutions has always required the collection and analysis. Nowadays, technology allows large amounts of data collected in any field and it therefore seems obvious that the analysis of the data is required for understanding and process control. 3.1.1 Scales Ratios. They have a natural zero, which corresponds to the absence of the phenomenon under study in a timely observation. They have a constant measurement unit (currency, kg, km). Let’s know how many times an object is greater than another. Interval scales. They don’t have a natural zero. The distance between two adjacent dots is equal. Since there is no zero, it is possible to add, subtract and compare but not multiply or create ratios. Let’s calculate mode, median, mean, range and standard deviation, e.g., temperature. Ordinal Scales. Used for classifying objects in an order depending on whether they have greater or less degree of a specific feature. It cannot compare 55

one object with another since there is no measure of dimension. It allows the calculation of mode and median, e.g., position of the drivers in a race. Nominal Scales. Divides data populations in different classes in which objects are considered equivalent. The only operations that can be made are based on equivalence ratio. A particular case is that of dichotomous variables or Boolean test, which analyze the presence or absence of a feature. Let’s calculate the mode and frequency, e.g., the sex of a person. In the case of time series, data collected will be two-dimensional because the position of the object in time and value has to be known. The position corresponds to an ordinal scale and a metric scale can measure the value.

3.2 Time series A time series is a sequence of consecutive observations of a variable at different points in time. These series may be continuous (the variable can be find at any instant of time) or discrete (values of the series are collected in equally spaced time intervals). Calling the dependent variable "Y", and the instant of time "t", the series will be represented as follows: {Y (t)} with t = 1, . . . , n

3.2.1 Prediction methods based on data There are two categories in which predictions are included: quantitative and qualitative methods. Quantitative method Quantitative prediction methods require three conditions: 56

Past information. Information must be quantified in numerical data.

-

Aspects that occurred in the past can be assumed that may continue in the future.

The criticism they receive is that quantitative methods cannot accurately describe the future because everything is constantly changing. There are two main models for quantitative prediction: time series and explanatory models. Explanatory models are based on the prediction of a variable depending on the relationship of other independent variables. Hence, the goal is to discover how to relate the variables. Time series are based on past values of variables but they don’t explore how these affects the system. The objective is to extrapolate to the future what happened in the past. Both time series and explanatory models have advantages, the former are easier to use while the latter can be very successful in making decisions. There are two types of analysis in quantitative methods: -

Univariate analysis: forecasts are made using only information about the series under study (ARIMA models).

-

Causal Analysis: the behavior of the series is explained by external factors (regression analysis and multivariate analysis of time series).

Qualitative method Qualitative methods require a decision based on accumulated knowledge. The utility of this method is more difficult to measure and is used together with quantitative methods (Bayesian analysis). They are opinions of experts who are used to predict future events. Is commonly used in series in which the past is not relevant to the future values or when there are no data on the past or techniques for analysis.

3.2.2 Components of a time series The general components of a time series are: -

Trend: Long-term evolution of the series. Can be defined as a long-term change that occurs in relation to the middle level, or change in the longterm average. The trend is identified with a smooth movement of the long-term series.

-

Cycle: oscillatory movements above and below trend.

57

-

Seasonal component: periodic fluctuations whose length is bounded by the calendar (daily, weekly, monthly, quarterly...) and repeated on.

-

Irregular or reminder: sporadic variations not included in the above. Having identified the components before and after they are removed, there are still values that are random. We intend to study what kind of random behavior presents these wastes, using some kind of probabilistic model that describes them. They can be of 2 types: o Random: small accidental effects. o Erratic: cannot predict.

Normally the random component εt is considered to be the only residual and it represents impacts produced by chance whose mean is constant and equal to zero and has variance σ2 constant over time. A time series {εt} is a white noise process if it verifies the following: 1.

E (εt)=0 for all t.

2.

E (εt, εt’) for all t other than t'. This means that the observations of the series present no correlation.

3.

V (εt) = σ for all t.

A white noise process (not normal distribution required although this may be usual) is totally unpredictable. Since it has no memory of past values of the series it does not provide information about what may happen in the future. This concept has been crucial for the development of the methodology developed.

3.2.3 Classical decomposition methods A time series could be express as follows: Yt = f (St, Tt, Et) where: Yt is the value of the time series in period t St is the seasonal component in period t Tt is the trend component (evolution of the series) and cyclical factor (oscillatory movements about the trend) in period t Et is the irregular component in period t

58

The following figure shows the decomposition graph of the components of a series. The data is plotted above, the trend component in the second row, the seasonal component in the third and the irregular component in the fourth.

Figure 3-1. Classical decomposition methods

First on the table is the observed time series. Second the component corresponding to the trend and the cycle of the series. Usually it indicates the behavior of long-term series. In this case it can be seen that it is a series with increasing trend. Third is the seasonal component. This component is usually a pattern repeated along the entire series continuously every certain time period (length of the season). In this case is 12 months. Finally the residual component that cannot be predicted is represented. Adding the 3 series up, the initial set observed would be obtained.

59

Depending on the relationship between the various components there are different models of decomposition: Additive decomposition Values of the time series are obtained by adding the various components: Yt = Tt + St + Et

(3.1)

Multiplicative decomposition Time series values are obtained by multiplying the individual components. Thus, the oscillations in the series are more pronounced than in the additive decomposition: Yt = Tt * St * Et

(3.2)

Mixed decomposition Adding and multiplying components result on the time series values. One part arises from a multiplier between the trend, cyclical and seasonal factors. It also adds an irregular element (It). Yt = Tt * St * Et + It

Table 3.1. Schematic of Trend and Seasonal Component of a series

60

(3.3)

This table shows graphically the behavior of a series based on two components, trend and seasonality. Besides that, it also illustrates the difference for these two components when they are zero, additive or multiplicative. It can be seen that when there is no trend the series remains stable around a value. When the series has an additive trend, the time series acquires a positive or negative constant slope. When the trend is multiplicative the slope of the series grows exponentially. If there is no seasonal component the series seems noisy and no pattern is repeated every certain period of time. An additive seasonal component will be reflected as a pattern that repeats every certain period of time with peaks and valleys. In the case of multiplicative seasonal component, the maximums and minimums of the pattern are increasingly separating (diverging). [REXA11]

3.3 Statistical process control. Nelson rules A process control is one whose behavior with respect to variations is stable over time. Statistical Process Control (SPC) is, from the perspective of quality, a very important tool used to control the background variability in a process, using statistical techniques that allow real-time monitoring of the quality of the process. The most commonly techniques used are control charts, these can be categorized in: control charts for variables, attributes or number of defects. These control charts allow identifying instability and abnormal circumstances. They are a visual comparison of the performance data of a process with control limits calculated as lines drawn on the graph. In the industry, when a chart indicates an out of control point, one can start an investigation to identify causes and make corrective decisions. The objective in this project is to relate the concept of "out of control" of a point, with turbulent periods of a financial series. Control charts for variables have been used to develop the method. This is a graphic representation of the variation produced in the series of courses over time. Any variation above the fixed limits is considered anomalous and out of control (turbulent). The basic components of a control chart are: -

Centerline: horizontal line parallel to the time axis at the point on the vertical axis that represents the mean of the sample data. The median can be also used as the centerline. 61

-

Statistical control limits: upper and lower boundary (typically ±3 standard deviations) within which any point of the variable is considered in statistical control.

Control charts have had lots of research and publications for its useful application in various fields. One of the most accepted and used rules in this discipline are “Nelson rules”. The rules approach is used to see if a process at one point is out of control, so in this project have been useful to relate the concept of turbulent periods with "out of control" points. Nelson rules are a method in process control of determining if some measured variable is out of control (unpredictable versus consistent). Rules, for detecting out-of-control or non-random conditions were first postulated by Walter A. Shewhart in the 1920s. Lloyd S Nelson first published Nelson rules in an article in the October 1984 issue of the Journal of Quality Technology. The rules are applied to a control chart on which the magnitude of some variable is plotted against time. The rules are based around the mean value and the standard deviation of the samples. The 8 rules of Nelson to detect points "out of control" are [WIKI12]: Rule 1. One point is more than 3 standard deviations from the mean. One sample (two shown in this case) is grossly out of control.

Figure 3-2. Nelson rule number 1

62

Rule 2. Nine (or more) points in a row are on the same side of the mean. Some prolonged bias exists.


Rule 3. Six (or more) points in a row are continually increasing (or decreasing). A trend exists.


63

Rule 4. Fourteen (or more) points in a row alternate in direction, increasing then decreasing. This much oscillation is beyond noise. This is directional and the position of the mean and size of the standard deviation do not affect this rule.


Rule 5. Two (or three) out of three points in a row are more than 2 standard deviations from the mean in the same direction. There is a medium tendency for samples to be medium out of control. The side of the mean for the third point is unspecified.


64

Rule 6. Four (or five) out of five points in a row are more than 1 standard deviation from the mean in the same direction. There is a strong tendency for samples to be slightly out of control. The side of the mean for the fifth point is unspecified.


Rule 7. Fifteen points in a row are all within 1 standard deviation of the mean on either side of the mean. With 1 standard deviation, greater variation would be expected.


65

Rule 8. Eight points in a row exist with none within 1 standard deviation of the mean and the points are in both directions from the mean. Jumping from above to below whilst missing the first standard deviation band is rarely random.


Applying these rules indicates when a potential "out of control" situation has arisen. However, there will always be some false alerts and the more rules applied the more will occur. For some processes, it may be beneficial to omit one or more rules. Equally there may be some missing alerts where some specific "out of control" situation is not detected. Empirically, the detection accuracy is good. Nelson rules have been applied in this project to differentiate between stable and turbulent periods. An example of rules 1, 2 and 5 is shown in the following figure. In the example the rules are applied to the monthly range variation of Santander bank quotes between January 2003 and March 2012.

66

Figure 3-10. Nelson rules 1, 2 and 5 applied to the monthly range variation of Santander bank quotes between January 2003 and March 2012

The points shown in the graph represent points that meet Nelson rules 1, 2 or 5. Therefore, these points are related to turbulent points and will be used to develop the methodology explained in Chapter 6.

3.4 White Noise The concept of white noise has been very useful to develop the methodology of this project. A methodology has been designed to explore and distinguish between stable, pseudo turbulent and turbulent periods. White noise is related to stable periods, this way is possible to distinguish stable from pseudo turbulent or turbulent periods A signal with values at two different time instants that are statistical uncorrelated is called a white noise signal. This distribution can be approximated to a normal with zero mean and standard error 1 / √n, where n is the number of observations.

67

A model for this type of noise would be: Yt = c + e t

(3.4)

et represents the random error component and "c" the level of the series. It is expected that 95% of the autocorrelation coefficients of white noise are in the range ± 1.96 / √n. If this is not the case, the series is likely to be a non white noise. This is why it is common to paint lines that define these limits (critical values) when viewed graphs of the ACF (autocorrelation function). For a white noise series, all ACF must be equal to zero. In practice, this condition is difficult to achieve and that is why a period is considered a white noise if the ACF are close to zero. In this project the methodology that has been develop to identify white noise periods is based in a rolling windows technique. This technique will be described in Chapter 6. Applying the white noise detection methodology makes possible to distinguish white noise periods from non white noise periods. An example of detection of white noise periods is shown in the following figure. This example represents the monthly range variation of Santander bank quotes between January 2003 and March 2012. The rolling windows size applied is of 18 months, since this is the one that seems to work better as stated in [REXA11]. With the program that has been developed in MATLAB it is possible to differentiate white noise from non white noise periods.

Figure 3-11. White noise test for the monthly range variation of Santander bank quotes between January 2003 and March 2012

68

In the plot the red points (non white noise periods) are related to nonstable periods. This white noise test, combined with Nelson rules, will be used to classify each period under the category of stable, pseudo turbulent or turbulent period.

3.4.1 Random Walk

In its most general form, random walks are any random process where the position of a particle at a certain instant depends only on its position at some instant before and some random variable that determines its subsequent direction and step length. As white noise series, a random walk is quite unpredictable. If yt defines a time series, a random walk is modeled by the following expression: Y (t + τ) = Y (t) + Φ (τ)

(3.5)

Where Φ is the random variable describing the law of probability to take the next step and τ is the time interval between consecutive steps. As the length and direction of a given step depends only on the position y (t) and not from any previous position, the random walk has the Markov property, for more detail see [MAKR98]. The fact that a series is a random walk means that it is not a white noise, so the ACF graph associated with that series would not have values close to zero.

69

3.5 Autoregressive models 3.5.1 Simple autoregressive models AR (p) Autoregressive models (AR models) are based on performing a regression of the variable yt on itself (auto-regression), e.g., on the values that the variable yt took in the n previous periods. The simplest model is the autoregressive AR (1) that follows this pattern: Yt = c + k Yt-1 + et

(3.6)

If the value of k is zero, the series is a white noise and if it is one the series is a random walk. Autoregressive models (AR) of order p are expressed as: Yt = c + k1 Yt-1 + k2 Yt-2 + ... + kp Yt-p + et

(3.7)

And they must meet the following characteristics: -1 < kp 0, for some integer n. A matrix A = [aij] is said to be positive if aij (n) > 0 for all i, j. A transition matrix T is said to be regular if there exists an integer N such that TN is positive. A regular chain is obviously irreducible, but the other way around doesn’t need to be necessarily true. Closed sets A Markov chain may contain some statements that are recurrent, others that are transient and others that are absorbent. Recurrent states can be part of closed subsets. A set of states C in a Markov chain is said to be close if any state in C can be reached from any other state in C and no other state outside of C can be reached from any state in C. Thus a necessary condition for this to happen is that

The absorbing states are closed with only one element. You can see that a closed subset is itself a substring of an irreducible complete Markov chain. Ergodic chains All states in an irreducible chain belong to the same class. If all states are ergodic, i.e., recurrent, null and aperiodic then the chain is defined as ergodic. The canonical representation of a transition matrix consists on rearranging the transition matrix, separating the recurrent states of the transients. At the same time, the recurrent states have to be reordered joining those who communicate with each other. In this way, the states of a chain can be divided into two disjoint subsets (some may be ∅): the transient and the recurrent, so that the transient states are inaccessible from recurrent.

84

Infinite Chains It is considered that the chain is not finite and the process continues indefinitely. A visit to a state j is known as the stage or moment in which the chain is in state j, and Nj denotes the number of visits to state j in the process. Periodic chains If a state j is periodic with period δ and another state i communicates with him, then also state i is periodic with the same period. Thus, the period is a common feature of irreducible and closed chains. If an irreducible and closed substring of states C ∈ E has period δ, then C can be subdivided in δ subsets D0, D1, ..., Dδ-1, such that

so that the evolution of the chain occurs cyclically at every stage from Dr to Dr +1 and Dδ-1 to D0.

4.3 Markov Switching Models Many economic time series occasionally exhibit dramatic breaks in their behavior, associated with events such as financial crises or abrupt changes in government policy. Of particular interest to economists is the apparent tendency of many economic variables to behave quite differently during economic downturns, when underutilization of factors of production rather than their long-run tendency to grow governs economic dynamics. Abrupt changes are also a prevalent feature of financial data. Consider describing the consequences of a dramatic change in the behavior of a single variable yt. Suppose that the typical historical behavior could be described with a first-order auto regression,

85

with εt ∼ N (0, σ2), which seemed to adequately describe the observed data for t=1, 2, ..., t0. Suppose that at date t0 there was a significant change in the average level of the series, so that it is wished to describe the data according to

for t = t0 + 1, t0 + 2, ... This fix of changing the value of the intercept from c 1 to c2 might help the model to get back on track with better forecasts, but it is rather unsatisfactory as a probability law that could have generated the data. We surely would not want to maintain that the change from c1 to c2 at date t0 was a deterministic event that anyone would have been able to predict with certainty looking ahead from date t = 1. Instead there must have been some imperfectly predictable forces that produced the change. Hence, rather than claim that expression (4.34) governed the data up to date t0 and (4.35) after that date, there is some larger model encompassing them both,

where st is a random variable that, as a result of institutional changes, happened in our sample to assume the value st = 1 for t = 1, 2, ...., t0 and st = 2 for t = t0 + 1, t0 + 2, .... A complete description of the probability law governing the observed data would then require a probabilistic model of what caused the change from s t=1 to st=2. The simplest such specification is that st is the realization of a two-state Markov chain with

Assuming that st is not observe directly, but only infer its operation through the observed behavior of yt, the parameters necessary to fully describe the probability law governing yt are then the variance of the Gaussian innovation σ2, the autoregressive coefficient k, the two intercepts c1 and c2, and the two state transition probabilities, p11 and p22. The specification in (4.37) assumes that the probability of a change in regime depends on the past only through the value of the most recent regime, though, as noted below, nothing in the approach described below precludes looking at more general probabilistic specifications. But the simple timeinvariant Markov chain (4.37) seems the natural starting point and is clearly preferable to acting as if the shift from c1 to c2 was a deterministic event. Permanence of the shift would be represented by p22 = 1, though the Markov formulation invites the more general possibility that p22 < 1. Certainly in the case of business cycles or financial crises, we know that the situation, though dramatic, is not permanent. Furthermore, if the regime change reflects a fundamental change in monetary or fiscal policy, the prudent assumption would seem to be to allow the possibility for it to change back again, suggesting that p 22

86

< 1 is often a more natural formulation for thinking about changes in regime than p22 = 1. A model of the form of (4.36)-(4.37) with no autoregressive elements (k = 0) appears to have been first analyzed by [LIND78] and [BAUM80]. Specifications that incorporate autoregressive elements date back in the speech recognition literature to [PORI82], [JUAN85], and [RABI89] (who described such processes as “hidden Markov models”). [GOLD73] introduced Markov-switching regressions in econometrics. The likelihood function was first correctly calculated by [COSS85]. The formulation of the problem described here, in which all objects of interest are calculated as a by-product of an iterative algorithm similar in spirit to a Kalman filter, is due to [HAMI89] and [HAMI94]. General characterizations of moment and stationary conditions for such processes can be found in [FRAN01]. Suppose that the econometrician observes yt directly but can only make an inference about the value of st based on what we see happening with yt. This inference will take the form of two probabilities

for j = 1, 2, where these two probabilities sum to unity by construction. Here Ωt = {yt , yt−1 , ..., y1 , y0 } denotes the set of observations obtained as of date t, and θ is a vector of population parameters, which for the above example would be θ = (σ2, k, c1 , c2 , p11 , p22 ), and which for now we presume to be known with certainty. The inference is performed iteratively for t = 1, 2, ... , T , with step t accepting as input the values

for i = 1, 2 and producing as output (4.38). The key magnitudes one needs in order to perform this iteration are the densities under the two regimes,

for j = 1, 2. Specifically, given the input (4.39) it can be calculated the conditional density of the tth observation from

and the desired output is then

87

As a result of executing this iteration, the evaluation of the sample conditional log likelihood of the observed data will have succeeded

for the specified value of θ. An estimate of the value of θ can then be obtained by maximizing (4.43) by numerical optimization. Several options are available for the value ξi0 to use to start these iterations. If the Markov chain is presumed to be ergodic, one can use the unconditional probabilities

Other alternatives are simply to set ξi0 = ½ or estimate ξi0 itself by maximum likelihood. The calculations do not increase in complexity if it is considered a (r × 1) vector of observations yt whose density depends on N separate regimes. Let Ωt = {yt , yt−1 , ..., y1} be the observations through date t, P be an (N × N) matrix whose row j , column i element is the transition probability pij , ηt be an (N × 1) vector whose jth element f (yt |st = j, Ωt−1; θ) is the density in regime j, and ξt|t an (N × 1) vector whose jth element is Pr(st = j |Ωt , θ). Then (4.41) and (4.42) generalize to

where 1 denotes an (N × 1) vector all of whose elements are unity and denotes element-by-element multiplication. Vector applications include describing the movements between stock prices and economic output and the tendency for some series to move into recession before others. There further is no requirement that the elements of ηt be Gaussian densities or even from the same family of densities. For example, [DUEK97] studied a model in which the degrees of freedom of a Student t distribution change depending on the economic regime. One is also often interested in forming an inference about what regime the economy was in at date t based on observations obtained through a later date T, denoted ξt|T. These are referred to as “smoothed” probabilities, an efficient algorithm for whose calculation was developed by [KIM_94]. The calculations in (4.45) and (4.46) remain valid when the probabilities in P depend on lagged values of yt or strictly exogenous explanatory variables. However, often there are relatively few transitions among regimes, making it 88

difficult to estimate such parameters accurately, and most applications have assumed a time-invariant Markov chain. For the same reason, most applications assume only N = 2 or 3 different regimes, though there is considerable promise in models with a much larger number of regimes, either by tightly parameterizing the relationship between the regimes or with prior Bayesian information. In the Bayesian approach, both the parameters θ and the values of the states s = (s1, s2, ..., sT )’ are viewed as random variables. Bayesian inference turns out to be greatly facilitated by Monte Carlo Markov chain methods, specifically the Gibbs sampler. This is achieved by sequentially (for a = 1, 2, ...) generating a realization θ(a) from the distribution of θ |s(a −1), ΩT followed by a realization of s(a) from the distribution of s |θ(a), ΩT. The first distribution, θ |s (a −1), ΩT, treats the historical regimes generated at the previous iteration, s1a−1, s2a−1, ... , sTa−1, as if fixed known numbers. Often this conditional distribution takes the form of a standard Bayesian inference problem whose solution is known analytically using natural conjugate priors. For example, the posterior distribution of k given other parameters is a known function of easily calculated OLS coefficients. An algorithm for generating a draw from the second distribution, s |θ (a), ΩT, was developed by [ALBE93]. The Gibbs sampler turns out also to be a natural device for handling transition probabilities that are functions of observable variables. It is natural to want to test the null hypothesis that there are N regimes against the alternative of N + 1, for example, when N = 1, to test whether there are any changes in regime at all. Unfortunately, the likelihood ratio test of this hypothesis fails to satisfy the usual regularity conditions, because under the null hypothesis, some of the parameters of the model would be unidentified. For example, if there is really only one regime, the maximum likelihood estimate p11 does not converge to a well-defined population magnitude, meaning that the likelihood ratio test does not have the usual χ2 limiting distribution. To interpret a likelihood ratio statistic one instead needs to appeal to the methods of [HANS92] or [GARC98]. An alternative is to rely on generic tests of the hypothesis that an N-regime model accurately describes the data, though these tests are not designed for optimal power against the specific alternative hypothesis of N + 1 regimes. A test (recently proposed by [CARR04] that is easy to compute but not based on the likelihood ratio statistic seems particularly promising. Other alternatives are to use Bayesian methods to calculate the value of N implying the largest value for the marginal likelihood or the highest Bayes factor, or to compare models on the basis of their ability to forecast. A specification where the density depends on a finite number of previous regimes, f (yt | st, st −1, ..., st−m, Ωt−1; θ) can be recast in the above form by a suitable redefinition of regime. For example, if st follows a 2-state Markov chain with transition probabilities Pr (st = j |st−1 = i) and m = 1, one can define a new regime variable s∗t such that f (yt | s∗t, Ωt−1; θ) = f (yt |st, st−1, ..., st−m, Ωt−1; θ) as follows:

89

Then s∗t itself follows a 4-state Markov chain with transition matrix

More problematic are cases in which the order of dependence m grows with the date of the observation t. Such a situation often arises in models whose recursive structure causes the density of yt given Ωt −1 to depend on the entire history yt−1, yt−2, ..., y1 as is the case in ARMA, GARCH, or state-space models. Consider for illustration a GARCH (1,1) specification in which the coefficients are subject to changes in regime, yt = ht vt, where vt ∼ N (0, 1) and

Solving (4.49) recursively reveals that the conditional standard deviation ht depends on the full history {yt−1, yt−2, ..., y0, st, st−1, ..., s1}. One way to avoid this problem was proposed by [GRAY96], who postulated that instead of being generated by (4.49), the conditional variance is characterized by

where

In Gray’s model, ht in (4.50) depends only on st since is a function of data Ωt−1 only. An alternative solution, due to [HAAS04], is to hypothesize N separate GARCH processes whose values hit all exist as latent variables at date t,

and then simply pose the model as yt = hst vt. Again the feature that makes this work is the fact that hit in (4.52) is a function solely of the data Ωt −1 rather than the states {st−1, st−2, ..., s1}. A related problem arises in Markov-switching state-space models, which posit an unobserved state vector zt characterized by

90

with vt ∼ N (0, In), with observed vectors yt and xt governed by

for wt ∼ N (0, Ir). Again the model as formulated implies that the density of yt depends on the full history {st, st−1, ..., s1}. [KIM_94] proposed a modification of the Kalman filter equations similar in spirit to the modification in (4.50) that could be used to approximate the log likelihood. A more common practice recently has been to estimate such models with numerical Bayesian methods, as in [KIM_99]. [HAMI05]

4.4 Regime switching in stock market returns In an influential paper, [HAMI89] suggested Markov switching techniques as a method for modeling non-stationary time series. In the [HAMI89] approach, the parameters are viewed as the outcome of a discrete-state Markov process. For example, expected returns in the stock market may be subject to occasional, discrete shifts. An extension of Hamilton’s approach is used to describe and analyze stock market returns. The Markov switching technique allows posing a variety of interesting new questions. Can we distinguish distinct regimes in stock market returns? How do the regimes differ? How frequent are regime switches and when do they occur? Are returns predictable, even after accounting for regime switches? Are regime switches predictable? The answers to these questions give a new set of stylized facts about stock market returns. 4.4.1 Are there regimes in stock market returns? In this section it is examine whether there is evidence of distinct regimes in stock market returns. In a formal sense, regimes switching econometric models refer to a situation in which stock market returns are drawn from two different distributions, with some well-defined stochastic process determining the likelihood that each return is drawn from a given distribution. Regime switching in returns could arise in several ways for example, economy’s endowment switches between high economic growth and low

91

economic growth accounting for several features of stock market returns, such as leptokurtosis and mean reversion. Previous papers that have estimated Markov switching models on stock market returns have faced a problem in testing the null hypothesis of no switching against the alternative hypothesis of switching. The problem is that the transition probabilities are not identified under the null hypothesis of no switching. Under these circumstances, the asymptotic distributions of likelihood ratio, Lagrange multiplier and Wald tests are non-standard. Recent econometric work by Hansen [HANS92] has suggested a method for calculating the nonstandard asymptotic distribution. [GARC92] shows how Hansen’s work can be applied to the problem of testing for Markov switching by treating the transition probabilities as nuisance parameters. To determine whether there is switching in stock market returns, both cases (switching against no switching) have to be considered and after testing them, determine which one fits best. According to [SCHA97] the conclusion for US stock market returns is clear, there is strong evidence of regime switching. The evidence for switching is robust to different specifications of the nature of switching and the conclusion does not appear to be sensitive to the estimation period. 4.4.2 Different specifications of switching The nature of the switching one finds in the data will depend on the economic forces that give rise to switching behavior. Suppose, for example, that the source is time variation in the uncertainty of excess stock returns. If this is variation in the diversifiable component of returns, then mean returns might be the same across regimes while their volatility differs. However, if the diversifiable risk component is the source of this variation, we might expect the high-variance state to have higher mean returns. On the other hand, [BLAC76] suggested that the leverage effect might cause higher variances to be associated with lower average returns. These differences make it interesting to consider how the behavior of mean returns and their variances are related across regimes. Switching in means The first specification to examine is the one in which stock market returns are drawn from two distributions which differ only in their means. The most striking feature of the estimates is the enormous difference in mean returns between the two regimes.

92

Switching in variances Another specification that can be examined is the one in which stock market returns are drawn from two distributions that differ only in their variances. Switching in means and variances The third specification that can be examined is the one in which stock market returns are drawn from distributions that differ in both means and variances. [SCHA97] studies these three cases and concludes that the stock market is characterized by a state in which risk is relatively low and investors earn more than they would by holding treasury bills and a state in which risk is substantially higher and investors lose money. The negative returns are associated with a higher variance than the positive returns. The diagnostic tests show no evidence of omitted ARCH effects. This is particularly interesting because stock market returns show strong ARCH effects, as has been widely documented. Specifications assume that a first-order Markov chain can adequately describe stock market returns. In other words, they assume that the current state is a function only of the previous state, not a function of the state two periods ago or further back. The results of [SCHA97] will be applied to the prediction program developed and that will be explained in Chapter 6. Hence, the model will be forecast with a first order autoregressive model (AR (1)). 4.4.3 Multivariate specifications Several recent studies have found evidence that stock market returns can be predicted using macroeconomic variables. Several other studies have provided evidence that stock market returns do not follow a random walk and, more specifically, that future returns can be predicted on the basis of returns over the previous several years. A natural question is whether, after controlling for switching, there is still evidence that stock market returns can be predicted using macroeconomic variables. Fads models provide one possible economic motivation for Markov switching processes in which stock market returns are predictable. In a world with fads, [CUTL91] shows that return will be correlated with lagged values of the price/dividend ratio. If allowance is made for state-dependent heteroscedasticity, this yields a switching model in which the price/dividend ratio predicts returns. The predictability of returns can also come from time variation in the risk premium in an efficient market.

93

A second economic motivation for a Markov switching process in which returns are predictable is regime switching in endowments. Such models have been proposed by [CECC90] and [KAND90] as ways of accounting for stock market anomalies such as mean reversion. [NORD96] provide simulation evidence that regime switching in endowments in a Lucas asset pricing model will lead to regime switching in returns. The non-linear predictability of returns could also arise in a switching specification as a result of stochastic bubbles of the kind proposed by [BLAN82].

4.4.4 Time variation in transition probabilities In this section it is examine whether the transition probabilities vary over time. In particular, whether the transition probabilities are influenced or not by the price/dividend ratio. In addition to econometric novelty, there are two motivations for allowing transition probabilities to depend on the price/dividend ratio. First, that the price/dividend ratio appears to influence future returns. It is possible that this apparent predictive power arises because the price/dividend ratio is useful in determining the state in the next period. Second, a `bad’ state for investors involves lower returns and greater risk. It would be highly desirable for investors if they were able to predict that regime would occur in the following period, based on currently available information. [SCHA97] states after studying both options, that although the effect of the price/dividend ratio on the transition probabilities is provocative, it is not very precisely measured. The `bad’ state is therefore hard to predict, at least using the price/dividend ratio.

94

4.5 Application of the Markov Switching Autoregressive Model in a financial time series [ENGE90] found that Markov switching model of exchange rate generates better forecasts than random walk. [YUAN11] proposed an exchange rate forecasting model, which combines the multi-state Markov-switching model with smoothing techniques. The Markov switching autoregressive models applied a great variety of specifications. These models can be applied where the autoregressive parameters, the mean or the intercepts, are regime-dependent Here it is presented a point forecasting method into Markov switching autoregressive model. Three regimes were defined in the model developed and described in Chapter 6. The regimes differentiate whether the period is stable, pseudo-turbulent or turbulent. The Markov switching model was introduced by [HAMI89]. A Markov switching autoregressive model (MS-AR) of three states with an AR process of order 1 (AR (1)) is written as:

with εt1 ∼ N (0, σ12), εt2 ∼ N (0, σ22), εt3 ∼ N (0, σ32). Therefore, the parameters that need to be estimated are c1, c2, c3, k1, k2, k3, σ12, σ22 and σ32. This model is applied to the monthly range variation of Santander bank quotes between January 2003 and March 2012. Each period is categorized as stable, pseudo-turbulent or turbulent and the predictions are made afterwards. The following graph shows these range variations and their respective period type.

95

Figure 4-1. Monthly range variation of Santander bank quotes between January 2003 and March 2012

The results of the autoregressive model, applied to the monthly range variation of Santander quotes between January 2003 and March 2012, are shown in the following figures. This was obtained with the program that has been developed in MATLAB. The parameters are first estimated for zero drift (c1=0, c2=0, c3=0), and after for constants that are not zero (nonzero drift). Table 4.1. Estimation of the parameters, for monthly range variation of Santander quotes between January 2003 and March 2012, with zero drift

96

Table 4.2. Estimation of the parameters, for monthly range variation of Santander quotes between January 2003 and March 2012, with nonzero drift

Similarly, [MOST12] outlines techniques for point forecasting into Markov switching autoregressive model in the case of two regimes on the fluctuations of U.S. Dollar/ Euro exchange rate. The regimes describe the periods of downtrend of exchange rates and periods of uptrend of exchange rates. The fluctuations of U.S. Dollar/ Euro exchange rate have jumps in their behavior. Markov Switching models by a change in their regimes themselves will up to date, when jumps arise in time series data. Hence, this model can be useful for modeling and forecasting this data, which is also confirmed by this study. The finding demonstrated that MS-AR achieved superior forecasts relative to the random walk with drift. The results of out-of-sample forecast indicated that the fluctuations of U.S. Dollar/ Euro exchange rate from May 2011 to May 2013 would be rising.

97

98

Chapter 5

5. Bayesian Data Analysis 5.1 Introduction Statistics can be defined as the discipline that provides a methodology to collect, to organize, to summarize and to analyze a set of data. Data analysis can be divided into two ways of analysis: exploratory data analysis and confirmatory data analysis. The former is used to represent, describe and analyze a set of data through simple methods in the first stages of statistical analysis. The latter is applied to make inferences from data, based on probability models. In the same way, confirmatory data analysis is divided into two branches depending on the adopted approach. The first one, known as frequent, is used to make the inference of the data resulting from a sampling through classical methods. The second branch, known as Bayesian, goes further in the analysis and adds to those data the prior knowledge that the researcher has about the treated problem. [SALG07]

Figure 5-1. Data analysis

As far as Bayesian analysis is concerned and according to [GELM04], the process can be divided into the following three steps: 99

-

Set up a full probability model, through a joint probability distribution for all observable and unobservable quantities in a problem.

-

Condition on observed data, obtaining the posterior distribution.

-

Finally, to evaluate the fit of the model and the implications of the resulting posterior distribution.

f (θ, ), known as the joint probability distribution (or f ( | ), if there are several parameters θ), is obtained by means of f (θ, ) = f ( |θ)f (θ)

f ( , ) = f ( | )f ( )

(5.1)

where is the set of sampled data. So this distribution is the product of two densities that are referred to as the sampling distribution f ( |θ) (f ( | )) and the prior distribution f (θ) (f ( )). The sampling distribution, as its name suggests, is the probability model that the researcher assigns to the statistics (set of statistics) to be studied after the data have been observed. Here, an important problem stands up in relation to parametric approach due to the fact that the probability model that the researcher chooses could not be adequate. The nonparametric approach overcomes this inconvenient, as it will be seen later. When is considered fixed, so it is function of θ ( ), the sampling distribution is called the likelihood function and obeys the likelihood principle, which states that for a given sample of data, any two probability models f ( |θ) (f ( | )) with the same likelihood function yield the same inference for θ ( ). The prior distribution does not depend upon the data. Accordingly, it contains the information and the knowledge that the researcher has about the situation or problem to be solved. When there is not any previous significant population from which the engineer can take his knowledge, that is, the researcher hasn’t got any prior information about the problem; a noninformative prior distribution must be used in the analysis in order to let the data speak for themselves. Hence, it is assumed that the prior knowledge will have very little importance in the results. But most non-informative priors are ”improper” in that they do not integrate to 1, and this fact can cause problems. In these cases it is necessary to be sure that the posterior distribution is proper. Another possibility is to use an informative prior distribution but with an insignificant weight (around zero) associated to it. Though the prior distribution can take any form, it is common to choose particular classes of priors that make computation and interpretation easier. These are the conjugate priors. A conjugate prior distribution is one which, when

100

combined with the likelihood function, gives a distribution that falls in the same class of distributions as the prior. Furthermore, and according to [KOOP03], a natural conjugate prior has the additional property that it has the same form as the likelihood does. But it is not always possible to find this kind of distribution and the researcher has to manage a lot of distributions to be able to give expression to his prior knowledge about the problem. This is another handicap that the nonparametric approach reduces. There are three different points of view corresponding to different styles of Bayesians to choose the prior distribution: -

Classical Bayesians consider that the prior is a necessary evil and priors that interject the least information possible should be chosen.

-

Modern parametric Bayesians consider that the prior is a useful convenience and priors with desirable properties such as conjugate should be chosen. They remark that given a distributional choice, prior hyper-parameters that interject the least information possible should be chosen.

-

Subjective Bayesians give essential importance to the prior, in the sense they consider it as a summary of old beliefs. So prior distributions that are based on previous knowledge (either the results of earlier studies or non-scientific opinion) should be chosen.

Returning to Bayesian data analysis process, simply conditioning on the observed data and applying the Bayes Theorem, the posterior distribution, namely f (θ | ) (f ( | )), yields: f (θ | ) = f (θ, ) f ( ) = f (θ)f ( |θ) f ( ) f ( | ) = f ( , ) f ( ) = f ( )f ( | ) f ( )

(5.2) (5.3)

where

is known as the prior predictive distribution, since it is not conditional upon a previous observation of the process and is applied to an observable quantity. An equivalent form of the posterior distribution displayed above omits the prior predictive distribution, since it does not involve ( ) and the interest is

101

based on learning about θ ( ). So, with fixed , it can be said that the posterior distribution is proportional to the joint probability distribution f (θ, ). Once the posterior distribution is calculated, some kind of summary measure will be required to estimate the uncertainty about the parameter θ ( ). This is due to the fact that the posterior distribution is a high- dimensional object and its use is not practical for a problem. That measure which will summarize the posterior distribution can be the posterior mean, mode, median or variance, apart from others. Its choice will depend on the requirements of the problem. So the posterior distribution has a great importance since it lets the researcher manage the uncertainty about θ ( ) and provide him information about it taking into account both his prior knowledge and the data collected from sampling on that parameter. According to [MATE06], it is not difficult to deduce that posterior inference will fit in the non-Bayesian as long as the estimation that the researcher gives to the parameter θ ( ) is the same as the one resulting from the sampling. Once the data have been observed, a new unknown observable quantity can be predicted for the same process through the posterior predictive distribution, namely f ( | ): f ( | ) = f ( , θ | )dθ = f ( |θ, )dθ = f ( |θ)f (θ| )dθ

(5.6)

In conclusion, the basic idea is to update the prior distribution f (θ) through Bayes theorem by observing the data in order to get a posterior distribution f (θ | ). Then a summary measure or a prediction for new data can be obtained from f (θ | ). Table 2.1 reflects what has been said. Table 5.1. Distributions in Bayesian Data Analysis

102

5.2 Bayesian Analysis for Normal and other distributions 5.2.1 Univariate Normal distribution The basic model to be discussed concerns an observable variable, normally distributed with mean µ and unknown variance σ2: y |µ, σ2 ~ N(µ, σ2)

(5.7)

The likelihood function for a single observation is

This means that the likelihood function is proportional to a Normal distribution, omitting those terms that are constant. Consider n independent observations y1, y2, ... , yn . According to the previous section, the parameters to estimate are µ and σ2: = (θ1, θ2) = (µ, σ2)

(5.9)

A full probability model must be set up through a joint probability distribution: f ( , (y1 , y2 , . . . , yn )) = f ( , ) = f ( | )f ( )

(5.10)

The likelihood function for a sample of n observations in this case is:

As it was recommended previously, a conjugate prior will be chosen; in fact, it will be a natural conjugate prior. According to [GELM04], this likelihood function suggests a conjugate prior distribution of the form f ( ) = f (µ, σ2) = f (µ | σ2)f (σ2)

(5.12)

where the marginal distribution of σ2 is the Scaled Inverse-χ2 and the conditional distribution of µ given σ2 is Normal [SALG07]: µ | σ2 ≃ N (µ0, σ2 V0)

(5.13)

σ2≃ Inv − χ 2 (µ0, s02)

(5.14)

103

So the joint prior distribution is: f ( ) = f (µ, σ2) = f (µ | σ2)f (σ2) ∝ N − Inv − χ2 (µ0 , s02 V0 , ν0 , s02)

(5.15)

Its four parameters can be identified as the location and scale of µ and the degrees of freedom and scale of σ2, respectively. As a natural conjugate prior was employed, the posterior joint distribution will have the same form that the prior has. So, conditioning on the data, and according to Bayes Theorem: f ( | ) = f (µ, σ2| ) = f ( |µ, σ2)f (µ, σ2) ∝ N − Inv − χ2 (µ1, s12 V1, ν1, s12) (5.16) where it be can shown that µ1= (V0-1 + n)−1 (V0-1 µ0 + n y) V1 = (V0-1 + n)−1 ν1 = ν0 + n ν1 s12 = ν0 s02 + (n − 1) s2 +

(5.17) (5.18) (5.19) (5.20)

All these equations evidence that Bayesian inference combines prior and posterior information. The first term means that posterior mean µ1 is a weighted mean of prior mean µ0 and empirical mean divided by the sum of their respective weights, where these are represented by V0-1 and the simple size n. The second term represents the importance that posterior mean has and it can be seen as a compromise between the sample size and the significance given to the prior mean. The third term indicates that the degrees of freedom of posterior variance are the sum of the prior degrees of freedom and the sample size. That is, the prior degrees of freedom can be understood as a fictitious sample size on which the expert’s prior information is based. The last term explains the posterior sum of square errors as a combination of prior and empirical sum of square errors plus a term that measures the conflict between prior and posterior information. The marginal posterior distributions are: µ | σ2, y ≃ N (µ1, σ2 V0)

104

(5.21)

σ2|y ≃ Inv − χ2 (ν1 , s12)

(5.22)

Integrating σ2, the marginal for µ will be a t-distribution: µ |y ≃ tν1 (µ1, s12 V0)

(5.23)

An application to the Spanish Stock market would be the following, supposing that the monthly close values associated with Ibex 35 are normally distributed. Taking the values at which the Spanish index closed during the first two weeks in January in 2006, it can be shown that the mean was 10893.29 and the standard deviation was 61.66. So the non-Bayesian approach would inference a Normal distribution with the previous mean and standard deviation. Guessing that the Ibex 35 evolution in January would decrease slightly, the mean close value at the end of the month would be around 10870 and, hence, the standard deviation would be higher, around 100. Then, according to the previous formulas, the posterior parameters would be µ1=10872, V1=0. 0091, ν1=110 and s1=96, This means that there is a difference of almost 20 points between the Bayesian estimation and the non-Bayesian for the mean close value of January. When the month of January would have passed, comparing both results will note that the Bayesian estimation was closer to the finally real mean close value and standard deviation.

5.2.2 Other distributions

As it has just been made with the Normal distribution, a Bayesian analysis for other distributions could be done. For instance, the exponential distribution is commonly used in reliability analysis. Because this project will deal with the Normal distribution for the likelihood, it will not be explained in detail the analysis with other distributions. Table 5.2 shows the conjugate prior and posterior distributions for other likelihood distributions. More details can be found in [GELM04].

105

Table 5.2. Conjugate distributions for other likelihood distributions

5.3 Nonparametric Bayesian To overcome the limitations that have been mentioned before, it is the nonparametric approach the one which achieves to get through and to reduce the restrictions of the parametric approach. This kind of analysis can be performed through the so-called Dirichlet Process, which allows expressing in a simple way the prior distributions or the distribution family of F, where F is the distribution function of the studied variable. This process has a parameter, called α, which is transformed into a distribution probability. According to [MATE06], a Dirichlet Process for F (t) requires:

106

-

A previous proposal for F (t), F0 (t), that corresponds to the distribution function that remarks the prior knowledge which the engineer has and it is denoted by:

-

A measure of the confidence about the previous proposal, denoted by M, and whose values can vary between 0 and ∞, depending on

whether there is a total confidence in the data or in the previous proposal respectively.

It can be demonstrated that the posterior distribution for F (t), a sampling over n data, is given by n

(t) = pn Fn (t) + (1 − pn) Fn (t)

where Fn (t) is the empirical distribution function and pn =

n

(t), with (5.25)

.

With this approach not only the limitation of the parametric approach related to the probability model of the variable to study is avoided, since no hypothesis is required, but also it allows us to confer a quantified importance to the prior knowledge. The prior knowledge is given by the specialist depending on the confidence on the certainty about his knowledge.

5.4 Bayesian Numerical Computation MCMC Bayesian computation arose in the 1980s from two independent sources, the statistical physics heritage as represented by [GEMA84], and the EM heritage as represented by [TANN87]. A synthesis of these two traditions occurred in the important work of [GELF90]. Like the former, they employed the Gibbs sampling version of MCMC. Like the latter, they focused on traditional statistical models and relied on the use of latent variables to create iterative sampling schemes. Their paper provided many examples to illustrate the easiness of use and effectiveness of iterative sampling, and clarified the relationship between the data augmentation algorithm and the Gibbs sampler. The framing of data augmentation as MCMC also raised some new and interesting theoretical issues in the analysis of the MCMC output. For example, in data augmentation the estimate of an expectation E (g(θ)|y) is given by:

107

where zi are the currently sampled values for the latent variable z. [GELF90] refer to the use of this estimate, instead of the usual estimate:

as Rao–Blackwellization proposed. [GELF90] reasoned that if the zi are independently drawn, as in a final iteration of the data augmentation algorithm, then clearly Rao–Blackwellization will reduce estimation error. They did not analyze the situation when the samples are dependent, as when the samples are generated from the Gibbs sampling process. The superiority of the Rao– Blackwellized estimator in the two-component Gibbs sampler was later established in [LIU_94]. After the publication of Gelfand and Smith’s influential paper, many mainstream statisticians began to adapt the use of MCMC in their own research, and the results in these early applications quickly established MCMC as a dominant methodology in Bayesian computation. However, it should be noted that in any given problem there could be great many ways to formulate a MCMC sampler. In simulating an Ising model [SWEN87], for example, one can try to flip each spin conditional on the rest, or flip a whole set of spins connected by (artificially introduced) bonds that are sampled alternatively with the spins. The effectiveness of the [SWEN87] algorithm in the Ising model does not simply stem from the fact that it is a Gibbs sampler, but rather depends critically on the clever design of the specific form of the sampler. A large part of the success of MCMC in the early 1990s was based on versions of Gibbs samplers that were designed to exploit the special structure of statistical problems in the style of the EM and data augmentation algorithms. Thus, the emergence of MCMC in mainstream Bayesian inference has depended as much on the introduction of the mathematically elegant MCMC formalism as on the realization that the structure of many common statistical models can be fruitfully exploited to design versions of the algorithm that are feasible and effective for these models. The appearance of [GELF90] marked the end of the emergence of the MCMC approach to the study of posterior distributions, and the beginning of an exciting period, lasting to this day, of the application of this approach to a vast array of problems, including inference in non-parametric problems. Advanced techniques have also been developed in this framework to accelerate convergence. Statisticians, no longer laggards in MCMC methodology, now rival physicists in the advancement of MCMC methodology. More information can be found in [TANN10].

108

Bayesian approaches are natural in the analysis of financial models. It has long been recognized that Bayesian thinking is relevant to fundamental questions about risk and uncertainty. Many modern financial models have a sequential structure expressed in terms of dynamics of latent variables. In these models, the basic Bayesian advantage in coherent assessment of uncertainty is coupled with powerful computational methods. An example of Bayesian financial models is studied in [HORE08], which reviews two approaches to the Bayesian analysis of sequential models and attempts to illustrate ways to apply them to models derived from financial and economic theory. Particle filtering methods allow to quickly and dynamically update the inferences about state and, in some cases, parameters. The advantage of this approach is its simplicity and wide applicability without asking the user to make difficult choices about algorithm details. More information about Bayesian analysis can be found in the International Society for Bayesian Analysis (ISBA) web page: http://bayesian.org/. The society promotes the development and application of Bayesian analysis useful in the solution of theoretical and applied problems in science, industry and government.

5.5 MCMC algorithms MCMC is a strategy for generating samples x(i) while exploring the state space X using a Markov chain mechanism. This mechanism is constructed so that the chain spends more time in the most important regions. In particular, it is constructed so that the samples x(i) mimic samples drawn from the target distribution p(x). MCMC is used when samples from p(x) cannot be drawn directly, but can evaluate p(x) up to a normalizing constant. In Bayesian statistics the posterior distribution p(x|y) contains all relevant information on the unknown parameters x given the observed data y. All statistical inference can be deduced from the posterior distribution by reporting appropriate summaries. This typically takes the form of evaluating integrals of the type:

109

of some function f (x) with respect to the posterior distribution. The problem is that these integrals are usually impossible to evaluate analytically. Moreover, when the parameter is multidimensional, even numerical methods may fail. Over the last ten years a barrage of literature has appeared concerned with the evaluation of such integrals by methods collectively known as Markov chain Monte Carlo (MCMC) simulation. The underlying rationale of MCMC is to set up a Markov chain in x with ergodic distribution p (x| y). Starting with some initial state x(0) one simulates M transitions under this Markov chain and records the simulated states x(j) , j = 1, . . . , M . The ergodic sample average

converges to the desired integral J (subject to some technical conditions), i.e., ˆJ provides an approximate evaluation of the art of MCMC, which is to set up a suitable Markov chain with the desired posterior as stationary distribution and to judge when to stop simulation, i.e., to diagnose when the chain has practically converged. In many standard problems it turns out to be surprisingly easy to define a Markov chain with the desired stationary distribution. [MULL08] Markov chain Monte Carlo (MCMC) methods use computer simulation of Markov chains in the parameter space. The Markov chains are defined in such a way that the posterior distribution in the given statistical inference problem is the asymptotic distribution. This allows using ergodic averages to approximate the desired posterior expectations. Several standard approaches to define such Markov chains exist, including Metropolis-Hastings, Gibbs sampling and reversible jump. Using these algorithms it is possible to implement posterior simulation in essentially any problem that allows point wise evaluation of the prior distribution and likelihood function.

5.5.1 The Metropolis-Hastings algorithm The Metropolis-Hastings (MH) algorithm is the most popular MCMC method [HAST70], [METR53]. The most practical MCMC algorithms can be interpreted as special cases or extensions of this algorithm. An MH step of invariant distribution p (x) and proposal distribution q(x⋆|x) involves sampling a candidate value x⋆ given the current value x according to q(x⋆ | x). The Markov chain then moves towards x⋆ with acceptance probability A (x , x⋆) = min{1, [ p(x )q (x⋆ | x)]−1 p(x⋆)q (x | x*) } otherwise it remains at x.

110

(5.30)

The pseudo-code [ANDR03] is shown in figure 5.2, while figure 5.3 shows the results of running the MH algorithm with a Gaussian proposal distribution q (x⋆| x(i)) = N (x(i), 100)

(5.31)

and a bimodal target distribution p(x) ∝ 0.3 exp(−0.2x2) + 0.7 exp( −0.2(x − 10)2)

(5.32)

for 5000 iterations. As expected, the histogram of the samples approximates the target distribution.

Figure 5-2. Metropolis-Hastings algorithm

111

Figure 5-3. Target distribution and histogram of the MCMC samples at different iteration points

The MH algorithm is very simple, but it requires careful design of the proposal distribution q (x⋆ | x). Many MCMC algorithms arise by considering specific choices of this distribution. In general, it is possible to use suboptimal inference and learning algorithms to generate data-driven proposal distributions. The transition kernel for the MH algorithm is : KMH (x (i +1) | x (i )) = q (x(i +1) | x (i)) A (x(i), x(i +1)) + δx(i) (x(i+1) r (x(i))

(5.33)

where r (x(i)) is the term associated with rejection

It is fairly easy to prove that the samples generated by MH algorithm will mimic samples drawn from the target distribution asymptotically. By construction, KMH satisfies the detailed balance condition p(x(i)) KMH (x(i+1) | x(i)) = p (x(i +1)) KMH (x(i) | x(i+1))

112

(5.35)

and, consequently, the MH algorithm admits p(x) as invariant distribution. To show that the MH algorithm converges, one needs to ensure that there are no cycles (aperiodicity) and that every state that has positive probability can be reached in a finite number of steps (irreducibility). Since the algorithm always allows for rejection, it follows that it is aperiodic. To ensure irreducibility, one simply needs to make sure that the support of q(·) includes the support of p(·). Under these conditions, one obtains asymptotic convergence. If the space X is small then it is possible to use minimization conditions to prove uniform (geometric) ergodicity. The independent sampler and the Metropolis algorithm are two simple instances of the MH algorithm. In the independent sampler the proposal is independent of the current state, q (x⋆| x (i)) = q (x⋆)

(5.36)

Hence, the acceptance probability is

This algorithm is close to importance sampling, but now the samples are correlated since they result from comparing one sample to the other. The Metropolis algorithm assumes a symmetric random walk proposal q (x⋆| x (i)) = q (x (i) | x⋆)

(5.36)

and, hence, the acceptance ratio simplifies to

Some properties of the MH algorithm are worth highlighting. Firstly, the normalizing constant of the target distribution is not required. One only needs to know the target distribution up to a constant of proportionality. Secondly, although the pseudo-code makes use of a single chain, it is easy to simulate several independent chains in parallel. Lastly, the success or failure of the algorithm often hinges on the choice of proposal distribution. This is illustrated in figure 5.4. Different choices of the proposal standard deviation σ⋆ lead to very different results. If the proposal is too narrow, only one mode of p (x) might be visited. On the other hand, if it is too wide, the rejection rate can be very high, resulting in high correlations. If all the modes are visited while the acceptance probability is high, the chain is said to “mix” well.

113

Figure 5-4. Approximations obtained using the MH algorithm with three Gaussian proposal distributions of different variances

5.5.2 The Gibbs sampler

Suppose an n-dimensional vector x and the expressions for the full conditionals p(xj | x1, . . . , xj-1, xj+1, . . . , xn). In this case, it is often advantageous to use the following proposal distribution for j= 1, . . . , n

The corresponding acceptance probability is:

114

That is, the acceptance probability for each proposal is one and, hence, the deterministic scan Gibbs sampler algorithm is often presented as shown in figure 5.5. [ANDR03]

Figure 5-5. Gibbs sampler

Since the Gibbs sampler can be viewed as a special case of the MH algorithm, it is possible to introduce MH steps into the Gibbs sampler. That is, when the full conditionals are available and they belong to the family of standard distributions (Gamma, Gaussian, etc.), one will draw the new samples directly. Otherwise, one can draw samples with MH steps embedded within the Gibbs algorithm. For n = 2, the Gibbs sampler is also known as the data augmentation algorithm, which is closely related to the expectation maximization (EM) algorithm. Directed acyclic graphs (DAGS) are one of the best-known application areas for Gibbs sampling. Here, a large-dimensional joint distribution is factored into a directed graph that encodes the conditional independencies in the model. In particular, if xpa(j) denotes the parent nodes of node xj , then:

115

It follows that the full conditionals simplify as follows

where ch(j) denotes the children nodes of xj . That is, one only needs to take into account the parents, the children and the children’s parents. This set of variables is known as the Markov blanket of xj. This technique forms the basis of the popular software package for Bayesian updating with Gibbs sampling (BUGS), which will be explained later in the chapter. Sampling from the full conditionals, with the Gibbs sampler, lends itself naturally to the construction of general purpose MCMC software. It is sometimes convenient to block some of the variables to improve mixing.

5.5.3 Reversible jump MCMC

Reversible jump MCMC attacks the more complex problem of model selection. Typical examples include estimating the number of neurons in a neural network, the number of splines in a multivariate adaptive splines regression (MARS) model, the number of sinusoids in a noisy signal, the number of lags in an autoregressive process, the number of components in a mixture, the number of levels in a change-point process, the number of components in a mixture of factor analyzers, the appropriate structure of a graphical model or the best set of input variables. Reversible jump is a mixture of MCMC kernels (moves). In addition, to the split and merge moves, one could have other moves such as birth of a component, death of a component and a simple update of the locations. The various moves are carried out according to the mixture probabilities (bk, dk, mk, sk, uk), as shown in figure 5.6. In fact, it is the flexibility of including so many possible moves that can make reversible jump a more powerful model selection strategy than schemes based on model selection using a mixture indicator or diffusion processes using only birth and death moves. However, the problem with reversible jump MCMC is that engineering reversible moves is a very tricky, time-consuming task. To get more information about reversible jump MCMC see [ANDR03].

116

Figure 5-6. Generic reversible jump MCMC

117

5.6 Bayesian Software

5.6.1 WinBUGS The BUGS (Bayesian inference Using Gibbs Sampling) project is concerned with flexible software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods. The project began in 1989 in the MRC Biostatistics Unit, Cambridge, and led initially to the `Classic' BUGS program, and then onto the WinBUGS software developed jointly with the Imperial College School of Medicine at St Mary's, London. Therefore, WinBUGS is part of the BUGS project, which aims to make practical MCMC methods available to applied statisticians. WinBUGS can use either a standard `point-and-click' windows interface for controlling the analysis, or can construct the model using a graphical interface called DoodleBUGS. WinBUGS is a stand-alone program, although it can be called from other software. WinBUGS can be downloaded for free from www.mrcbsu.cam.ac.uk/bugs/. WinBUGS is an interactive Windows version of the BUGS program for Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) techniques. WinBUGS allows models to be described using a slightly amended version of the BUGS language, or as Doodles (graphical representations of models), which can, if desired, be translated to a text-based description. The BUGS language is more flexible than the Doodles. A manual can be found in [SPIE03]. More information of the program is available in [LUNN00]. In this project WinBUGS will be used to estimate parameters for a financial series prediction. A Bayesian Markov chain Monte Carlo will be compute using this software as suggested by [MART07], which uses WinBUGS to predict influenza critical periods. A complete WinBUGS program consists of three parts: Model Specification, Data Input, and Starting Values. The model is based in a MCMC with different states: stable, pseudo turbulent and turbulent. The user will select the financial data that he desires to study and the starting values. The option for selecting the starting values allows including the previous knowledge of the user and then uses it for the prediction. Therefore, the knowledge of a trader can be used, resulting in a better prediction. Figure 5.7 shows the WinBUGS window when performing the Bayesian prediction to Santander bank monthly range variations between January 2003 and March 2012. This window pops up when asking in MATLAB to execute the prediction. This is done through the MatBUGS interface that is explained in the next section.

118

Figure 5-7. WinBUGS output for Santander three states prediction.

5.6.2 MatBUGS

MatBUGS is a MATLAB interface for WinBUGS and OpenBugs, which are programs for Gibbs sampling applied to hierarchical Bayesian models. The program developed in this project is written in MATLAB. Hence, in the case of this project MatBUGS will be used to connect with WinBUGS. Therefore, it will not be necessary to open WinBUGS, all the prediction can be made from a single MATLAB based program. However, it will be necessary to install both MATLAB and WinBUGS in the computer.

119

MatBUGS automatically generates the script file from the model, calls WinBUGS, reads in the results (using bugs2mat), and then performs simple convergence diagnostics. Thus it provides complete functionality. To read more information about MatBUGS see [MATB12]. MatBUGS has been integrated to the program developed in this project. Hence, the MATLAB program is able to perform a Bayesian prediction. Figure 5.8 shows the results of performing the WinBUGS approach to Santander bank monthly range variations between January 2003 and March 2012. The initial values to run the predictions that are seen in the figure can be modified and therefore including the previous knowledge of the user.

Figure 5-8. Bayesian results in the MATLAB interface for Santander three states prediction.

120

Chapter 6

6. Methodology One of the most complicated challenges in this project was to initially define the scope of the investigation. Trying to predict financial events of extreme nature in the economic field has strengthened the importance of selecting good cases in order to get the best possible conclusions. A big dare was to focus in a specific field given the tremendous amount of possibilities to develop a financial prediction methodology. It was hard to choose from all the options trying to pick the best one. This project has led to a MATLAB program that is able to analyze and predict in some way the state of the market. The program differentiates three possible states for a financial series: stable, pseudo turbulent and turbulent periods. The capability to detect the current financial state and predict what will happen in the immediate future gives an investor a great advantage to choose where to invest its money. The way to invest is different depending on the state of the market. Therefore, an investor can take benefit of knowing the type of period of a specific financial stock, index, commodity, currency, etc. Firstly, the incorporation of the data that will be analyzed is described. Then, there are two options to continue with the analysis: Non-Bayesian model and Bayesian model. Each of these processes with their respective predictions will also be described in this chapter.

6.1 Incorporation of market quotations To predict some financial stock, the first step is to include in the program databases with the prices of the financial series. These databases have been downloaded from the web page: http://es.finance.yahoo.com/. The data is daily market quotations. These databases are read from MATLAB and after the analysis can start. Any kind of database with daily financial time series data could be adapted to the format read by this program and the analysis could be performed.

121

The second step is to determine the type of data and the starting and ending date of the data that will be analyzed. The choices that the program gives regarding the type of data to be analyzed are the following: -

Closing quotes Is the data directly obtained from the database and represents the daily closing quote of the stock. The daily closing quotes plot for Santander from January 2003 to March 2012 is represented in figure 6.1.

Figure 6-1. Daily closing quotes plot for Santander from January 2003 to March 2012

-

Monthly returns The data analyzed is the return month to month. To calculate each return the following expression is used:

where rt is the return of month t and yt is the data of the last day of month t.

122

t=2, . . ., n. n is the total number of months. Performing this calculation the monthly returns expressed in percentage are obtained. The monthly returns plot for Santander from January 2003 to March 2012 is represented in figure 6.2.

Figure 6-2. Monthly returns plot for Santander from January 2003 to March 2012

-

Weekly returns The data analyzed is the return week to week. To calculate each return expression (6.1) is used, but now yt is the data of the last day of week each week. The weekly returns will be expressed in percentage. The weekly returns plot for Santander from January 2010 to March 2012 is represented in figure 6.3.

123

Figure 6-3. Weekly returns plot for Santander from January 2010 to March 2012

-

Daily returns The data analyzed is the return day to day. To calculate each return expression (6.1) is used, but now yt represents the data of day t. The daily returns will be expressed in percentage. The daily returns plot for Santander from March 2011 to March 2012 is represented in figure 6.4.

124

Figure 6-4. Daily returns plot for Santander from March 2011 to March 2012

-

Monthly range variations The monthly range variations are calculated from the daily returns. The maximum and minimum daily returns of each month are selected and the difference between these numbers is the range variation for that specific month. Rt = max{rt} – min{rt}

(6.2)

where Rt is the range variation of month t and rt represent daily returns. The monthly range variations will give a point per month and will be expressed in percentage. This type of data is the one that has been studied through the project because it gives the best results for analyzing the state of a financial time series. After trying the different choices to analyze the data this resulted to be the best one. Since there is only one point per month, studying the monthly range variations means that it is necessary to have data of various years. Therefore, it will be a medium to long-term approach. The monthly range variations plot for Santander from January 2003 to March 2012 is represented in figure 6.5.

125

Figure 6-5. Monthly range variations plot for Santander from January 2003 to March 2012

-

Weekly range variations It is calculated as the monthly range variations but this time using the timeframe of one week. Hence, the weekly range variations will give a point per week and will also be expressed in percentage. The weekly range variations plot for Santander from January 2010 to March 2012 is represented in figure 6.6.

126

Figure 6-6. Weekly range variations plot for Santander from January 2010 to March 2012

6.2 Non-Bayesian Model Before the Bayesian model is analyzed a non-Bayesian approach is studied whose results will also be important. In the non-Bayesian approach it is needed to differentiate between stable, pseudo turbulent and turbulent states before the prediction is developed.

6.2.1 Determination of stable, pseudo turbulent and turbulent periods

The first step is to plot the data that will be analyzed. The monthly range variations are the type of data that will give the best results. The second step is to apply Nelson rules to the data plotted. The program will represent Nelson rules points as Chapter 3 describes. The rules that will be 127

represented and used for the predictions will be the ones that the user marks in the interface window of the program. If none are market the default Nelson rules to use in the prediction will be 1, 2 and 3 as [REXA11] suggests. Periods that satisfy Nelson rules will be related with turbulent periods. For rule number 1 to occur it will be necessary that the data goes above or below 3 standard deviations with respect to the average. In this case that data will be an extreme event. If there is a period with two or more extreme events in less than 13 months that period will also be considered to satisfy rule 1. Rule number 2 may alert of a future extreme event [REXA11]. The representation of Nelson rules 1, 2 and 3 for the monthly range variations plot for Santander from January 2003 to March 2012 is shown in figure 6.7. This is shown in the main interface window of the program.

Figure 6-7. Nelson rules 1, 2 and 3 for the monthly range variations plot for Santander from January 2003 to March 2012

After Nelson rules are chosen, it will be necessary to differentiate white noise periods from non-white noise periods. Non-white noise periods will be related to turbulent periods and white noise periods to stable periods. The Rolling Windows technique will be used to explore and detect candidates for turbulent periods (non-white noise period). The steps to follow for this technique are: -

128

Establish a fixed range of number of periods to run the rolling windows technique. Although 18 will be the default value to run as is suggested in [REXA11] (18 months is considered to be the best option to get a

good white noise test result), the user can select the number that desires. -

Run lbqtest (res) in MATLAB. This is a white noise test that will determine which periods are white noise and which are not. The test can be run starting from the beginning or from the end, e.g., the last or the first number of periods (18 as default) cannot be tested because there is not enough data. Since it is more interesting to know what will happen in the future, the test that will be run is a backward test. Hence, the first periods or months (18 as default) won’t be categorized as white noise or non-white noise.

-

If p-value for the test is less than 10%, then the null hypothesis that states that the period is distributed according to a white noise will be rejected. Therefore, that period will be considered as a non-white noise period.

This test will be run in each period of the financial series, obtaining various points divided into white noise periods and non-white noise periods. This is shown in figure 6.8 for the monthly range variations plot for Santander from January 2003 to March 2012.

Figure 6-8. White noise test with an 18 rolling window size for the monthly range variations plot for Santander from January 2003 to March 2012

129

After computing Nelson rules and the white noise test the periods will be classified as follows: -

Stable periods: white noise periods that don’t satisfy any of the Nelson rules applied.

-

Pseudo turbulent periods:  White noise periods that satisfy at least one of the Nelson rules applied.  Non-white noise periods that don’t satisfy any of the Nelson rules applied.

-

Turbulent periods: non-white noise periods that satisfy at least one of the Nelson rules applied.

With the stable, pseudo turbulent and turbulent periods defined the program is able to represent each period with its characteristic state. This is shown in figure 6.9 for the example that has been represented before: the monthly range variations plot for Santander from January 2003 to March 2012.

Figure 6-9. Three states for the monthly range variations plot for Santander from January 2003 to March 2012

The user not only can represent the periods divided in three different states, but also in two states. In this case the financial time series will have only two types of periods: stable and turbulent. When the user wants to differentiate only two states (stable and turbulent), then the program classifies the periods which were before pseudo turbulent as stable.

130

The financial time series that was represented before in figure 6.9 with three states is now represented in figure 6.10 with only two states. As it can be seen, the periods that were considered before pseudo turbulent are now considered stable periods.

Figure 6-10. Two states for the monthly range variations plot for Santander from January 2003 to March 2012

6.2.2 Prediction methodology

After determining the states of the financial time series the non-Bayesian model is ready to start the prediction. The method will be based on the Markov switching model. It will be necessary to have different methods to implement the different types of periods (stable, pseudo-turbulent and turbulent periods). Stable periods will be modeled according to a Gaussian white noise process and turbulent and pseudo-turbulent periods (non-stable periods) according to a first-order autoregressive process with zero drift (AR (1)). The range variations can be represented with Yi, where i is representing the number of month. Each of these periods will be associated with a random variable Zi. If the period is stable, then Zi will be equal to 0, if it is pseudo turbulent Zi will be equal to 1 and if it is turbulent Zi will be equal to 2. According to this, the model will be:

131

Stable periods: Pseudo turbulent periods: Turbulent periods: To start using this model it is required to estimate the parameters:

Normality tests Since the model is based in normality, first, it has to be proven that the periods can be modeled with a normal distribution and that it behaves as a Gaussian white noise. The tests that will be performed are: KolmogorovSmirnov, Anderson-Darling, D'Agostino-Pearson, Shapiro-Wilks and Jarque-Bera. More than one test is performed because depending which one is executed the periods could be or not be considered normally distributed. For instance, for the Jarque-Bera test as measures are based on moments of the data, this test has a zero breakdown value. In other words, a single outlier can make the test worthless [BRYS04]. The fact that normal test can work better than other depending on the sample supports the idea of using various normal distribution tests. For example, [AISH11] states that Shapiro-Wilks test is the best normality test because this test rejects the null hypothesis of normality test at the smallest sample size compared to the other tests, for all levels of skewness and kurtosis of these distributions. It will be needed to ensure that p-value’s for these normality tests are greater than 5%, so that it cannot be rejected the normal distribution hypothesis. The prediction will continue even if only one test accepts normality. However, the program will give feedback to the user telling how many tests have been accepted. The user can also change the significance level and see the information of the five tests performed to the series. This is shown in figure 6.11.

Figure 6-11. Normality tests report for stable periods.

132

After to be sure that the data can be modeled with a normal distribution the prediction can continue.

Stable periods The average of the means and variances of all stable periods will be used to estimate and respectively. The results for the stable periods will be: -

Estimated density function of the range variations for stable periods. Box and whisker plot with a 95% confidence.

The results for the stable periods of the three states monthly range variations plot for Santander from January 2003 to March 2012 is shown in figure 6.12.

Figure 6-12. Stable periods of the three states monthly range variations plot for Santander from January 2003 to March 2012

Non-stable periods (turbulent and pseudo-turbulent periods) Both pseudo turbulent and turbulent periods are calculated the same way. Therefore to estimate and , where i = 1, 2:

where k is the number of period (periods that have two or more consecutive non-stable data) and is modeled according to a first-order autoregressive process (AR(1)). There are two options to model the AR (1):

133

With constant:

With no constant:

The preferred option is the one with no constant because will be greater and so will be the variance. For non-stable periods the variance is higher than in stable periods, that is why the option that gives higher variance (AR (1) with no constant) is the selected option for the prediction. The estimator for the variance will be:

where

where is the variance of period k (periods that have two or more consecutive non-stable data).

The results for the non-stable periods will be: -

Prediction interval with a 95% confidence: with its graph.

-

Graph for the prediction

-

Key measures for the error in the prediction:  

.

Mean absolute percentage error (MAPE), equation (3.22) Theil’s U, equation (3.23)

The results for the pseudo turbulent and turbulent periods of the three states monthly range variations plot for Santander from January 2003 to March 2012 is shown in figure 6.13.

134

Figure 6-13. Pseudo turbulent and turbulent periods of the three states monthly range variations plot for Santander from January 2003 to March 2012

As it can be seen the pseudo turbulent periods cannot be modeled as a normal distribution and therefore the program does not compute the prediction. In the case of the turbulent periods the significance level has to be 0.5 so that at least one test accepts the normal distribution and the prediction can be computed.

Probabilities of transition The probabilities of transition from one period to another have to be estimated.

Then:

135

Each of these probabilities is computed according to the transitions that have occurred in the past. Therefore, to calculate the probabilities of transition from one period to another the following equations will be used:

The probabilities of transition for the three states monthly range variations plot for Santander from January 2003 to March 2012 is shown in table 6.1. The probabilities show that if being in a particular state it is much more probable to stay in it than to change to another state.

136

Table 6.1. Probabilities of transition for the three states monthly range variations plot for Santander from January 2003 to March 2012

Steady state probabilities and expected durations After setting up the matrix of transition probabilities (transition matrix T):

Let P0 be the long-term probability of being in state 0 (stable), P1 be the long-term probability of being in state 1 (pseudo turbulent) and P2 be the longterm probability of being in state 2 (turbulent). Then setting up the vector: V = (P0 P1 P2)

(6.14)

The steady state probabilities are the components of vector V, such that, VT=V:

This will give 3 simultaneous equations. But one equation is redundant (because the probabilities in each row need to add to 1). Therefore, one of the 3 equations has to be thrown away (doesn't matter which one). But because the steady state probabilities need to add up to 1, it is necessary to add this additional equation: P0 + P1 + P2 = 1

(6.16)

Resulting now 3 simultaneous equations to solve, giving the answer for the steady state probabilities. The system of equations that has to be solved (throwing away the third equation of the (6.15) system of equations and adding equation (6.16)) is:

137

In the case of two states the same methodology will be followed obtaining the system of equations:

Considering that:

The steady state probabilities for a two state Markov chain results (stable state 0 and turbulent state 1):

The expected durations in each state is calculated with the average number of periods of stay in each state using the past data. Table 6.2 shows the steady state probabilities and expected durations of the three states monthly range variations plot for Santander from January 2003 to March 2012. Table 6.3 shows the same but for the case of only two states (stable and turbulent).

138

Table 6.2. Steady state probabilities and expected durations of the three states monthly range variations plot for Santander from January 2003 to March 2012

Table 6.3. Steady state probabilities and expected durations of the two states monthly range variations plot for Santander from January 2003 to March 2012

Auto regressive prediction The program developed in this project also allows the user to compute a pure autoregressive prediction. This prediction models each state with a different AR (1) model:

and also gives the errors of the prediction. The program is able to compute the AR (1) with zero drift (c=0) or nonzero drift. An example of the AR (1) computation with zero drift for the two states monthly range variations plot for Santander from January 2003 to March 2012 is shown in figure 6.4.

139

Figure 6-14. AR (1) computation with zero drift for the two states monthly range variations plot for Santander from January 2003 to March 2012

6.3 Bayesian Model As explained in Chapter 5, a Bayesian model goes further than a frequent model (non-Bayesian) in the analysis and adds to those data the prior knowledge that the researcher has about the treated problem. Another advantage when using a Bayesian computation is that the periods don’t have to be initially divided into different states. To compute the Bayesian approach the software that has been used is WinBUGS. To be able to run this directly from MATLAB the interface MatBUGS has been used. Hence, MATLAB will pass the information to WinBUGS and the results will be received and displayed in MATLAB. WinBUGS performs a number of iterations (this number is selected by the user) with the model supplied and outputs the information of the parameters. After the iterations the software distinguishes between different states.

140

Two different models have been developed depending on whether the user desires a two states (stable and turbulent) prediction or a three states (stable, pseudo turbulent and turbulent) prediction. It is considered that the states are distributed normally, with its mean modeled according to a first-order autoregressive process with zero drift. Stable periods: Pseudo turbulent periods: Turbulent periods: The WinBUGS model has to be provided with some initial distributions and initial values so that the prediction can be accomplished.

6.3.1 Two states prediction The WinBUGS model for the two states prediction contains the following prior distributions: For the mean parameters:

For the probabilities of transition:

and therefore:

For the variances, since should be less than has less variance than a turbulent state [MART07]:

because a stable state

141

where

The values of parameters of the distributions: roNormStable, roNormTurbulent, p00beta, p22beta, a and b are provided by the user. The user also feeds to the program some initial values to start the iterations. These values are for: More options that can be changed are the number of iterations to perform, the number of samples saved and the option to view the WinBUGS window. By providing these values the program adds the prior knowledge that the investor has. Consequently, the Bayesian prediction can be much more powerful than the non-Bayesian. The prediction in WinBUGS made to Santander monthly range variations with two states gives the outputs shown in the following figures. The results of MATLAB will be shown in section 6.3.3. Table 6.4 shows the results for the transition probabilities (P.mat), some of the state variables (comp), the standard deviations (lambda) and (ro). The states variables can be explained the following way: a comp (i)=1 means that period i is stable and comp (j)=2 means that period j is turbulent. Therefore, there is the same number of state variables and total periods of the financial series studied. Iterations assign each period a state variable equal to 1 or 2. Hence, the number of state variables that have been assigned for each period is equal to the number of iterations. Figure 6.15 presents the evolution of the previous results through the last 100 iterations that have been tested.

142

Table 6.4. WinBUGS output for Santander two states prediction.

143

Figure 6-15. WinBUGS history for Santander two states prediction.

6.3.2 Three states prediction The WinBUGS model for the three states prediction is a little different from the two states prediction. In this case, the model contains the following initial distributions: For the mean parameters:

144

For the probabilities of transition:

and therefore:

As explained before the variances, than , then:

should be less than

and

less

where

145

The values of the distributions: roNormStable, roNormPseudoTurbulent, roNormTurbulent, p01unifsup, p02unifsup, p10unifsup, p12unifsup, p20unifsup, p21unifsup, a and b are provided by the user. The user also feeds to the program some initial values to start the iterations. These values are for: By providing these values the program adds the prior knowledge that the investor has and the prediction becomes more powerful as happened with the two states prediction. The prediction in WinBUGS made to Santander monthly range variations with three states gives the outputs shown in the following figures. The results of MATLAB will be shown in section 6.3.3. Table 6.5 shows the results for the transition probabilities (P.mat), some of the state variables (comp), the standard deviations (lambda) and (ro). comp (i)=1 means that period i is stable, comp (j)=2 means that period j is pseudo turbulent and comp (k)=3 means that period k is turbulent. Figure 6.16 presents the evolution of the previous results through the last 100 iterations tested. Table 6.5. WinBUGS output for Santander three states prediction.

146

Figure 6-16. WinBUGS history for Santander two states prediction.

147

6.3.3 Bayesian results For both two states and three states prediction the results are: -

A plot with the probabilities that WinBUGS has computed of being in a specific state (comp). According to this plot and in the case of three states, y=1 is related with stable periods, y=2 is related with pseudo turbulent periods and y=3 is related with turbulent periods. In the case of two states, y=1 is related with stable periods and y=2 is related with turbulent periods. With this defined, the states are classified according to: o In the case of three states:  If y

financial forecasting system based on bayesian modeling - IIT

financial forecasting system based on bayesian modeling - IIT

Suggest Documents

Financial Forecasting Based on Artificial Neural

Financial Analysis, Modeling, and Forecasting Techniques

Technical Analysis on Financial Forecasting

Approximate Bayesian Forecasting

Bayesian Realized-GARCH Models for Financial Tail Risk Forecasting ...

Study on Forecasting System Based on knowledge ... - Science Direct

FINANCIAL MOBILE SYSTEM BASED ON ... - Semantic Scholar

Exception Based Modeling and Forecasting

Forecasting forest development through modeling based on the ...

Scene Modeling Based on Constraint System

CPN Based Modeling of Tourism Demand Forecasting

A Student Modeling based on Bayesian Network ...

Bayesian Forecasting of Parts Demand

A DESIGN FLOOD FORECASTING SYSTEM BASED ON THE

short term load forecasting system based on support vector kernel

Traffic Management and Forecasting System Based on ... - IEEE Xplore

Interval Time Series Analysis and Forecasting - IIT

HS205 Financial Accounting - IIT Mandi

A GIS-Based Flood Forecasting System - Esri

An Agent-based Bayesian Forecasting Model for ... - Semantic Scholar

Dynamic Decision Support System Based on Bayesian ... - CiteSeerX

An Auxiliary System for Medical Diagnosis Based on Bayesian Belief ...

modeling medium term hydroelectric system operation with large ... - IIT

Design of Reliable System Based on Dynamic Bayesian ... - ReliaSoft