A Concise Review of Classical Linear Regression ...

University of Dar es Salaam

Department of Economics

A CONCISE REVIEW OF CLASSICAL LINEAR REGRESSION MODEL ASSUMPTIONS WITH PRACTICE USING STATA Majune Kraido Socrates

June 2017

CONTENTS LIST OF TABLES ..................................................................................................................................... iv LIST OF FIGURES .................................................................................................................................... v LIST OF ACRONYMS ............................................................................................................................. vi ABSTRACT ............................................................................................................................................... vii CHAPTER ONE ......................................................................................................................................... 1 INTRODUCTION....................................................................................................................................... 1 1.1

Background ..................................................................................................................................... 1

1.2

Objectives......................................................................................................................................... 1

CHAPTER TWO ........................................................................................................................................ 2 DESCRIPTION OF DATA ........................................................................................................................ 2 2.1 Introduction ....................................................................................................................................... 2 2.1.1 Descriptive Statistics .................................................................................................................. 2 2.1.2 Graphical Analysis of Data ....................................................................................................... 7 CHAPTER THREE .................................................................................................................................. 14 CORRELATION ANALYSIS AND EARLY REGRESSION ANALYSIS ........................................ 14 3.1 Introduction ..................................................................................................................................... 14 3.2 Correlation Analysis ....................................................................................................................... 14 3.3 Early Regression Analysis .............................................................................................................. 16 CHAPTER FOUR..................................................................................................................................... 23 MULTICOLLINEARITY ........................................................................................................................ 23 4.1. Introduction .................................................................................................................................... 23 4.2 Test for Multicollinearity ............................................................................................................... 24 4.3 Conclusion ....................................................................................................................................... 25 CHAPTER FIVE ...................................................................................................................................... 26 HETEROSCEDASTICITY ..................................................................................................................... 26 5.1 Introduction ..................................................................................................................................... 26 5.2 Testing for heteroscedasticity ........................................................................................................ 26 5.2.1 Goldfeld-Quandt test ............................................................................................................... 27 5.2.2 White’s General Test ............................................................................................................... 32 5.2.3 Breusch-Pagan-Godfrey Test .................................................................................................. 32 5.3 Conclusion ....................................................................................................................................... 33 CHAPTER SIX ......................................................................................................................................... 34

ii

AUTOCORRELATION ........................................................................................................................... 34 6.1 Introduction ..................................................................................................................................... 34 6.2 Testing for autocorrelation ............................................................................................................ 34 6.2.1 Durbin-Watson Test ................................................................................................................ 34 6.2.2 Breush- Godfrey Test/ Lagrange Multiplier Test ................................................................. 36 CHAPTER SEVEN ................................................................................................................................... 37 ENDOGENEITY - SIMULTANEOUS EQUATIONS .......................................................................... 37 7.1 Introduction ..................................................................................................................................... 37 7.2 Background of Simultaneous Equations ....................................................................................... 37 7.3 Endogeneity Problem...................................................................................................................... 38 7.3.1 Two stage least square method ............................................................................................... 39 7.4 Conclusion ....................................................................................................................................... 40 CHAPTER EIGHT ................................................................................................................................... 41 COMPARATIVE REGRESSION ANALYSIS...................................................................................... 41 8.1 Introduction ..................................................................................................................................... 41 8.2 Nominal variables ........................................................................................................................... 41 8.3 Real Variables ................................................................................................................................. 44 8.4 Growth Variables ............................................................................................................................ 47 CHAPTER NINE ...................................................................................................................................... 51 CONCLUSION AND RECOMMENDATION ...................................................................................... 51 9.1 Introduction ..................................................................................................................................... 51 9.2 Summary of the study ..................................................................................................................... 51 9.3 Econometric recommendations ..................................................................................................... 52 9.4 Conclusion ....................................................................................................................................... 52 REFERENCES .......................................................................................................................................... 53 APPENDIX A ............................................................................................................................................ 54 APPENDIX B ............................................................................................................................................ 56 APPENDIX C ............................................................................................................................................ 58

iii

LIST OF TABLES Table 1: Descriptive Statistics for Nominal Variables............................................................... 6 Table 2: Descriptive Statistics for Real Variables ..................................................................... 6 Table 3: Descriptive Statistics for Growth Variables ................................................................ 7 Table 4: Correlation Results for Nominal Variables ............................................................... 14 Table 5: Correlation Results for Real Variables ...................................................................... 15 Table 6: Correlation Results for Growth Variables ................................................................ 15 Table 7: Early Regression Results using Nominal data .......................................................... 19 Table 8: Early Regression Results using Real data ................................................................. 20 Table 9: Early Regression Results using Growth data ............................................................ 21 Table 10: Results of Multicollinearity Test using Nominal, Real and Growth Data ............ 24 Table 11: Rank of GNP values................................................................................................... 27 Table 12: Regression Results for the First Sub-sample ........................................................... 28 Table 13: Results for the second Sub-sample ........................................................................... 29 Table 14: Results for Sub-samples of Real variables ............................................................... 30 Table 15: Results for Sub-samples of Growth variables ......................................................... 31 Table 16: White’s test for nominal, real and growth variables respectively ......................... 32 Table 17: Breusch-Pagan-Godfrey Test.................................................................................... 33 Table 18: Durbin Watson test rules (Elementary case) ........................................................... 35 Table 19: Durbin-Watson test decision matrix ........................................................................ 35 Table 20: Durbin Watson Results for nominal, real and growth variables .......................... 35 Table 21: Breush- Godfrey Test Results for nominal, real and growth variables ................ 36 Table 22: Comparative Regression Results for Nominal Variables....................................... 42 Table 23: Results Durbin–Wu–Hausman test for endogeneity (Nominal data) ................... 43 Table 24: estat firststage results (Nominal variables) ............................................................. 43 Table 25: Beta regression results for Nominal variables ........................................................ 44 Table 26: Comparative Regression Results for Real Variables.............................................. 45 Table 27: Results Durbin–Wu–Hausman test for endogeneity (Real data) .......................... 46 Table 28: estat firststage results (Real variables) .................................................................... 46 iv

Table 29: Beta regression results for Real variables ............................................................... 47 Table 30: Comparative Regression Results for Growth Variables ........................................ 48 Table 31: Results Durbin–Wu–Hausman test for endogeneity (Growth data) .................... 48 Table 32: estat firststage results (Growth variables ................................................................ 49 Table 33: Beta regression results for Growth variables .......................................................... 50 Table A- 1: Summary of Unit Root Test Results.................................................................................... 54 Table A- 2: Shapiro -Wilk W test for normality .................................................................................... 54 Table A- 3: Elasticity Results ................................................................................................................... 55 Table B- 1: Data ........................................................................................................................................ 56

Table C- 1: Do-File...................................................................................................................... 58

LIST OF FIGURES Figure 1: Negative and Positive Skewness .................................................................................. 4 Figure 2: Kurtosis ......................................................................................................................... 5 Figure 3: Actual vs. Fitted values and Kernel density function for nominal variables .......... 8 Figure 4: Nominal VS Real Plots for variables ........................................................................ 10 Figure 5: Growth of Nominal VS Real Plots for variables...................................................... 12 Figure 6: Money Supply vs. Interest rate; Money Supply vs. CPI ......................................... 13

v

LIST OF ACRONYMS

2SLS

- Two Stage Least Squares

ADF

- Augmented Dickey Fuller

AR

- Autoregressive

CLRM

- Classical Linear Regression Models

CPI

- Consumer Price Index

ESS

- Error Sum of Squares

GNP

- Gross National Product

IV

- Instrumental Variable

OLS

- Ordinary Least Squares

PP

- Phillips-Perron

RSS

- Residual Sum of Squares

TSS

- Total Sum of Squares

ZA

- Zivot-Andrews

vi

ABSTRACT The main objective of this study is to practically use Stata software to conduct data analysis. This is coupled with two specific objectives. The first is to conduct diagnostic tests of Multicollinearity, Heteroscedasticity, Autocorrelation and Endogeneity. These are violations of the CLRM assumptions. Endogeneity is analyzed through a system of simultaneous equations. The second objective is to analyze the effect of GNP, Interest rate, money supply and devaluation on investment. A hypothetical data with a sample size of 40 observations is used. Variables are arranged in three forms; nominal, real and growth (percentage). The dependent variable is investment while independent variables are GNP, money supply, CPI (for nominal and growth variables), interest rate and devaluation (dummy). We establish that there is multicollinearity among nominal and growth variables but it is absent in real variables. As a result, we run a regression based on real data as a remedy. Heteroscedasticy is present in variables in all forms but autocorrelation is absent among variables. Hence, we run robust regression models to handle the problem of heteroscedasticity. Using the Durbin–Wu–Hausman test, we establish that there is an endogeniety problem. As a result we run a 2SLS regression. In general we established that the most important determinant of Investment is GNP, followed by money supply, interest rate, CPI and devaluation (for nominal and growth variables). In the case of real variables the most important determinant of real investment is real GNP followed by real money supply, real interest rates and devaluation. We recommend that future studies should explore major time series econometrics problems such as unit root and co-integration. In line with the main objective, we provide a detailed Do-File from Stata in the Appendix. JEL Codes: C01, C22, C32, C82, C87 Key Words: Autocorrelation, Multicollinearity, Heteroscedasticity, Endogeneity, Simultaneous Equations, Stata Note: This paper is an assignment that was done during my first year (2017) of PhD Economics at the University of Dar es Salaaam. The views therein do not necessarily reflect the views of the university but they are of the author. For purposes of improving the paper, the author will highly appreciate

feedback

from

readers

on

[email protected].

vii

email;

[email protected]

or

CHAPTER ONE INTRODUCTION 1.1 Background This report is about the violations of the assumptions of a Classical Linear Regression Model (henceforth CLRM). The following violations are discussed; Multicollinearity, Heteroscedasticity, Autocorrelation and Endogeneity. Endogeneity is assessed in a system of Simultaneous Equations. For each violation, the theory together with tests and remedies are discussed, and they are also shown practically using data on Stata 14 software. To facilitate the discussion, we consider a study of the Determinants of Investment. A hypothetical data with a sample size of 40 observations is used. This is 80 percent of the original data with 50 observations1. Investments is the dependent variable while money supply, Gross National Product (henceforth GNP), interest rate and devaluation are the covariates. Consumer Price Index (henceforth CPI) is used as a deflator. The analysis is based on nominal variables, real variables as well as growth rate of variables in percentages. Natural logarithms of variables is also used to establish the elasticity of the explanatory variables. The report is arranged in the following way. After this brief introduction in Chapter One which also includes objectives, we describe the data in Chapter Two. Chapter Three is on Correlation Analysis and Early Regression Analysis, Chapter Four is on Multicollinearity, Chapter Five is on Heteroscedasticity, Chapter Six is on Autocorrelation while Chapter Seven is on EndogeneitySimultaneous Equations. Chapter Eight is on Comparative Regression Analysis while Chapter Nine concludes the study. 1.2 Objectives The main objective of this study is to practically use Stata software to conduct data analysis. The specific objectives include: a) To conduct diagnostic tests of Multicollinearity, Heteroscedasticity, Autocorrelation and Endogenity on Stata. b) To analyze the effect of GNP, Interest rate, money supply, CPI and devaluation on investment.

1

The appropriate command in Stata is sample 80 after loading the data.

1

CHAPTER TWO DESCRIPTION OF DATA 2.1 Introduction In this Chapter, preliminary examination of the data is conducted so as to depict its features. It is important to understand the basic features of data prior to undertaking any analysis. Therefore, both graphical evidence and descriptive statistics of all variables is presented2. The latter section entails conducting measures of dispersion, measures of central tendency and assessing the shape of frequency distributions using skewness and kurtosis3. 2.1.1 Descriptive Statistics The descriptive statistics conducted in this study includes; mean, standard deviation, coefficient of variation, skewness and kurtosis. These measures are discussed in that order for variables in nominal form, real form and growth form. The theory in this section is based on (Salvatore & Reagle, 2002).

2.1.1.1 Mean The mean or the arithmetic mean measures the average value of given data points. In practice, the mean of a certain dataset is the sum of all the values divided by the number of observations. It is usually denoted as 𝑋̅ and its formula is as follows: 𝑛

∑ 𝑋 𝑋̅ = 𝑖=1 𝑖 where 𝑋𝑖 is the value of the 𝑖𝑡ℎ observation and n is the total number of observations. 𝑛

2.1.1.2 Standard deviation Standard deviation measures the degree of dispersion or variation of observations from the mean. It is usually denoted as S and its formula is as follows:

2

This mixture of numerical and graphical summaries complement one another and they aid in understanding the basic features of the data (Geda, Ndung'u, & Zerfu, 2012). 3 Measures of central tendency help to locate the center of the distribution while measures of dispersion show how observations are spread out on either side of the center. Skewness is the third moment while kurtosis is the fourth moment.

2

̅ 2 ∑𝑛 𝑖 (𝑋𝑖 −𝑋)

𝑆=√

𝑛−1

̅ is the mean, n is the total number where 𝑋𝑖 is the value of the 𝑖𝑡ℎ observation, 𝑋

of observations. Notably, Standard deviation is the square root of the variance (𝑆 2 ). A small standard deviation indicates that the data points are very close to the mean while a large standard deviation indicates that the data points are spread out over a large range of values.

2.1.1.3 Coefficient of Variation The coefficient of variation measures relative dispersion. It is the percentage of the ratio of the standard deviation to the mean. It is usually denoted as CV and its formula is as follows: 𝑆

̅ is the mean. 𝐶𝑉 = 𝑋̅ ∗ 100% where S is standard deviation and 𝑋 It is a useful statistics for comparing the degree of variation from one data series to another.

2.1.1.4 Skewness Skewness assesses the shape of a distribution arising from a dataset. A distribution, is symmetric if it looks the same to the left and right of the center point. Ideally it should have the mean being equal to the median and mode4. In such a case, a distribution is normal and its skewness values are asymptotically zero. Nevertheless, there are exceptions and a distribution can be negatively or positively skewed depending on the position of the mean. A positively skewed distribution has observations that are greatly spread out the right hand side than they are on the left hand (see Figure 1). This formally means that the value of the mean is the maximum while that of the mode is the minimum and the median lies between the two. It has positive values in practice.

4

Median is the value of the middle item when all the items of a dataset are arranged either in ascending or descending order in terms of values. Mode is the value that occurs most frequently in a data set.

3

Figure 1: Negative and Positive Skewness

Source: (kullabs, 2017)

A distribution is negatively skewed if it has observations that are greatly spread out the left hand side than they are on the right hand (see Figure 1). This formally means that the value of the mode is the maximum while that of the mean is the minimum and the median lies between the two. It has negative values in practice. The formula for skewness is as follows:

𝑆𝑘 =

̅ 3 ∑𝑛 𝑖 (𝑋𝑖 −𝑋) 𝑆3

Where 𝑋𝑖 is the value of the 𝑖𝑡ℎ observation, 𝑋̅ is the mean, S is the standard

deviation and n is the number of observations.

2.1.1.5 Kurtosis Kurtosis measures the degrees of flatness or peakedness of a distribution. This is compared to a normal distribution whose shape is symmetrical and bell-shaped, also called Mesokurtic. It has a value of zero in practice. Synonymous to skewness, there are exceptions to a Mesokurtic distribution. If a curve is relatively narrower and peaked at the top, it is designated as Leptokurtic.

4

It has positive values in practice. Equally, if the curve is flatter than a normal curve, it is designated as platykurtic. This has negative values in practice. Figure 2 elaborates this discussion on Kurtosis. Figure 2: Kurtosis

(Medica, 2017)

The formula for Kurtosis is given as follows:

𝐾=

̅ 2 ∑𝑛 𝑖 (𝑋𝑖 −𝑋) 𝑆4

Where 𝑋𝑖 is the value of the 𝑖𝑡ℎ observation, 𝑋̅ is the mean, S is the standard

deviation and n is the number of observations.

2.1.1.6 Results of Descriptive Statistics This section shows and discusses the output of the above-mentioned measures on variables in nominal, real and growth terms. We first describe how variables are converted into the different terms before showing the results. First, raw data is taken as nominal, as a result no modification is done. Second, real data is created by deflating nominal data by CPI except for devaluation. Third, growth values are created by taking the difference between two subsequent values of a variable divided by the first value thereafter multiplying by 100 percent to convert it into percentage5. Their respective Stata commands are provided in the Do-File in the appendix.

5

Note that modification to real and growth terms is not done for devaluation because it is discrete (binary) and not continuous over time.

5

Results for variables in nominal terms are presented in Table 1: Table 1: Descriptive Statistics for Nominal Variables Variable N Mean S CV Skewness GNP 40 98.35 50.91 51.8 -0.0885 Investment 40 49.52 26.76 54.0 -0.107 Money supply 40 39.30 21.91 55.8 0.181 Interest rate 40 1.528 1.173 76.8 0.951 CPI 40 1.591 0.820 51.6 0.543 Key: N=Number of observations, S=Standard Deviation, CV= Coefficient of Variation.

Kurtosis 1.410 1.341 1.710 3.440 1.873

From Table 1, GNP has the highest mean among the variables at 98.35 followed by Investment at 49.52. Interest rates and CPI have the least mean at 1.528 and 1.591 respectively. Equally, GNP had the highest standard deviation indicating that it has a high dispersion from the mean. CPI has the least dispersion from the mean with the lowest standard deviation of 0.820. All variables had a more than 50% variation using the values of the Coefficient of Variation. Interest rate was the most affected variable with a variation of about 77%. GNP and Investment were negatively skewed while money supply, interest rate and CPI were positively skewed. However, the values of GNP, investment and money supply are closer to zero, thereby indicating a possibility of normality in this variables. All variables are Leptokurtic indicating that they have a narrow distribution that is peaked at the top.

Results for variables in real terms are presented in Table 2:

Table 2: Descriptive Statistics for Real Variables variable N Mean S CV Skewness Real GNP 40 60.91 13.71 22.5 0.424 Real Investment 40 30.28 7.617 25.2 0.403 Real Money Supply 40 23.91 5.197 21.7 -0.420 Real Interest Rate 40 -0.0630 1.774 -2815.2 0.581 Key: N=Number of observations, S=Standard Deviation, CV= Coefficient of Variation.

Kurtosis 4.743 3.259 2.445 2.298

The highest mean as per Table 2 is from Real GNP at 60.91, followed by Real Investment, Real Money Supply and Real Interest Rate respectively. The magnitude of the variance in descending order is synonymous to the order of the mean. Real Interest Rate has the highest variation 6

indicating high fluctuation over time. Except Real Money Supply, all other variables are have longer right hand tails due to their positive skewness. Consequently, all variables are Leptokurtic.

Results for variables in growth terms are presented in Table 3:

Table 3: Descriptive Statistics for Growth Variables variable N Mean S CV Skewness GNP growth 39 48.97 154.8 316.1 2.896 Investment growth 39 47.13 133.4 283.0 1.838 Money Supply growth 39 54.71 173.1 316.4 2.911 Interest rate growth 39 105.5 257.7 244.3 1.547 Key: N=Number of observations, S=Standard Deviation, CV= Coefficient of Variation.

Kurtosis 13.07 6.917 12.61 3.946

Whereas all variables grew by more than 45%, Interest rates had the highest growth with the mean of 105.5. Synonymously Interest rate has the highest standard deviation indicating a high dispersion from the mean and generally all variables had an above 130% dispersion rate from their means. Coefficient of Variation was highest in GNP and Money supply. This could be due to the susceptibility of these variables to shocks in the economy. All variables were positively skewed and Leptokurtic. Hence they had a right-tailed narrow distribution that is peaked at the top.

2.1.2 Graphical Analysis of Data This section provides a visual inspection of data in nominal, real and growth terms. Graphical plots of each variable are constructed.

2.1.2.1 Nominal Plots According to Figure 3, graphical plots of each variable are provided against time and thereafter they are compared to their fitted trends. The graphical analysis also shows the kernel density functions of each variable against their respective normal functions.

7

Figure 3: Actual vs. Fitted values and Kernel density function for nominal variables Gnp VS Time

Kernel density estimate

Gross National Product

Fitted values Kernel density estimate Normal density

200

.008 150

Density

GNP

.006

100

.004

50

.002

0 0

10

20 Time

30

40

0

50 100 150 Gross National Product

0


Investment VS Time National Investment

200

Kernel density estimate Normal density

Fitted values 80

.015

Investment

60

Density

.01 40

.005 20

0

10

20 Time

30

40

0

0

Money Supply VS Time Money Supply

20

40 60 National Investment

80

100

0

Kernel density estimate Kernel density estimate Normal density

Fitted values 80

.02 60

Density

Money Supply

.015 40

20

0

10

20 Time

30

40

.01

.005

0

0

8

20

40 60 Money Supply

80

0


Interest Rate VS Time Interest Rate

Kernel density estimate Normal density

Fitted values 5

.4

4

Interest rate

.3

Density

3 2

.2

.1

1

0

10

20 Time

30

40

0

0

CPI VS Time Consumer Price Index

2 4 Interest Rate

6

0

Kernel density estimate Kernel density estimate Normal density

Fitted values 3

.5

2.5

.4 .3

CPI

Density

2 1.5

.2

1

0

10

20 Time

30

40

.1

.5

0

1 2 3 Consumer Price Index

4

0

All variables in their nominal form depict characteristics of stationarity6 by exhibiting both upward and downward trends while oscillating around a certain range. This indicates that they are likely to have error terms that have a constant variance (homoscedastic as discussed in Chapter Five). Apart from interest rates, all the other variables have an upwards trend in their fitted values. Investment, money supply and CPI grow faster, thereby indicating that these variables are expected to grow over time. Conversely, interest rates has a steep slope, indicating its fast decline over time.

6

Although this is beyond the scope of this study, results in Table A-1 in the Appendix confirm that variables are stationary at level and have no structural break. Augmented Dickey-Fuller test and Phillips-Perron test for stationary while Zivot-Andrews test is a robustness checks for stationarity and structural breaks.

9

A comparison of the kernel density functions and normal graphs shows that only interest rates portrays tendencies of normality over time. This is confirmed by the Shapiro-Wilk W test7 results in Table A-2 in the Appendix A. Indeed the test finds that none of the variables is normal. Nevertheless, whereas normality is one of the assumptions of the CLRM model, it is negligible (Greene, 2012, pp. 64-65).

2.1.2.2 Nominal VS Real plots Figure 4 shows the plots for nominal and real values for GNP, Investment, Money Supply and Interest Rates.

Figure 4: Nominal VS Real Plots for variables

Gross National Product Real Gross National Product

200

National Investment

Real National Investment

80

150

60

100

40

50

0

20

0

10

20 Time

30

0

40

0

10

20

30

40

Time

7

This a formal test of normality with the condition being that the p-value must be greater than 0.01 for a variable to be normal.

10

Money Supply

Real Money Supply

Interest Rate

80

6

60

4

Real Interest Rate

2

40

0 20

-2 0

0

10

20 Time

30

40

0

10

20 Time

30

40

As expected from Figure 4, all real values are less than nominal values because they account for inflation. Real values are procyclical to nominal values although the gap between them is huge especially for GNP, National Investment and Real Money supply. For instance whereas the nominal values of GNP oscillates between 20 and 172, real GNP oscillates between 27 and 106. Similarly, nominal investment oscillates between 11 and 88 while its real values oscillate between 16 and 52. Nominal money supply oscillates between 8 and 78 while its real values oscillate between 11 and 33. However, the gap between real interest rate and nominal interest rate is narrow especially for periods with rises in trend. Conversely, this gap is huge for periods with a fall in trend such as period 9, 22, 31 and 35. This is an indicator of the effect of inflation in periods when nominal interest rate decreases.

2.1.2.3 Growth rates Figure 5 shows the plots for growth rates of nominal and real values for GNP, Investment, Money Supply and Interest Rates.

11

Figure 5: Growth of Nominal VS Real Plots for variables Growth of GNP

Growth of Real GNP

800

600

600

400

Growth of Real Investment Growth of National Investment

400

200 200

0 0 0

10

20

30

40

-200

0

10

Time

6000

Growth of Money Supply Growth of Real Money Supply

800

600

4000

400

2000

200

20 Created Time

30

40

Growth of Interest rate Growth of Real Interest Rate

0

0 -2000 0

10

20

30

40

Time

0

10

20 Time

30

40

From Figure 5, the highest growth among all variables except interest rates occurred in period 9. This can be alluded to a positive shock in the economy. Nevertheless, in general, GNP, money supply and investment (both real and nominal values) display an almost identical trend throughout the period. This shows the procyclic nature money supply and investment and how these variables influence GNP. Accordingly, their growth is unstable up to around period 15 before gaining stability up to period 26 where they become unstable again to period 30. The first half of period 30-40 is stable while the other half is unstable. These fluctuations are indications of the susceptibility of these variables to economic shocks.

The growth of real interest rates is generally stable apart from period 2, 19 and 39. Conversely, the growth of nominal interest rates has been generally unstable especially before period 20. Whereas the post-period 20 era too has highs and lows, their magnitudes are not as high as those in the prior 12

period. Nonetheless, the relative stability of the real interest rates justifies the need of accounting for inflation.

Figure 6: Money Supply vs. Interest rate; Money Supply vs. CPI Growth of Money Supply Growth of Interest rate

Growth of Money Supply

800

Growth of CPI

800

600

600

400 400 200 200 0 0 -200

0

10

20 Time

30

40

0

10

20 Time

30

40

The relationship between money supply and interest rates, both in nominal form, is negative as predicted by theory. Therefore, an increase in money supply reduces interest rates and vice versa as predicted by theory ( (Romer, 2012, p. 243). On the contrary, the relationship between money supply and CPI, both in nominal form, is positive and they relatively grow at a similar rate. This is in line with the Quantity Theory and Keynesian Theory in which inflation is demand-pull (Branson, 1989).

13

CHAPTER THREE CORRELATION ANALYSIS AND EARLY REGRESSION ANALYSIS 3.1 Introduction This Chapter presents results of the correlation analysis and early regression analysis. Correlations analysis is conducted to measure the strength and direction of linear association between two variables (Gujarati & Porter, 2009). On the other hand, early regression analysis is conducted for purposes of interpreting the regression results without taking into account the violations of the CLRM assumptions. These results are later compared to the results of Comparative Regression analysis in Chapter Eight upon addressing these violations (we will then have addressed the specific violations such as autocorrelation).

3.2 Correlation Analysis Correlation analysis as aforesaid measures the degree of association between variables. It is formally represented as r and its values range between negative one and positive one (−1 ≤ 𝑟 ≤ 1). A value of -1 or close to it indicates a strong negative relationship while a value of 1 or close to it indicates a strong positive relationship. It is formally presented as (Gujarati & Porter, 2009): 𝑛 ∑ 𝑋𝑖 𝑌𝑖 − (∑ 𝑋𝑖 ) (∑ 𝑌𝑖 )

𝑟=

√[𝑛 ∑ 𝑋𝑖2 − (∑ 𝑋𝑖 )2 ][𝑛 ∑ 𝑌𝑖2 − (∑ 𝑋𝑖 )2 ] Where n is the sample size, 𝑋𝑖 is the X observation while 𝑌𝑖 is the Y observations. This study uses Pearson’s Correlation to test the relationship of variables at one percent level of significance. Results are as follows: Table 4: Correlation Results for Nominal Variables Investment Investment

GNP

Money Supply

Interest Rate

CPI

1

GNP Money Supply

0.9932* 0.9728*

1 0.9777*

1

Interest Rate

-0.7397*

-0.7551*

-0.6862*

1

0.9641*

-0.5695*

CPI 0.9316* 0.9407* Key: Asterisk (*) indicates 1% level of significance

14

1

There is a very high positive and significant correlation between Investment, GNP and Money Supply as shown in Table 4. This affirms the results of Figure 5 whose main conclusion was that these variables portrayed an almost identical trend. The fact that these relationships have a magnitude of more than 0.93 point to a possibility of the problem of multicollinearity that is handled in Chapter Four. The rule of the thumb is that a correlation of more than 0.8 indicates the possibility of high dependence among variables which leads to multicollinearity (Gujarati & Porter, 2009). Interest rate has a negative and significant relationship with all variables with the highest magnitude being 0.76 with GNP and the least is 0.6 with CPI. This is a stylized fact established in theory (Romer, 2012).

Table 5: Correlation Results for Real Variables Real GNP Real Investment Real Money Supply

Real GNP 1

Real Investment

Real Money Supply

0.9032* 0.8253*

1 0.7929*

1

Real Interest Rate -0.5481* -0.5619* Key: Asterisk (*) indicates 1% level of significance

-0.6454*

Real Interest Rate

1

The relationship between Real GNP, real Investment and Real Money Supply is still positive and significant at 1% (see Table 5). The highest correlation is between Real GNP and Real Investment while the least is between Real Money Supply and Real Investment. However, the fact that these correlations are at least 0.8 alludes to multicollinearity problem. Real Interest Rate has maintained a negative and significant relationship with all variables. In general, the magnitude of correlation among real values is lower than that of nominal values.

Table 6: Correlation Results for Growth Variables Investment

GNP

Investment 1 GNP 0.8785* 1 Money Supply 0.8765* 0.9858* Interest Rate -0.453 -0.389 Key: Asterisk (*) indicates 1% level of significance

15

Money Supply

Interest Rate

1 -0.360

1

Correlation of Investment, GNP and Money Supply is still highly positive and significant as per Table 6. Notably, Money Supply and GNP are very highly correlated at 0.99 while the other correlations are above 0.87. As earlier said, this indicates presence of multicollinearity. Interest rate remains negative but insignificant throughout.

3.3 Early Regression Analysis Regression analysis extends the discussion of correlation analysis by reviewing the relationship between one variable, dependent variable, and one or more variables, independent variables (Greene, 2012, Gujarati & Porter, 2009). Therefore, the main interest of regression analysis is to predict the average value of one variable on the basis of the fixed values of other variables. This relationship is done in a set up of a model which follows a cerain distribution that has to adhere to certain assumptions. As a result, a Classical Linear Regression Model used in this study has four main assumptions (the structure a CLRM is explained in the next paragraph). First, it is assumed that there is a linear relationship between dependent and independent variables. Second, the assumption of full rank which rules out possibility of an exact linear relationship among indpendent variables (no multicollinearity). Third, the assumption of spherical disturbances that assumes the variance of the error term is constant (homoscedastic) and uncorrelated across observations (nonautocorrelation). Fourth, strict exogeneity in which the expected value (mean) of the error term given an independent variable(s) is zero. Fifth, the assumption of normality of the error term8.

A regression model can formally be presented as: 𝑦 = 𝑋𝛽 + 𝜀 …………………………………………………………………………………… Equation 1 Where 𝑦 is the dependent variable, in our case Investment, while X is independent variable(s)9. 𝛽 is a parameters to be estimated and 𝜀 is the error term. Generally the first part of the right hand side of the equation, 𝑋𝛽, is deterministic while the second part, 𝜀, is stochastic/random/ error

8

See (Greene, 2012), (Gujarati & Porter, 2009) and (Hayashi, 2000) for details of the five assumptions. 9 It is a simple regression model if the independent variable is one and it is a multiple regression if there is more than one independent variables.

16

term/disturbance term. However, this is in a set up of a population and 𝛽 is a population estimate. Given that 𝛽 is unknown, it is etimated using sample data which is observable. Therefore, equation 1 is converted into a sample set up as follows: 𝑦𝑖 = 𝑋𝑏 + 𝑒 ………………………………………………………………………………….. Equation 2 Where 𝑦𝑖 is still the dependent variable, X is the independent variable (s), b is the parameter to be estimated and e is the error term. 𝑏 is estimated by the method of Ordinary Least Squares (henceforth OLS) which involves minimizing the sum of squared residuals to obtain: 𝑏 = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 ……………………………………………………………………………… Equation 3 A detailed derivation of 𝑏 is in Greene (2012, pp. 66-68) and Hayashi (2000, pp. 15-18). It should be recognized that 𝑏 is coefficient of variables that is obtained in practice.

In reiteration, a regression analysis is conducted at this early stage to show its results when CLRM assumptions are violated. Two kinds of regression results are presented. The first is a simple regression analysis which involves running the dependent variable, Investment, against individual independent variables. The second is a multiple regression which involves running the dependent variable, Investment, against all the independent. Results for variables in nominal, real and growth terms are then presented in Table 7, Table 8 and Table 9 respectively. Inference is then made with regards to the magnitude of the coefficient (𝑏 in equation 3), its significance using t-test, and the overall significance of the model using F-test and the Coefficient of Determination, R-Squared.

The formal presentation of the calculated t, F and R-Squared is as follows (Refer to Greene (2012), Gujarati (2012) and Hill, Griffiths, & Lim (2011) for details): The calculated t-statistic is:

𝑡=

𝑏̂𝑖 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 (𝑏̂𝑖 )

=

𝑏̂𝑖 √𝜎 2 (𝑋′𝑋)−1

………………………………….....………… Equation 4

It has n-k degrees of freedom (n is the number of observations and k is the number of parameters 𝛼

to be estimated). This is tested at a two-tail level, , because of the hypotheses which are: 2

𝐻0 : b̂i = 0 (Coefficient is insignificant) 𝐻1 : b̂i ≠ 0 (Coefficient is significant) 17

𝛼 is the level of significance. In this case, it is either 1%, 5% or 10%. The formal presentation of the F-statistics is: 𝐹=

𝐸𝑆𝑆⁄ (𝐾−1) 𝑅𝑆𝑆⁄ (𝑁−𝐾)

=

𝑅 2⁄ (𝑘−1) …………………………………………………..…………… (1−𝑅 2 ) ⁄(𝑛−𝑘)

Equation 5

Where ESS is the Error Sum of Squares, RSS is the Residual Sum of Squares and 𝑅 2 is R-squared. 𝑘 − 1 and n−𝑘 are degrees of freedom which aid in establishing the critical F-value at a certain level of significance. In this case, it is either 1%, 5% or 10% and the hypotheses are as follows: 𝐻0 : b̂1 = 0, b̂2 = 0, … ,̂b𝑖 = 0 (Model is insignificant) 𝐻0 : b̂1 ≠ 0, b̂2 ≠ 0, … , b̂𝑖 ≠ 0 (Model is significant)

R-Squared is the Coefficient of Determination. It explains the portion of the dependent variable that is explained by the independent variables. It is formally represented as: 𝐸𝑆𝑆

𝑅𝑆𝑆

𝑅 2 = 𝑇𝑆𝑆 = 1 − 𝑇𝑆𝑆…………………………………………………………………………….. Equation 6 Where ESS is the Error Sum of Squares, TSS is the Total Sum of Squares and RSS is the Residual Sum of Squares. One major disadvantage of R-squared is that it tends to increase with an increase in number of explanatory variables. To overcome this, the adjusted R-squared is also presented. It is formally written as: 𝑛−1 𝑅̅ 2 = 1 − (1 − 𝑅 2 ) 𝑛−𝑘……………………….......…………………………………………….Equation 7

A practical illustration of early regression results alongside t-statics, F-statistics, R-squared and adjusted R-squared results for nominal, real and growth variables are presented in Table 7, 8 and 9 respectively.

18

Table 7: Early Regression Results using Nominal data Dependent Variable: Investment

GNP

Model 1 0.5219403* (0.0099578)

Simple Regression Model 2 Model 3

Model 4

1.187879* (0.0458808)

Money Supply

-16.87* (2.489776)

Interest Rate CPI

Observations

-1.807828 (1.099883) 40

F

(1,38)=2747.35 (1,38)=670.32

(1,38)=45.91 (1,38)=249.49 (4,35)=687.74

Prob > F

0.0000

0.0000

0.0000

0.0000

0.0000

R-squared

0.9864

0.9464

0.5471

0.8678

0.9874

Constant

2.84135 (2.058331) 40

75.29393* (4.773136) 40

30.3845* (1.923639) 1.198455 (1.198455) 40

Multiple Regression Model 5 0.5516064* (0.0597931) 0.1321657 (0.1410643) 1.212757 (0.8512642) -4.234505 (2.78561) -5.037108*** (2.85957) 40

0.9860 0.9449 0.5352 0.8643 0.9860 Adjusted Rsquared Key: Asterisk (*) represent level of significance with * being 1% level and ***being 10% level. Standard errors are in parenthesis.

Model 1, Model 2 and Model 4 in Table 7 indicate that GNP, Money Supply and CPI significantly increase investment at individual levels. Actually CPI has the biggest influence as a unit increase in increases Investment by 30 units. Interest rates (Model 3) as established in Chapter Two, reduces Investment, by about 17 units. This negative relationship is highly significant. Whereas Model 1 to 4 are all significant at one percent level using F-test10, they have different shares of the influence on Investment as per the R-squared values. GNP explains 99% of the variations in Investment, Money Supply explains 95%, CPI explains 87% and Interest rate explain the least at 55%.

10

F-test requires that the calculated value be compared to the critical value. For instance model 1 has a calculated value of 2747.35 with (1, 38) degrees of freedom. The critical value at this degrees of freedom is 7.35254. Inference is made by comparing the calculated and critical value. If the former is greater than the latter, then a model is significant or else it is insignificant is the former is lower than the latter. In this case, the calculated is greater than the critical, thereby indicating significance of the model 1.

19

Model 5, which is a multiple linear regression, indicates that all variables except Interest Rates and CPI maintain their signs of influence on Investment as they were individually. However, only GNP is significant. The model is significant at one percent using the F-test and it has an R-squared value of 99%. Hence, GNP, Money Supply, CPI and Interest Rates explain 99% of the variations in Investment. In general, results from Model 5 allude to the problem of Multicollinearity as earlier suspected in the Correlation Analysis. This is because according to (Gujarati & Porter, 2009; Greene, 2012), having a high R-squared with few significant variables in a significant model is an indicator of Multicollinearity. These characteristics befit Model 5.

Table 8: Early Regression Results using Real data Dependent Variable: Real Investment Simple Regression Model 1 Model 2 0. 501636 * Real GNP (0.0386743) 1.162181 * Real Money (0.1448997) Supply Real Interest Rate

Model 3

Multiple Regression Model 4 0.4316118* (0.0685551) 0.1603694 (0.1981067) -0.2803783 (0.391928) 0.136444 (3.293708) 40

Observations

-0.2766054 (2.413025) 40

2.49014 (3.543226) 40

-2.4131* (0.5763365) 30.12411* (1.009988) 40

F

(1,38)=168.24

(1,38)= 64.33

(1,38)=17.53

(4,35)=56.69

Prob > F

0.0000

0.0000

0.0002

0.0000

R-squared

0.8158

0.6287

0.3157

0.8253

Constant

0.8109 0.6189 0.2977 0.8107 Adjusted Rsquared Key: Asterisk (*) represent level of significance with * being 1% level. Standard errors are in parenthesis.

The signs of influence for Real GNP (Model 1), Real Money Supply (Model 2) and Real Interest Rate (Model 3) are synonymous to nominal results in Table 7. Whereas the magnitudes of Real GNP and Real Money reduce mildly in comparison to their nominal form, Real Interest Rate reduces significantly because inflation is accounted for. Models 1 to 3 are highly significant using 20

the F-test but have lower R-squared values. Real GNP accounts for 82% of variations in Real Investment while Real Money Supply and Real Interest rate account for 63% and 32% respectively. This is the influence of accounting for inflation.

Model 5 indicates that Real GNP and Real Money Supply increase Real Investment while Real Interest Rate reduces it. However, only Real GNP is significant and the model is significant at 1% using the F-test. Furthermore, Real GNP and Real Money Supply and Real Interest Rate explain 83% of the variations in Real Investment as per the R-squared value. Nonetheless as earlier concluded the high R-squared value together with a highly significant model and few significant variables indicates presence of multicollinearity.

Table 9: Early Regression Results using Growth data Dependent Variable: Investment Growth Simple Regression Model 1 Model 2 Model 3 Model 4 0.7568951 * GNP Growth (0.0676547) 0.6752378 * Money Supply (0.0609762) Growth -0.2344527* Interest Rate (0.0758431) Growth 1.254525* CPI Growth (0.0977156) 10.06361 10.19137 71.87011* 9.390527 Constant (10.85613) (10.94084) (20.88801) (9.721863) 39 39 39 39 Observations

Multiple Regression Model 5 0.372 (0.342) -0.0860 (0.325) -0.0557 (0.0373) 0.808* (0.210) 15.21 (10.39) 39

F

(1,37)=125.16

(1,37)=122.63

(1,37)=9.56

(1,37)=164.83

(4,34)=50.52

Prob > F

0.0000

0.0000

0.0038

0.0000

0.0000

R-squared

0.7718

0.7682

0.2053

0.8167

0.8560

0.7657 0.7619 0.1838 0.8117 0.8390 Adjusted Rsquared Key: Asterisk (*) represent level of significance with * being 1% level. Standard errors are in parenthesis.

The signs of GNP, Money Supply, Interest Rate and CPI (all in growth from Model 1 to 4) resemble those in Table 7. They are also significant at individual levels. However, CPI has the 21

highest R-squared at 82% followed by GNP, Money Supply and Interest Rates in that order. Model 5 also encompasses a different results in that only growth in CPI is significant and all variables together explain 86% of variations in growth of Investment. Nevertheless, the presence of only one significant variable in a model that is highly significant and has a high R-squared value indicates the presence of multicollinearity.

22

CHAPTER FOUR MULTICOLLINEARITY 4.1. Introduction This Chapter discusses the theoretical basis and practically tests for multicollinearity in our data. Multicollinearity is a violation of the CLR assumptions and it means that there exists an exact or an almost perfect linear relationship among independent variables11 (Greene, 2012). As it has been widely stated in Chapter Three, the main indicators of multicollinearity include (Greene, 2012; Gujarati & Porter, 2009): i.

Very few significant coefficients with a high R-squared value.

ii.

Small changes in data produce wide swings in parameter estimates.

iii.

Coefficients might have the “incorrect” signs or implausible magnitudes.

iv.

An R coefficient of at least 0.8.

Ideally, the main problem of multicollinearity is that it results in inefficient estimators. This means that an estimator no longer has the least variance but rather has a big variance. This in turn affects inferencing of the t-test. Recall from Equation 4, in Chapter Three, the t-statistic is a ratio of the estimator and its respective standard error. The standard error is the square root of the variance. Hence, if the variance is inefficient, it means that the standard error is also inefficient and the resulting t-statistic is also inefficient and it is not fit to be used for making an inference. It is therefore likely to lead to commitment of type 1 error where the null hypothesis is rejected yet it should have been accepted.

Formally, the variance of an estimator without multicollinearity is as follows:

𝑉𝑎𝑟[𝑏|𝑋] = 𝜎 2 (𝑋′𝑋)−1 = ∑𝑛

𝜎2 ̅ 2

𝑖=1(𝑋𝑖 −𝑋)

………………………………………….…… Equation 8

Whereas the variance of an estimator with multicollinearity is of the form:

𝑉𝑎𝑟[𝑏|𝑋] =

11

𝜎2 2 ) ∑𝑛 (𝑋 −𝑋 ̅ 𝑘 )2 (1−𝑟23 𝑖=1 𝑖𝑘

, k=2, 3.……………………………………..……… Equation 9

Precisely, variables are observationally equivalent.

23

2 Where 𝑟23 is the Correlation Coefficient between two independent variables. Refer to Chapter

Three for the formula. It can be seen that if there is a high correlation between variables, then the 2 variance becomes big and it is infinite if the correlation is perfect, that is 𝑟23 = 1.

4.2 Test for Multicollinearity Multicollinearity can be detected using the Variance inflation factor (VIF). VIF is formally written as:

𝑉𝐼𝐹 =

1

……………………………………………………………….………………. (1−𝑟2 23 )

Equation 10

The rule of the thumb is that a VIF of more than 10 indicates presence of multicollinearity. Conversely, Tolerance (which is the ratio of 1 over VIF) can also be used to test for multicollinearity. Therefore, a value that equals to or less than 0.1 signals presence of multicollinearity.

Table 10: Results of Multicollinearity Test using Nominal, Real and Growth Data Variable name

Nominal Variables

Real Variables

Growth Variables

VIF

Tolerance

VIF

Tolerance

VIF

Tolerance

Money Supply

37.18

0.026896

3.76

0.265652

41.93

0.023852

GNP

36.06

0.027729

3.14

0.318504

37.25

0.026849

CPI

20.32

0.049211

5.38

0.185888

Interest Rate

3.88

0.257653

1.23

0.814507

Mean VIF

24.36

1.72 2.87

0.582703

21.44

Key: VIF- Variance Inflation Factor

There is evidence of Multocollinearity in Money Supply, GNP and CPI when they are in for, nominal and growth variables. This is because their VIF values are above 10 and their Tolerance values are below 0.112. This result is rather not surprising as it had been hinted in Chapter Two and Chapter Three. Accounting for inflation (CPI in this case) eliminates the problem of 12

More tests of confirming for multicollinearity among this variables can be found in the Do-File. For instance the test that Money Supply=GNP was significant.

24

multicollinearity. It can be seen from Table 10 that all real variables have a VIF of below 10, hence there is no multicollinearity. Nevertheless, interest rate is not correlated with any variable in all states of variables.

4.3 Conclusion The panacea for the problem of multicollinearity is to run a regression model using the real values which have been established as being free from multicollinearity. Using real values is a data transformation remedy for multicollinearity as noted by (Gujarati & Porter, 2009, p. 344). This is because real values are created by dividing nominal variables with CPI. Alternatively, we could transform the variables by obtaining their first differences but this is beyond the scope of this study.

The last remedy is explains the purpose of discussing Endogeneity in a framework of Simultaneous Equations in Chapter Seven. The high correlation between Money Supply, GNP and CPI begs the question of causality when at least one of them is not included in a model. This is lacuna that Chapter Seven addresses.

25

CHAPTER FIVE HETEROSCEDASTICITY 5.1 Introduction Heteroscedasticity (heteroskedasticity) is a violation of the CLRM assumption that has a nonconstant variance in the error term (Greene, 2012). An ideal variance of an error term that has no heteroscedasticity (that is homoscedastic) is of the form: 𝑉𝑎𝑟[𝜀𝑖 |𝑋] = 𝜎 2 𝐼 …………………………………………………………..…………………. Equation 11 A variance that is heteroscedastic is of the form: 𝑉𝑎𝑟[𝜀𝑖 |𝑋] = 𝜎𝑖2 , 𝑖 = 1, … , 𝑛…………………………………………………………………. Equation 12 Note that this variance is the numerator of equation 8. Therefore, heteroscedasticity is a problem because it makes the estimators to be unbiased as their variances are no longer minimum. Reverting to equation 8, the ideal variance of an estimator is Var[b|X] = σ2 (X′X)−1. However, with heteroscedasticity, the variance becomes: 𝑉𝑎𝑟[𝑏|𝑋] = 𝜎 2 [𝑋′𝑋]−1 [𝑋′𝛺𝑋]−1 [𝑋′𝑋]−1………………………………………………… Equation 13 This heteroscedastic variance can be presented as follows in matrix form:

σ2 Ω = σ2 [

ω1 0 0

0 … ω2 0

⋮

𝜎12 0

0 𝜎22

ω𝑛 [ 0

0

0 0

]=

… 0 … 0 ⋮ … 𝜎𝑛2 ]

Analogous to multicollinearity, when the variance is affected and it is not minimum under heteroscedasticity, then making an inference using the t-test and F-test is inappropriate.

5.2 Testing for heteroscedasticity Three tests of identifying heteroscedasticity are discussed and tested practically.

26

5.2.1 Goldfeld-Quandt test This test assumes that the heteroscedastic variance is positively related with one of the explanatory variables. For instance this variance may be of the form: 𝜎𝑖2 = 𝜎 2 𝑋𝑖2 . Therefore, a practical illustration of this test follows the following steps (Gujarati & Porter, 2009): i.

ii.

Form the hypotheses such that: 𝐻0 : = 𝜎12 = 𝜎22 = ⋯ = 𝜎𝑛2 (Constant variance i.e. homoscedastic) 𝐻0 : = 𝜎12 ≠ 𝜎22 ≠ ⋯ ≠ 𝜎𝑛2 (Non-constant variance i.e. heteroscedastic) Order observations in ascending order based on the values of 𝑋𝑖 . In this case, we ordered observations based on GNP values as shown in Table 11.

Table 11: Rank of GNP values .

tab

gnp

Gross National Product

Freq.

Percent

Cum.

20 25 27 29 36 38 44 45 46 52 55 56 57 60 75 110 112 130 133 135 138 140 141 146 148 150 154 159 165 168 169 172

1 1 1 1 1 1 2 1 1 1 1 1 1 3 1 3 1 2 1 2 1 1 1 1 1 2 1 1 1 1 1 1

2.50 2.50 2.50 2.50 2.50 2.50 5.00 2.50 2.50 2.50 2.50 2.50 2.50 7.50 2.50 7.50 2.50 5.00 2.50 5.00 2.50 2.50 2.50 2.50 2.50 5.00 2.50 2.50 2.50 2.50 2.50 2.50

2.50 5.00 7.50 10.00 12.50 15.00 20.00 22.50 25.00 27.50 30.00 32.50 35.00 42.50 45.00 52.50 55.00 60.00 62.50 67.50 70.00 72.50 75.00 77.50 80.00 85.00 87.50 90.00 92.50 95.00 97.50 100.00

Total

40

100.00

27

iii.

Omit 𝛾 central observations and divide the remaining observations into two groups each with (𝑛 − 𝛾)⁄2 observations.

In our case, eight (8) values were eliminated out of the forty observations. Hence the reminder were dived into two sub-samples with sixteen observations as follows: (𝑛 − 𝛾)⁄2 = (40 − 8)⁄2 = 16 . The values eliminated fell between the range of 60 (1 value that has 60) and 130 from Table 11. iv.

Fit the two separate regressions for each (𝑛 − 𝛾)⁄2 observations and obtain their respective Residual Sum of Squares (RSSs). Let the first be 𝑅𝑆𝑆1 and the second be 𝑅𝑆𝑆2. Note that each RSS has

(𝑛−𝛾) 2

− 𝑘 degress of freedom where k is the number of parameters

including the intercept. Results for the First sub-sample are as per Table 12.

Table 12: Regression Results for the First Sub-sample . reg investment gnp moneysupply interest cpi if gnp F R-squared Adj R-squared Root MSE

P>|t| 0.659 0.076 0.631 0.323 0.361

= = = = = =

17 14.91 0.0001 0.8325 0.7766 2.9745

[95% Conf. Interval] -.3991915 -.0634629 -2.882087 -18.19797 -28.96556

.6081266 1.110565 4.569304 50.91418 11.36994

Table 13: Results for the second Sub-sample . reg investment gnp moneysupply interest cpi if gnp >=133 Source

SS

df

MS

Model Residual

612.668846 61.0811545

4 11

153.167211 5.55283223

Total

673.75

15

44.9166667

investment

Coef.

gnp moneysupply interest cpi _cons

.4594797 -.0516484 2.792137 -1.103851 10.93099

Std. Err. .2595719 .3747207 2.344982 2.927679 17.42973

t 1.77 -0.14 1.19 -0.38 0.63

Number of obs F(4, 11) Prob > F R-squared Adj R-squared Root MSE

P>|t| 0.104 0.893 0.259 0.713 0.543

= = = = = =

16 27.58 0.0000 0.9093 0.8764 2.3564

[95% Conf. Interval] -.1118342 -.8764031 -2.369133 -7.54763 -27.43158

1.030794 .7731063 7.953407 5.339928 49.29356

The RSS from Table 13 is 61.0811545 with 11 degrees of freedom.

v.

Compute the ratio

𝜏=

𝑅𝑆𝑆1 ⁄𝑑𝑒𝑔𝑟𝑒𝑠𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 𝑅𝑆𝑆2 ⁄𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚

𝜏 follows an F distribution with

……………………………………………….………… Equation 14

(𝑛−𝛾) 2

− 𝑘 degrees of freedom. Thus an inference is made by

comparing the calculated 𝜏 and the critical F value. 106.173449⁄11

𝜏 = 61.0811545⁄11 = 1.738. This is compared to the critical value at 1% i.e. 𝐹(11,11) = 4.46244

The calculated value of 1.738 is less than the critical value of 4.46244. Hence, we fail to reject the null hypothesis and conclude that there is homoscedasticity or alternatively, there is no heteroscedasticity. Similarly the, calculated values for real and growth values and 3.47 and 0.00732 respectively. These against the critical value of 4.46244 indicates that we fail to reject the null hypothesis and conclude that there is homoscedasticity. Tables for the real and growth variables are as follows:

29

Table 14: Results for Sub-samples of Real variables . reg realinvestment realgnp realmoneysupply realinterestrate if realgnp F R-squared Adj R-squared Root MSE

P>|t| 0.957 0.614 0.210 0.033

= = = = = =

16 2.66 0.0953 0.3998 0.2497 3.7046

[95% Conf. Interval] -.4867042 -.7006808 -2.538606 1.951156

.4625224 1.138023 .6201203 40.16373

. reg realinvestment realgnp realmoneysupply realinterestrate if realgnp >=61.85567 Source

SS

df

MS

Model Residual

345.616707 47.399832

3 12

115.205569 3.949986

Total

393.016539

15

26.2011026

realinvestment

Coef.

realgnp realmoneysupply realinterestrate _cons

.4867499 .0635633 -1.083354 -.952946

Std. Err. .0608666 .2105565 .5945799 5.514599

30

t 8.00 0.30 -1.82 -0.17


P>|t| 0.000 0.768 0.093 0.866

= = = = = =

16 29.17 0.0000 0.8794 0.8492 1.9875

[95% Conf. Interval] .354133 -.3951998 -2.378833 -12.96822

.6193668 .5223265 .2121239 11.06233

Table 15: Results for Sub-samples of Growth variables . reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth if gnpgrowth F R-squared Adj R-squared Root MSE

t

.0807214 .07383 .0029437 .0752198 1.737153

12.45 3.30 1.01 -2.28 1.15

P>|t| 0.000 0.008 0.335 0.045 0.278

= = = = = =

15 626.60 0.0000 0.9960 0.9944 2.2129

[95% Conf. Interval] .824827 .0787749 -.0035804 -.3394378 -1.875751

1.184544 .4077821 .0095375 -.0042373 5.865486

. reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth if gnpgrowth >=31.57895 Source

SS

df

MS

Model Residual

202528.834 65553.8035

4 9

50632.2085 7283.75594

Total

268082.637

13

20621.7413

investementgrowth

Coef.

gnpgrowth moneysupplygrowth interestgrowth CPIgrowth _cons

-.2643189 .3531013 -1.111587 .8105771 5.901895

Std. Err.


t

.8165793 .8170492 .7045384 .5986266 57.43246

-0.32 0.43 -1.58 1.35 0.10

P>|t| 0.754 0.676 0.149 0.209 0.920

= = = = = =

14 6.95 0.0078 0.7555 0.6468 85.345

[95% Conf. Interval] -2.11155 -1.495192 -2.705364 -.5436104 -124.0194

1.582912 2.201395 .4821892 2.164764 135.8232

These results have two main implications. First they confirm the theory that heteroscedasticity is likely to be a trivial problem in time series data, such as our data ( (Kmenta, 1971). Second, they confirm the suspicions of Chapter Two that the graphs oscillated around a constant range which indicated that the variance of the error term was constant and in turn homoscedastic.

31

5.2.2 White’s General Test The White’s General test is based on the premise that an auxiliary regression model is run. This auxiliary regression model has the squared residuals from the original model as the dependent variable while covariates are the values of Xs, squared values of Xs and cross product (s) of Xs. The inference is made by comparing the product of the sample size (n) and R-squared, i.e. 𝑛. 𝑅 2. This statistic asymptotically follows a Chi-square distribution with 𝑘 − 1 degrees of freedom. 𝑘 is the number of covariates in the model including the constant.

Similar to Goldfeld-Quandt test, the null hypothesis is that of homoscedasticity and the alternative is for heteroscedasticity. Hence, we conclude that there is heteroscedasticity if the calculated value is greater than the critical value meaning that the null hypothesis is rejected. Conversely, we conclude that there is homoscedasticity if the calculated value is less than the critical value and therefore we fail to reject the null hypothesis.

The results for this test in Stata are as shown in Table 14: Table 16: White’s test for nominal, real and growth variables respectively Nominal values

Real values

Growth values

Chi-squared

31.91

26.92

32.79

Prob > Chi-squared

0.0041

0.0014

0.0001

The Whites test for all levels of variables; nominal, real, and growth, indicates that there is heteroscedasticity. This is contrary to the results of the Goldfeld-Quandt test.

5.2.3 Breusch-Pagan-Godfrey Test This is a Lagrange multiplier test that assumes the variance of the error term to be; 𝜎𝑖2 = 𝜎 2 𝑓(𝛼0 + 𝛼 ′ 𝑍𝑖 ). 𝑍𝑖 is a vector of nonstochastic independent variables. The model is homoscedastic if 𝛼 ′ = 0 meaning that 𝜎𝑖2 = 𝜎 2 𝛼0 which is a constant.

32

Inference is made by comparing the Lagrange multiplier statistic with the Chi-square with 𝑘 − 1 degrees of freedom. In this case, Lagrange multiplier statistic= 1

(𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑠𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠) from the reression of 𝑒𝑖2 ⁄( 2

If the calculated value of

1 2

𝑒 ′𝑒 𝑛

) on 𝑍𝑖 .

(𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑠𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠) is greater than the critical Chi-square

value, then we conclude that there is heteroscedasticity and the null hypothesis is rejected. Conversely, we conclude that there is homoscedasticity if the calculated value is less than the critical value and therefore we fail to reject the null hypothesis.

Results of the Breusch-Pagan-Godfrey Test are presented in Table 15.

Table 17: Breusch-Pagan-Godfrey Test Nominal values

Real values

Growth values

chi2(1)

1.33

42.51

123.16

Prob > chi2

0.2492

0.0000

0.0000

From Table 15, only nominal variables are homoscedastic. Real and Growth variables are heteroscedastic.

5.3 Conclusion Only Goldfeld-Quandt test finds homoscedasticity in variables in all forms among the three tests. Whereas this is a good finding, the main disadvantage of this test is that it relies on selecting the central observations, 𝛾, with which there is no agreed selection criteria in theory. Hence, the remedy is to adopt the results of the White’s test and Breusch-Pagan-Godfrey test and conclude that there is heteroscedasticity. To correct for heteroscedasticity, we run a robust regression model (Greene, 2012). This is seen in the final model in Chapter Eight.

33

CHAPTER SIX AUTOCORRELATION 6.1 Introduction Autocorrelation is a CLRM violation that assumes that the disturbance term of an observation is affected by or correlated with the disturbance term of another observation. Hence, in a time-series set-up such as the case of this study, the effect of a disturbance in one period should not be carried over to the next period (Gujarati & Porter (2009); Kmenta, (1971)). It is formally presented as 𝐸[𝑢𝑡 , 𝑢𝑡−𝑠 ] ≠ 0 for 𝑡 > 𝑠 and 𝑠 ≠ 0. It also means that the covariances between independent variables is not equal to zero as it can be seen in the following matrix:

σ2 Ω = σ2 [

1 𝜌1

𝜌1 1

𝜌𝑛−1

𝜌𝑛−2

… 𝜌𝑛−1 … 𝜌𝑛−2 ] ⋮ … 1

The most common autocorrelation is the first-order autoregressive scheme, often denoted as AR (1). Similar to heteroscedasticity, the main problem of autocorrelation is that it tends to make variances inefficient. In this case, an estimator does not obtain the minimum possible variance which in turn makes inferences based on t and F tests inappropriate.

6.2 Testing for autocorrelation Two main tests are discussed and tested in this study.

6.2.1 Durbin-Watson Test This tests is based on the ratio of sum of squared differences in successive residuals to the RSS as follows:

𝑑=

2 ∑𝑇 𝑡=2(𝑒𝑡 −𝑒𝑡−1 ) 2 ∑𝑇 𝑡=1 𝑒𝑡

𝑒 2 +𝑒𝑇2

= 2(1 − 𝑟) − ∑𝑇1

2 𝑡=1 𝑒𝑡

≈ 2(1 − 𝑟)………………………..….. Equation 15

34

𝑟 is an AR(1) coefficient that lies between negative one and positive one (−1 ≤ 𝑟 ≤ 1). Inference is then made considering the rules in Table 18: Table 18: Durbin Watson test rules (Elementary case) Value of r -1 0 1

Value of d 4 2 0

Remark Presence of negative autocorrelation No autocorrelation Presence of positive autocorrelation

In practice inference is usually made by comparing the calculated d-statistic with the critical values. Critical values are two, lower d, (𝑑𝐿 ), and upper d, (𝑑𝑈 ). Inference is then made based on the rules in Table 19 (see Gujarati et al. (2009) for details):

Table 19: Durbin-Watson test decision matrix Null Hypothesis No positive autocorrelation No positive autocorrelation No negative correlation No negative correlation No autocorrelation

If 0 < 𝑑 < 𝑑𝐿 𝑑𝐿 ≤ 𝑑 ≤ 𝑑𝑈 4 − 𝑑𝐿 < 𝑑 < 4 4 − 𝑑𝑈 ≤ 𝑑 ≤ 4 − 𝑑𝐿 𝑑𝑈 < 𝑑 < 4 − 𝑑𝑈

Remark Reject No decision Reject No decision Do not reject

Results of this test for nominal, real and growth variables is in Table 20:

Table 20: Durbin Watson Results for nominal, real and growth variables Form of data Nominal Real Growth

Null Hypothesis No autocorrelation No autocorrelation No autocorrelation

𝒅

𝒅𝑳

𝒅𝑼

1.545359 1.345037 1.741014

1.2305 1.2848 1.2734

1.7859 1.7209 1.7215

Criteria 𝑑𝐿 ≤ 𝑑 ≤ 𝑑𝑈 𝑑𝐿 ≤ 𝑑 ≤ 𝑑𝑈 𝑑𝐿 ≤ 𝑑 ≤ 𝑑𝑈

Remark No decision No decision No decision

Results in Table 20 portray the key disadvantage of the Durbin-Watson tests in that we are not able to conclude whether autocorrelation exists or not. This justifies the need for the second test which is discussed below.

35

6.2.2 Breush- Godfrey Test/ Lagrange Multiplier Test This Lagrange Multiplier test whose null hypothesis is that there is no autocorrelation while the alternative hypothesis assumes the error term is either an Autoregressive process of order P, AR(P), or a Moving Average process of order P, MA(P) (Greene, 2012). In our case, P=1 because it is an AR (1) process. Although Gujarati & Porter (2009) give a step by step illustration, the main idea of this test is that an auxilliary model is run using lagged residuals then inference is made based on, (𝑛 − 𝑝)𝑅 2 ~𝜒𝑝2 (Johnston & DiNardo, 1998). This calculated value is then compared with the critical Ch-Square value with P degrees of freedom.

Results for nominal, real and growth variables is as follows:

Table 21: Breush- Godfrey Test Results for nominal, real and growth variables Form of data Nominal Real Growth

Null hypothesis No autocorrelation No autocorrelation No autocorrelation

Lags(p ) 1 1 1

F

DF

0.288 0.068 0.101

(1,34 ) (1,35 ) (1,34 )

Prob > F 0.5952 0.7956 0.7523

Remark Do not reject Do not reject Do not reject

According to Table 21, there is no autocorrelation among variables at all levels; nominal, real and growth terms. Nevertheless, we should recognize that the Breush- Godfrey Test has a shortcoming in establishing the appropriate number of lags (Gujarati & Porter, 2009). Hence several methods of establishing lags have been suggested but they are beyond the scope of this study13.

6.2.3 Conclusion

We conclude that there is no autocorrelation based on the Breush- Godfrey Test.

13

These are Akaike Information Criteria and Schwarz Information Criteria inter alia (Gujarati & Porter, 2009).

36

CHAPTER SEVEN ENDOGENEITY - SIMULTANEOUS EQUATIONS 7.1 Introduction This Chapter is on Endogeneity in a set-up of Simultaneous Equations. As stated in Chapter Four, Simultaneous Equations are also a remedy of multicollinearity especially in our case where several equations can be formed and there are possibilities of leaving out at least one variable in an equation.

This Chapter is arranged in the following ways; subsequent to this introduction, is a Background of Simultaneous Equations, followed by a discussion on Endogeneity Problem and, Methods of Handling Endogeneity problem. Results are left for Chapter Eight which has Comparative Regression Analysis. This compares results from OLS to those of Two-Stage-Least Squares (henceforth 2SLS).

7.2 Background of Simultaneous Equations Simultaneous equation models entail estimating more than one equation. It is the estimation of multiple equations or a system of equations. In this model, more than one dependent variable is involved which requires estimating as many equations as the number of endogenous variables. Concisely, dependent variables are called endogenous variables while independent variables, which are pre-determined, are called exogenous variables14. As it will be seen shortly while explaining the endogeneity problem, using OLS gives biased and inconsistent results. This violates the assumption of classical linear regression models of no correlation between the explanatory variables and the error term or strict exogeinity (Hill, Griffiths, & Lim, 2011, p. 450). Based on theory we estimate one simultaneous equation model with two equations15. The first is the determinant of investment in accordance to the original objective. This is determined by money

14 15

See detailed explanation in (Greene, 2012), (Hill, Griffiths, & Lim, 2011) and (Gujarati & Porter, 2009). The specification of a model should be consistent with theory ((Gujarati et al., (2009); Hill et al., (2011)).

37

supply, GNP, CPI (for nominal and growth data) and interest rates. The second is the determinant of money supply which is only determined by interest rates.

The model specification for each category, nominal, real and growth is as follows:

i) Nominal specification 𝑌1𝑡 = 𝛼10 + 𝛼11 𝑌2𝑡 + 𝜆11 𝑋1𝑡 + 𝜆12 𝑋2𝑡 + 𝜆13 𝑋3𝑡 + 𝜀1𝑡 ………………………………… Equation 16 𝑌2𝑡 = 𝛼20 + 𝜆13 𝑋3𝑡 + 𝜀2𝑡 …..……………………………………………………...……….. Equation 17

Where 𝑌1𝑡 is Investment, 𝑌2𝑡 is Money Supply, 𝑋1𝑡 is GNP, 𝑋2𝑡 is CPI and, 𝑋3𝑡 is Interest Rates. 𝛼𝑖𝑡 and 𝜆𝑖𝑡 are respective structural coefficients of variables. 𝜀𝑖𝑡 and 𝜀2𝑡 are error terms. Note that subscript “t” is used to index the observations in time series.

ii) Real specification 𝑌1𝑡 = 𝛼10 + 𝛼11 𝑌2𝑡 + 𝜆11 𝑋1𝑡 + 𝜆12 𝑋2𝑡 + 𝜀1𝑡 ………….....……………………………… Equation 18 𝑌2𝑡 = 𝛼20 + 𝜆12 𝑋2𝑡 + 𝜀2𝑡 ………..………………………..…………………………….….. Equation 19 Where 𝑌1𝑡 is Real Investment, 𝑌2𝑡 is Real Money Supply, 𝑋1𝑡 is Real GNP, 𝑋2𝑡 is Real Interest Rates. 𝛼𝑖𝑡 and 𝜆𝑖𝑡 are respective coefficients of variables. 𝜀𝑖𝑡 and 𝜀2𝑡 are error terms. iii) Growth specification 𝑌1𝑡 = 𝛼10 + 𝛼11 𝑌2𝑡 + 𝜆11 𝑋1𝑡 + 𝜆12 𝑋2𝑡 + 𝜆13 𝑋3𝑡 + 𝜀1𝑡 ………………………………… Equation 20 𝑌2𝑡 = 𝛼20 + 𝜆13 𝑋3𝑡 + 𝜀2𝑡 ………………….………………………………………...……… Equation 21 Where 𝑌1𝑡 is Growth of Investment, 𝑌2𝑡 is Growth of Money Supply, 𝑋1𝑡 is Growth of GNP, 𝑋2𝑡 is Growth of CPI and, 𝑋3𝑡 is Growth of Interest Rates. 𝛼𝑖𝑡 and 𝜆𝑖𝑡 are respective coefficients of variables. 𝜀𝑖𝑡 and 𝜀2𝑡 are error terms.

7.3 Endogeneity Problem In a simultaneous equation model it can be noted that an endogenous variable in one equation may appear as an explanatory variable in another equation of the system as seen in equations 16 to 21. 38

As a consequence, an endogenous explanatory variable becomes stochastic and correlated with the disturbance term of the equation in which it appears. In this study, money supply appears as an endogenous variable in the equation of determinants of Money Supply and it is also an explanatory variable in the estimation of the determinants of investment. Therefore, it is correlated with the disturbance term in the estimation of the determinants of investment equation. The major problem with this is that the least square estimators are biased and inconsistent and cannot be used to estimate these equations. The objective is to estimate the endogenous variable 𝑌1𝑡 and 𝑌2𝑡 and the exogenous variables 𝑋1𝑡 , 𝑋2𝑡 and 𝑋3𝑡 . To solve the endogeneity problem, we use the Two Stage Least Square (2SLS) which is an Instrumental Variables Estimator (IV)16. This method produces biased but consistent estimators.

7.3.1 Two stage least square method This method consists of using an instrument of 𝑌2𝑡 in form of its predicted values in a regression of 𝑌2𝑡 on all x' s in the system. Since 𝑌2𝑡 is correlated with the disturbance term, the solution is to purge 𝑌2𝑡 from its endogeneity by making it exogenous. This is done in two steps, hence the name Two Stage Least Squares.

Before explaining these steps, we elaborate some important facts that need to be considered before using 2SLS: i) 2SLS is applied in a case where a model is over identified17. In our case, the Investment equation is unidentified but the Money Supply equation is over identified. ii) As aforesaid, we assume that we have instruments for Money Supply18. In this case, instruments for Money Supply are Interest Rates and CPI for nominal and growth variables. In the case of Real values, Real Interest Rate is used as the Instrument. iii) A test for simultaneity should be conducted to establish whether endogenous variables in the model are in fact exogenous. This can also be stated as a test for endogeneity. We use the 16

Three Stage Leas Squares Method can also be used. It is important to check the identification of a model before estimating it (Hill, Griffiths, & Lim, 2011). This can be done using order or rank condition. See (Gujarati & Porter, 2009, pp. 699-702). 18 Instruments of Money Supply are those variables which are highly correlated with it and are not included in the Investment Model. Hence, we modify equations with Investment as the dependent variable by assuming that they do not contain Interest Rates (in nominal, real and growth form) and, CPI (in nominal and real forms). 17

39

Durbin–Wu–Hausman test, though as a post-estimation test for 2SLS. Hence its results are in Chapter Eight.

Reverting to our discussion, the steps for the 2SLS are as follows: Step one: Run reduced form regression of 𝑌2𝑡 on all the exogenous variables in the system and obtain 𝑌̂2𝑡 . Step two: Replace 𝑌2𝑡 in the original equation by the estimated, 𝑌̂2𝑡 and then apply OLS to the transformed equation.

7.4 Conclusion The main aim of this Chapter was to define and explain simultaneous equations. This includes definition and how to run analysis in a set-up of simultaneous equations. Therefore the 2SLS was discussed. Since 2SLS aims to remove the biasness and inconsistency in OLS estimators, we compare the results of the 2SLS with the OLS estimators in Chapter Eight.

40

CHAPTER EIGHT COMPARATIVE REGRESSION ANALYSIS 8.1 Introduction In this Chapter, OLS results are compared with 2SLS results at nominal, real and growth levels. However, it is important to take note the following matters: i) A dummy variable called devaluation is introduced. It is assumed to arise from a structural break. Hence, it is one (1) if there is devaluation and zero (0) if there is no devaluation. Devaluation occurred after time 25. The Do-File in the appendix elaborates on how to create this variable. ii) We take into account recommendations from previous chapters when analyzing OLS regressions. For instance, Chapter Four on multicollinearity recommends the use of real values because its regression has no multicollinearity. Additionally, Chapter Five on heteroscedasticity recommends that we run robust results for nominal, real and growth variables. iii) From footnote 18 in Chapter Seven, the instruments for Money Supply when running the 2SLS are; Interest Rates (in nominal, real and growth form) and, CPI (in nominal and real forms). iv) In accordance to Chapter Seven, we run a Durbin–Wu–Hausman test upon conducting the 2SLS estimation. v) We establish the most important variables in each model using Beta coefficients.

8.2 Nominal variables We run a robust OLS regression model along the 2SLS. Their results are as per Table 22.

41

Table 22: Comparative Regression Results for Nominal Variables Dependent Variable: Investment OLS

2SLS

1.084985*

1.130936*

(0.0527589

(0.0542584)

-3.46943**

-2.898794*

(1.50286)

(1.008739)

-4.030134**

-4.176123**

(1.654583)

(1.783496)

13.79669*

11.17757*

(4.139407)

(3.543523)

Number of observation

40

40

P-value

0.0000

0.0000

R-squared

0.9612

0.9605

Root MSE

5.4839

5.2521

Money Supply

Interest Rate

Devaluation

Constant

Key: Asterisk (*) represents level of significance. * is for 1% and ** is for 5%.

Results from Table 22 indicate a slight difference in the magnitude of coefficients from OLS and 2SLS but their results are relatively the same.2SLS has larger coefficients compared to OLS. This confirms the strong relationship between two tests. Only Money Supply positively influences investment while interest rates and devaluation reduce investment. Money supply, interest rates and devaluation explain over 96% of the variations in investment as per the R-squared value. Furthermore, both the OLS and 2SLS models are highly significant.

Results for the Durbin–Wu–Hausman test for endogeneity are shown in Table 23.

42

Table 23: Results Durbin–Wu–Hausman test for endogeneity (Nominal data) .

estat

endogenous

Tests of endogeneity Ho: variables are exogenous Durbin (score) chi2(1) Wu-Hausman F(1,35)

= =

15.9641 23.2462

(p (p

= =

0.0001) 0.0000)

Results in Table 23 confirm that endogeneity exists because we reject the null hypothesis that endogeneity does not exists. Hence, running the 2SLS was ideal. We also provide results to establish whether the instruments of money supply are ideal. This is because they are expected to be highly correlated with money supply. The instruments in this case are interest rate, devaluation, CPI and GNP. Therefore, the estat firststage results are presented in Table 24.

Table 24: estat firststage results (Nominal variables) . estat firststage First-stage regression summary statistics

Variable

R-sq.

Adjusted R-sq.

Partial R-sq.

F(2,35)

Prob > F

moneysupply

0.9760

0.9732

0.9542

364.699

0.0000

Minimum eigenvalue statistic = 364.699 Critical Values Ho: Instruments are weak

# of endogenous regressors: # of excluded instruments: 5%

2SLS relative bias 10% 19.93 8.68

2SLS Size of nominal 5% Wald test LIML Size of nominal 5% Wald test

43

10% 20% (not available) 15% 11.59 5.33

20% 8.75 4.42

30%

25% 7.25 3.92

1 2

From Table 24, the R-squared value is very high, indicating the close relationship between money supply and interest rate, devaluation, CPI and GNP.

To assess the order of importance of determinants of Investment, we run a regression with beta and the result is presented in Table 25.

Table 25: Beta regression results for Nominal variables .

reg

investment Source

gnp

moneysupply SS

interest df

cpi

devaluation

MS

Model Residual

27571.2855 350.689463

5 34

5514.25711 10.314396

Total

27921.975

39

715.948077

investment

Coef.

gnp moneysupply interest cpi devaluation _cons

.5542207 .1273716 1.243568 -4.25439 .1155779 -5.167484

Std.

Err.

.0664195 .151463 .9205758 2.833376 1.196279 3.199425

t 8.34 0.84 1.35 -1.50 0.10 -1.62

,

beta


= = = = = =

40 534.62 0.0000 0.9874 0.9856 3.2116

P>|t|

Beta

0.000 0.406 0.186 0.142 0.924 0.116

1.054579 .1043102 .0545258 -.130437 .0021431 .

The most important determinant of Investment is GNP followed by Money Supply, Interest rates, CPI and devaluation.

We also provide results for the elasticity in Table A-3 in the appendix.

8.3 Real Variables We run a robust OLS regression model alongside the 2SLS. Their results are as per Table 26. Recall from Chapter Four that real variables are the major remedy for the problem of multicollinearity.

44

Table 26: Comparative Regression Results for Real Variables Dependent Variable: Real Investment OLS

2SLS


1.15012* (0.2464497 ) -0.5047486 (0.5827745) -3.165434** (1.384805) 4.012866 (5.721219) 40

1.791202* (0.2650642) 0.6244376 (0.6632093) -4.152562** (1.732614) -10.84845*** (6.230379) 40

P-value

0.0000

0.0000

R-squared

0.6703

0.5625

Root MSE

4.5523

4.9752

Real Money Supply Real Interest Rate Devaluation Constant

Key: Asterisk (*) represents level of significance. * is for 1%, ** is for 5% and *** is for 10%.

There is significant difference in the magnitude of the coefficients between the two tests. Apart from real interest rates, the two methods have the same sign for the effect of real money supply and devaluation on real investment. OLS predicts that real interest rates negatively impact real investment. This is in line with theory. On the other hand, 2SLS shows that the relationship between real interest rates and real investment is positive. Nevertheless, both methods indicate that real interest rate is an insignificant determinant of real investment. Real money has a positive and significant effect on real investment while devaluation has a negative and significant effect.

Both models are significant at 1% and real interest rates, devaluation and money supply explain at least 56% of the variations in real investment as per the R-squared value.


45

Table 27: Results Durbin–Wu–Hausman test for endogeneity (Real data) .

estat

endogenous


= =

19.0872 31.9445

(p (p

= =

0.0000) 0.0000)

We conclude that endogeneity exists from Table 27. This is because we reject the null hypothesis of no endogeneity. Therefore, running the 2SLS was ideal and we adopt their results since relying on OLS results will be inappropriate.

In this regard, we check for the relationship between Real Money Supply and its instruments; real interest rate, devaluation and Real GNP. Results are presented in Table 28.

Table 28: estat firststage results (Real variables) . estat firststage First-stage regression summary statistics

Variable

R-sq.

Adjusted R-sq.

Partial R-sq.

F(1,36)

Prob > F

realmoneys~y

0.7707

0.7516

0.5933

52.5127

0.0000



2SLS relative bias 10% 16.38 16.38


10% 20% (not available) 15% 8.96 8.96

20% 6.66 6.66

1 1

30%

25% 5.53 5.53

The R-squared in Table 28 shows that real money supply is strongly related with its instruments. 46

To assess the order of importance of determinants of Real Investment, we run a regression with beta and the result is presented in Table 29. Table 29: Beta regression results for Real variables . reg realinvestment realgnp realmoneysupply realinterestrate devaluation, beta Source

SS

df

MS

Model Residual

1872.84567 390.047003

4 35

468.211419 11.1442001

Total

2262.89268

39

58.0228892

realinvestment

Coef.

realgnp realmoneysupply realinterestrate devaluation _cons

.4146117 .2149832 -.3198358 -.8392182 .1993249

Std. Err. .0733573 .2148067 .3989818 1.220841 3.319366

t 5.65 1.00 -0.80 -0.69 0.06


= = = = = =

40 42.01 0.0000 0.8276 0.8079 3.3383

P>|t|

Beta

0.000 0.324 0.428 0.496 0.952

.7465027 .1466682 -.0744705 -.0546611 .

The most important determinant of real investment is real GNP followed by real money supply, real interest rates and devaluation.

8.4 Growth Variables We run a robust OLS regression model along the 2SLS. Their results are as per Table 30.

47

Table 30: Comparative Regression Results for Growth Variables Dependent Variable: Investment Growth OLS

2SLS


0.6288854* (0.0654304) -0.0831456*** (0.0436476) -3.915449 (21.32782) 23.10643 (15.96035) 39

0.6509025* (0.0629803) -0.077507*** (0.0415042) -2.396236 (20.25034) 20.68371 (15.18889) 39

P-value

0.0000

0.0000

R-squared

0.7901

0.7894

Root MSE

63.665

60.409

Money Supply Growth Interest Rate Growth Devaluation Constant

Key: Asterisk (*) represents level of significance. * is for 1%, ** is for 5% and *** is for 10%.

OLS and 2SLS results are highly related as per Table 30. Growth of money supply and interest rates have a positive and negative effect respectively on growth of investment. This is in accordance with theory. These relationships are also significant at 1% and 10% respectively. Devaluation has a negative but insignificant relationship with growth in investment. Both models are highly significant. Money supply growth, interest rate growth and devaluation explain about 79% of variations in growth of investment.


Table 31: Results Durbin–Wu–Hausman test for endogeneity (Growth data) .

estat

endogenous


= =

48

4.34046 4.25786

(p (p

= =

0.0372) 0.0468)

Endogeneity exists as per Table 31 where we reject the null hypothesis of exogeneity at 5% level of significance.

In this regard, we check for the relationship between Growth of Money Supply and its instruments; Interest rate growth devaluation CPI growth and GNP Growth. Results are presented in Table 32.

Table 32: estat firststage results (Growth variables . estat firststage First-stage regression summary statistics

Variable

R-sq.

Adjusted R-sq.

Partial R-sq.

F(2,34)

Prob > F

moneysuppl~h

0.9765

0.9738

0.9718

584.829

0.0000



2SLS relative bias


10% 19.93 8.68

10% 20% (not available) 15% 11.59 5.33

20% 8.75 4.42

30%

25% 7.25 3.92

The correlation between money supply growth and its instruments is very high as per Table 32.

To assess the order of importance of determinants of Growth of Investment, we run a regression with beta and the result is presented in Table 33.

49

1 2

Table 33: Beta regression results for Growth variables . reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth devaluation Source

SS

df

MS

Model Residual

581706.516 94220.8736

5 33

116341.303 2855.17799

Total

675927.389

38

17787.5629

investementgrowth

Coef.

gnpgrowth moneysupplygrowth interestgrowth CPIgrowth devaluation _cons

.3804756 -.1285384 -.0615173 .8540074 -19.16295 24.19086

Std. Err. .3418281 .3267432 .0376717 .2138646 18.29504 13.46361


t 1.11 -0.39 -1.63 3.99 -1.05 1.80

P>|t| 0.274 0.697 0.112 0.000 0.303 0.082

= = = = = =

39 40.75 0.0000 0.8606 0.8395 53.434

[95% Conf. Interval] -.3149788 -.7933024 -.1381611 .4188966 -56.38448 -3.201057

1.07593 .5362257 .0151264 1.289118 18.05859 51.58278

The most important determinant of Growth of Investment is growth of GNP followed by growth of money supply, interest rate growth, CPI and devaluation.

50

CHAPTER NINE CONCLUSION AND RECOMMENDATION 9.1 Introduction This Chapter summarizes the present study and gives recommendations based on the findings. The Chapter has three sections. Section 9.2 presents the summary of the study and discusses the empirical findings. Econometric recommendations are given in the section 9.3 and section 9.4 gives the final conclusion of the study.

9.2 Summary of the study This study had two-fold objectives. The first was to conduct diagnostic tests of Multicollinearity, Heteroscedasticity, Autocorrelation and Endogeneity on Stata. These are violations of the CLRM assumptions and Endogeneity was to be analyzed in a system of simultaneous equations. The second was to analyze the effect of GNP, Interest rate, money supply and devaluation on investment. Nonetheless, the main objective of this study was to practically use Stata software to conduct data analysis. To achieve these objectives, we first conducted preliminary examination of the data (at nominal, real and percentage growth levels) using descriptive statistics and graphical analysis. Measures of dispersion, measures of central tendency, frequency distributions using skewness and kurtosis, trend graphs, and Kernel density functions were used. This was followed by a review of what we call “early regression results.” At this stage, we ran regression results without conducting any diagnostic tests. This was meant to compare results that have violated CLRM assumptions with those that have corrected for them in a later stage. The study proceeded to conduct diagnostic tests in this order: multicollinearity (using VIF); heteroscedasticity (using Goldfeld-Quandt test, White’s test and Breusch-Pagan-Godfrey Test) and; autocorrelation (using Durbin-Watson test and Breush- Godfrey Test/ Lagrange Multiplier Test). We later reviewed simultaneous equations with an aim of testing for endogeneity. This involved conducting a 2SLS estimation and testing for endogeneity using Durbin–Wu–Hausman test.

51

In general, the study found out that there is high multicollinearity among money supply, GNP and CPI for nominal and growth variables. However, real variables, which were created by dividing through by CPI, did not have multicollinearity problem. Hence, using real variables and conducting 2SLS were suggested as remedies for this problem. Note that investment was the dependent variable. Whereas heteroscedasticy was present, there was no autocorrelation among variables. Hence, it was recommended that we run robust OLS regressions. The Durbin–Wu– Hausman test affirmed the problem of endogeneity and necessitated the use of 2SLS. Therefore, two simultaneous equations were used; one that was for determinants of investment and the other was for determinants of money supply. Hence, we formed instruments for money supply at respective levels (nominal, real and growth).

It was established that most important determinant of Investment is GNP, followed by money supply, interest rate, CPI and devaluation (for nominal and growth variables). In the case of real variables the most important determinant of real investment is real GNP followed by real money supply, real interest rates and devaluation.

9.3 Econometric recommendations This section is about the two main weaknesses of this study and in turn gives recommendation for future studies. The first weakness is that this study is a basic level for time series data analysis. An extension of the study should consider econometric methods that test for major time series data problems such as unit roots and co-integration. Second, the study uses hypothetical data of which real world data could offer different insights that could enable us to give policy recommendations and prescriptions. Nevertheless, the main strength of this study is that it provides a detailed DoFile in Appendix C where a Researcher can use. It also gives leads to sources of information that can be used by a Researcher.

9.4 Conclusion This chapter outlined the core objective of the entire study, summarized the findings, highlighted the weaknesses and strengths and proposed areas of future research. In light of the study’s finding that the most important determinant of investment is GNP followed by money supply, interest rate, CPI and devaluation, it is crucial for governments to maintain stability for these variables. 52

REFERENCES Branson, W. (1989). Macroeconomic Theory and Policy. New York: Harper and Row. Geda, A., Ndung'u, N., & Zerfu, D. (2012). Applied Time Series Econometrics: A Practical Guide for Macroeconomic Researchers with a Focus on Africa. Nairobi: University of Nairobi Press. Greene, W. (2012). Econometric Analysis . Essex: Pearson Education Limited. Gujarati, D. (2012). Econometrics by Example. Hampshire: Palgrave Macmillan. Gujarati, D. N., & Porter, D. C. (2009). Basic Econometrics . New York: McGraw-Hill/Irwin. Hayashi, F. (2000). Econometrics. New Jersey: Princeton University Press. Hill, R., Griffiths, W., & Lim, G. (2011). Principles of Econometrics. New Jersey: John Wiley & Sons, Inc. Johnston, J., & DiNardo, J. (1998). Econometric Methods (4 ed.). Singapore:: McGraw-Hill. Kmenta, J. (1971). Elements of Econometrics. New York: Macmillan Publishing Co., Inc. kullabs. (2017, May 20). kullabs. Retrieved from www.kullabs.com: https://www.kullabs.com/classes/subjects/units/lessons/notes/note-detail/9135 Medica, B. (2017, May 21). Biochemia medica. Retrieved from Biochemia medica: http://www.biochemia-medica.com/content/reference-intervals-tool-total-qualitymanagement Romer, D. (2012). Advanced Macroeconomics . New York: McGraw-Hill. Salvatore, D., & Reagle, D. (2002). Theory & Problems of Statistics & Econometrics. New York: 2002.

53

APPENDIX A Table A- 1: Summary of Unit Root Test Results Test Inference Remark Test Statistic ADF -6.210* I(0) Stationary at level PP -6.232* I(0) Stationary at level ZA -6.980* I(0) Stationary at level ADF -6.469* I(0) Stationary at level GNP PP -6.469* I(0) Stationary at level ZA -7.440* I(0) Stationary at level ADF -6.350* I(0) Stationary at level Money supply PP -6.372* I(0) Stationary at level ZA -7.480* I(0) Stationary at level ADF -6.433* I(0) Stationary at level Interest Rate PP -6.395* I(0) Stationary at level ZA -7.450* I(0) Stationary at level ADF -6.614* I(0) Stationary at level CPI PP -6.627* I(0) Stationary at level ZA -7.996* I(0) Stationary at level Key: ADF- Augmented Dickey-Fuller test, PP- Phillips-Perron and, ZA- Zivot-Andrews. Asterisk (*) represents significance at 1% level. Variable Investment

Table A- 2: Shapiro -Wilk W test for normality Variable Observations W V z Investment 40 0.866 5.303 3.511 GNP 40 0.887 4.469 3.151 Money supply 40 0.921 3.133 2.403 Interest Rate 40 0.908 3.652 2.726 CPI 40 0.866 5.278 3.501 Key: A variable is normal if its p-value is greater than 0.01. None in this case.

54

Prob>z 0.000220 0.000810 0.00813 0.00321 0.000230

Table A- 3: Elasticity Results . reg LNInvestment LNGNP LNIMoneysupply LNIInterest LNIcpi devaluation Source

SS

df

MS

Model Residual

17.1384886 .703316033

5 34

3.42769771 .020685766

Total

17.8418046

39

.457482169

LNInvestment

Coef.

LNGNP LNIMoneysupply LNIInterest LNIcpi devaluation _cons

.6750993 .122548 -.0420006 .2464348 -.0253225 .234934

Std. Err. .225774 .2242323 .0409571 .1443013 .0523345 .5287077

t 2.99 0.55 -1.03 1.71 -0.48 0.44

55


P>|t| 0.005 0.588 0.312 0.097 0.632 0.660

= = = = = =

40 165.70 0.0000 0.9606 0.9548 .14383

[95% Conf. Interval] .2162713 -.3331468 -.1252355 -.0468206 -.1316791 -.8395292

1.133927 .5782427 .0412344 .5396903 .081034 1.309397

APPENDIX B Table B- 1: Data Time

ID

GNP

Investment

Money supply

Interest rate

CPI

1

4

20

20

8

5

0.73

2

50

172

88

75

1.4

3

3

27

130

70

50

0.5

1.67

4

23

110

55

30

1.5

1.48

5

7

44

20

15

2

0.77

6

17

45

21

16

3

0.99

7

32

138

69

53

0.4

1.98

8

3

29

11

10

3

0.69

9

37

150

72

58

0.3

2.49

10

11

52

22

17

2

0.87

11

48

168

84

76

1.5

3.05

12

36

146

75

57

0.2

2.04

13

42

154

80

66

1.8

2.89

14

2

27

11

12

3.5

0.67

15

8

44

21

16

2

0.81

16

24

110

60

40

0.5

1.54

17

1

25

12

10

4

0.65

18

6

38

16

15

3

0.78

19

14

55

24

19

3

0.91

20

16

60

33

30

1.2

0.97

21

22

112

55

35

1

1.06

22

33

140

68

53

0.3

1.99

23

29

135

72

52

0.3

1.72

24

9

46

21

16

2

0.83

25

13

57

26

25

2

0.87

26

20

75

36

30

1

1

27

47

165

86

74

1.3

2.99

28

19

60

25

22

1.5

0.99

29

30

130

70

50

0.5

1.89

56

30

26

110

62

48

1

1.67

31

40

148

74

60

0.2

2.88

32

38

150

77

57

0.3

2.61

33

28

133

71

51

0.4

1.68

34

45

159

81

71

1.4

2.92

35

18

56

26

28

2

0.98

36

35

141

70

56

0.3

2.03

37

49

169

83

78

1.4

2.89

38

15

60

30

27

1

0.93

39

31

135

68

52

0.4

1.95

40

5

36

16

14

3

0.76

57

APPENDIX C Table C- 1: Do-File ************************************************************** * Determinants of Investment * Socrates Kraido Majune * Violations of CLRM and Simultaneous Equations assignment ************************************************************** set more off *Set working directory use "E:\Economics\PhD Economics\PhD Semester Two\QM2\EC706 Data.dta" *Selecting 80 perecent of the data sample 80 *Generating a time variable gen time=_n *Setting ID gen ID=time *Labeling variables label variable time "Created Time" label variable ID "Item Number" label variable gnp "Gross National Product" label variable investment "National Investment" label variable moneysupply "Money Supply" label variable interest "Interest Rate" label variable cpi "Consumer Price Index" ************************************************************** * Generation of real variables gen realgnp=gnp/cpi gen realinvestment=investment/cpi gen realmoneysupply=moneysupply/cpi 58

gen realinterestrate=interest-cpi *Labeling variables label variable realgnp "Real Gross National Product" label variable realinvestment "Real National Investment" label variable realmoneysupply "Real Money Supply" label variable realinterestrate "Real Interest Rate" ************************************************************** * Generation of growth variables gen gnpgrowth=(gnp[_n]-gnp[_n-1])/gnp[_n-1]*100 gen investementgrowth=(investment[_n]-investment[_n-1])/investment[_n-1]*100 gen moneysupplygrowth=(moneysupply[_n]-moneysupply[_n-1])/moneysupply[_n-1]*100 gen interestgrowth=(interest[_n]-interest[_n-1])/interest[_n-1]*100 *Labeling variables label variable gnpgrowth "Growth of GNP" label variable investementgrowth "Growth of National Investment" label variable moneysupplygrowth "Growth of Money Supply" label variable interestgrowth "Growth of Interest rate" ************************************************************** * Descriptive statistics for Nominal Values tabstat gnp investment moneysupply interest cpi , stat(n mean sd cv skewness kurtosis) long col(stat) * To directly transfer output to word document *First install logout command which enables to transfer results ssc install logout logout, save(mytable) word replace: tabstat gnp investment moneysupply interest cpi , stat(n mean sd cv skewness kurtosis) long col(stat) * Descriptive statistics for Real values tabstat realgnp realinvestment realmoneysupply realinterestrate, stat(n mean sd cv skewness kurtosis) long col(stat) * To directly transfer output to word document logout, save(mytable) word replace: tabstat realgnp realinvestment realmoneysupply realinterestrate, stat(n mean sd cv skewness kurtosis) long col(stat) 59

* Descriptive statistics for Growth values tabstat gnpgrowth investementgrowth moneysupplygrowth interestgrowth , stat(n mean sd cv skewness kurtosis) long col(stat) * To directly transfer output to word document logout, save(mytable) word replace: tabstat gnpgrowth investementgrowth moneysupplygrowth interestgrowth , stat(n mean sd cv skewness kurtosis) long col(stat) ************************************************************** * Graphical Analysis *This requires running simple regressions per variable then generating their respetive predictions asfollows reg gnp time predict gnpp reg investment time predict investmentp reg moneysupply time predict moneysupplyp reg interest time predict interestp reg cpi time predict cpip * Graphs are plotted as follows twoway (line gnp gnpp time), ytitle(GNP) xtitle(Time) title(GNP VS TIME) name(Figure1) kdensity gnp , normal twoway (line investment investmentp time), ytitle(Investment) xtitle(Time) title(Investment VS Time) name(Figure3) kdensity investment , normal twoway (line moneysupply moneysupplyp time), ytitle(Money Supply) xtitle(Time) title(Money Supply VS Time) name(Figure5) kdensity moneysupply , normal twoway (line interest interestp time), ytitle(Interest rate) xtitle(Time) title(Interest Rate VS Time) name(Figure7) kdensity interest , normal twoway (line cpi cpip time), ytitle(CPI) xtitle(Time) title(CPI VS Time) name(Figure8)

60

kdensity cpi , normal * Unit root tests * Augmented Dickey-Fuller test *First set data as time series tsset time dfuller gnp dfuller investment dfuller moneysupply dfuller interest dfuller cpi * Phillips-Perron test pperron gnp pperron investment pperron moneysupply pperron interest pperron cpi * Zivot-Andrews test *First install Zivot-Andrews, then run commands ssc install zandrews zandrews gnp zandrews investment zandrews moneysupply zandrews interest zandrews cpi *Shapiro-Wilk W test for normality swilk investment gnp moneysupply interest cpi * Exporting Results Directly into Word logout, save(mytable) word replace:swilk investment gnp moneysupply interest cpi * Plotting graphs for Nominal Vs Real values

61

line gnp time , yaxis(1 2) xaxis(1 2)|| line realgnp time line investment time , yaxis(1 2) xaxis(1 2)|| line realinvestment time line moneysupply time , yaxis(1 2) xaxis(1 2)|| line realmoneysupply time line interest time , yaxis(1 2) xaxis(1 2)|| line realinterestrate time *** Plotting graphs for growth of Nominal Vs Real values * First generate growth of real values as follows gen gnprgrowth=( realgnp [_n]- realgnp [_n-1])/realgnp[_n-1]*100 gen realinvestmentgrowth=(realinvestment[_n]-realinvestment[_n-1])/realinvestment[_n-1]*100 gen realmoneysupplygrowth=( realmoneysupply[_n]-realmoneysupply[_n-1])/realmoneysupply[_n-1]*100 gen realinterestrategrowth=(realinterestrate[_n]-realinterestrate [_n-1])/realinterestrate[_n-1]*100 gen CPIgrowth=(cpi[_n]-cpi[_n-1])/cpi[_n-1]*100 *Labeling variables label variable gnprgrowth "Growth of Real GNP" label variable realinvestmentgrowth "Growth of Real Investment" label variable realmoneysupplygrowth "Growth of Real Money Supply" label variable realinterestrategrowth "Growth of Real Interest Rate" label variable CPIgrowth "Growth of CPI" *Graphs are then generated as follows line gnpgrowth time , yaxis(1 2) xaxis(1 2)|| line gnprgrowth time line investementgrowth time , yaxis(1 2) xaxis(1 2)|| line realinvestmentgrowth time line moneysupplygrowth time , yaxis(1 2) xaxis(1 2)|| line realmoneysupplygrowth time line interestgrowth time , yaxis(1 2) xaxis(1 2)|| line realinterestrategrowth time ************************************************************** * Correlation Analysis * Correlation Of Nominal Variables pwcorr investment gnp moneysupply interest cpi , star(.01) bonferroni * Exporting Results Directly into Word logout, save(mytable) word replace: pwcorr investment gnp moneysupply interest cpi , star(.01) bonferroni * Correlation Of Real Variables

62

pwcorr realgnp realinvestment realmoneysupply realinterestrate , star(.01) bonferroni * Exporting Results Directly into Word logout, save(mytable) word replace: pwcorr realgnp realinvestment realmoneysupply realinterestrate , star(.01) bonferroni * Correlation Of Growth Variables pwcorr investementgrowth gnpgrowth moneysupplygrowth interestgrowth , star(.01) bonferroni * Exporting Results Directly into Word logout, save(mytable) word replace: pwcorr investementgrowth gnpgrowth moneysupplygrowth interestgrowth , star(.01) bonferroni ************************************************************** * Regression Analysis * Regression of nominal varaibles reg investment gnp reg investment moneysupply reg investment interest reg investment cpi reg investment gnp moneysupply interest cpi * Regression of Real varaibles reg realinvestment realgnp reg realinvestment realmoneysupply reg realinvestment realinterestrate reg realinvestment realgnp realmoneysupply realinterestrate * Regression of Grwoth varaibles reg investementgrowth gnpgrowth reg investementgrowth moneysupplygrowth reg investementgrowth interestgrowth reg investementgrowth CPIgrowth reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth ************************************************************** * Multicollinearity Test

63

*Nominal variables quietly reg investment gnp moneysupply interest cpi estat vif *Real variables quietly reg realinvestment realgnp realmoneysupply realinterestrate estat vif *Growth variables quietly reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth estat vif *Additional tests for Multicollinearity quietly reg investment gnp moneysupply interest cpi test gnp= moneysupply test cpi= gnp test cpi= moneysupply ************************************************************** * Heteroscedasticity Test * Goldfeld-Quandt test - Steps tab gnp //step 1 reg investment gnp moneysupply interest cpi if gnp =133 //step 3 tab realgnp reg realinvestment realgnp realmoneysupply realinterestrate if realgnp =61.85567 tab gnpgrowth reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth if gnpgrowth =31.57895 * White’s General Test quietly reg investment gnp moneysupply interest cpi estat imtest, white

64

quietly reg realinvestment realgnp realmoneysupply realinterestrate estat imtest, white estat imtest, white quietly reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth CPIgrowth estat imtest, white * Breusch-Pagan-Godfrey Test estat hettest quietly reg investment gnp moneysupply interest cpi estat hettest quietly reg realinvestment realgnp realmoneysupply realinterestrate estat hettest quietly reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth estat hettest ************************************************************** * Autocorelation Test * Durbin-Watson test quietly reg investment gnp moneysupply interest cpi estat dwatson quietly reg realinvestment realgnp realmoneysupply realinterestrate estat dwatson quietly reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth estat dwatson * Breusch–Godfrey test quietly reg investment gnp moneysupply interest cpi estat durbinalt, small quietly reg realinvestment realgnp realmoneysupply realinterestrate estat durbinalt, small quietly reg investementgrowth gnpgrowth moneysupplygrowth interestgrowth estat durbinalt, small

65

************************************************************** * Simultaneous Equations * Conducting 2SLS *First create a variable called devaluation as follows gen devaluation=0 if time=25 label variable devaluation "devaluation after the break" label define devaluation 0 "nodevaluation" 1 "devaluationpresent" label values devaluation *We can oder devaluation to appear before time using this command order devaluation , before( time ) * Conducting robust regression for nominal variables reg investment moneysupply interest devaluation, robust *Conducting 2SLS for nominal variables ivregress 2sls investment interest devaluation ( moneysupply = cpi gnp ) * Conducting Durbin–Wu–Hausman test for endogeneity for nominal variables estat endogenous * To establish the correlation of intruments with Money Supply for nominal variables ivregress 2sls investment interest devaluation ( moneysupply = cpi gnp ) estat firststage *Obtaining Beta Regression Results for nominal variables reg investment gnp moneysupply interest cpi devaluation , beta ************************** *Commands for Real variables reg realinvestment realmoneysupply realinterestrate devaluation, robust ivregress 2sls realinvestment realinterestrate devaluation ( realmoneysupply = realgnp ) ivregress 2sls realinvestment realinterestrate devaluation ( realmoneysupply = realgnp ) estat endogenous estat firststage

66

reg realinvestment realmoneysupply realinterestrate devaluation,beta ************************** *Commands for Growth variables ivregress 2sls investementgrowth interestgrowth devaluation ( moneysupplygrowth = CPIgrowth gnpgrowth ) reg investementgrowth moneysupplygrowth interestgrowth devaluation ivregress 2sls investementgrowth interestgrowth devaluation ( moneysupplygrowth = CPIgrowth gnpgrowth ) estat endogenous estat firststage reg investementgrowth moneysupplygrowth interestgrowth devaluation, beta ************************************************************** * Elasticity results * First generate log of variables and label them gen LNGNP=ln( gnp ) gen LNInvestment=ln( investment ) gen LNIMoneysupply =ln( moneysupply ) gen LNIInterest =ln( interest ) gen LNIcpi =ln( cpi ) label variable LNGNP "Log GNP" label variable LNInvestment "Log Investment" label variable LNIMoneysupply "Log Money Supply" label variable LNIInterest "Log Interest Rate" label variable LNIcpi "Log CPI" *Run regression model to obtain elasticity as follows reg LNInvestment LNGNP LNIMoneysupply LNIInterest LNIcpi devaluation ************************************************************** * End Do-file log close

67

68

A Concise Review of Classical Linear Regression ...

A Concise Review of Classical Linear Regression ...

Suggest Documents

Using the classical linear regression model in analysis of the ...

A Concise Review - CeRV

Linear Regression

Linear Regression

A Mixture of Linear-Linear Regression Models for Linear ... - arXiv

The Classical Linear Regression Model with one Incomplete Binary

a concise review - SAGE Journals

A classical regression framework for mediation

A Study of Fuzzy Linear Regression - InterStat

A concise review of carbon nanotube's toxicology

A Concise Review of Amyloidosis in Animals

LINEAR ESTIMATION OF REGRESSION COEFFICIENTS*

Linear Regression Projection of Periodicity

Applications of Generalized Linear Regression

Regression & Linear Modeling

4sid linear regression

Functional Linear Regression

Linear Regression in MATLAB

Simple linear regression - Statstutor

Bayesian Linear Regression

Linear Regression Analysis Study

Simple Linear Regression Models

Simple Linear Regression

Linear Regression - Courses