Spatial dependence and risk aversion in the

0 downloads 0 Views 7MB Size Report
Jun 15, 2015 - Nova Scotia; Montreal, Quebec; Toronto, Ontario; Ottawa, Ontario; ... a new house or improving an existing one, and the data of UWI is also ...
Spatial dependence and risk aversion in the residential Canadian housing market ∗





Yuan Zhang, Yiguo Sun, Thanasis Stengos June 15, 2015

Abstract This paper studies the spatial dependence of residential resale housing returns in 10 major Canadian Census Metropolitan areas (CMA) from 1992Q4 to 2012Q4 and makes the following methodological contributions. Firstly, in the context of a spatial dynamic panel data model we use grid search to derive the appropriate spatial weight matrix W among dierent possible specications. We select the compound W with the minimum root mean squared error formed by geographical distances and the number of inter-CMA migrants. We further oer an interpretation of the selected W that is directly linked to the denition of the Arrow-Pratt risk aversion parameter. Secondly, contrary to common practice in the literature, we decompose our parameter estimates into direct and indirect eects and we proceed to derive and plot the impulse response functions of housing returns to external shocks.The empirical results suggest that Canadian residential housing markets exhibit statistically signicant spatial dependence and spatial autocorrelation and both geographical distances and economic closeness are the dominant channels. Furthermore, the special feature of the Canadian housing market is, as seen from the impulse response functions, that the responses to shocks do not spread widely across regions and that they fade fast over time. Key Words: Canadian residential housing returns; Impulse response functions; Spatial dependence; Spatial dynamic panel data model; Spatial weight matrix W ;

Ph.D candidate in Economics and Finance, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W1, email: [email protected]. This paper is presented in 2nd Annual Doctoral Workshop in Applied Econometrics, 41th Annual Conference of the Eastern Economics Association, and 49th Annual Conference of the Canadian Economics Association, we wish to thank Dr. Martin Burda, Xuefeng Pan, Dr. Christos Ntantamis and participants in the conferences, for their helpful comments on earlier drafts of this paper. We wish also thank Dr. Paul Anglin and Dr. Min Seong Kim for their valuable comments. † Professor of Economics, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W Contact: 519-824-4120 ext. 58948, email: [email protected]. ‡ Professor of Economics, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W Contact: 519-824-4120 ext. 53917, email: [email protected]. ∗

1

1

Introduction Residential real estate investment accounts for a large share of total household expen-

ditures as well as of national wealth.

The Statistics Canada's 2011 National Household

Survey showed that 69.0% of households in Canada, or 9.2 million of 13.3 million, owned their dwellings, and the share of total housing-related spending in nominal GDP was 19.6% in 2011. Moreover, housing demand aects the demand for mortgage nancing and carries vital macroeconomic implications which inuence potentially protable investment opportunities and economic growth. The most recent U.S. nancial market crisis beginning in 2008 demonstrates the impact of the housing market on the global economy. The immediate trigger of the crisis was the bursting of the U.S. housing bubble. Thus, changes in house prices have huge macroeconomics implications. Over the past few decades, sustained growth in house prices and their recurrent uctuations around the growth path seem to be a common phenomenon in many countries. Canadian housing prices have also experienced rapid growth in recent years and there has been much debate on whether this growth constitutes a housing market bubble.

The real value of residential house and land has

increased, and home equity accounts for a rising share of household net worth over the past decade.

1

Hence, it is worthwhile to study the housing price and return behavior.

New insights on links between economic conditions and housing returns will be of value both to household investors and mortgage managers. Houses have xed locations, hence dierent from other markets, they are geographically separated. Allen et al. (2006) provides an empirical analysis of Canadian city housing prices and the conclusion is consistent with the theoretical modeling of Glaeser and Gyourko (2006). They both reach the conclusion that housing is local in nature and suggest to use city-level house prices rather than a nationwide index. So our interest is to examine whether there is any spatial dependence across these location-separated housing markets and whether price shocks in one region are subsequently transmitted to other regions. In general, spatial dependence of housing prices could arise from information spillovers and asset demands. Meen (1999) suggests that national housing markets are a series of interlinked local markets, and provides several channels through which spatial interaction can possibly occur. First, households take advantage of dierences in regional house prices to move between locations.

Population changes and migration ows may be important

factors. This refers to house price ripple eects as households relocate and produce changes in the spatial distribution in house prices. The second possible channel is through equity transfer and spatial arbitrage.

When housing prices in one region rise, the equity of

homeowners in this region would also increase.

They can then trade or transfer their

expanded equity to other markets, say low priced regions, and force up the prices in other

1

Canada Mortgage and Housing Corporation(CMHC) Canadian Housing Observer 2012. 2

regions.

It is not only related to migration movements, but also related to arbitrage

investment motives without households physically moving. Third, the observed pattern of houses price can occur even if there are no spatial links between housing markets, if the other determining factors follow similar patterns. Housing prices are determined by local characteristics. Such determinants may be linked inter-regionally, then eventually lead to dependencies in housing markets across regions.

Given these theoretical possibilities of

spatial linkages of housing prices, it is reasonable to study the spatial dependence across housing markets. To study the spatial dependence of returns could also be meaningful to researchers, home buyers, and investors. However, there is surprisingly not much research on the Canadian housing markets using modern time-series and spatial methods, compared with the U.S and some European markets, see Miller and Peng (2006), Gupta and Miller (2012), Karoglou (2013), Tsai (2010) and Guirguis et al. (2007). The main contribution of this paper can be summarized as follows. Firstly, this paper studies the Canadian residential housing market with a dynamic spatial panel data model, which has not been applied before to the Canadian market. Secondly, it uses a data driven method relying on grid search to select the spatial weight matrix a novel interpretation of the above selected weight matrix aversion theory.

W

W.

We then provide

using arguments from risk

Thirdly, we oer an explanation and interpretation of the estimated

coecients that is quite dierent from standard non-spatial models and we are able to decompose the estimated eects into direct and indirect ones. The current practice in the literature is for applied researchers to interpret spatial model estimates in the same way as those from non-spatial models. Finally, we derive impulse response functions to shocks, something that is also novel in the context of the model that we use. The remainder of this paper is organized as follows.

Section 2 oers the additional

motivation and the relevant literature for our paper. Section 3 deals with the data and the choice of the econometric model that we employ, while Section 4 presents the estimation results.

Section 5 discusses the results from the impulse response functions and nally,

Section 6 oers the conclusions. The appendix collects all the derivations including those of the impulse response functions.

2

Motivation and Literature review Most house buyers purchase a home for primary residence. The question then arises of

the importance of housing returns to these primary residence buyers. Before attempting to answer this question, it is instructive to consider some of the main highlights of housing market activities from the National Household Survey (NHS) 2011 conducted by Statistics Canada.

Of the approximately 9.2 million (69%) households owner in 2011, about 2.6

million (or 28.1%) had moved into their dwelling between 2006 and 2011 - reecting a

3

large part of residential real estate market activity over this ve-year period. Of these 2.6 million households, about 1.5 million (or 58.6%) were local housing market purchasers, and the remaining 1.1 million (or 41.4%) were purchasers from outside the municipality. The proportion of owner households that had moved between 2006 and 2011 diered across CMAs.

From these data, we can observe how many Canadians moved their location

within a ve-year period, and whether CMAs have any spatial linkage through inter-CMA migrants. At the time they need to sell the current house in order to buy a new one, house price appreciation could be the most important issue for them. Any discrepancy between buying and selling house prices could aect the decision to move.

A house is likely to

be the largest single asset and most valuable property in people's life, and home equity is the current market value of the home minus the remaining balance of the mortgage. In addition, the amount of property equity that one has built up through both price appreciation and mortgage principle reductions can be one of the most powerful nancial tools to reborrow, i.e. through home equity line of credit (HELOC) for renovations, debt consolidation, higher education, and so on. These are the main reasons for people to be concerned about housing returns. In recent years, the housing sector has seen dramatic changes as the number of vacation house buyers and investors has increased. According to the Canada Mortgage and Housing Corporation's (CMHC) ndings, the number of homes built in Canada from 2001 to 2011 exceeded the net increase in households by about 225,000. This suggests that the excess of housing completions over household formation is due to growth in the number of second homes, as well as the ongoing replacement of homes lost to, for example, re, demolition or conversion to other uses. In addition, the banking system in Canada is widely considered one of the safest banking system in the world, and the majority of provinces have no 2

restrictions on foreign ownership of real estate in Canada. These attract not only domestic investors but also foreign non-resident investors.

Thus, housing returns could also be

valuable information for these people. In some studies of housing markets, authors use aggregate house price indexes. Tsai, Chen and Ma (2010) use U.K. nationwide house price data to analyze volatility properties of UK house prices.

For the Canadian housing market, Hossain and Latif (2009) use

average housing prices to identify the determinants of housing price volatility. However, the aggregate index may not represent any particular housing location. As in Allen et al. (2006), the lack of long-run relationship for Canadian house prices suggests the need to seek

2

British Columbia, Ontario, Quebec, Nova Scotia, Newfoundland, and New Brunswick have no restrictions on foreign ownership, some do limit the amount of property/land that a non-resident can purchase, e.g. Prince Edward Island and Manitoba.

4

city-specic house-price determinants to study Canadian housing market. This nding is consistent with Glaeser and Gyourko's (2006) theoretical modeling approach who argue that less than 8% of the variation in price levels and only about 25% of the variation of price changes can be accounted for by national year-specic xed eects.

Therefore,

housing is local in nature. In the real estate literature, a number of studies have attempted to examine housing prices, housing returns and/or housing volatility behavior. Time series approaches are commonly used, such as vector autoregressive (VAR) model, ARCH, and multivariate GARCH models. Dolde and Tirtiroglu (2002) use a non-parametric time series model to examine thirty-six volatility events in the U.S., while Miller and Peng (2006) use GARCH models and a panel VAR model to analyze the housing price volatility in 277 U.S. metropolitan statistical areas (MSA). Miles (2008) estimates GARCH models in fty U.S. states to study house price appreciation and volatility. His results suggest that the U.S. housing market exhibits volatility clustering. Gupta and Miller (2012) use a VAR model to examine the time-series relationship between house prices in eight Southern California MSAs and use a vector-error-correction (VEC) model to calculate out-of-sample forecasts in each MSA. In the studies mentioned above, the authors either estimate regions separately or cluster the regions into several subgroups.

However, none of these studies considers the one-to-one

linkages among individual regions, something that we explicitly examine in the present paper for the selected 10 CMAs in the Canadian housing market. We use a spatial dynamic panel data (SDPD) model to examine Canadian residential housing returns. This model can capture the spatial feature and one-to-one linkages among spatial units, when compared with standard time series models, while it contains a dynamic temporal feature when compared with simple spatial models. Brady (2011) uses a SDPD model to study 31 California counties and measures the diusion of housing prices. Zhu et al. (2013) use a SDPD model to investigate the spatial linkage in return and volatility among U.S. regional housing markets.

However, in both papers, the authors choose a

subjectively pre-specied spatial weight matrix discussion for their choice.

W

without oering much of a convincing

In both of these papers the authors did not interpret their

estimates by taking into consideration the complicated spatial structure as in LeSage and Pace (2009). In the present paper we interpret our estimates accounting for the spatial model structure. We discuss both direct and indirect impacts associated with a change of a particular explanatory variable in a single region and we examine how this change will aect the region itself directly (or direct impact) and potentially aect all other regions indirectly (or indirect impacts). The spatial weight matrix is the key element in spatial regression models.

Anselin

(1988) denes it as the formal expression of spatial dependence between spatial units. It is usually denoted as the

W

matrix and it is commonly constructed using geographical

5

distances, see Moran (1950), Geary (1954), Cli and Ord (1969) to name only a few among the ealier specications of spatial weights. More recently, Getis and Aldstadt (2004) summarize various previous studies on creating a spatial weight matrix, and they propose a Local Statistics Model (LSM) based on the concept of local statistic. Geographical

W

can be formed as the spatial contiguity with binary variables, see Cli and Ord (1981); Zhang and Murayama (2003) using distances with decaying weights dened by certain functions,

k -nearest neighbors; Zhang and Murayama (2000) with weights based on lengths

of shared borders, and so on. Since

W

is dened beforehand, it is widely seen as the most

controversial part in the spatial structure. subjectively a pre-specied

W,

In many studies, researchers usually choose

while other researchers use an objective method to select

W , which nevertheless lies outside the contextual framework of their model, see for example Brady (2011) and Zhu et al. (2013). The formulation and calculation of the spatial weight matrix and the criteria of selecting a weighting scheme among various possible alternatives need to be carefully analyzed. Most applied researchers do not justify their choice of

W

and they also fail to interpret their results conditional on that choice. In this paper we provide a data driven approach based on grid search to decide on an appropriate oer some guidance on the objective selection of

W

and we

within the context of the given model.

We also provide a risk-aversion based interpretation of

3

W

W,

which is new in the literature.

Data and methodology

3.1 Data The data used in this paper are quarterly data from 1992Q2 to 2012Q4, and the summary statistics of all data can be found in Table 1. Housing prices data are real quarterly residential resale housing prices provided by the University of British Columbia (UBC) Centre for Urban Economics and Real Estate based on Royal LePage (RLP) Survey of 3

Canadian House Prices data. Real house prices are calculated based on 2002Q2 dollars. The RLP city housing indices are based on standardized house type (average of bungalow and two-storey executive) in 250 well-dened neighborhoods across Canada.

4

There are

10 major census metropolitan areas(CMAs) selected from 7 provinces, including Halifax, Nova Scotia; Montreal, Quebec; Toronto, Ontario; Ottawa, Ontario; Winnipeg, Manitoba; Regina, Saskatchewan; Calgary, Alberta; Edmonton, Alberta; Vancouver, British

3 Another available housing price measure is the Multiple Listing Services (MLSr ) conducted by Canadian Real Estate Association (CREA). However, the MLSr index are not quality adjusted. Holios and Pesando (1992) discussed some shortcomings of MLSr index, including mix of units sold varies over time, so that the 'quality' of dwelling units sold is not held constant; the short-run signals regarding price movements often diverge; and the MLS index suggests a more rapid rate of price escalation. 4 Detailed denition of bungalow and two-storey executive can be found in Glossary of Housing Types of Royal LePage House Price Survey.

6

Columbia; and Victoria, British Columbia.

5

These include the largest urban centers in

Canada geographically spanning the whole of the country. According to Statistics Canada, a CMA is a large urban area together with adjacent rural areas that have a high degree of social and economic integration with urban cores. A CMA has an urban core population of at least 100,000. However, nding high-quality, reliable and comparable Canadian data at CMA level is problematic. Bearing this limitation in mind, we begin our analysis by discussing city-level housing price determinants.

The value of residential building permits (BP) is a leading

indicator for the construction industry since the issuance of a building permit is the rst step in the construction process. We opt to use the seasonally adjusted value of residential BP to capture the costs associated with construction. The quarterly value of residential building permits (BP ) are from Statistics Canada and it is seasonally adjusted.

The

construction union wage index (UWI) is used as a proxy of labour cost of either building a new house or improving an existing one, and the data of UWI is also obtained from Statistics Canada. From the supply side, we include new residential dwelling starts and completions. These two determinants are leading factors in housing construction activity. Builders may adjust construction activity according to dwelling starts and completion in order to manage inventory levels or existing housing stock.

For home-builders and investors to evaluate

the real estate market, housing starts could be looked in conjunction with mortgage rates, existing house sales and a housing price index. According to Statistics Canada, housing starts are considered as a key gauge of economic conditions. Housing completions directly capture the number of homes added into the existing housing stock.

The total units of

quarterly housing starts data are obtained from Canada Mortgage and Housing Corporation (CMHC) and they are seasonally adjusted at annual rates. According to CMHC, completions are a better measure for assessing housing inventory. Housing inventory levels often can be assessed by aggregating the number of completed and unoccupied units. The total units of newly completed dwellings are also obtained from CMHC, which includes all newly completed and unoccupied units. The new housing price index (NHPI) measures changes over time in the selling prices of new residential houses.

While we study resale residential housing prices we expect a

positive relationship between new and resale home prices.

If a new home price is high,

the price is most likely to be high in the future when the owner resells it. New homes are potential previously-owned homes to appear in resale market in the future. In addition, NHPI tracks and comprehends the events and trends in the housing construction sector.

5

See Appendix A for the detailed list of selected 10 CMAs and their geographical locations. CMA is a larger area than a city as it also includes municipalities adjacent to the city urban core. Geographic denitions used by Royal LePage is slightly dier from those used by Statistics Canada. However, it is the closest data available. 7

Statistics Canada also use NHPI series to calculate some element in the Consumer Price Index. New housing price index (NHPI) data are obtained from Statistics Canada using 2007 as base year. When a buyer buys a house other than for primary residence, it is usually for collecting rents. If the rental price of a residential unit is high and the rental vacancy rate is low in that area, then it is valuable to invest, something that may drive the property prices up.

In addition, if rents are too high, some people may substitute away from renting

and consider buying a home.

In this sense, residential home prices will be aected by

average rents. The rent index is obtained from Statistics Canada, and it is the sub-index of Consumer Price Index (CPI) called "rented accommodation". The rent index includes all rental expenses, such as rents, tenants' insurance premiums, maintenance, repairs and other related expenses. Some other common demographic variables are population growth rate, unemployment rate, and household income. Population growth rate is considered as the most important factor that aects housing demand. The unemployment rate is the labor market variable, since people are unlikely to buy a house when they lost their jobs. Hence, housing prices are expected to be low when when unemployment is high. Population and unemployment rates for CMAs are from Statistics Canada Labour Force Survey (LFS) estimates. Variables that exhibit seasonality are seasonally adjusted. Another demographic variable is household income.

For some high income CMAs,

housing prices also tend to be high and as such household income is usually considered as a possible determinant of housing prices. Statistics Canada only provides annual metro-level household income.

To obtain quarterly metro-level household income, we use quarterly

national data and assume that the ratio of CMA-specic data to the national data remains 6

constant within a given year.

The variables we used in the spatial weight matrix are as follows. The geographical distances between metropolitan areas are obtained from Google Maps. Total employment levels and data for movers are from the Canadian population Census for years 1996, 2001, 2006 and 2011.

The employment levels used in the weight matrix are averages of total

employment in the 1996, 2001, 2006 and 2011 for each CMA. Inter-CMA migrants are from 2006 Census data; they are total number of people moved between our selected CMAs in a ve-year period from 2006 to 2011. Movers are people that lived in a dierent address 5 years before the census year.

Movers include migrants (both inter-provincial,

intra-provintial migrants and external migrants or immigrants) and non-migrants that lived in a dierent address. Real GDP used in the weight matrix is the average value of

6

More specic, we use annual CMA-level household income divided by the annual national level household income, and we assume this ratio remain constant in a given year. Then, we use this ratio multiply by the quarterly national level income to get the quarterly CMA-level household income.

8

Table 1: Summary Statistics

Variables

Mean

Std. Dev.

Min

Max

Housing returns (Yt ) (in %)

0.73

2.86

-9.01

22.90

New dwelling starts (Starts) (in 1,000 units) New dwelling completed (Completed ) Residential Building Permits (BP ) (values, in $1,000) Construction Union Wage Index (U W I ) New housing price index (N HP I ) Rent index (Rent)

9.83 2,755 417,017.23 93.16 79.84 100.67

10.54 3,035.20 497,895.70 13.88 23.75 10.33

0.30 54 6,775 68.30 33.03 81.10

51.41 15,905 2,872,221 134.10 148.23 135.30

Population growth (pop) (in %) Unemployment Rate (unem) (in %) Household Income (inc) (in $)

0.40 6.90 13,509.79

0.22 2.10 2,145.11

-0.36 2.80 9,390.27

1.27 14.20 20,591.90

Geographical distances (in 100 km) Average total employment level emp ( in 1,000 persons) Intre-CMA migrants, now and ve years ago Average Real GDP ( in $1, 000 millions) Movers ( in 1, 000 persons)

11.30 770.11 4,230.83 65.71 633.77

8.28 769.96 11,236.10 67.07 627.59

0.58 105.98 90 8.42 78.22

27.84 2,476.85 104,445 225.12 2,000.48

Dependent variable

Explanatory Variables

Demand side shifters

Variables used in spatial weight matrix

Sources : Royal LePage, CANSIM, CMHC, and Conference Board of Canada. The data spans from year 1992Q4 to 2012Q4. Geographical distances are obtained from Google Maps. Detailed data source and description can be found in Appendix B. Ten selected CMAs including: Halifax, Nova Scotia; Montreal, Quebec; Toronto, Ontario; Ottawa, Ontario; Winnipeg, Manitoba; Regina, Saskatchewan; Calgary, Alberta; Edmonton, Alberta; Vancouver, British Columbia; Victoria, British Columbia.

real GDP from the Conference Board of Canada from years 2001 to 2012.

3.2 Model To capture and detect possible time dynamic and spatial interaction of housing return data, we apply a spatial dynamic xed-eects panel data model

Yt = γYt−1 + λWN Yt + Xt−1 β + µ + vt τN + ut

ut = ρWN ut + εt ,

9

t = 1, . . . , T.

(1)

(2)

where the spatial dynamic panel data (SDPD) model (1) is represented by the inclusion of the time-lag terms

Yt−1

W Yt

as explanatory variables, and the spatial lag term

captures

spatial dependence among housing returns from dierent CMAs. The disturbance in spatial autoregressive (SAR) form in equation (2) indicates that the error term is assumed to be spatially correlated.

Section 4 provides empirical evidence to support he use of model

(1)-(2). In model (1)-(2),

i = 1, . . . , N ,

and

each spatial unit

i

Yt = (y1,t , y2,t , ..., yN,t )0

t = 1, . . . , T , in year

t.

and

and

WN

is an

γ

N ×1

vector of dependent variables,

is the log-dierence of housing price values for

The log-return is calculated as

represents housing price at period

Yt−1

yi,t

is an

t

and

pi,t−1

yi,t = ln





pi,t pi,t−1

, where

pi,t

is the housing price in the previous period.

N ×1 vector representing the one-period-lagged values of the dependent variable,

is the corresponding coecient, which captures the pure dynamic eects of returns.

is the normalized

N ×N

cross-sectional correlations, and

non-stochastic spatial weight matrix that generates the

WN Yt

captures the correlation with housing prices in the

neighborhood regions, usually known as the spatial lag vector. The

λ

is the corresponding

coecient which is called the spatial multipiler, and it captures the spatial eects of the returns' interdependence.

Xt−1 is an N ×k vector of explanatory variables in the previous period, and β is the k×1 corresponding coecients to be estimated. We use lagged explanatory variables to avoid endogeneity problems. Furthermore, due to the low frequency of housing market transactions, participants rely maninly on past information. Here,

Xt−1

of housing demand and supply, which are new dwelling starts pleted

Completed,

total value of building permits

housing price index

N HP I

pop.

The

µ

Starts,

new dwelling com-

construction costs

Cost

and new

as well as several demand side shifters, including labor mar-

ket variables, household income population

BP ,

includes the determinants

is the

inc

N ×1

unemployment rate

unem

and demographic variable

time-invariant individual market xed eects which

represent the individual market characteristics and some geographic attributes, while

vt

is

the time-specic xed eect that capture nationwide factors, including national ination and mortgage rates. Finally,

τN

is the

N ×1

vector of

1's.

Equation (2) is a spatial autoregressive regression (SAR) form of error term and it captures the cross-sectionally dependent unexpected shocks.

WN

The spatial weight matrix

is also used to capture the spatial correlation in disturbances, and

ρ

is the spatial

7

εt is an N × 1 vector of i.i.d error term with zero mean 2 and constant variance matrix σ IN , where IN is an N × N identity matrix. autocorrelation coecient. The

This model can be estimated by quasi-maximum likelihood (QMLE) with xed eects,

7

In most applied spatial econometrics, the same spatial weight matrix is used in both spatial lags of dependent variable and disturbances. We tried to estimate our model with dierent weight matrices in spatial lags and spatial errors, yet the results did not change much. This indicates that spatial dependence of housing returns and spatial autocorrelation in disturbances comes from the same channel. 10

see Yu et al. (2008). They developed a QMLE estimator with xed eects when relative to

n



and the resulting estimator is

nT

T

is large

consistent and asymptotically normally

distributed. Below, for notational convenience we will drop the subscript of

WN

when it

causes no confusion.

3.3 Spatial weight matrix (WN ) The estimates of the spatial multiplier

λ

and

ρ

are very sensitive to the choice of

spatial weights, thus carefully choosing an appropriate spatial weight matrix is one of the most important issues in the estimation of parametric spatial models. However, there is no clear-cut agreement on its choice. In the literature, even though there are several popular choices commonly used, there is little agreement regarding the best form of

W.

Typically, the spatial weights are calculated from geographic distances and economic distances, see Anselin (1988).

To construct a distance between units, some authors use

length of shared borders and combined distance-boundary, see for example Cli and Ord (1981). Below, we list several popular methods of constructing spatial weights using distances:



spatial contiguity neighbors (Cli and Ord 1981, Zhang and Murayama 2003);

• k -nearest

neighbors ( i.e. Zhang and Murayama 2000);



inverse distances raised to some power;



negative exponential weighting;

There are some researchers who use a non-parametric weight matrix treating the distance as a regressor using kernel weights. For example, bandwidth distance decay method (Fotheringham, Charlton, and Brunsdon 1996) and Gaussian distance decay method (LeSage 2003). The spatial weight matrix formed by distances are sometimes referred to as local statistics, and this form of spatial weights is more often used in Geography studies. The contiguity of neighbors method treats all neighbors the same, and assigns the same weight on all neighbors; and the

k -nearest

neighbors method only considers rst neighbors, second

neighbors, etc. However, it does not take into consideration who will have a more signicant eect among the group of rst neighbors. The 10 selected CMAs in our study do not have joint borders, so all weights associated with borders will not be applicable. Moreover, in an economic study, we also need to consider the economic meaning of any such choice of weights. The purely nonparametric-based weights, such as bandwidth distance decay and Gaussian distance decay methods, do not have any economic meanings. Then, it only

11

left us with inverse distance weighting (IDW) and negative exponential distance weighting methods. The Inverse Distance Weighting (IDW) scheme is formed as follows. weight,

wi,j

is the (i,

inuence of region

i

j )th

element of weight matrix

on region

self-inuence. Denoting

j.

W,

wi,i = 0

We set

π (W ) = max |ξi (W )| 1≤i≤N

in absolute value. In addition, below we use

and

Wo

and it typically reects the spatial

for all

ξi

Each spatial

i = 1, .., N

to exclude the

to be the largest eigenvalue of

W

to represent the spatial weight matrix

before the normalization procedure, and the (i,j )th element of

Wo

is denoted by

0 .8 wi,j

π (W )

as in

We normalize the inverse distance matrix by its largest positive eigenvalue

Kelejian and Prucha (1998). Specically, we denote the actual distance between

i

and

j

G,0 is formed as follows: by di,j , then the geographical distance weight matrix W

  G,0 G,0 W G,0 = wi,j , and wi,j = d−α i,j , i 6= j where

α

(3)

is the power parameter to be determined, and the superscript

geographical weight.

In empirical studies,

α

G

indicates the

is usually set equal to 1 or 2 subjectively.

The inverse relation shows that the closer observations are, the more weight in the spatial weight matrix. In practice,

α=2

is usually chosen based on the gravity model of Haggett

et al. (1977) . This formation of geographical weight with quadratic power is the most popular of all the distance-based constructing

W

W 0s

according to Getis and Aldstadt's (2004) study of

matrix using local statistics.

The geographical distance weight matrix formed by negative exponential weighting function is

  G,0 W G,0 = wi,j , The parameter

α

G,0 and wi,j = exp(−αdi,j ), i 6= j

(4)

in (4) also needs to be determined.

The spatial weight matrix can also be dened based on economic distances. The economic variables or indicators that are commonly used to construct the weight, are personal income, GDP, or total employment level. In Case et al. (1993), the weight for city city

j

is calculated by the inverse weighting function (of power 1, or

α = 1)

i

to

with row stan-

dardization. We use the inverse weighting function to form each economic weight, where the power

α

9

will be determined in the context of the model. We construct the economic

weight using the inverse weighting function to capture the economic similarity between

8 The superscript 0 indicates the raw weight matrix before the normalization, and we use this notation through all this section. 9 We also tried to construct economic weight matrix using the negative exponential function. However, in those cases we run into computational problems and as such we only use the inverse distance weighting function to construct the economic weight.

12

regions. The rst economic weight is formed as follows:

  emp,0 W emp,0 = wi,j , The

|empi − empj |

emp,0 and wi,j = |empi − empj |−α , i 6= j

(5)

calculates the dierence total employment levels between two regions.

The inverse function ensures the regions with similar employment level will take more weight.

Thus, this weight matrix reects the economic closeness in total employment

levels between two regions. The second economic weight

W mig

between two CMAs. The variable migrants moved between region

migi,j =

is formed to capture the direct economic connection

migi,j

i

and

is used as the inverse of the number of inter-CMA

j

in a ve-year period from 2006 to 2011, where

1 number of migrants between CM A i and j . The larger number of inter-CMA migrants,

migi,j , the smaller the mig,0 migi,j takes more weight on wi,j . In other words, more people moved between region i

the smaller the

and

j,

migi,j .

After we apply the inverse function on

there are more connections in two regions. The direct economic connection before

normalization is formed as follows:

 W mig,0 = wmig,0 ,

mig,0 −α and wi,j = migi,j , i 6= j

The third economic weight matrix use number of movers between CMA

(6)

i

and

j

is

formed as follows:

  mov,0 W mov,0 = wi,j ,

mov,0 and wi,j = |movi − movj |−α , i 6= j

(7)

This weight matrix uses total number of movers in each CMA. Movers are people that lived in a dierent address ve years before the census year, which include migrants (both inter-provincial, intra-provincial migrants, and external migrants or immigrants) and nonmigrants but lived in a dierent address. The between CMAs.

W mov

is also a measure of direct connection

10

The fourth economic weight matrix uses GDP values is formed as follows:

  GDP,0 W GDP,0 = wi,j ,

GDP,0 and wi,j = |GDPi − GDPj |−α , i 6= j

(8)

This weight matrix is used to capture the economic similarity in real GDP levels between two regions.

10

The variable movi,j include all people lived in dierent address 5 years before the census year, regardless of where did they moved. While the other variable in (6) migi,j only counts the number of people moved between CMA i and j specically, and it does not include the population growth from immigrants and external migrants. These two variables are dierent, and we think they are both important, so we use them both to see which one can capture some economic connection of CMA i and j in order to better explain the spatial dependence of housing returns.

13

The general form of spatial weight matrix is then dened as

WN0

WN ,

which is the matrix

by (3)-(8) after the normalization procedure with all diagonals set to zeros:



WN

. . . w1N



  w21 0 w23 . . . w2N   w 0 . . . w3N =  31 w32  . . . . ..  .. . . . . . . .  wN 1 wN 2 wN 3 . . . 0

       

0

w12

w13

The rst row captures the spatial connection between the rst metropolitan area and all other metropolitan areas. In other words, it reects how other metropolitan areas aect the housing return in the rst metropolitan area. For example, the element how the second region will aect housing return in the rst region, while the eect of the rst region on the second region.

WN

w12

w21

j

may not be the same as the eect of region

o-diagonal elements in

WN

represents

Note that the spatial weight matrix

is not symmetric with standardization, thus the magnitude that region

region

indicates

j

on region

i.

i

inuences

We also notice that all

are positive.

Case et al. (1993) proposed a method to construct a compound spatial weight matrix. We can use a combination of geographical weight and economic weight

W = aW G + (1 − a)W E , a ∈ [0, 1] where

WG

(9)

is the geographical weight matrix constructed by either (3) through the inverse

distance weighting function or (4) the negative exponential weighting function, and is the economic weight that is chosen from one of

W emp , W mig , W mov ,

and

W GDP

WE

using

equations (5) to (8). We propose to simultaneously estimate parameters appearing in the model using the compound weight matrix . We apply grid search to nd the most appropriate

WN .

We estimate the model with

compound weight matrix formed by both geographical distance and economic similarities,

WN = g (di,j , zi,j , θ) where

θ = (αG , αE , a).

The

αG

is the power parameter in geographical weight, and

is the power parameter in economic weight. either (3) or (4), and economics similarity We choose the power parameter parameter

a,

αE

(10)

αG

Geographical distance

zi,j

di,j

αE

is calculated by

is calculated by one of (5), (6), (7) and (8).

appearing in geographical weight matrix, the power

appearing in the economics weight matrix and the combination parameter

simultaneously. For each combination of

θ,

14

we estimate model (1)-(2) by the QMLE,

and then choose the combination of

θ

that minimizes the RMSE as the estimate of

By employing a grid search for the selection of the spatial weight matrix

W,

θ.

we oer a

data driven approach that is model specic, contrary to common practice in the literature to use a pre-assigned spatial weight matrix.

4

Empirical Results In Table 2, we conducted several tests before nalizing our proposed model (1)-(2).

We rst checked whether the 10 housing return series exhibit cross-sectional dependence using Pesaran's CD test, see Pesaran (2004), conrming the presence of cross-sectional dependence between the spatial units. Breusch-Pagan's LM tests supported the two-ways xed eects specication, while the Baltagi et al.'s (2007) conditional LM test detected spatial autocorrelation in error terms, suggesting the inclusion of a SAR disturbance in the model. Finally, the Hausman test rejected the random eects model in favour of the xed-eects specication. This section contains two subsections.

Section 4.1 presents our estimation results,

and Section 4.2 oers a justication of our modeling strategy using arguments from risk aversion theory.

4.1 Estimation results Equations (1)-(2) are estimated by quasi-maximum likelihood (QMLE) with xed effects, see Yu et al. (2008). We rst estimate the SDPD model (1)-(2) with ve dierent specications of the spatial weight matrix given in equations (3) to (8). In Table 3, the rst column lists the results from a traditional dynamic panel data model without spatial interactions.

We observe the RMSE in that case is much larger than all other columns

with estimation of spatial dependence and a spatial error term. The second and the third columns contain the estimation results where

W

is calculated from geographical distances

as the channel of spatial dependence across CMAs.

In the second column, weights are

calculated as the inverse weighting function, while in the third column weights are is calculated by the negative exponential function of distances. results when

W

The last four columns give

is calculated from total employment levels, inter-CMA migrants, the total

number of movers and real GDP based on equations (5) to (8). The RMSE is signicantly reduced relative to that calculated from non-spatial models and the coecient estimates of

λ

and

ρ

are both signicant. In Table 3, the minimum RMSE is obtained in the fth

column when we use

W mig

. This indicates that direct connections in terms of inter-CMA

migrants between regions can better explain spatial dependence. We then proceed to use

15

16

pooling model signicant two-way eects

H0 : H1 :

2 > 0) σµ

ρ 6= 0

and

2 > 0) σµ

serial correlation in error terms

(assuming

H1 :

and

εt

no serial correlation in error terms

H0 : ρ 6= 0

xed eects model

H1 :

(assuming

random eects model

spatial autocorrelation and individual-specic eects

H0 :

H1 :

2 =0 H0 : ρ = σµ

no cross section dependence cross-sectional dependence

H0 : H1 :

Null v.s. alternative hypotheses

1.5037

24.3421

37.7409

52.5103

8.1093

Test statistics

0.22401

5.178e − 06

1.998e − 09

3.958e − 12

5.092e − 16

p-value

R

software command sphtest in package spml does not work currently for maximum likelihood estimations. Thus, we

xed eects panel data model with no serial correlation in

εi,t .

apply the Hausman specication test on traditional panel models without spatial weight matrix. We nd strong evidence to support cross-sectional dependence in a two-way

(see Mutl and Pfaermayr, 2011), however, the

see Baltagi, Song, Jung and Koh (2007). For Baltagi, Song and Koh LM test for spatial panels see Baltagi, Song and Koh (2003). We tried to apply spatial Hausman test

Notes : Pasaran's CD test refers to the test in Pesaran (2004). Breusch-Pagan's LM test is based on Breusch and Pagan (1980). For Baltagi, Song, Jung and Koh's LM test

Baltagi, Song, Jung and Koh LM test for spatial panels

5

Baltagi, Song and Koh's LM test for spatial panels

3

Hausman specication test

Breusch-Pagan's LM test on two-ways xed eect

2

4

Pesaran's CD test for cross-sectional dependence in panels

1

Tests

Table 2: Tests on model selection

17 (0.0038)

− 0.0019 (0.0038)

−0.0003

(0.0043)

BP

0.66836

0.02051

0.0383 (0.0379)

0.0291

(0.0421)

0.0042 (0.0097)

(0.0029)

(0.5725)

− 0.0054

(0.6507)

2.5322 ***

(0.0028)

3.0193 ***

− 0.0046

(0.0032)

0.1367 ***

(0.0234)

− 0.0051

0.1337 ***

(0.1286)

(0.1454)

(0.0273)

0.1360

(0.0035)

0.0806

(0.0038)

0.02032

(0.0375)

0.0382

(0.0097)

0.0055

(0.5668)

2.1758 ***

(0.0029)

− 0.0045

0.1425 ***

(0.0238)

(0.1280)

0.1241

(0.0034)

− 0.0078 **

0.01963

(0.0371)

0.0216

(0.0093)

0.0060

(0.5780)

2.7144 ***

(0.0029)

− 0.0043

0.1350 ***

(0.0249)

(0.1311)

0.6466

(0.0037)

− 0.0017

(0.0033)

− 0.0079 **

0.0054 **

(0.0031)

(0.0322)

0.0789 **

(0.1915)

5 mig,0

−α = migi,j

***

0.02025

(0.0375)

0.0349

(0.0097)

0.0047

(0.5814)

2.6422 ***

(0.0028)

− 0.0064 **

(0.02346)

0.1308

(0.1310)

0.1079

(0.0038)

− 0.0001

(0.0036)

− 0.0092 **

0.0065 **

(0.0033)

(0.0331)

0.1118 **

(0.1734)

− 0.9878 ***

(0.1611)

−0.7968***

α = 0.28

wi,j

mov,0

= movi − movj −α

***

0.02005

(0.0379)

0.0247

(0.0099)

0.0062

(0.6192)

3.0981 ***

(0.0031)

− 0.0052 *

(0.02367)

0.1297

(0.1352)

0.0473

(0.0039)

− 0.0036

(0.0037)

− 0.0099 ***

0.0077 **

(0.0033)

(0.0354)

0.0792 **

(0.0641)

− 0.1637 **

(0.0393)

−0.0367

α = 0.02

wi,j

6

GDP,0

= GDPi − GDPj −α

(0.0036)

0.02145

(0.0383)

0.0278

(0.0098)

0.0071

2.9388 *** (0.6060)

(0.0030)

− 0.0049

0.1299 *** (0.0253)

(0.1349)

0.0562

(0.0039)

− 0.0035

− 0.0096 ***

0.0071 ** (0.0034)

0.0813 ** (0.0352)

(0.0333)

− 0.0762 **

(0.0276)

−0.0356

α = 0.1

wi,j

7

the model with dierent power parameter

α, and the

α

minimizes RMSE is reported in above table in each cases.

panel model without the spatial weight matrix. All other columns are estimation of spatial dynamic panel data model with dierent specication of weight matrix. We estimate

Notes : ***, **, and * indicate the estimate is statistically signicant at 1%, 5%, and 10% signicant levels, respectively. First column is the estimation result from the dynamic

RMSE:σ ˆ

inc

unem

pop

Rent

N HP I

UW I

0.0022

− 0.0083 **

− 0.0031 **

Completed

0.0062 *

(0.0033)

0.0060 *

(0.0033)

0.0069 *

(0.0038)

(0.0339)

0.1139 ***

(0.1728)

Starts

(0.0339)

0.1060 ***

(0.1466)

(0.1791)

(0.1565)

− 0.8988 ***

(0.1397)

− 0.8173 ***

− 0.9943 ***

α = 0.28 −0.7881***

α = 0.5 −0.7098***

= empi − empj −α

α = 0.35

emp,0

−0.7353***

4 wi,j

G,0

3 wi,j = exp(−αdi,j )

wi,j = d−α i,j G,0

2

0.0830 **

(0.0380)

W

Yt−1

ρ

λ

without

1

Table 3: Estimation with single weight matrix

a combination of geographical and economic weights. We present the results in Table 4. We notice that the weight matrix in the last column minimizes RMSE, given by

WN = a · W G + (1 − a) · W E We vary

a

from 0 to 1, and vary

RMSE is reached for

αG

and

αE

(11)

between 0.1 to 4 simultaneously. The lowest

a = 0.1, αG = 0.55, and αE = 0.35. Thus, the resulting nal estimate

of our compound weight matrix is

WN = 0.1W G + 0.9W GDP where

WG =

d−0.55 i,j G WN

π(

)

, and

W GDP =

|GDPi −GDPj |−0.35 . E π (W N )

11

(12)

Note that, the estimated RMSEs are

smaller in Table 4 than Table 5. Therefore, geographical distance is better calculated more 12

accurately using the inverse distance weighting function from equation (3).

A change in a single region associated with any given explanatory variable will aect the region itself as direct impact and potentially aect all other regions as indirect impact. Behrens and Thisse (2007) theoretically explained that any change in parameters that directly involves one region can generate spatial spillover eects, so it is unlikely to leave the remaining regions unaected. Due to the existence of the spatial lag term, it is evident that any changes occurring in one region will be spread to all other regions. To make economic inference from our model, we derive the reduced-form models from equation (1)-(2) as

Yt = (IN − λWN ) −1 γYt−1 +(IN − λWN ) −1 Xt−1 β +(IN − λWN ) −1 (µ + vt τN + ut )

ut = (IN − ρWN ) −1 εt . 11

(13)

(14)

ˆ are 0.7984, 0.2289, 0.1600, 0.1548, 0.1180, 0.1010, The absolute value of eigenvalues of the matrix λW 0.0984, 0.0712, 0.05338, and 0.0147. The absolute value of eigenvalues of the matrixρˆW are 0.9916, 0.2843, 0.1987, 0.1922, 0.1465, 0.1254, 0.1222, 0.0885, 0.0663, and 0.0183. All absolute eigenvalues are less than 1, which ensures our spatial system is stationary. 12 It is worth noting that the weight matrices used in spatial models are typically assumed to be exogenous. However, the exogeneity assumption may not be reasonable. Kelejian and Piras (2014) discuss an estimation method with an endogenous weight matrix. Similarly, in Case et al. (1993), many of the weight matrices used to capture economic similarities are likely to be endogenous. In our case we also use weight matrices based on economic factors, yet the nally selected W in (12) formed by geographical distances and the real GDP levels is taken to be exogenous as geographical factors and GDP levels are taken as given. In that sense, the issue of endogeneity that we face may not be as severe. Given the complexity of our approach for arriving at the nal choice of the weight matrix based on a data driven grid search method, we leave any issue of endogeneity for future research.

18

Table 4: Estimation results with compound weight matrix, when both geographical and economic weights are calculated from inverse function.

0.4W G + 0.6W emp

λ

0.5W G + 0.5W mov

0.2W + 0.8W mov

0.1W G + 0.9W GDP

αG = 0.35, αE = 0.4

αG = 0.45, αE = 0.15

αG = 0.55, αE = 0.25

αG = 0.55, αE = 0.35

−0.8429***

−0.8964***

−0.9007***

−0.8038***

(0.1644)

ρ

− 0.9988 ***

0.0905 ***

0.1084 ***

(0.0334)

Starts

0.0070 **

Completed

(0.1596)

− 0.9984 ***

(0.1586)

(0.1831)

0.0835 **

(0.0334)

0.0747 **

(0.0333)

0.0059 *

(0.0335)

0.0073 **

0.0070 **

(0.0033)

(0.0032)

(0.0031)

− 0.0080 **

− 0.0084 **

− 0.0078 **

− 0.0070 **

0.0013

− 0.0017

− 0.0008

− 0.0015

(0.0037)

(0.0038)

(0.0040)

(0.0036)

0.0987

0.1314

0.0870

0.0741

(0.1299)

(0.1273)

(0.1286)

(0.1295)

BP UW I N HP I

0.1377 ***

(0.0037)

0.1337 (0.02294)

− 0.0042

− 0.0050 *

pop

2.4379 ***

(0.5707)

unem

(0.0033)

(0.0031)

0.1297 ***

***

(0.0241)

(0.0029)

0.1403 ***

(0.0237)

(0.0251)

− 0.0045

− 0.0038

(0.0028)

(0.0028)

(0.0029)

2.2804 ***

2.6339 ***

(0.5668)

2.6497 ***

(0.5704)

(0.5721)

0.0051

0.0042

0.0046

0.0037

(0.0094)

(0.0096)

(0.0093)

(0.0092)

inc RMSE:σ ˆ

(0.1735)

− 0.9774 ***

(0.0032)

(0.0033)

Rent

(0.1550)

− 0.9979 *** (0.1697)

Yt−1

(0.1528)

0.0029

0.0385

0.0281

0.0241

(0.0371)

(0.0373)

(0.0375)

(0.0372)

0.01968

0.01984

0.01962

0.01949

Notes : ***, **, and * indicate the estimate is statistically signicant at 1%, 5%, and 10% signicant levels, respectively. We use grid search to nd the most appropriate

W.

We estimate the model with the compound

weight matrix formed by both geographical distance and economic similarities,

θ = (αG , αE , a).

one of (5), (6), (7) and (8). We choose the power parameter parameter

αE

WN = g (di,j , zi,j , θ)

where

Geographical weight matrix is calculated by (3), and economics similarity is calculated by

αG

in geographical weight matrix, the power

in the economics weight , and the combination parameter

minimize the RMSE.

19

a,

simultaneously. The estimates

Table 5: Estimation results with compound weight matrix, when geographical weight use negative exponential function, and all the economic weights use inverse functions

λ

0.8W G + 0.2W emp

0.9W + 0.1W mig

0.5W G + 0.5W mov

0.4W G + 0.6W GDP

αG = 0.6, αE = 0.11

αG = 0.5, αE = 0.05

αG = 0.5, αE = 0.35

αG = 0.15, αE = 0.5

−0.7607***

−0.8028***

−0.8977***

−0.8092***

(0.1600)

− 0.9693 ***

ρ

(0.1770)

Yt−1

0.1147 ***

(0.0337)

Starts Completed

0.0062 **

− 0.9971 ***

− 0.9973 ***

(0.1785)

(0.1590)

0.1141

0.0920 **

(0.0336)

(0.0332)

0.0061 *

0.0074 **

(0.1721)

− 0.9985 *** (0.1807)

0.0729 **

(0.0336)

0.0074 **

(0.0033)

(0.0032)

(0.0031)

− 0.0077 **

− 0.0077 **

− 0.0075 **

− 0.0076 **

0.0024

0.0023

− 0.0004

− 0.0023

(0.0038)

(0.0038)

(0.0037)

(0.0037)

0.1250

0.1241

0.0876

0.0702

(0.1273)

(0.1268)

(0.1281)

(0.1297)

BP UW I N HP I

0.1434 ***

(0.0023)

(0.0033)

0.1416 ***

0.1349 ***

(0.0033)

0.1404 ***

(0.0237)

(0.0236)

(0.0236)

(0.0253)

− 0.0045

− 0.0045

− 0.0045

− 0.0038

(0.0028)

(0.0028)

(0.0028)

(0.0029)

2.1436 ***

2.1415 ***

2.5286 ***

pop

(0.5637)

unem

(0.5615)

(0.5662)

2.6497 ***

(0.5721)

0.0056

0.0054

0.0056

0.0039

(0.0096)

(0.0096)

(0.0093)

(0.0092)

inc RMSE:σ ˆ

(0.1542)

(0.0033)

(0.0034)

Rent

(0.1624)

0.0382

0.0383

0.0301

0.0228

(0.0374)

(0.0372)

(0.0372)

(0.0371)

0.02004

0.01990

0.01967

0.01958

Notes : ***, **, and * indicate the estimate is statistically signicant at 1%, 5%, and 10% signicant levels, respectively. We use grid search to nd the most appropriate

W.

We estimate the model with the compound

weight matrix formed by both geographical distance and economic similarities,

θ = (αG , αE , a).

Geographical weight matrix is calculated by (4), and economics similarity

one of (5), (6), (7) and (8). We choose the power parameter

αE

WN = g (di,j , zi,j , θ)

in the economic weight , and the combination parameter

RMSE.

20

αG a,

zi,j

where

is calculated by

in the geographical weight, the power parameter simultaneously. The estimates minimize the

Let

Sk (W )

the

k th

N ×N

be the

partial derivative matrix of expected value of

explanatory variable of

t−1

available at time

IN

is the

N ×N

element is denoted by

in region

Qt−1 .

denote by

Sk (W ) = where

X

[Sk (W )]i,j .

direct impact of changes from the where

i 6= j

value in the change of the

k

is the

k th

k th

t,

given information

explanatory variable, and its (i,

∂E(yi,t |Qt−1 ) ∂xj,t−1,k

variable are dened as respectively.

j )th

of housing returns in region

i

= [Sk (W )]i,j

The implication of a non-zero

explanatory variable in region

[Sk (W )]i,j 6= 0.

Sk (W )

at time

In LeSage and Pace (2009), the indirect impact and

returns in all other regions if

diagonal elements in

N

From equation (13), we have

= [Sk (W )] i,i ,

k th

up to region

with respect to

∂E (Yt | Qt−1 ) = (IN − λWN )−1 βk T ∂Xt−1,k

identity matrix,

∂E(yi,t |Qt−1 ) and ∂xi,t−1,k

1

Y

j

will aect the housing

The second derivative measures the impact

from a change in explanatory

xi,k .

In other words, the

represent the the direct impacts, and o-diagonal elements

represent the indirect impacts. LeSage and Pace (2009) give a good interpretation for these measures of impacts.

1. Direct impact.

The measure of direct impact is

elements in matrix

Sk (W ) = (IN −

2. Indirect impacts to region i, is the ith row of for the

k th

[Sk (W )] i,i ,

Sk (W ) except for the element [Sk (W )] i,i ,

explanatory variable. This indirect impact considers how changes of

explanatory variable in all other regions inuence region 3. Indirect impact from region the element

k th

which is the diagonal

λW )−1 βk .

[Sk (W )] i,i .

i,

k th

i.

which is measured by the ith column of

Sk (W )

except

This form of indirect impacts measures how changes of the

explanatory variable in region

i

inuence all other regions.

Note, the average

total of these two forms of indirect impacts are numerically equal. However, these two measures represent dierent interpretative viewpoints.

We can now apply these to the interpretation of our parameter estimates. The coecient

λ

captures the spatial dependence of housing returns in other CMAs.

As we mentioned earlier, in models containing spatial lags of dependent variables, the interpretation of the parameters becomes richer and more complicated. This spatial lag parameter

−0.9984.

λ

and spatial autocorrelation parameter

The estimates of

λ

ρ

are estimated to be

−0.8038

and

is negative, so the direct impact and indirect impacts appear

to be of opposite signs. The predominant channel of both spatial dependence and spatial autocorrelation is a weighting average of distance and number of migrants between CMAs.

21

22

2.8713

9. Montreal, QC

10. Halifax, NS

0.1478 0.1437 0.1520 0.1477

9. Montreal, QC

10. Halifax, NS

Average impact

0.1424

7.Toronto, ON

8. Ottawa, ON

0.1501

0.1480

4. Edmonton, AB

0.1468

0.1465

3.Calgary, AB

6. Winnipeg, MB

0.1455

2. Vancouver, BC

5. Regina, SK

0.1538

direct impact

1. Victoria, BC

CMA

2.7890

2.7147

8. Ottawa, ON

Average impact

2.6897

2.7909

7.Toronto, ON

2.8350

2.7727

4. Edmonton, AB

6. Winnipeg, MB

2.7950

3.Calgary, AB

5. Regina, SK

2.7484

2.7673

2. Vancouver, BC

2.9050

direct impact

1. Victoria, BC

CMA

-1.1531

-1.1547

from

-0.0677

-0.0816

-0.0503

-0.0732

-0.0390

-0.0681

-0.0767

-0.0772

-0.0662

-0.0611

-0.0677

-0.0817

-0.0503

-0.0731

-0.0360

-0.0681

-0.0759

-0.0724

-0.0665

-0.0611

-0.0884

to -0.0882

indirect impact

-1.2779

-1.5417

-0.9502

-1.3830

-0.7360

-1.2869

-1.4337

-1.3669

-1.2552

indirect impact

NHPI

-1.2779

-1.5417

-0.9502

-1.3830

-0.7360

-1.2867

-1.4485

-1.3625

-1.2505

-1.6700

from

to -1.6664

indirect impact

indirect impact

Population growth

0.0786

0.0809

0.0765

0.0787

0.0760

0.0782

0.0800

0.0788

0.0780

0.0775

0.0819

direct impact

0.0074

-0.0360

-0.0435

-0.0268

-0.0390

-0.0207

-0.0363

-0.0404

-0.0385

-0.0354

-0.0326

-0.0471

to

indirect impact

Yt−1

-0.0034

-0.0041

-0.0025

0.0072 0.0076

-0.0037

-0.0020

0.0074

0.0071

-0.0038 -0.0034

0.0075 0.0074

-0.0036

-0.0033

-0.0031

-0.0360

-0.0817

-0.0503

-0.0732

-0.0390

-0.0681

-0.0760

-0.0723

-0.0665

-0.0611

-0.0884

from

indirect impact

-0.0034

-0.0041

-0.0025

-0.0037

-0.0020

-0.0034

-0.0038

-0.0036

-0.0033

-0.0030

-0.0044

from

to -0.0044

indirect impact

indirect impact

0.0074

0.0074

0.0073

0.0077

direct impact

Starts

Table 6: Direct and indirect impact

-0.0082

-0.0085

-0.0080

-0.0082

-0.0079

-0.0082

-0.0083

-0.0082

-0.0081

-0.0080

-0.0085

direct impact

0.0038

0.0045

0.0028

0.0041

0.0022

0.0038

0.0042

0.0040

0.0037

0.0034

0.0049

to

indirect impact

Completed

0.0038

0.0045

0.0028

0.0041

0.0022

0.0038

0.0042

0.0040

0.0037

0.0034

0.0049

from

indirect impact

The estimated coecient of population growth is

2.6497,

which is statistically signif-

icant at the 5% signicant level. This is the largest number among all estimates, which coincides with the CMHC report that population growth drives household formation, typically the largest component of housing demand. In Table 6 we calculate the direct impact and two forms of indirect impacts from the change in the explanatory variables. The rst column in Table 6 is the direct impacts from a 1% increase in the population growth rate, if everything else remains the same.

For example, when the population growth rate in

Vancouver rises by 1%, the direct impact on Vancouver's own housing return will increase by 2.7484%. When there is an equal amount of 1% increase in the population growth rate in all other regions, the housing return in Vancouver will drop by 1.1531%. If population growth in Vancouver rises by 1%, all other regions housing returns will drop by 1.1547% on average corresponding to this change. The average direct impact of all regions associated with a 1% increase in population growth rate is 2.79%, while the average total indirect impact in either form is

−1.28%.

The biggest direct impact and indirect impact in absolute

value in either form (from other regions and to other regions) associated with changes in population growth rate is in Victoria, BC, while the smallest impact is in Toronto, ON. The supply side variables, changes of the total number of dwelling starts (Starts) and completion (Completed), are statistically signicant at the 10% and 5% signicance level, respectively. When there is a 1% increase in the changes of total number of starts, keeping other factors xed, the average direct impact will be a 0.0074% increase in housing returns, while the average indirect impact in either form will be decrease 0.0034%.

Similarly,

when there is a 1% increase in the changes of total number of dwelling completion, with everything else unchanged, the average direct impact will decrease by 0.0082%, while the average indirect impact in either form will increase by 0.0038%. Ley and Tutchener (2001) nd the supply side variables to have a limited eect over the sample year 1971 to 1996 when they study the housing prices in Canada's eight gateway CMAs. However, in our model, the supply side variable

Starts and Completed both show a statistically signicant

relationship with the housing return. The growth in New Housing Price Index (N HP I ) is also statistically signicant at the 5% signicance level, with a value of 0.1403. New houses and resale houses in a particular metropolitan area are demanded by the same population group and since they both share the same economic environment, we expect them to be positively correlated. Meanwhile, new houses and resale house seem not to be substitutes, since their returns are positively related. The interpretation of the estimates of NHPI is similar to the growth in population density. When there is a 1% increase in new housing return in Vancouver, BC, the direct impact of the resale housing return in Vancouver, BC itself will increase by 0.1455%. The largest direct impact of region's own new housing return increase happens in Victoria, BC, while the smallest direct impact appears in Toronto, ON. The average direct impact is

23

0.15%, if there is a 1% increase in new housing returns. The return in previous year (Yt−1 ) is statistically signicant at 5% signicance level, with a value equal to 0.0747.

If the return in the previous year rise by 1%, holding

everything else constant, the average direct impact will be 0.0786% increase in housing return in this year, and the average indirect impact will decrease by 0.0360% in either case. Similar to the stock market, our results indicate that housing prices are sticky. In other words, the price in the previous period will inuence the next period's price. This may attribute to the low frequency of housing transactions, and house buyers mostly relied on the previous transaction information. The estimate of the coecient in front of the changes of income variable is not statistically signicant even at the 10% signicance level.

Housing prices rise faster than

household income as most recent reports indicated. Based on our data, for example, the real housing price in Toronto rose about 80.46% from year 1992 to 2012, while real household income only increased about 13.33% in the same period. Gallin (2006) uses a panel of 95 U.S. cities over 23 years and he does not nd any evidence of cointegration of housing prices and income either. Gallin's result implies that house prices do not appear to have a stable long-run equilibrium relationship with fundamentals such as income, and our result further shows that housing returns do not have signicant relationship with income growth, either.

4.2 Arrow-Pratt risk aversion explanation of W It is customary to discuss asset returns usually in a framework that allows for risk. Hence, we proceed to examine the possible link between spatial dependence in housing returns and risk preferences. In the formulation of spatial weights, in order to calculate the weight matrix before standardization, we dened geographical distance as

G D(di,j ) = d−α i,j

(15)

and the economic distance between CMAs as

−αE G(gi,j ) = gi,j , where

i 6= j , and αG , αE > 0.

and gi,j = |GDPi − GDPj |

(16)

Applying grid search, we found the estimate of the compound

spatial weight matrix in equation (12), where the geographical distance weights accounts for 10%, and economic weights for 90%. In that case, the economic similarity in GDP levels between CMAs accounts for the predominant channel in explaining the spatial relations. We could think of Arrow-Pratt absolute risk aversion

24

A1 (di,j ) and A2 (gi,j ) as a measure

of sensitivity in terms of return spillover eects between housing markets. The CMA with higher value of

A1 (di,j )

and

A2 (gi,j )

will be more sensitive to the return spillover eects.

We calculate the Arrow-Pratt measure of absolute risk aversion associated with (15) and (16) as

A1 (di,j ) =

−D00 (di,j ) 1 = (1 + αG ) 0 D (di,j ) di,j

(17)

A2 (gi,j ) =

−D00 (gi,j ) 1 = (1 + αE ) 0 D (gi,j ) gi,j

(18)

and

di,j

The geographical distance

A1

with

and

A2 , respectively;

and the economic distance

the larger distance

the economic distance, the small

A2 (migi,j ).

di,j

gi,j

the smaller

is negatively correlated

A1 (di,j ) is, and the larger

That is, the larger the geographical and

economic distance, the smaller sensitivity will be in terms of return spillovers eects. In other words, the areas are more sensitive to return spillover eects from nearby areas, and from regions that have similar GDP values. Consider the value of power parameters sensitive to the return spillover eects. either 1 or 2 with and

αE

and

αE ,

the higher the power the more

Many researchers subjectively choose

α's

to be

being more popular in the gravity models. Here, we choose

to lie between 0.1 to 4 by grid search and arrive at estimates of

α ˆ E = 0.35 in a

α=2

αG

α ˆ G = 0.55

αG and

that minimize the model RMSE. These two estimates are both relatively small

[0.1, 4]

interval indicating little sensitivity to return spillover eects.

This gives an idea to housing investors that would consider investing in a house in locations further away and less similar in GDP levels. The reason is geographical nearby housing markets and regions economically similar will have similar return comovements, and the shocks from one area will be transmitted to the other nearby and closely linked area.

Suppose a person invests in several houses in dierent housing markets that are

located far away from each other, they may not have similar return comovements; hence they are less likely to be aected at the same time. This nding is similar to what investors commonly know as risk diversication . The case of negative exponential distance weights using geographical distance as an example here is given by

D = exp(−αdi,j ) where the associated

A(di,j ) =

−D100 (di,j ) D10 (di,j )

= α will be constant.

(19) This suggests that distances

are not related to the sensitivity of return spillover eects between housing markets, something that is not supported by our results. This argument provides additional evidence to support our choice of formulating the spatial weight matrix from an inverse power function of geographic distance and rules out the negative exponential distance weighting method

25

alternative option.

5

Impulse Response Functions To give further interpretation to our estimation results we study the impulse response

functions (IRF) in two interesting cases, both occurring in 2012Q4.

We compute the

one-period-ahead forecast of housing returns and the impulse response functions up to six quarters ahead. Two cases are discussed separately in the following two subsections. Section 5.1 discusses the rst case, which is the IRF's of a shock to housing returns, while the Section 5.2 discusses the second case, which is the IRF's of a shock to population growth. The computational details for deriving the IRF and its condence interval of the rst case is presented in Appendix C, while for the second case in Appendix D.

5.1 The impulse response functions of a shock to dependent variable (housing returns) We investigate the impulse response of the eects of conventional shocks to CMAs housing returns. We are interested in the large CMAs, such as Vancouver, Calgary, and Toronto especially. In Figure 1, we plot the impulse response of the eects of a positive shock to each of Vancouver, Toronto and Calgary housing returns, to see how it spreads over to other CMAs over time. We plot the impulse response functions in a three-dimensional space, so that we can observe how the eects of a shock progress in time horizon and distance horizon simultaneously.

Along the distance horizon, CMAs are ordered by the

closeness (in both geographic and economic closeness) from the shock origin.

The rate

of decay in time horizon is captured by the downhill direction going from left to right. Movement going from right to left captures the spatial patterns. When there is a positive shock to Vancouver housing returns, it increases Vancouver housing returns by 2.96% in that quarter. The positive shock does not last, and it only increases the Vancouver housing returns instantly.

It has negative instant eect on all

other CMAs housing returns in the quarter when shock happens. Calgary housing returns decrease the most among the others. The shock does not have any signicant impact on other CMAs after quarter one. When there is a positive shock that increases the Calgary quarterly housing return by 4.36% instantly, all other CMAs housing returns decrease instantly and do not have any signicant eects after quarter one.

Housing returns in

Edmonton decrease the most in response to this shock. If an one standard deviation shock originates in Toronto, Toronto housing returns increase by 2.30% instantly and die away in the rst quarter. All other CMAs housing returns decrease instantly, except Vancouver, Calgary, and Edmonton. Montreal housing returns decrease the most among others. This shock does not have any signicant eect after quarter one.

26

Figure 1:

One standard deviation shock on Vancouver, Calgary and Toronto housing

returns.

Notes : Impulse responses of one standard deviation shock to each of the Vancouver, Calgary and Toronto housing returns over time and across regions. The CMAs are ordered by the geographic and economic closeness from the shock origin. 27

Table 7: Regions and one standard deviations of housing returns for each CMA

CMAs

Region 1

Region 2

Region 3

1. Victoria, BC 2. Vancouver, BC 3. Calgary, AB 4. Edmonton, AB 5. Regina, SK 6. Winnipeg, MB 7. Toronto, ON 8. Ottawa, ON 9. Montreal, QC 10. Halifax, NS

one stdev. of

one stdev. of

housing return

population growth

2.84% 2.64% 3.80% 3.68% 3.81% 1.96% 2.20% 2.06% 2.20% 2.56%

0.12% 0.19% 0.23% 0.23% 0.20% 0.15% 0.10% 0.13% 0.12% 0.09%

Notes : We group 10 CMAs into three regions: region one, region two, and region three. Our conventional shocks are one standard deviation of each CMA's housing return, which is reported in the third column. The other positive shock is one standard deviation shock to the CMA's population growth rate, and the magnitude of it is reported in the last column.

Overall, the shocks only have an instantaneous eect to individual CMA'S housing return and then fade away quickly.

Another view of the results is provided in Figure

2. We plot the instant impact when the shock takes place and examine how the shocks spread across distances. At the same time, we plot the shock origin CMA over time on the same gure, so that we are able to compare the decaying speed in both time horizon and distance horizon.

In other words, we can observe whether the shock decays faster

over time or across space. In each of the following graphs, we plot the IRF's on CMAs 13

ordered by the closeness from the shock origin, and its 95% condence interval.

It also

shows that the IRF of the same shock on the origin over time. We order CMAs by taking into consideration both geographical and economic closeness from the shock origins, which are indicated on the top-axis. The bottom-axis is the time horizon from quarter zero to quarter six. In all three cases the decaying speed across time is faster than the decaying speed across the distance. This conclusion is not surprising. Canada is ranked as the second-largest country by total area in the world, but the population is ranked as the thirty-seventh which is relatively much low.

In addition, large urban centers are located far away from each other.

The

average distance of our selected 10 CMAs is 1,130km. In our model, the dominant channel of the spatial linkage is the geographical distance combined with the economic closeness in terms of inter-CMA migrants. Even though the spatial lag parameter is estimated to be statistically signicant, the impulse response to a shock does not pass across regions and

13

The plot of shock origins with its 95% condence interval can be found in Figure 9 in Appendix E.

28

time signicantly after quarter zero. We also want to investigate the positive shock on several CMAs at the same time. If there is a positive shock to the entire regions housing return, will it last longer and spread wider? We divide the 10 CMAs into three regions: region one, region two and region three (see Table 7). Region one includes Victoria and Vancouver; region two includes Calgary, Edmonton, Regina and Winnipeg; and region three includes Toronto, Ottawa, Montreal and Halifax. In Figure 3, we plot the impulse response functions of the shock to the entire region one, region two, and region three housing returns. For example, when we impose the positive shock to region one, we increase the housing returns by one standard deviation of Victoria and Vancouver by 2.84% and 2.64%, respectively. Again, the shocks only have an instantaneous eect relative to the shock origins on all other CMAs. When there is a one standard deviation shock to region one, the Victoria and Vancouver housing return increase by 3.50% and 2.59%, respectively. All other CMAs housing returns decrease instantly, and Halifax housing return is aected the most by negative 1.32%.

Similarly, when there is one standard deviation shock to region two

(including Calgary, Edmonton, Regina and Winnipeg), the shock origin CMAs housing returns increase, and all other CMAs housing returns decrease instantly. Ottawa is aected most by this shock with a 1.92% decrease. But the shock does not last even one quarter long, and does not spread out. For region 3, when there is one standard deviation shock to entire region 3 (including Toronto, Ottawa, Montreal and Halifax), the shock origin's housing returns increase instantly, and all other CMAs housing returns decrease instantly. Victoria has the biggest response to this shock, and its housing returns decrease by 2.93%. Note that the impulse responses of the shock to the multiple CMAs are sub-addtive. The magnitude of the impulse responses of the shocks to multiple CMAs is larger than the simple sum of the IRF'S from each shock happened in single CMA. This also shows the ten CMAs in Canada have spatial linkages, and they have spatially indirect impacts to each other. If we compare the impulse responses in Figure 1 and Figure 3, to the respective shock to a single CMA and to multiple CMAs, we nd the shocks only have instant eect in both cases. Shocks do not persist even one quarter long, and do not spread to others after quarter one. Since the shocks only have instant eects, in Figure 4, we also plot the regional shock at horizon zero across CMAs with its 95% condence band, and the shock origin CMAs across time on the same gure. We order CMAs according by the distance from the shock origins, which are indicated on the top-axis. quarter zero to quarter six.

29

The bottom-axis is the time horizon from

Figure 2: Impulse responses at horizon zero and over time

30

Figure 3: Impulse responses of one standard deviation shock to entire region one, region two and region three

31

Figure 4: Impulse response of entire regions at horizon zero

32

5.2 The impulse response functions of the shock to population growth Based on the estimation results in Table 4, the coecient of population growth rate is 2.6497 which is the largest number among all the estimates. It shows that the population growth is the biggest factor that aect residential housing returns, so we are also interested in the impulse response functions to the population growth shock. We impose one standard deviation shock to the population growth rate, and then we look for the dynamic and spatial patterns of the impulse response functions.

14

The computation details are given in

Appendix D. Figure 5 displays the IRF's of housing returns to a one standard deviation shock to the population growth in Vancouver, Calgary, and Toronto, individually. In response to the shock that originates from Vancouver housing returns, Vancouver itself rise by 0.51% instantly, and the shock becomes insignicant starting from quarter one. Housing returns decrease instantly in the rest of CMAs at horizon zero, and Calgary decreases the most by 0.04%. When there is a one standard deviation shock to Calgary population growth, Calgary quarterly housing returns increase by 0.63%.

All other CMAs housing returns

decrease instantly, and Edmonton responses the most with a 0.06% decrease. In the case of a positive shock in Toronto population growth, the housing returns in Toronto itself increase by 0.41% instantly and they become insignicant in quarter one.

The rest of

CMAs housing return decrease instantly at horizon zero and Halifax is aected the most with a 0.03% decrease in quarterly housing returns. In Figure 6, the eect of a positive population growth shock to a single CMA over time is directly compared to the impact of the same shock on CMAs ordered by their distance from the shock origin. On top-axis, we order CMAs according to the both geographical and economic closeness from the shock origin.

The bottom-axis is the dynamic horizon

from quarter one to quarter six. We observe that the decay of the impulse responses across CMAs is faster than the time decay. In Figure 7, we impose a population growth shock to all the regions.

When there

is a positive shock to region one population growth, Victoria and Vancouver population growth increase by 0.12% and 0.19%, respectively.

The housing returns of region one

CMA increase instantly, where Victoria quarterly housing returns increase by 0.31% and Vancouver quarterly housing returns increase by 0.49%. All other CMAs housing returns decrease instantly in response to this shock. Regina is aected the most with a decrease of 0.06%. In response to the positive shock to region two (including Calgary, Edmonton, Regina, and Winnipeg), the shock origins' housing returns rise instantly and the other CMAs housing return decrease instantly. The shock does not have any signicant impact

14 The magnitude of one standard deviation of population growth in each CMA is listed in last column of Table 7.

33

Figure 5:

Impulse response functions of housing returns to a shock to the population

growth

34

Figure 6: Impulse response functions of one standard deviation shock to population growth

35

that persist to the rst quarter. Ottawa quarterly housing returns decrease the most by 0.23%.

The last graph in Figure 7 displays the impulse responses to the positive shock

to the entire region three (including Toronto, Ottawa, Montreal and Halifax). The region three housing returns increase instantly, and then become insignicant in quarter one. It only has an instantly negative impact on the rest of CMAs at horizon zero, and then dies away.

Edmonton quarterly housing returns are aected the most by this shock, with a

decrease by 0.07%. Another view of the region-wide shock on population growth is provided in Figure 8, where the IRF's for the housing returns changes are added across regions at horizon zero with its 95% condence interval in broken lines. In addition, the impulse responses of the shock origins are plotted over horizons. The dynamic and spatial impacts of shocks to Canadian residential housing returns are dierent from what researchers found in other countries. The shocks in Canadian housing market fade faster across time and have smaller impacts to other regions spatially. Holly, Pesaran, and Yamagata (2011) provide an analysis of the spatial and temporal diusion of house prices in the UK and they nd that a positive shock to London housing prices would spread to other regions gradually and raise prices across the country. In that case, the closer the region in question is to London, the more rapid the response to the shock. Brady (2011) who studied the diusion of housing prices in 31 California counties across space and over time also found the diusion of housing prices across space to last up to two and half years. The impulse responses of housing returns in Canadian residential housing markets have smaller impacts.

This special feature may be attributed to the Canadian banking

system and mortgage regulations. The Canadian banking system is widely considered as one of the safest banking system in the world. Even though Canada's mortgage market performed well through the most recent nancial crisis, mortgage regulations have been tightened since 2008. For example, the maximum amortization for new government backed insured mortgages was reduced from 30 to 25 years; the maximum loan-to-value (LTV) for renanced mortgages was lowered from 85% to 80%; the minimum down payment on nonowner-occupied properties was raised from 5% to 20%; more stringent eligibility criteria were introduced; and so on. These policies ensure the Canadian housing market to be less vulnerable to the shocks.

36

Figure 7: One standard deviation shock to population growth

37

Figure 8: IRF of housing returns to a one standard deviation shock to population growth

38

6

Conclusion This paper uses a dynamic spatial panel data model to analyze the spatial depen-

dence of residential resale housing return across 10 major Canadian CMAs from 1992Q4 to 2012Q4. We identify statistically signicant spatial dependence and spatial autocorrelation. The geographical distance and economic similarity are the predominant channels for both spatial dependence and spatial autocorrelation in Canadian housing market. Even if the spatial weighting function is pre-specied up to unknown parameters, it has to be objectively determined within the context of the given model. We do that by employing a data-driven grid search approach to arrive at an estimate of our weight matrix that is model-consistent.

Furthermore, we investigate the IRF's of housing returns to the one

time instant housing return shocks and population growth shocks, and we nd that these shocks only last for a short time horizon and will not spread widely across space. For our future research, we would like to relax the restriction that we have imposed on the admissible spatial weight matrix

W

that all of its elements have to be non-negative. Us-

ing such a model to forecast one-period-ahead returns might result in over-predicting future returns. Therefore, extending the current SDPD parametric model to a semi-parametric one, and letting the data determine the spatial weight function

g(W )

would allow for a

much richer specication. In that case, we would need to tackle additional computational and estimation issues that are worth exploring, including that of endogeneity of the weight matrix.

39

References [1] Akari, A.H. and Y. Aydede, 2012. Eects of immigration on house prices in Canada. Applied Economics 44, 1645-1658.

[2] Allen, J., R. Amano, D.P. Byrneand and A.W. Gregory, 2006. Canadian city housing prices and urban market Segmentation. Bank of Canada Working Paper. Series: 200649. [3] Anselin, L., 1988. Spatial Econometrics:

Methods and Models. Kluwer Academic

Publishers, Dordrecht, The Netherlands. [4] Anselin, L., 2003. Spatial externalities, spatial multipliers and spatial econometrics. International Regional Sciences Review 26(2), 153-166.

[5] Baltagi, B.H., S.H. Song, S.H. and W. Koh, 2003. Testing panel data regression models with spatial error correlation. Journal of Econometrics 117, 123-150. [6] Baltagi, B.H., S.H Song, B. Jung and W. Koh, 2007. Testing panel data regression models with spatial and serial error correlation. Journal of Econometrics 140, 5-51. [7] Behrens, K., and J.F. Thisse, 2007. Regional economics: A new economic geography perspective. Regional Science and Urban Economics 37(4), 457-465. [8] Bitter, C., G.F. Mulligan, and S. Dall'erba, 2007. Incorporating spatial variation in housing attribute prices: a comparison of geographically weighted regression and the spatial expansion method . Journal of Geographical Systems 9(1), 7-27. [9] Brady, R.R., 2011. Measuring the diusion of housing prices across space and over time. Journal of Applied Econometrics 26(2), 213-231. [10] Breusch, T., and A. Pagan, 1980. The lagrange multiplier test and its applications to model specication in econometrics. Review of Economic Studies 47, 239-253.

[11] Bruyne, K. D and J.V. Hove, 2013. Explaining the spatial variation in housing prices: an economic geography approach. Applied Economics 45(13), 1673-1689. [12] Can, A., 1992. Specication and estimation of hedonic housing price models. Regional Science and Urban Economics 22(3), 453-474.

[13] Case A. C., 1991. Spatial patterns in household demand. Econometrica 59(4), 953-965. [14] Case, A.C., H.S. Rosen and J.R. Hines. 1993. Budget spillovers and scal policy interdependence: evidence from the States. Journal of Public Economics 52(3): 285307. [15] Cli, A. D. and J. K. Ord. 1969. The Problem of spatial autocorrelation. Papers in Regional Science 1, Studies in Regional Science, 25-55, edited by A. J Scott, London:

Pion.

40

[16] Cli, A. D. and J. K. Ord. 1973. Spatial Autocorrelation. London: Pion. [17] Cli, A. D. and J. K. Ord. 1981. Spatial Process: Models and Applications. London: Pion. [18] Congdon, P., 2006. A model for non-parametric spatially varying regression eects. Computational Statistics and Data Analysis 50(2), 422445.

[19] Dacey, M. F. 1965. A review on measures of contiguity for two and k-Color Maps. Technical Report No. 2, Spatial Diusion Study. Department of Geography, Evanston: Northwestern University. [20] Debarsy, N. and C. Ertur, 2010. Testing for spatial autocorrelation in a xed eects panel data model. Regional Science and Urban Economics 40, 453-470. [21] Dolde, W. and D. Tirtiroglu, 2002. Housing price volatility changes and their eects. Real Estate Economics 30 (1), 41-66.

[22] Dubin, R. A., 1992. Spatial autocorrelation and neighborhood quality. Regional Science and Urban Economics 22(3), 433-352.

[23] Dubin, R. A., 1998. Spatial Autocorrelation: A Primer. Journal of Housing Economics 7(4), 304-327. [24] Elhorst, J.P., 2003. Specication and estimation of spatial panel data moldes. International Regional Science Review 26 (3), 244-268.

[25] Elhorst, J.P. 2010. Spatial Penal Data Models. in: Fischer, M. M., Getis, A. (Eds.), Handbook of Applied Spatial Analysis. Springer Berlin Heidelberg, pp.377-407. [26] Farber, S., M. Yeates, 2007. A comparison of localized regression models in a hedonic house price context. Canadian Journal of Regional Science, Vol. 29, No. 3. [27] Fingleton, B., 2001. Equilibrium and economic growth: spatial econometric models and simulations. Journal of Regional Science 41(1), 117-147. [28] Gallin, J., 2006. The long-run relationship between house prices and income: evidence from local housing markets. Real Estate Economics 34(3), 417438.

[29] Geary, R. C. 1954. The contiguity ratio and statistical mapping. The Incorporated Statistician 5, 115-145.

[30] Getis, A. and J. Aldstadt, 2004. Constructing the spatial weights matrix using a local tatistic. Geographical Analysis 36(2), 90-104. [31] Glaeser, E.L., and J. Gyourko, 2006. Housing dynamics. NBER Working Papers 12787, National Bureau of Economic Research Inc. [32] Guirguis, H.S., C.I. Giannikos and L.G. Garcia, 2007. Price and volatility spillovers between large and small Cities: a study of the spanish market. Journal of Real Estate Portfolio Management 13(4), 311-316.

41

[33] Gupta, R. and S.M. Miller, 2012. The time-series properties of housing prices:

a

case study of the southern California market. Journal of Real Estate Finance and Economics 44, 339-361.

[34] Haggett, P., A.D. Cli and A. Frey, 1977. Location Analysis in Human Geography, 2nd edition. New York: Wiley. [35] Holios, A.J., and J.E. Pesando, 1992. Monitoring price behaviour in the resale housing market: a note on measurement and implications for policy. Canadian Public Policy XVIII, 57-61. [36] Holly, S., M.H. Pesaran, and T. Yamagata, 2006. A spatial-temporal model of house prices in the U.S. Journal of Econometrics, 158(1), 160-173.

[37] Hossain, B. and E. Latif, 2009. Determinants of housing price volatility in Canada: a dynamic analysis. Applied Economics 41, 3521-3531. [38] Karoglou, M., B. Morley and D. Thomas, 2013. Risk and structural instability in U.S. house prices. Journal of Real Estate Finance and Economics 46(3), 424-436. [39] Kelejian, H.H. and I.R. Pruch, 1998. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. Journal of Real Estate Finance and Economics 17, 99-121.

[40] Kelejian, H.H. and I.R. Prucha, 2010. Specication and estimation of spatial autoregressive models with autoregressive and heteroscedastic disturbances. Journal of Econometrics 157, 53-67.

[41] Kelejian, H. H. and G. Piras, 2014. Estimation of spatial models with endogenous weighting matrices, and an application to a demand model for cigarettes. Regional Science and Urban Economics 46, 140-149.

[42] Lee, L.F. and J. Yu, 2009. Some recent developments in spatial panel data models. Regional Science and Urban Economics 40(5), 255-271.

[43] Lee, L.F. and J. Yu, 2010a. Estimation of spatial autoregressive panel data models with xed eects. Journal of Econometrics 154(2), 165-185. [44] Lee, L.F. and J. Yu, 2010b. A spatial dynamic panel data model with both time and individual xed eects. Econometric Theory 26(2), 564-597. [45] Lee, L.F. and J. Yu, 2010c. Estimation Of unit root spatial dynamic panel data models. Econometric Theory 26 (5), 1332-1362.

[46] LeSage, J. and Pace R. K., 2009. Introduction to Spatial Econometrics. Taylor & Francis CRC Press , Boca Raton.

42

[47] Ley, D. and J. Tutchener, 2001. Immigration, globalization and house prices in Canada's gateway cities. Housing Studies 16(2), 199-223. [48] Meen, G., 1999. Regional house prices and the ripple eect: A new interpretation. Housing Studies 4(6), 733-753.

[49] Miao, H., S. Ramchander and M.C. Simpson, 2011. Return and volatility transmission in U.S. housing markets. Real Estate Economics 39, 701-741. [50] Miles, W., 2008. Volatility clustering in U.S. home prices. Journal of Real Estate Research 30(1), 73-90.

[51] Miller, N. and L. Peng, 2006. Exploring metropolitan housing price volatility. Journal of Real Estate Finance and Economics 43, 5-18.

[52] Millo, G. and G. Piras, 2012. splm:

Spatial panel data models in R. Journal of

Statistical Software 47(1), 1-38.

[53] Mutl, J. and M. Pfaermayr, 2011. The Hausman test in a Cli and Ord panel model. Econometrics Journal 14, 48-76.

[54] Olmo, J.C., 1995. Spatial estimation of housing prices and locational rents. Urban Studies 32(8), 1331-1344.

[55] Pace, R.K., and O.W. Gilley, 1997. Using the spatial conguration of the data to improve estimation. Journal of Real Estate Finance and Economics 14(3), 333-340. [56] Pace, R.K., R. Barry, and C.F. Sirmans, 1998. Spatial statistics and real estate. Journal of Real Estate Finance and Economics 17(1), 5-13.

[57] Pace, R. K. and J. P. LeSage, 2004. Spatial statistics and real estate. Journal of Real Estate Finance and Economics 29(2), 147-148.

[58] Paze, A., T. Uchida, and K. Miyamoto, 2001. Spatial association and heterogeneity issues in land price models. Urban Studies 38(9), 1493-1508. [59] Pesaran, M.H., 2004. General diagnostic tests for cross section dependence in panels, Cambridge Working Papers in Economics, CESIfo 1229. [60] Tsai, C., M. Chen and T. Ma, 2010. Modeling house price volatility states in the UK by switching ARCH models. Applied Economics 42(9), 1145-1153. [61] Tse, R.Y.C., 2002. Estimating neighborhood eects in house prices: towards a new hedonic model approach. Urban Studies 39(7), 1165-1180. [62] Tobler, W. R, 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography 46: 23440.

[63] Yu, J., de J. R., and L.F. Lee, 2008. Quasi-maximum likelihood estimators for spatial dynamic panel data with xed eects when both Econometrics 146(1), 118-134.

43

n

and

T

are large. Journal of

[64] Zhang, C. and Y. Murayama, 2000. Testing local spatial autocorrelation using

k -order

neighbors. Int J Geographic Information Science 14, 681-692. [65] Zhang, C., and Y. Murayama, 2003. Evaluation on the prominences of irregular areas based on spatial weight matrices. Geographical Review of Japan 76A: 777-787. [66] Zhang, Y., X. Hua and L. Zhao, 2012. Exploring determinants of housing prices: a case study of chinese experience in 1990-2010. Economic Modeling 29(6), 2349-2361. [67] Zhu, B., R. Fuss, and N.B. Rottke, 2013. Spatial linkages in returns and volatilities among U.S. regional housing markets. Real Estate Economics 41(1), 964.

44

Appendix A. List of CMAs Table 8: The list of the selected 10 CMAs

Provinces and Territories 1 2 3 4 5 6 7

Metropolitan Areas

Nova Scotia

Halifax

Quebec

Montreal

Ontario

Toronto, Ottawa

Manitoba

Winnipeg

Saskatchewan

Regina

Alberta

Edmonton, Calgary

British Columbia

Vancouver, Victoria

B. Detailed data sources •

Building Permits: CANSIM Table 026-0006.



Housing Starts: CANSIM Table 027-0052.



Completed: CANSIM Table 027-0060.



Construction Union Wage Index (UWI): Table 327-0045.

The UWI data is not

available for Regina from 1992Q2 to 2007 Q1, so the Prairie Region data is used for that period instead.



NHPI: CANSIM Table 327-0046



Rent index: CANSIM Table 326-0020.



Population and unemployment rate:

CANSIM Table 282-0090 for year 1992Q2-

1995Q4 and Table 282-0109 for year 1996Q1-2013Q1. Data in January and February 1996 is not available, so the data in March 1996 is used for 1996Q1 instead of quarterly average.



Household income: quarterly and annually national household income are from CANSIM Table 380-0083, and annual CMA-level household income is from CANSIM Table 202-0605, which is real median after-tax household income. The income data is not available for Regina from year 1992 to 1997, so the household income for province Saskatchewan is used at that period.

45

C. The IRF of the shock to housing returns 1.The derivation of the IRF of the shock to housing returns

An impulse response function traces out the magnitude and duration of a variable in response to a shock. We apply this technique to our dynamic spatial panel data model to better understand the eect of shocks on housing returns over time and across spaces. From model (1)-(2), we have

Yt = γYt−1 + λWN Yt + Xt−1 β + µ + vt τN + (IN − ρWN ) −1 εt To simplify the notation, let

IN − λWN = A,

and

IN − ρWN = B .

(20)

Rearranging (20), we

have

Yt = A−1 γYt−1 + A−1 Xt−1 β + A−1 (µ + vt τN ) + A−1 B −1 εt The instant impulse response to a one-time shock in

At period

εt

of magnitude

η

(21)

at time

t

is

IRF0 = A−1 B −1 η

(22)

Yt+1 = A−1 γYt + A−1 Xt β + A−1 (µ + vt+1 τN ) + A−1 B −1 εt+1

(23)

t + 1,

If we substitute equation (21) into (23), we get

 Yt+1 =γ 2 A−2 Yt−1 + γA−2 Xt−1 β + A−1 Xt β + γA−2 + A−1 µ + γA−2 vt τN + A−1 vt+1 τN + γA−2 B −1 εt + A−1 B −1 εt+1 We assume the shock to then we hold

Xs

and

εt

does not pass on to the the variable

vs unchanged for all period s ≥ t.

Xs , εs

and

vs

for

s ≥ t,

Therefore, we could clearly observe

how the housing return in response to such a shock. Then, one time shock in

ε

at time

t

results

 Yt+1 = γ 2 A−2 Yt−1 + γA−2 + A−1 (Xt−1 β + µ + vt τN )+γA−2 B −1 εt +A−1 B −1 εt+1 The corresponding impulse response function in period

IRF1 = γA−2 B −1 η Repeating this procedure we have

46

t+1

(24)

is

(25)

 Yt+2 =γ 3 A−3 Yt−1 + γ 2 A−3 + γA−2 + A−1 (Xt−1 β + µ + vt τN ) + γ 2 A−3 B −1 εt + γA−2 B −1 εt+1 + A−1 B −1 εt+2

 Yt+3 =γ 4 A−4 Yt−1 + γ 3 A−4 + γ 2 A−3 + γA−2 + A−1 (Xt−1 β + µ + vt τN ) γ 3 A−4 B −1 εt + γ 2 A−3 B −1 εt+1 + γA−2 B −1 εt+2 + A−1 B −1 εt+3

. . .

i h Yt+h =γ h+1 A−(h+1) Yt−1 + γ h A−(h+1) + γ h−1 A−h + · · · + γA−2 + A−1 (Xt−1 β + µ) + γ h A−(h+1) B −1 εt + γ h−1 A−h B −1 εt+1 + · · · + γA−2 B −1 εt+h−1 + A−1 B −1 εt+h The corresponding impulse response functions are

IRF2 = γ 2 A−3 B −1 η IRF3 = γ 3 A−4 B −1 η . . .

IRFh = γ h A−(h+1) B −1 η for

(26)

h = 0, 1, 2, ...T. In order to construct the 95% condence interval for each impulse response, we need

to compute the standard errors for the standard error.

IRFh .

The delta method is used to compute the

We take the derivative of the transformed function with respect to the

parameter, and multiply it by the asymptotic variance of the untransformed parameter. To be specic,

 V ar(IRFh ) =

∂IRFh ∂IRFh ∂IRFh , , ∂γ ∂λ ∂ρ



47

−1

Σ



∂IRFh ∂IRFh ∂IRFh , , ∂γ ∂λ ∂ρ

0 (27)

where

Σ

is the asymptotic variance-covariance matrix of the estimators of

γ, λ

and

ρ.

We

then construct the 95% condence interval

ˆ h ± 1.96 · se(IRF ˆ h) CI = IRF where the

ˆ h IRF

is the estimate of

IRFh .

(28)

The detailed derivatives of

IRFh

can be found

below.

2. Derivatives of

IRF h

of the shock to housing returns

In order to construct the 95% condence interval, we need to compute the standard

IRFh , then apply the delta method. parameter (γ, λ, ρ) in each period, from h = 0

errors for the estimates of

The derivatives of

with respect to

to

h = 6,

IRFh

are presented

below. In period

t, h = 0,

we impose a shock and denote it by

η,

then

IRF0 (γ, λ, ρ) = (IN − λWN )−1 (IN − ρWN )−1 η =A−1 B −1 η The derivative of

IRF0

with respect to

(γ, λ, ρ)

(29) is

∂IRF0 =0; ∂γ i d vec A−1  ∂IRF0 h −1 T = B η ⊗ IN ∂λ dλ h  i  ˆ N h i d vec  A−1  d vec IN − λW  T = B −1 η ⊗ IN · dvec (A) dλ h i h i   −1 T −1 T −1 = B η ⊗ IN − A ⊗A [−vec (WN )] h i T = A−1 B −1 η ⊗ A−1 vec (WN ) ;    d vec B −1 ∂IRF0 T −1 = h ⊗A · ∂ρ dρ   −1  d [vec (IN − ρˆWN )] T −1 d vec B = η ⊗A · dvec (B) dρ h i   T = η T ⊗ A−1 B −1 ⊗ B −1 vec (WN ) h T i = B −1 η ⊗ A−1 B −1 vec (WN ) where  ⊗ refers to the Kronecker product, and

WN

and transforms

WN

into a column vector.

48

vec (WN )

is the vectorization of matrix

In period

t + h,

where

h≥1

IRFh (γ, λ, ρ) =γ (IN − λWN )−(h+1) (IN − ρW )−1 η =γA−(h+1) B −1 η For

h ≥ 1,

the derivative of

IRFh

with respect to

(30)

(γ, λ, ρ)

is

∂IRFh =hγ h−1 A−(h+1) B −1 η; ∂γ h+1 h  i X ∂IRFh h =γ A−(h+2−j) B −1 η ⊗ A−j vec (WN ) ; ∂λ j=1 h i T  ∂IRF1 =γ h B −1 η ⊗ A−(h+1) B −1 vec (WN ) . ∂ρ

D. The IRF to a population growth shock. 1. The derivation of IRF to the population growth shock

We rst need to compute the impulse response functions of a one-time shock to the explanatory variables

Xt−1 .

From (1), we have

Yt =γA−1 Yt−1 + A−1 Xt−1 β + A−1 (µ + vt τN ) + A−1 ut Yt+1 =γ 2 A−2 Yt−1 + γA−2 Xt−1 β + A−1 Xt β + γA−2 + A

(31)

 −1

µ+

+ γA−2 vt τN + A−1 vt+1 τN + γA−2 ut + A−1 ut+1

(32)

Yt+2 =γ 3 A−3 Yt−1 + γ 2 A−3 Xt−1 β + γA−2 Xt β + A−1 Xt+1 β  + γ 2 A−3 + γA−2 + A−1 µ + γ 2 A−3 vt τN + γA−2 vt+1 τN + A−1 vt+2 τN + γ 2 A−3 ut + γA−2 ut+1 + A−1 ut+2

(33)

. . .

Yt+h =γ h+1 A−(h+1) Yt−1 + γ h A−(h+1) Xt−1 β + γ h−1 A−h Xt β + · · · + γA−2 Xt+h−2 β   + A−1 Xt+h−1 β + γ h A−(h+1) + γ h−1 A−h + . . . + γA−2 + A−1 µ + γ h A−(h+1) vt τN + γ h−1 A−h vt+1 τN + · · · + γA−2 vt+h−1 τN + A−1 vt+h τN + γ h A−(h+1) ut + γ h−1 A−h ut+1 + · · · + γA−2 ut+h−1 + A−1 ut+h where

IN − λWN = A.

growth at time

t − 1. βp

We denote an

N ×1

vector

ψ

(34)

to be the shock on population

is the coecient in front of the population growth. Then, the

corresponding impulse response functions are

49

IRF0 =A−1 ψβp

(35)

IRF1 =γA−2 ψβp IRF2 =γ 2 A−3 ψβp . . .

IRFh =γ h A−(h+1) ψβp Equation (34) is the response in quarter

t,

(36)

and it is also referred to as impulse response

at horizon zero. We also apply the delta method and equation (28) to construct the 95% condence interval for each IRF's. The detail of derivatives that required to construct the error bound is given below.

2. The derivatives of

IRF

to a population growth shock.

In this section, we include the detailed derivatives that required by equation (27) to construct the 95% condence interval of the impulse response functions. This section is similar to the previous section Appendix C. We calculate the derivatives of IRF with respect to parameters space where

A = IN − λWN , βp

N ×1

vector denotes the shock to the population growth at horizon zero.

When

h = 0, IRF0 = A−1 ψβp ,

the derivative of

IRF0

with respect to

∂IRF0 =A−1 ψ. ∂βp t + h,

where

h≥1 IRFh = γ h A−(h+1) ψB

For

h ≥ 1,

the derivative of

IRFh

with respect to

50

(γ, λ, βp )

is

and

ψ

(γ, λ, βp )

∂IRF0 =0; ∂γ i d vec A−1  ∂IRF0 h T = (ψβp ) ⊗ IN ∂λ dλ h i d vec  A−1  d [vec (I − λW )] N N = (ψβp )T ⊗ IN dvec (A) dλ h ih i T = (ψβp )T ⊗ IN − A−1 ⊗ A−1 [−vec (WN )] h i T = A−1 ψβp ⊗ A−1 vec (WN ) ;

In period

γ, λ

is the parameter in front of the population growth, and

βp ,

is an

is

∂IRFh =hγ h−1 A−(h+1) ψβp ; ∂γ h+1 h  i X ∂IRFh =γ h A−(h+2−j) ψβp ⊗ A−j vec (WN ) ; ∂λ j=1 ∂IRF1 =γ h A−(h+1) ψ. ∂βp

E. The IRF of the shock origin CMAs with 95% condence interval. Figure 9: The IRF of the CMAs with 95% condence interval

51